This project involves analyzing Netflix movie data to determine whether movie durations are getting shorter over time. The project uses Python's pandas, seaborn, and matplotlib libraries for data analysis and visualization.
The data is sourced from a CSV file named netflix_data.csv
which contains information about Netflix shows and movies, including columns like title, genre, release year, and duration.
- Data Loading: Load the dataset using pandas.
- Data Cleaning: Remove missing values and filter movies from the dataset.
- Data Exploration: Visualize movie durations over the years using scatter plots.
- Genre Color Mapping: Assign specific colors to different genres.
- Visualization: Create scatter plots with genre-based coloring to analyze movie durations over the years.
Based on the analysis, it appears that the average duration of movies has been declining, with a noticeable variation across different genres.
- The code for data loading, cleaning, exploration, and visualization is provided in Python using libraries like pandas, seaborn, and matplotlib.
- The analysis code is available in the provided Jupyter Notebook.
- Ensure you have Python and the required libraries installed.
- Download the
netflix_data.csv
dataset and place it in the same directory. - Open the Jupyter Notebook or Python script and run the code cells to perform the analysis.
The project uses the Netflix movie dataset and leverages the power of pandas, seaborn, and matplotlib for data analysis and visualization.