A comprehensive Exploratory Data Analysis (EDA) of Netflix titles dataset. This project analyzes Netflix's content library to uncover insights about movie and TV show distributions, release years, and content trends.
This EDA project examines Netflix's dataset to answer key questions about:
- Distribution of movies vs TV shows
- Trends in content release years
- Content patterns and statistics
- Visualization of key metrics
- Data Processing: Using Pandas for data manipulation and analysis
- Statistical Analysis: Comprehensive data exploration and summary statistics
- Visualizations: High-quality plots using Matplotlib and Seaborn
- Insights: Actionable findings from Netflix content data
- Reproducible: Well-documented code for easy reproduction
The analysis uses the Netflix titles dataset containing information about:
- Movie and TV show titles
- Release years
- Type (Movie/TV Show)
- Genres and descriptions
- Content metadata
- Python: 3.7+
- Data Processing: Pandas
- Visualization: Matplotlib, Seaborn
- Analysis: NumPy, Scipy
- Python 3.7+
- pip or conda
- Clone the repository:
git clone https://github.com/Viblla/netflix-eda.git
cd netflix-eda- Install dependencies:
pip install -r requirements.txt- Obtain the Netflix dataset:
- Download
netflix_titles.csvfrom Kaggle or another source - Place it in the project root directory
- Download
Run the analysis script:
python eda.pyThis will:
- Load and explore the Netflix dataset
- Generate statistical summaries
- Create visualization plots
- Save visualizations as PNG files
Analysis of the distribution between movies and TV shows in Netflix's catalog.
Distribution of movie release years showing content trends over time.
- Netflix catalog includes both movies and TV shows
- Content spans multiple decades with trends in recent releases
- Significant growth in content availability
- Diverse distribution of content types
netflix-eda/
├── eda.py # Main EDA script
├── movies_vs_tv_shows.png # Visualization: Movies vs TV shows
├── movie_release_years.png # Visualization: Release year distribution
├── requirements.txt # Python dependencies
├── README.md # This file
└── LICENSE # MIT License
See requirements.txt for complete dependencies list. Key packages:
- pandas
- matplotlib
- seaborn
- numpy
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to open an issue or discussion for questions about the analysis.
For more Netflix data analysis, checkout similar EDA projects on Kaggle.

