- Introduction
- Architecture
- Project Structure
- Libraries
- Installation
- Usage
- Streamlit Interface
- Deployment
- Reports
- Tests & CI
Netflix EDA is a Data Exploration (EDA) project focused on analyzing trends, genres, and performance of Netflix titles. The project explores historical and forecast data files, analyzes popular movies and TV shows, and generates detailed reports to visualize insights from the data.
Here is a Mermaid diagram showing the main components and their interactions:
flowchart LR
A[Data Import] --> B[Data Cleaning]
B --> C[Data Exploration]
C --> D[Data Visualization]
D --> E[Generate Reports]
E --> F[Streamlit Dashboard]
- Data Import: Loading raw data from CSV files.
- Data Cleaning: Preparing and cleaning the data.
- Data Exploration: Statistical exploration and identifying key trends.
- Data Visualization: Generating charts to better understand the data.
- Generate Reports: Generating detailed PDF and CSV reports.
- Streamlit Dashboard: Displaying the results on an interactive web interface.
- assets/: Contains visual resources such as icons and CSS styles.
- data/: Folder containing raw (
raw) and processed (processed) data files. - env/: Virtual environment for managing project dependencies.
- notebooks/: Exploratory analysis with Jupyter.
- reports/: Reports generated during analysis, in PDF format.
- scripts/: Python scripts for analysis, report generation, etc.
- visualization/: Contains scripts and tools for visualization.
- requirements.txt: List of project dependencies.
- pandas → Data management and analysis.
- matplotlib → Data visualization.
- seaborn → Advanced data visualization.
- numpy → Numerical computation.
- scikit-learn → Statistical models and machine learning.
- streamlit → Interactive web interface to visualize data and EDA results.
- pytest → Unit and integration testing.
- fpdf → PDF report creation.
- Clone the repository:
git clone https://github.com/YourUsername/Netflix_EDA.git
cd Netflix_EDA- Create and activate a Python environment:
python3.11 -m venv env
source env/bin/activate- Install dependencies:
pip install -r requirements.txt- Install the project in development mode:
pip install -e .- Run the Streamlit interface to explore the data interactively:
streamlit run scripts/streamlit_app.pyThe project includes a Streamlit interface that allows you to visualize the EDA interactively. You can easily explore trends of Netflix movies and TV shows, view charts on popular genres, ratings, and much more.
To start the Streamlit interface, simply run:
streamlit run scripts/streamlit_app.pyThe web interface will launch, and you can interact with your data via a visual dashboard.
- You can try this project on Streamlit instead of installing it locally.
Analysis results are generated as PDF reports in the reports/ folder, containing visualizations and statistical summaries.
Example of a generated report:
./reports/Netflix_Analysis_Report_20250406.pdfTo run unit and integration tests:
pytest --cov=netflix_eda --cov-report=term-missing --cov-report=xmlMIT © 2025 [YourUsername]