Skip to content

Arralle21/Netflix-Data-Visualization-Python-R

Repository files navigation

Netflix Content Analysis & Visualization

Abdullahi Mohamed Jibril

Submission Date: April 20, 2025

A comprehensive data analysis project that explores Netflix's content library using both Python and R visualization techniques.

Project Overview

This project analyzes Netflix's content library dataset to extract insights about content types, genres, ratings, and temporal trends. The analysis is implemented in both Python and R to demonstrate different visualization approaches.

Files in this Repository

netflix_visualization.py - Python script for data preparation, cleaning, and visualization Netflix_visualizations.R - R script for complementary data analysis and visualization Netflix_shows_movies.csv - Source dataset netflix_dataset.csv - Cleaned dataset used R script Generated visualizations:

top_genres.png - Bar chart of top 15 Netflix genres content_type_distribution.png - Distribution of content types ratings_distribution.png - Distribution of content ratings yearly_content_additions.png - Content added to Netflix by year movie_duration_distribution.png - Distribution of movie durations top_genres_r.png - R-generated bar chart of top genres rating_by_type_r.png - R-generated visualization of ratings by content type

Key Features

Data Cleaning: Handles missing values and prepares data for analysis Exploratory Data Analysis: Provides statistical summaries and distributions Visualizations: Creates insightful charts about Netflix content Cross-language Implementation: Demonstrates both Python and R approaches

Technologies Used

Python: pandas, matplotlib, seaborn, numpy R: tidyverse, ggplot2, dplyr, stringr

How to Run

Python Script bashpython netflix_visualization.py R Script bashRscript Netflix_visualizations.R Sample Insights

Distribution of Movies vs TV Shows in the Netflix library Most popular content genres Content ratings analysis Trends in content additions over time Movie duration analysis

Prerequisites

Python 3.x with pandas, matplotlib, seaborn, and numpy R with tidyverse, ggplot2, dplyr packages Netflix dataset CSV file in the working directory

Notes The R script expects the cleaned CSV output from the Python script. Run the Python script first to generate netflix_cleaned.csv.


                                                                                      THANKS

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors