**1.Importing the Data& visualization**

In [None]:
# import the data 
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode,iplot
init_notebook_mode(connected=False)
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')
data=pd.read_csv('/kaggle/input/netflix-shows/netflix_titles.csv')
data

**2.Handling Missing Values**

In [None]:
data.columns

In [None]:
print(data.isnull().sum())


In [None]:
data.dtypes

In [None]:
data.describe()

In [None]:
data['country'].fillna('Unknown', inplace=True)
data['director'].fillna('Unknown', inplace=True)
data['cast'].fillna('Unknown',inplace=True)
data['rating'].fillna('Unknown',inplace=True)



In [None]:
print(data.isnull().sum())

In [None]:
data.info()

In [None]:
data = data.dropna(subset=['date_added','duration'])


**Correcting format of a date**

In [None]:
data['date_added'] = pd.to_datetime(data['date_added'],format='mixed')
data['date_added'] = data['date_added'].dt.strftime('%d/%m/%Y')
data['date_added'] = pd.to_datetime(data['date_added'])


data.info()

**Question 1: What is the distribution of content types on Netflix?**

In [None]:
plt.figure(figsize=(6, 6))
sns.countplot(data=data, x='type')
plt.title('Distribution of Movies and TV Shows')
plt.xlabel('Content Type')
plt.ylabel('Count')
plt.show()


**Question 2: What are the most common genres listed in the dataset?**

In [None]:
genre_counts = data['listed_in'].str.split(',').explode().str.strip().value_counts()

plt.figure(figsize=(18, 15))
genre_counts.head(15).plot(kind='bar')
plt.title('Top 15 Genres on Netflix')
plt.xlabel('Genre')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.show()


**Question 3: Which countries have the highest number of movies and TV shows on Netflix?**


In [None]:
country_counts = data['country'].str.split(',').explode().str.strip().value_counts()

plt.figure(figsize=(10, 8))
country_counts.head(15).plot(kind='bar')
plt.title('Top 15 Countries with Most Content on Netflix')
plt.xlabel('Country')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.show()


**Question 4 : Top 10 Coutries by cintent type?**

In [None]:
country_type_counts = data.groupby(['country', 'type']).size().unstack().fillna(0)

country_type_counts.nlargest(10, ['Movie', 'TV Show']).plot(kind='barh', stacked=True, figsize=(12, 8))
plt.title('Top 10 Countries by Content Type')
plt.xlabel('Count')
plt.ylabel('Country')
plt.show()


**Question 5: What is the distribution of content ratings on Netflix?**

In [None]:
plt.figure(figsize=(10, 6))
sns.countplot(data=data , x='rating', order=data['rating'].value_counts().index)
plt.title('Distribution of Content Ratings on Netflix')
plt.xlabel('Rating')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.show()


In [None]:
actor_counts = data['cast'].str.split(',').explode().str.strip().value_counts()

plt.figure(figsize=(12, 8))
actor_counts.head(15).plot(kind='bar', color='coral')
plt.title('Top 15 Actors with Most Movie Appearances on Netflix')
plt.xlabel('Actor')
plt.ylabel('Number of Appearances')
plt.xticks(rotation=45)
plt.show()


**Question 6: How has the number of movies and TV shows added to Netflix changed over time?**

In [None]:
yearly_content_added = data.groupby(['release_year', 'type']).size().unstack().fillna(0)

yearly_content_added.plot(kind='bar', stacked=True, figsize=(12, 8))
plt.title('Movies vs. TV Shows Added to Netflix Over the Years')
plt.xlabel('Year Added')
plt.ylabel('Number of Content')
plt.show()


**Conclusion**
   
   The analysis of the Netflix dataset provides valuable insights into the platform's content distribution across various dimensions:

Common Genres: The dataset reveals that genres like Drama, Comedy, and Action are the most frequently listed. This suggests a strong focus on these categories to cater to a broad audience base.

Content by Country: The United States, India, and the United Kingdom lead in terms of the number of movies and TV shows available on Netflix. This indicates a significant presence of content from these regions, likely due to their large entertainment industries and viewer base.

Top 10 Countries by Content Type: The United States dominates the list of top countries by content type, followed by India and the United Kingdom. This reflects the global reach of these countries' entertainment content on the platform.

Distribution of Content Ratings: The analysis of content ratings shows that Netflix caters to a wide audience, with a substantial amount of content rated for mature audiences (e.g., TV-MA, R). However, there is also a significant presence of family-friendly ratings like PG and TV-PG.

Content Growth Over Time: The number of movies and TV shows added to Netflix has seen significant growth over the years, with a notable increase around 2017-2018. This trend reflects Netflix's aggressive content expansion strategy to attract and retain subscribers globally.

Overall, these insights demonstrate Netflix's strategic content diversification across genres, countries, and age groups, contributing to its global success and appeal to a wide range of audiences.