# The unfortunate pandemic changed the dynamics of various businesses across the globe and one such industry is 'Media & Entertainment'.
# With Cinema halls closed,OTT platforms witnessed a massive surge in its adoption. 

In [None]:
import pandas as pd # package for high-performance, easy-to-use data 
#structures and data analysis
import numpy as np # fundamental package for scientific computing with Python
import matplotlib
import matplotlib.pyplot as plt # for plotting
import seaborn as sns # for making plots with seaborn
color = sns.color_palette()
import plotly.offline as py
py.init_notebook_mode(connected=True)
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.express as px
import plotly.offline as offline
offline.init_notebook_mode()
from pylab import rcParams



# import cufflinks and offline mode
import cufflinks as cf
cf.go_offline()

# from sklearn import preprocessing
# # Supress unnecessary warnings so that presentation looks clean
import warnings
warnings.filterwarnings("ignore")

from IPython.display import Image

In [None]:
ott=pd.read_csv('../input/movies-on-netflix-prime-video-hulu-and-disney/MoviesOnStreamingPlatforms_updated.csv')

In [None]:
ott.head()

In [None]:
ott.columns

In [None]:
ott.shape

In [None]:
ott['Country'].value_counts()

In [None]:
ott.dtypes

In [None]:
plt.figure(figsize=(12,8))
corr = ott.corr()
#Plot figsize
fig, ax = plt.subplots(figsize=(10, 8))
#Generate Heat Map, allow annotations and place floats in map
sns.heatmap(corr, cmap='magma', annot=True, fmt=".2f")
#Apply xticks
plt.xticks(range(len(corr.columns)), corr.columns);
#Apply yticks
plt.yticks(range(len(corr.columns)), corr.columns)
#show plot
plt.show()

# Checking the missing values and dropping unnecessary columns

In [None]:
def missing_percentage(data):
    
    """
    A function for returning missing ratios.
    """
    
    total = ott.isnull().sum().sort_values(
        ascending=False)[ott.isnull().sum().sort_values(ascending=False) != 0]
    percent = (ott.isnull().sum().sort_values(ascending=False) / len(ott) *
               100)[(data.isnull().sum().sort_values(ascending=False) / len(ott) *
                     100) != 0]
    return pd.concat([total, percent], axis=1, keys=['Total', 'Percent'])

In [None]:
missing = missing_percentage(ott)

fig, ax = plt.subplots(figsize=(20, 5))
sns.barplot(x=missing.index, y='Percent', data=missing, palette='Reds_r')
plt.xticks(rotation=90)

display(missing.T.style.background_gradient(cmap='Reds', axis=1))

In [None]:
ott.columns

*Rotten Tomatoes seem to have around 69% of missing values.Hence, drop it
#Age too doesn't seem to add much value to our analysis. Hence, drop it too.

In [None]:
ott.drop(['Rotten Tomatoes','Unnamed: 0','Type','ID','Age'],axis=1, inplace=True)
ott.head()

In [None]:
ott['Language'].value_counts()

# Which genres bagged Maximum,Minimum & Average ratings across OTT platforms ?

In [None]:
ott.groupby('Genres').IMDb.agg(['count','max','min','mean'])

# What is the Rating range for movies ?

In [None]:
# rating distibution 
rcParams['figure.figsize'] = 10,8
g = sns.kdeplot(ott.IMDb, color="Red", shade = True)
g.set_xlabel("Rating")
g.set_ylabel("Frequency")
plt.title('Ratings Range',size = 15)

# What is the staple range of runtime across the platforms?

In [None]:
# rating distibution 
rcParams['figure.figsize'] = 10,8
g = sns.kdeplot(ott.Runtime, color="Red", shade = True)
g.set_xlabel("Runtime Stretch")
g.set_ylabel("Frequency")
plt.title('Ratings Range',size = 15)

**Roughly all the movies are well below 200 mins (i.e 3hrs) and mostly it peaks at around 120 mins**

**Hence, most movies are roughly anywhere between 100-130 mins**

# Which OTT platform hosted maximum and minimum no.of movies ?

In [None]:
total_movies_Netflix = len(ott[ott['Netflix'] == 1].index)
total_movies_Hulu = len(ott[ott['Hulu'] == 1].index)
total_movies_Prime =len(ott[ott['Prime Video'] == 1].index)
total_movies_Disney = len(ott[ott['Disney+'] == 1].index)


print(total_movies_Netflix)
print(total_movies_Hulu)
print(total_movies_Prime)
print(total_movies_Disney)

In [None]:
tags=['Netflix','Hulu', 'Prime Video','Disney+']
counts=[total_movies_Netflix,total_movies_Hulu,total_movies_Prime,total_movies_Disney]
ott_platform = pd.DataFrame(
    {'Platform': tags,
     'MovieCount': counts,
    })
ott_platform

In [None]:
fig = px.pie(ott_platform,names='Platform', values='MovieCount')
fig.update_traces(rotation=45,pull=[0.1,0.03,0.03,0.03,0.03],title='MOVIE DISTRIBUTION ACROSS PLATFORMS')
fig.show()

1. Prime Video (owned by Amazon) have maximum number of movies hosted on their platform which accounts to 12,354 (71.1%).
2. Netflix stands second with around 3560 movies hosted on it's online streaming platform (i.e, 20.5%).
3. Hulu which is owned by Walt Disney,along with Disney Plus (recently bought Hotstar in India) combined have 1367 movies available on their platform for viewers (i.e, around 8.44%) which is less than half of what Netflix has.

# Which movie had the longest Runtime ?

In [None]:
movies_with_longer_runtime = ott.sort_values('Runtime',ascending = False).head(15)
fig = px.bar(movies_with_longer_runtime, x='Title', y='Runtime', color='Runtime', height=700)
fig.show()

**It can be seen that the movie 'Colorado' has the highest Runtime followed by 'Law of the Lawless'**

In [None]:
Image("../input/colorado-th/colorado.JPG")

# How many movies released 1913 and onwards ?

In [None]:
year_wise_movie_release= ott.groupby('Year')['Title'].count().reset_index().rename(columns = {'Title':'Number_of_Movies'})
fig = px.bar(year_wise_movie_release, x='Year', y='Number_of_Movies', color='Number_of_Movies', height=500)
fig.show()

1. **Surprisingly , maximum number of movies ever made in this century were in the year 2017 (around 1401)**
2. **This number fell down to just 689, two years later (less than 50%)**
3. **In the year 1913, a total of only 2 movies were made**
4. **This year i.e,2020 it is visible the movie production taken a massive hit due to the pandemic with only 147 movies in the kitty**

# Just for our knowledge, in the year 1913 the first ever full length feature film named '*Raja Harishchandra*' was directed and produced by *'Dadasaheb Phalke'* in India 

In [None]:
Image("../input/raja-harishch/rjha.JPG")

# Which are the 15 most favourite genres amongst OTT viewers ?

In [None]:
favourite_genres = ott.groupby('Genres')['Title'].count().reset_index().rename(columns = {'Title':'Number_of_Movies'}).sort_values('Number_of_Movies',ascending = False).head(15)
fig = px.bar(favourite_genres, x='Genres', y='Number_of_Movies', color='Number_of_Movies', height=700)
fig.show()

1. **It is no surprise that 'Drama' is the most favourite genre across the globe. We all love watching family centric movies loaded with emotions and happy fairy tale endings.**
2. **Documentaries have now evolved with time and the spectrum of topics is quite wide. From Veganism to Capitalism, there are quite interesting documentaqries especially on Netflix.**
3. **Surprisingly Action genre is the 15th in the queue, which might be because of its limitations with respect to age.**

# Which are the top 15 directors who have maximum movies/shows published on the OTT platforms ?

In [None]:
top_15_directors = ott.groupby('Directors')['Title'].count().reset_index().rename(columns = {'Title':'Number_of_Movies'}).sort_values('Number_of_Movies',ascending = False).head(15)
fig = px.bar(top_15_directors, x='Directors', y='Number_of_Movies', color='Number_of_Movies', height=650)
fig.show()

1. **Jay Chapman who is a renowned showrunner in Hollywood tops the list.**
2. **Joseph Kane who is 2nd in the list has some world class movies to his name - 'Flame of Barbary Coast(1945)' to name one of them**

# If you are interested in watching some of Jay Chapman's work over a weekend, let me list that down for you.

In [None]:
Image("../input/jaychapman/jaychapman.JPG")

# Which countries hosted maximum movies on OTT ?

In [None]:
top_15_countries = ott.groupby('Country')['Title'].count().reset_index().rename(columns = {'Title':'Number_of_Movies'}).sort_values('Number_of_Movies',ascending = False).head(15)
fig = px.bar(top_15_countries, x='Country', y='Number_of_Movies', color='Number_of_Movies', height=650)
fig.show()

1. **It is of no doubt that Hollywood has hosted maximum movies on respective OTT platforms with over 8776 movies.**
2. **Indian cinema is at the core of every Indian and an inherent part of his life. Movies,Cricket and Politics rest in our blood as they say. Many regional cinemas are now coming in the mainstream and are being hosted on OTT platforms. They are one of the major contributors. India has around 1064 movies on OTT right after USA.**
3. **What came to my surprise though is that France and Canada have prestigious Film festivals in the world to their name. TIFF (Toronto International Film Festival) is a badge of honour for many film makers acros the globe.**



# Which of the top 15 shows/movies hosted on any OTT platforms have highest IMDb ratings?

In [None]:
best_15_works = ott.sort_values('IMDb',ascending = False).head(15)
fig = px.bar(best_15_works, x='Title', y='IMDb', color='IMDb', height=600)
fig.show()

1. **David Letterman's 'My next guest' tops the list. I remember watching the episode where he interviewed SRK at his house (Mannat,Bandra-Mumbai) and then in the studio. It is one of the best interviews I have seen in my life. Both of them are equally smart and humorous. It is streaming on Netflix. I urge you to watch it if you haven't.**
2. **Second on the list is this famous sports movie called "Down, But Not Out!" captures all the action of four amateur women boxers as they step for the first time into the ring. Each fighter, ANNA, DARIA, AGA and ALICJA go all out to win as they are challenged to face an unknown opponent at a box competition organized by the amateur boxing association within their league**
3. **Surprisingly 'Natsamrat', which is a well-renowned Marathi drama written by V.V.Shirvadkar,now made into a movie stands in top 10 with an IMDb rating of 9.1. Nana Patekar starred in the movie which was directed by Mahesh Manjrekar.**

# Let me show you a snippet from the interview :)

In [None]:
Image("../input/srkdavid/srkdavid.JPG")

In [None]:
Image("../input/natsamrat/nstrm.JPG")

# Which genre has the highest runtime on OTT platforms ?

In [None]:
genre_runtime = ott.sort_values('Runtime',ascending = False).head(15)
fig = px.bar(genre_runtime, x='Genres', y='Runtime', color='IMDb', height=600)
fig.show()

1. **Action,Adventure, Drama,Romance& Western - has the highest runtime on OTT platforms accounting to 1256 hours.**
2. **Surprisingly, the least of these 15 is 'Drama,Mystery,Thriller accounting to 240 hours. Bollywood has lost its ability way back to create such movies,but Hollywood seems to row in the same boat.**
3. **Though the 2nd most type of genre on OTT is 'Documentary', the runtime is quite low acounting to only 256 hours.**

# Analysing 'Language-Runtime' relationship:

In [None]:
genre_runtime = ott.sort_values('Runtime',ascending = False).head(15)
fig = px.bar(genre_runtime, x='Language', y='Runtime', color='IMDb', height=600)
fig.show()

**Russian and Swedish movies are making a mark on OTT platforms, followed by Arabic,German,Spanish,French,etc.**

# Conclusion:

1. **Though the 2nd most favourite genre across all OTT plaforms is 'Documentaries', the runtime is very low and ranks 10th on the highest runtime list. It seems the audience enjoys knowledgeable content to watch and their taste is not only limited to Romance, Comedy,Action,etc.**
2. **France and Canada is known to host world class film festivals but fail to host good amount of content on the OTT platforms. Combined they have hosted around 317 movies only. If they host the movies (though in other languages with subtitles enabled) it would be a great contribution to loyal world cinema audience, who love watching meaningful content. Had Narcos abstained itself from hosting on Netflix bacause of it's spanish language, they would have never garnered such love and attention from corners of the world. Same applies to 'Money Heist' as well.**
3. **Though Prime Video is not the major source of income for global e-commerece giant Amazon, Prime Video is hosting 71.1% of the total OTT content while Netflix, whose entire business is dependant on the publishing video content ,is hosting around 20%. This is something the OTT competitors have to look into and seems an interesting area of research.**
4. **Genre-specific OTT platforms can prove to be an interesting area to experiment as a content creator and as an investor as well. More foreign content acros geographies have an excellent opportunity to break the walls.**

In [None]:
Image("../input/thatsallfolks/thatsall.JPG")

# If enjoyed, please UPVOTE. Also, mention in the comments where all can I improve. Thank you :)