# **Streaming Services Analysis**

## *Analysis Overview:*

This analysis will dive into data from four leading streaming platforms—Netflix, Hulu, Amazon Prime, and Disney+—to explore trends, viewership patterns, and key performance metrics. By examining each service's unique offerings and subscriber behaviors, we aim to uncover insights that highlight their market positions and user preferences in the ever-evolving digital entertainment landscape.

## *Content Volume and Trends*


### Distribution of Content by Platform

This visualization illustrates the proportion of the different platforms and their count of releases throughout the years. It appears that Amazon Prime has the highest count of releases for now, with Netflix close behind. 

In [None]:
import pandas as pd
import plotly.express as px
import polars as pl

disney = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/disney_plus_titles.csv")
hulu = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/hulu_titles.csv")
netflix = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/netflix_titles.csv")
prime = pd.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/amazon_prime_titles.csv")

disney = disney.with_columns(pl.lit("Disney+").alias("platform"))
hulu = hulu.with_columns(pl.lit("Hulu").alias("platform"))
netflix = netflix.with_columns(pl.lit("Netflix").alias("platform"))

disney = disney.to_pandas()
hulu = hulu.to_pandas()
netflix = netflix.to_pandas()

prime["platform"] = "Amazon Prime"

data = pd.concat([disney, hulu, netflix, prime], ignore_index=True)

fig_hist = px.histogram(data, x="release_year", color="platform", marginal="box", 
                        title="Distribution of Content Release Years by Platform",
                        opacity=0.7, nbins=30)
fig_hist.show()

### Content Count Over Time 

This line graph tracks the growth in content offerings for each platform over time. We can see here that among the platforms Amazon Prime again has the highest release count through the years. Although, Disney+ seems to have remained steady with their releases over time. 

In [None]:
content_over_time = data.groupby(['release_year', 'platform']).size().reset_index(name='count')

fig_line = px.line(content_over_time, x='release_year', y='count', color='platform', 
                   title="Content Count Over Time by Platform")
fig_line.show()

### Movies and TV Shows Released Over Time

This visualization highlights the release patterns of movies and TV shows over time across all platforms. This graphic displays how much more prominent TV shows has become over the years with their value increasing more than Movies have. 

In [None]:
content_stream = data.groupby(['release_year', 'type']).size().unstack().fillna(0)

fig_stream = px.area(content_stream, title="Movies and TV Shows Released Over Time")
fig_stream.show()

## *Content Duration Analysis*

### Duration of Content by Platform

This plot provides a comparative analysis of the average content duration on each platform. We have Movie and TV shows being displayed, and it appears that Amazon Prime has the longest duration in both movies and tv shows, while Disney+ has the shortest. 