<a href="https://colab.research.google.com/github/jibbsmathew/Ads-CTR-Forecasting-using-Python/blob/main/NetflixSubscriptionsForecasting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Forecasting the number of subscriptions Netflix will achieve in a time period is a vital business practice that enables them to plan, strategize, and make data-driven decisions. It enhances operational efficiency, financial planning, and content strategy, ultimately contributing to their success and growth in the highly competitive streaming industry. If you want to learn how to forecast the number of subscriptions for a streaming service like Netflix, this article is for you. In this article, I’ll take you through the task of Netflix Subscriptions Forecasting using Python.

# Netflix Subscriptions Forecasting: Process We Can Follow

Using techniques like time series forecasting, Netflix can estimate the expected number of new subscribers in
a given time period and better understand the growth potential of their business. Below is the process we can
follow to forecast subscription counts for Netflix:

- Gather historical Netflix subscriptions growth data
- Preprocess and clean the data
- Explore and analyze time series patterns
- Choose a time series forecasting model (e.g., ARIMA, LSTM)
- Train the model using the training data
- Forecast future Netflix subscription counts


So the process for forecasting subscriptions for Netflix starts with collecting a dataset based
on the historical growth of Netflix Subscribers.

In the section below, I’ll take you through the task of Netflix Subscriptions Forecasting using Time Series Forecasting and the Python programming language.

# Netflix Subscriptions Forecasting using Python
Let’s start this task by importing the necessary Python libraries and the dataset:

In [1]:
# Importing Necessay Python libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.graph_objs as go
import plotly.express as px
import plotly.io as pio
import sys
pio.templates.default = "plotly_white"
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# reading the data
#data = pd.read_csv('Netflix Subscriptions.csv')
#print(data.head())

if 'google.colab' in sys.modules:
    project_path =  "/content/drive/My Drive/"
    # Google Colab lib
    from google.colab import drive
    # Mount the drive
    drive.mount('/content/drive/', force_remount=True)
    sys.path.append(project_path)
    %cd $project_path

Mounted at /content/drive/
/content/drive/My Drive


In [2]:
df = pd.read_csv("data/Netflix-Subscriptions.csv")
df.head()

Unnamed: 0,Time Period,Subscribers
0,01/04/2013,34240000
1,01/07/2013,35640000
2,01/10/2013,38010000
3,01/01/2014,41430000
4,01/04/2014,46130000


The dataset contains subscription counts of Netflix at the start of each quarter from 2013 to 2023. Before moving forward, let’s convert the Time Period column into a datetime format:

In [3]:
df['Time Period'] = pd.to_datetime(df['Time Period'],format='%d/%m/%Y')
print(df.head())

  Time Period  Subscribers
0  2013-04-01     34240000
1  2013-07-01     35640000
2  2013-10-01     38010000
3  2014-01-01     41430000
4  2014-04-01     46130000


Now let’s have a look at the quarterly subscription growth of Netflix:

In [4]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=df['Time Period'],
                         y=df['Subscribers'],
                         mode='lines', name='Subscribers'))
fig.update_layout(title='Netflix Quarterly Subscriptions Growth',
                  xaxis_title='Date',
                  yaxis_title='Netflix Subscriptions')
fig.show()

In the above graph, we can see that the growth of Netflix subscribers is not seasonal. So we can use a forecasting technique like ARIMA in this dataset.

Now let’s have a look at the quarterly growth rate of subscribers at Netflix:

In [5]:
# Calculate the quarterly growth rate
df['Quarterly Growth Rate'] = df['Subscribers'].pct_change() * 100

# Create a new column for bar color (green for positive growth, red for negative growth)
df['Bar Color'] = df['Quarterly Growth Rate'].apply(lambda x: 'green' if x > 0 else 'red')

# Plot the quarterly growth rate using bar graphs
fig = go.Figure()
fig.add_trace(go.Bar(
    x=df['Time Period'],
    y=df['Quarterly Growth Rate'],
    marker_color=df['Bar Color'],
    name='Quarterly Growth Rate'
))
fig.update_layout(title='Netflix Quarterly Subscriptions Growth Rate',
                  xaxis_title='Time Period',
                  yaxis_title='Quarterly Growth Rate (%)')
fig.show()

Now let’s have a look at the yearly growth rate:

In [7]:
# Calculate the yearly growth rate
df['Year'] = df['Time Period'].dt.year
yearly_growth = df.groupby('Year')['Subscribers'].pct_change().fillna(0) * 100

# Create a new column for bar color (green for positive growth, red for negative growth)
df['Bar Color'] = yearly_growth.apply(lambda x: 'green' if x > 0 else 'red')

# Plot the yearly subscriber growth rate using bar graphs
fig = go.Figure()
fig.add_trace(go.Bar(
    x=df['Year'],
    y=yearly_growth,
    marker_color=df['Bar Color'],
    name='Yearly Growth Rate'
))
fig.update_layout(title='Netflix Yearly Subscriber Growth Rate',
                  xaxis_title='Year',
                  yaxis_title='Yearly Growth Rate (%)')
fig.show()