
# Instagram Reach Forecasting

Instagram Reach Forecasting is an essential process that predicts the number of people an Instagram post, story, or other content will reach. This prediction is based on historical data and various other factors. For professionals using Instagram, this forecast can provide valuable insights to optimize their content strategy. By understanding the performance of their content, they can make more informed decisions regarding publishing schedules, content type, and audience engagement tactics.

In this notebook, we will perform an Instagram Reach Forecast using a dataset that provides historical reach data. We will visualize, analyze, and predict future reach using Time Series Forecasting.



## Data Loading and Initial Exploration

We'll begin by importing the necessary libraries and loading our dataset.


In [None]:

import pandas as pd
import plotly.graph_objs as go
import plotly.express as px
import plotly.io as pio
pio.templates.default = "plotly_white"
data = pd.read_csv("/mnt/data/Instagram-Reach.csv", encoding = 'latin-1')
data.head()



## Data Preprocessing

To make our analysis more streamlined, we'll convert the 'Date' column into a datetime format.


In [None]:

data['Date'] = pd.to_datetime(data['Date'])
data.head()



## Time Series Forecasting

To forecast the reach, we will use Time Series Forecasting methods. We'll start by analyzing the trends and seasonal patterns in the Instagram reach data.



### Trends and Seasonal Patterns

We'll decompose the time series to visualize the trend, seasonal, and residual components.


In [None]:

from plotly.tools import mpl_to_plotly
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

result = seasonal_decompose(data['Instagram reach'], model='multiplicative', period=100)

fig = plt.figure()
fig = result.plot()

fig = mpl_to_plotly(fig)
fig.show()



### Autocorrelation and Partial Autocorrelation

To determine the parameters for our SARIMA model, we'll visualize the autocorrelation and partial autocorrelation plots.


In [None]:

pd.plotting.autocorrelation_plot(data["Instagram reach"])


In [None]:

from statsmodels.graphics.tsaplots import plot_pacf
plot_pacf(data["Instagram reach"], lags = 100)



### SARIMA Model Training

We'll use the SARIMA model to forecast the reach, based on the parameters determined from the autocorrelation and partial autocorrelation plots.


In [None]:

import statsmodels.api as sm
import warnings

p, d, q = 8, 1, 2

model = sm.tsa.statespace.SARIMAX(data['Instagram reach'],
                                  order=(p, d, q),
                                  seasonal_order=(p, d, q, 12))
model = model.fit()
model.summary()



### Making Predictions

Now that our model is trained, we'll use it to make forecasts for the future reach of the Instagram account.


In [None]:

predictions = model.predict(len(data), len(data)+100)

trace_train = go.Scatter(x=data.index, 
                         y=data["Instagram reach"], 
                         mode="lines", 
                         name="Training Data")
trace_pred = go.Scatter(x=predictions.index, 
                        y=predictions, 
                        mode="lines", 
                        name="Predictions")

layout = go.Layout(title="Instagram Reach Time Series and Predictions", 
                   xaxis_title="Date", 
                   yaxis_title="Instagram Reach")

fig = go.Figure(data=[trace_train, trace_pred], layout=layout)
fig.show()



## Analyzing Reach

We'll now visualize and analyze the trend of Instagram reach over time. 


In [None]:

fig = go.Figure()
fig.add_trace(go.Scatter(x=data['Date'], 
                         y=data['Instagram reach'], 
                         mode='lines', name='Instagram reach'))
fig.update_layout(title='Instagram Reach Trend', xaxis_title='Date', 
                  yaxis_title='Instagram Reach')
fig.show()


In [None]:

fig = go.Figure()
fig.add_trace(go.Bar(x=data['Date'], 
                     y=data['Instagram reach'], 
                     name='Instagram reach'))
fig.update_layout(title='Instagram Reach by Day', 
                  xaxis_title='Date', 
                  yaxis_title='Instagram Reach')
fig.show()


In [None]:

fig = go.Figure()
fig.add_trace(go.Box(y=data['Instagram reach'], 
                     name='Instagram reach'))
fig.update_layout(title='Instagram Reach Box Plot', 
                  yaxis_title='Instagram Reach')
fig.show()


In [None]:

data['Day'] = data['Date'].dt.day_name()
data.head()


In [None]:

import numpy as np
day_stats = data.groupby('Day')['Instagram reach'].agg(['mean', 'median', 'std']).reset_index()
day_stats


In [None]:

fig = go.Figure()
fig.add_trace(go.Bar(x=day_stats['Day'], 
                     y=day_stats['mean'], 
                     name='Mean'))
fig.add_trace(go.Bar(x=day_stats['Day'], 
                     y=day_stats['median'], 
                     name='Median'))
fig.add_trace(go.Bar(x=day_stats['Day'], 
                     y=day_stats['std'], 
                     name='Standard Deviation'))
fig.update_layout(title='Instagram Reach by Day of the Week', 
                  xaxis_title='Day', 
                  yaxis_title='Instagram Reach')
fig.show()



## Time Series Forecasting

To predict future reach, we'll make use of Time Series Forecasting techniques. First, we'll examine the Trends and Seasonal patterns of Instagram reach.


In [None]:

from plotly.tools import mpl_to_plotly
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

result = seasonal_decompose(data['Instagram reach'], 
                            model='multiplicative', 
                            period=100)

fig = plt.figure()
fig = result.plot()
fig = mpl_to_plotly(fig)
fig.show()


In [None]:

pd.plotting.autocorrelation_plot(data["Instagram reach"])


In [None]:

from statsmodels.graphics.tsaplots import plot_pacf
plot_pacf(data["Instagram reach"], lags = 100)


In [None]:

import statsmodels.api as sm
import warnings

p, d, q = 8, 1, 2
model=sm.tsa.statespace.SARIMAX(data['Instagram reach'],
                                order=(p, d, q),
                                seasonal_order=(p, d, q, 12))
model=model.fit()
model.summary()


In [None]:

predictions = model.predict(len(data), len(data)+100)

trace_train = go.Scatter(x=data.index, 
                         y=data["Instagram reach"], 
                         mode="lines", 
                         name="Training Data")
trace_pred = go.Scatter(x=predictions.index, 
                        y=predictions, 
                        mode="lines", 
                        name="Predictions")

layout = go.Layout(title="Instagram Reach Time Series and Predictions", 
                   xaxis_title="Date", 
                   yaxis_title="Instagram Reach")

fig = go.Figure(data=[trace_train, trace_pred], layout=layout)
fig.show()
