<a href="https://colab.research.google.com/github/s-vgustavo/analytics-studies/blob/main/Instagram_Analytics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Instagram analytics using Instagram Graph API

This is a script developed for collecting data from Instagram for marketing analytics purposes.

Current status: This code is able to collect data using Instagram Graph API and requests library, separating them into feed and reels metrics. 

Metrics obtained: number of feed posts, like in feed posts, comments in feed posts, number of posted reels, profile impressions and profile reach.

Future improvements: Creation of an automation that stores data and continually builds a timeline with metrics.

## 1. Defining endpoint parameters and authentication tokens
Instagram Graph API, such as Facebook Graph API, uses authentication tokens, secrets, page ids and account ids that must be previously known in order to run this code.
You can find more information on this links: https://towardsdatascience.com/discover-insights-from-your-instagram-business-account-with-facebook-graph-api-and-python-81d20ee2e751, https://developers.facebook.com/docs/graph-api/get-started/

In [None]:
# You can fill the initial information here
#app_id = int
#app_secret = int
#page_id = int
#instagram_account_id = int

We will start off importing libraries. For the purpose of this code, we will use requests, json, datetime and pandas.

It is also necessary to set a parameters dictionary. This dictionary contains the access token, client id, client secret, graph domain, graph version, endpoint base, page id, instagram account id and instagram username.

In [None]:
# Importing libraries
import requests
import json
import datetime
import pandas as pd

# Defining the parameters dictionary
params = dict()
params['access_token'] = 'string' 
params['client_id'] = 'string'     
params['client_secret'] = 'string'     
params['graph_domain'] = 'https://graph.facebook.com'
params['graph_version'] = 'v15.0'
params['endpoint_base'] = params['graph_domain'] + '/' + params['graph_version'] + '/'
params['page_id'] = 'string'                 
params['instagram_account_id'] = 'string'
params['ig_username'] = 'string'

Following the parameters dictionary, we can begin by obtaining data about the access token. This is important because of two reasons:

1) The first access token we get is the short-lived one. This means that it is valid only for a couple hours. If you, like me, are willing to use this code with some frequency, you will need to obtain the long-lived one.

2) Access token data contains its expiration time. Now we are able to check when will it be required to reset the access token.  

In [None]:
# Defining endpoint parameters to obtain the access token data
endpointParams = dict()
endpointParams['input_token'] = params['access_token']
endpointParams['access_token'] = params['access_token']

# Defining the URL
url = params['graph_domain'] + '/debug_token'

# Requesting access token data
data = requests.get(url, endpointParams)
access_token_data = json.loads(data.content)
access_token_data

Access token data, just as the other data obtained from Instagram Graph API, returns as a JSON file, which is equivalent to a python dictionary.

With the code below, we will print the expiration date of the access token.

In [None]:
print("Token Expires: ", datetime.datetime.fromtimestamp(access_token_data['data']['expires_at']))

The present code is already changed so we use the long-lived access token. In case this is the first time you are using it, you will need it to obtain the long-lived token (using the short-lived one).

## 2. Obtaining the long-lived access token

In [None]:

# The following code is based on this Facebook API Endpoint:
# https://graph.facebook.com/{graph-api-version}/oauth/access_token?grant_type=fb_exchange_token&client_id={app-id}&client_secret={app-secret}&fb_exchange_token={your-access-token}
# Only in cases that access token is the short-lived token. Already changed the code so the access token is the long-lived one.

# Defining the URL
##url = params['endpoint_base'] + 'oauth/access_token'

# Defining endpoint parameters
##endpointParams = dict() 
##endpointParams['grant_type'] = 'fb_exchange_token'
##endpointParams['client_id'] = params['client_id']
##endpointParams['client_secret'] = params['client_secret']
##endpointParams['fb_exchange_token'] = params['access_token']

# Requesting data
##data = requests.get(url, endpointParams )
##long_lived_token = json.loads(data.content)

# Changing access token so it is the long-lived.
##params['access_token'] = long_lived_token['access_token']



## 3. Obtaining media insights

We will start by resetting the endpoint parameters and requesting the basic insights for media content: id, caption, media product, media type, media url, permalink, thumbnail url, timestamp, username, like and comments count.

Then, we will do the same but for profile insights: impressions and reach.

Both will return JSON files, which we can use to print the results after.

In [None]:
# Basic Media Insights
# API Endpoint: https://graph.facebook.com/{graph-api-version}/{ig-user-id}/media?fields={fields}

# Defining URL
url = params['endpoint_base'] + params['instagram_account_id'] + '/media'

# Defining endpoint parameters
endpointParams = dict()
endpointParams['fields'] = 'id,caption,media_product_type,media_type,media_url,permalink,thumbnail_url,timestamp,username,like_count,comments_count'
endpointParams['access_token'] = params['access_token']

# Requesting data
data = requests.get(url, endpointParams )
basic_insight = json.loads(data.content)

# Basic profile insights
# Defining URL
url = params['endpoint_base'] + params['instagram_account_id'] + '/insights'

# Defining endpoint parameters
endpointParams = dict() 
endpointParams['metric'] = 'impressions,reach'
endpointParams['period'] = 'week'
endpointParams['access_token'] = params['access_token'] 

# Requesting data
profile_data = requests.get(url, endpointParams)
profile_data_json = json.loads(profile_data.content)


To understand how data was obtained, we will create a DataFrame object so we can visually see the information.

In [None]:
# Creates a temporary dataframe for understanding and collecting data
df_temp = pd.DataFrame(basic_insight['data'])

Finally, printing the results. Print statements are in portuguese, but you can change this with no problem.

In [None]:
# Obtaining metrics
timestamp = datetime.date.today()
print('Data: ', timestamp)

post_feed_instagram = len(df_temp[df_temp['media_product_type'] == 'FEED'])
likes_feed = sum(df_temp[df_temp['media_product_type'] == 'FEED']['like_count'])
comentarios_feed = sum(df_temp[df_temp['media_product_type'] == 'FEED']['comments_count'])
print('\n\nPosts feed: ', post_feed_instagram, '\nLikes feed: ', likes_feed, '\nComentarios feed: ', comentarios_feed)

reels_instagram = len(df_temp[df_temp['media_product_type'] == 'REELS'])
likes_reels = sum(df_temp[df_temp['media_product_type'] == 'REELS']['like_count'])
comentarios_reels = sum(df_temp[df_temp['media_product_type'] == 'REELS']['comments_count'])
print('\n\nReels postados: ', reels_instagram, '\nLikes reels: ', likes_reels, '\nComentarios reels: ', comentarios_reels)

profile_impressions = profile_data_json['data'][0]['values'][0]['value']
profile_reach = profile_data_json['data'][1]['values'][0]['value']
print('\n\nImpressões do perfil: ', profile_impressions, '\nAlcance do perfil: ', profile_reach)