# Effectiveness of Social Media Influencer Marketing

## By: Quinn O'Rourke

### Imports

In [1]:
from aif360.datasets import StandardDataset
from aif360.metrics import  BinaryLabelDatasetMetric
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#import seaborn as sns
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve, auc

2023-05-28 16:52:52.727173: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


### Introduction

For this assignment, I'm going to break down the impact of influencer social media marketing on brands/brand reception as a whole. To do this, I'm going to compile various datasets, all mostly from keggle and cited as needed below, and try to answer the impact of some of the largest influencers on Instagram in particular on the brand before and after the campaign. I'll also be looking at what metrics are most important in boosting a brand's reputation and/or brand awareness, whether its average likes, total followers, et cetera. 

My research question is as follows: How effective are macro-influencer social media marketing campaigns, especially on Instagram, in boosting a specific company's awareness and/or brand reputation. To judge how effective sponsored posts from the top Instagram influencers are, I will be manually collecting the company's data metrics (instagram followers/likes, google search metrics, etc) and comparing it to an existing dataset I found of top instagram influencers in a variety of metrics. I also will be using existing survey data on certain segments and how they interpret influencer-sponsored posts to guide my research and explain any results I find.

I will be doing my reserach for the top influencers by different metrics in 3 different industries and seeing how their sponsored posts impacted the brand they marketed for. The industries I selected based on their variety and possible different insights are the following: sports, music, fashion/modeling. There is a possibility the analysis done below would result in different insights if I selected for different industries or if we analyzed micro-influencers instead, but this should give us a great base to find what type of influencer campaigns, if any, are effective in increasing a brand's reputation/awareness.

### Analysis:

In [28]:
#Both datasets are from keggle, links are here: 
#Online Influencer Customer Survey: https://www.kaggle.com/datasets/thedevastator/maximizing-revenues-through-online-influencer-ma for
survey_df = pd.read_csv('Whatsgoodly - Thought Catalog Influencers.csv')
#Top Instagram Influencer Data: https://www.kaggle.com/datasets/surajjha101/top-instagram-influencers-data-cleaned
insta_df = pd.read_csv('top_insta_influencers_data.csv')

In [69]:
survey_df.head()

Unnamed: 0,index,Question,Segment Type,Segment Description,Answer,Count,Percentage
0,0,What do you think when an influencer is obviou...,Mobile,Global results,Is this product cool?,268,0.226
1,1,What do you think when an influencer is obviou...,Mobile,Global results,This is lame,532,0.449
2,2,What do you think when an influencer is obviou...,Mobile,Global results,Get that money!,293,0.247
3,3,What do you think when an influencer is obviou...,Mobile,Global results,Other (comment),91,0.077
4,4,What do you think when an influencer is obviou...,Web,Web,Is this product cool?,0,0.0


In [34]:
insta_df.head()

Unnamed: 0,rank,channel_info,influence_score,posts,followers,avg_likes,60_day_eng_rate,new_post_avg_like,total_likes,country
0,1,cristiano,92,3.3k,475.8m,8.7m,1.39%,6.5m,29.0b,Spain
1,2,kyliejenner,91,6.9k,366.2m,8.3m,1.62%,5.9m,57.4b,United States
2,3,leomessi,90,0.89k,357.3m,6.8m,1.24%,4.4m,6.0b,
3,4,selenagomez,93,1.8k,342.7m,6.2m,0.97%,3.3m,11.5b,United States
4,5,therock,91,6.8k,334.1m,1.9m,0.20%,665.3k,12.5b,United States


In [38]:
# First, let's find the top macro-influencers on Instagram in the all the different variables given for the 3 selected industries (sports, music, fashion/modeling)
# sort the 'avg_likes' column in descending order
insta_top_avglikes_df = insta_df.sort_values('avg_likes', ascending=False)
insta_top_avglikes_df.head()

Unnamed: 0,rank,channel_info,influence_score,posts,followers,avg_likes,60_day_eng_rate,new_post_avg_like,total_likes,country
28,29,realmadrid,90,6.9k,123.4m,996.2k,0.48%,588.3k,6.8b,Spain
39,40,Shakira,88,2.0k,76.1m,975.1k,0.41%,304.7k,1.9b,
135,136,brunamarquezine,84,3.0k,43.1m,969.7k,1.05%,444.2k,2.9b,United States
85,86,tatawerneck,86,5.6k,53.9m,959.8k,0.51%,266.5k,5.4b,Brazil
99,100,kritisanon,76,2.7k,50.2m,897.2k,1.21%,604.4k,2.4b,India


In [55]:
# sort the '60_day_eng_rate' column in descending order
insta_60_eng_df = insta_df.sort_values('60_day_eng_rate', ascending=False)
insta_60_eng_df[0:29]

Unnamed: 0,rank,channel_info,influence_score,posts,followers,avg_likes,60_day_eng_rate,new_post_avg_like,total_likes,country
167,168,rkive,83,0.11k,37.0m,10.9m,NaN%,0,1.2b,
69,70,roses_are_rosie,82,0.82k,61.8m,4.6m,9.72%,6.0m,3.8b,
64,65,sooyaaa__,82,0.83k,62.9m,4.5m,9.43%,5.9m,3.8b,
38,39,lalalalisa_m,70,0.87k,80.9m,5.8m,9.00%,7.2m,5.1b,
118,119,zayn,82,0.16k,46.5m,4.7m,8.81%,4.0m,773.5m,United States
75,76,milliebobbybrown,80,0.28k,57.6m,4.0m,8.63%,5.0m,1.1b,United States
156,157,georginagio,74,0.73k,39.1m,2.2m,8.56%,3.3m,1.6b,
49,50,jennierubyjane,76,0.86k,68.9m,5.1m,8.36%,5.7m,4.4b,
83,84,zacefron,86,0.66k,54.5m,2.3m,8.18%,4.4m,1.5b,United States
114,115,harrystyles,57,0.59k,46.9m,4.7m,6.38%,2.9m,2.8b,United States


In [41]:
# sort the 'total_likes' column in descending order
insta_totallike_df = insta_df.sort_values('total_likes', ascending=False)
insta_totallike_df.head()

Unnamed: 0,rank,channel_info,influence_score,posts,followers,avg_likes,60_day_eng_rate,new_post_avg_like,total_likes,country
102,103,thv,83,0.06k,49.3m,15.4m,25.80%,12.6m,987.4m,
199,200,raisa6690,80,4.2k,32.8m,232.2k,0.30%,97.4k,969.1m,Indonesia
148,149,addisonraee,85,0.30k,40.1m,3.1m,2.49%,994.4k,957.9m,
169,170,antogriezmann,83,0.87k,36.5m,1.1m,0.57%,203.6k,955.4m,France
132,133,hrithikroshan,85,0.58k,43.7m,1.6m,3.82%,1.6m,949.9m,CÃ´te d'Ivoire


#### Selected accounts based on metrics (were best rated in all the ones analyzed above plus most followed which is the default order seen in insta_df.head()) are below:

Sports: realmadrid (soccer club), cristiano (soccer player), paulodybala (soccer player)

Music: rkive (BTS music group member), thv (BTS music group member), Shakira (musician)

Modeling/Acting: kyliejenner, brunamarquezine, addisonraee

#### Most recent sponsorship posts for each of the accounts above:

realmadrid: do not do sponsored posts on the main account

cristiano: https://www.instagram.com/p/Cnjkb2FhFFt/ 1/18/23 paid partnership with LiveScore

paulodybala: https://www.instagram.com/p/CmMi5ZIIQnW/ 12/15/22 sponsored post by OneFootball

rkive: https://www.instagram.com/p/CqZijhPvxkZ/ 3/29/23 sponsored post by Bottega Veneta

thv: https://www.instagram.com/p/CrAUv3_vp9P/ 04/14/23 sponsored post by Celine by hedi slimane

Shakira: https://www.instagram.com/p/CkBjH35DpxT/ 10/22/22 sponsored post with Youtube Music (Google)

kyliejenner: https://www.instagram.com/p/CrQ-4wSpbhC/ 4/20/22 sponsored post with Dolce & Gabbana

brunamarquezine: https://www.instagram.com/p/CoGVbJNpJxz/ 1/31/23 sponsored post with AbsolutBrasil

addisonraee: https://www.instagram.com/p/CftzxiCD4WN/ 07/07/22 sponsored post with Samsung Mobile/Google

#### Given all the sponsored posts above, I'm going to create my own dataframe with the brand metrics before/after the above sponsored posts then analyze to see if most of these posts seemed to boost brand awareness/reputation

In [62]:
#websites used:
# For Google Trends data, used google trends (https://trends.google.com/), based on week before post date and week after post date
# For Google PageRank data, used a combination of WayBack machine and https://dnschecker.org/pagerank.php (calculates brand reputation/how favorable pagerank views it)

research_data = [{'Influencer': 'cristiano', 'PostDate': "1/18/23", 'Company': 'LiveScore', 
                  'BeforeGoogleTrends': 38, 'AfterGoogleTrends': 45,
                 'BeforeGooglePageRank': 5, 'AfterGooglePageRank': 5},
            {'Influencer': 'paulodybala', 'PostDate': "12/15/22", 'Company': 'OneFootball',
                 'BeforeGoogleTrends': 21, 'AfterGoogleTrends': 100,
                'BeforeGooglePageRank': 5, 'AfterGooglePageRank': 6},
            {'Influencer': 'rkive', 'PostDate': "3/29/23", 'Company': 'Bottega Veneta',
                 'BeforeGoogleTrends': 72, 'AfterGoogleTrends': 89,
                 'BeforeGooglePageRank': 5, 'AfterGooglePageRank': 5},
            {'Influencer': 'thv', 'PostDate': "04/14/23", 'Company': 'Celine by hedi slimane',
                 'BeforeGoogleTrends': 30, 'AfterGoogleTrends': 45,
                 'BeforeGooglePageRank': 4, 'AfterGooglePageRank': 5},
            {'Influencer': 'Shakira', 'PostDate': "10/22/22", 'Company': 'Google',
                 'BeforeGoogleTrends': 93, 'AfterGoogleTrends': 96,
                 'BeforeGooglePageRank': 10, 'AfterGooglePageRank': 10},
            {'Influencer': 'kyliejenner', 'PostDate': "4/20/22", 'Company': 'Dolce & Gabbana',
                 'BeforeGoogleTrends': 71, 'AfterGoogleTrends': 61,
                 'BeforeGooglePageRank': 6, 'AfterGooglePageRank': 6},
            {'Influencer': 'brunamarquezine', 'PostDate': "1/31/23", 'Company': 'Absolut Vodka Brazil',
                 'BeforeGoogleTrends': 54, 'AfterGoogleTrends': 51,
                 'BeforeGooglePageRank': 5, 'AfterGooglePageRank': 5},
            {'Influencer': 'addisonraee', 'PostDate': "07/07/22", 'Company': 'Google',
                 'BeforeGoogleTrends': 95, 'AfterGoogleTrends': 97,
                 'BeforeGooglePageRank': 10, 'AfterGooglePageRank': 10}]
                

# create the DataFrame from the list of dictionaries
research_df = pd.DataFrame(research_data)
research_df

Unnamed: 0,Influencer,PostDate,Company,BeforeGoogleTrends,AfterGoogleTrends,BeforeGooglePageRank,AfterGooglePageRank
0,cristiano,1/18/23,LiveScore,38,45,5,5
1,paulodybala,12/15/22,OneFootball,21,100,5,6
2,rkive,3/29/23,Bottega Veneta,72,89,5,5
3,thv,04/14/23,Celine by hedi slimane,30,45,4,5
4,Shakira,10/22/22,Google,93,96,10,10
5,kyliejenner,4/20/22,Dolce & Gabbana,71,61,6,6
6,brunamarquezine,1/31/23,Absolut Vodka Brazil,54,51,5,5
7,addisonraee,07/07/22,Google,95,97,10,10


In [67]:
before_trends_avg = research_df['BeforeGoogleTrends'].mean()
after_trends_avg = research_df['AfterGoogleTrends'].mean()
print(before_trends_avg)
print(after_trends_avg)
before_rank_avg = research_df['BeforeGooglePageRank'].mean()
after_rank_avg = research_df['AfterGooglePageRank'].mean()
print(before_rank_avg)
print(after_rank_avg)

59.25
73.0
6.25
6.5


In [68]:
before_trends_var = research_df['BeforeGoogleTrends'].var()
after_trends_var = research_df['AfterGoogleTrends'].var()
print(before_trends_var)
print(after_trends_var)
before_rank_var = research_df['BeforeGooglePageRank'].var()
after_rank_var = research_df['AfterGooglePageRank'].var()
print(before_rank_var)
print(after_rank_var)

787.9285714285714
612.2857142857143
5.642857142857143
4.857142857142857


In [78]:
#lets see industry specific trend avgs too, will talk about in findings
before_sports_trends_avg = research_df.iloc[0:2]['BeforeGoogleTrends'].mean()
after_sports_trends_avg = research_df.iloc[0:2]['AfterGoogleTrends'].mean()
print(before_sports_trends_avg)
print(after_sports_trends_avg)
before_music_trends_avg = research_df.iloc[3:6]['BeforeGoogleTrends'].mean() 
after_music_trends_avg = research_df.iloc[3:6]['AfterGoogleTrends'].mean() 
print(before_music_trends_avg)
print(after_music_trends_avg)
before_model_trends_avg = research_df.iloc[7:10]['BeforeGoogleTrends'].mean()
after_model_trends_avg = research_df.iloc[7:10]['AfterGoogleTrends'].mean() 
print(before_model_trends_avg)
print(after_model_trends_avg)

29.5
72.5
64.66666666666667
67.33333333333333
95.0
97.0


In [70]:
# Important survey questions responses to note for finding/implicaitons:
survey_df[0:4] #mobile users, good general overview on sponsored posts

Unnamed: 0,index,Question,Segment Type,Segment Description,Answer,Count,Percentage
0,0,What do you think when an influencer is obviou...,Mobile,Global results,Is this product cool?,268,0.226
1,1,What do you think when an influencer is obviou...,Mobile,Global results,This is lame,532,0.449
2,2,What do you think when an influencer is obviou...,Mobile,Global results,Get that money!,293,0.247
3,3,What do you think when an influencer is obviou...,Mobile,Global results,Other (comment),91,0.077


In [71]:
survey_df[8:12] #looking at model account follower metrics, most of the audience is female so this is of interest

Unnamed: 0,index,Question,Segment Type,Segment Description,Answer,Count,Percentage
8,8,What do you think when an influencer is obviou...,Gender,Female voters,Is this product cool?,71,0.165
9,9,What do you think when an influencer is obviou...,Gender,Female voters,This is lame,220,0.51
10,10,What do you think when an influencer is obviou...,Gender,Female voters,Get that money!,100,0.232
11,11,What do you think when an influencer is obviou...,Gender,Female voters,Other (comment),40,0.093


In [73]:
survey_df[12:16] #looking at sports account follower metrics, mostly male so this is of interest

Unnamed: 0,index,Question,Segment Type,Segment Description,Answer,Count,Percentage
12,12,What do you think when an influencer is obviou...,Gender,Male voters,Is this product cool?,197,0.262
13,13,What do you think when an influencer is obviou...,Gender,Male voters,This is lame,311,0.414
14,14,What do you think when an influencer is obviou...,Gender,Male voters,Get that money!,192,0.256
15,15,What do you think when an influencer is obviou...,Gender,Male voters,Other (comment),51,0.068


### Findings and Implications

After conducting the above analysis, I got a lot of insight into how an influencer marketing campaign can impact different social media accounts in different industries in a variety of ways. 

In regards to influencer-driven campaigns increasing brand awareness, overall they were fairly effective. It seemed like there was a massive variability in the awareness increase effectiveness of these posts, with some increasing by almost 5x (OneFootball) and others actually going to slightly down (Dolce & Gabbana), with smaller companies being more likely to increase at a higher rate. Another interesting development was how different industries/audiences were impacted. Sports-related accounts had an average increase in Google Trends brand awarenes for their sponsors by 13.75, while music and modeling accounts had only 2 and 3 average increase respectively. Looking at the survey results dataframe I collected online, and given the instagram metrics that most sports followers are men and most model/fashion accounts are women, we get better insight into why there's such a disparity in influencer-driven campaigns. While 51% of female responds thought obvious sponsored posts were lame, and only 16% were interested in the actual product, 26% of men were interested in the product and 41% thought it was lame. This could be an indication that if you want more effective brand awareness from an influencer-driven campaign, you should make sure the target audience is majority men and/or interested in sports.

Influencer-driven campaigns seemed to have little to no effect on brand reputation, however. Google PageRanks and even most comments on the posts were neutral/unchanged and very little of the discussions related to the brand itself. This is reflective in all the major accounts no matter what industry/audience and might be an indication that the top Instagram influencer's approval won't have much of an effect on a brand's reputation, likely due to the fact they advertise a large number of products/aren't personalized. 

Overall, these findings could have big implications on advertisers in regards to social media influencer marketing. A lot of the possible conclusions drawn from this research require more in-depth analysis (is it the same on other platforms besides Instagram, what about micro-influencers, etc), but there are certain conclusions we can make here that offer some great insights. There seems to be a great opportunity for smaller companies that want to target male/sports fans, such as OneFootball and LiveScore did, to greatly increase brand awareness almost overnight while, according to survey results, minimizing the amount of people skipping over the product itself. However, a possible challenge could be that if you're a bigger brand that already had great awareness, you may want to avoid influencer marketing all together. It doesn't seem to have much of an impact on brand reputation, likely due to a majority of people seeing obvious influencer-sponsored posts as "Lame" or just a cash grab, and if you already have high brand awareness there's not a need for it.

Overall, influencer marketing could be a powerful tool if used correctly, but it does seem to have very specific uses. This could include being a small sports-related brand and not a large fashion brand. There's definitely more research that could be done but the insights from this study could be great.