In [1]:
import numpy as np
import pandas as pd
import matplotlib as plt

In [7]:
import plotly.express as px

## Followers

In [38]:
df_followers = pd.read_csv('linkedin_data/followers_job.csv')
df_followers

Unnamed: 0,Job function,Total followers
0,Engineering,166
1,Business Development,155
2,Research,117
3,Education,80
4,Information Technology,47
5,Marketing,47
6,Operations,39
7,Arts and Design,34
8,Sales,34
9,Community and Social Services,33


In [39]:
total_followers = df_followers['Total followers'].sum()

# Calculate the percentage of each 'Job function'
df_followers['Percentage'] = (df_followers['Total followers'] / total_followers) * 100

In [45]:
df_followers.head(10)

Unnamed: 0,Job function,Total followers,Percentage
0,Engineering,166,17.025641
1,Business Development,155,15.897436
2,Research,117,12.0
3,Education,80,8.205128
4,Information Technology,47,4.820513
5,Marketing,47,4.820513
6,Operations,39,4.0
7,Arts and Design,34,3.487179
8,Sales,34,3.487179
9,Community and Social Services,33,3.384615


The job functions of "Engineering" and "Business Development" have the highest total number of followers, accounting for approximately 17.03% and 15.90% of the total followers, respectively. These two categories appear to be the most popular among the LinkedIn audience in the dataset. The "Research" and "Education" job functions also demonstrate a notable presence on LinkedIn, with around 12.00% and 8.21% of the total followers, respectively. Tech-Related Functions Hold Influence: "Information Technology" and "Marketing" job functions each have around 4.82% of the total followers, highlighting the significance of tech-related and marketing roles in the LinkedIn community.

## Content Metrics

In [29]:
df_posts = pd.read_csv('linkedin_data/posts.csv')
df_posts

Unnamed: 0,Post title,Post link,Post type,Campaign name,Posted by,Created date,Campaign start date,Campaign end date,Audience,Impressions,Views (Excluding offsite video views),Offsite Views,Clicks,Click through rate (CTR),Likes,Comments,Reposts,Follows,Engagement rate,Content Type
0,,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",07/21/2023,,,All followers,26,,,6,0.230769,0,0,0,,0.230769,
1,Wondering what are latest technology trends? K...,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",07/20/2023,,,All followers,204,,,5,0.024510,3,0,0,,0.039216,
2,,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",07/13/2023,,,All followers,43,,,2,0.046512,0,0,0,,0.046512,
3,Business innovations are based on such applied...,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",07/08/2023,,,All followers,536,,,9,0.016791,5,0,0,,0.026119,
4,💜 We deeply appreciate the dedication of these...,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",07/02/2023,,,All followers,91,,,4,0.043956,2,0,0,,0.065934,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
70,"Welcome to Benjamin, who is our Lead Product D...",https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",09/29/2022,,,All followers,109,,,5,0.045872,0,0,1,,0.055046,
71,"We would like to welcome Rim, our SMM and Mark...",https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",09/28/2022,,,All followers,146,,,4,0.027397,4,0,3,,0.075342,
72,,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",09/26/2022,,,All followers,118,,,8,0.067797,0,0,0,,0.067797,
73,Please take part in our 3-5 minute survey - We...,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",09/23/2022,,,All followers,160,,,6,0.037500,9,0,2,,0.100000,


In [30]:
# drop repost posts
df_posts.dropna(subset=['Post title'], inplace=True)
df_posts.reset_index(drop=True, inplace=True)

In [35]:
df_posts.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 62 entries, 0 to 61
Data columns (total 20 columns):
 #   Column                                 Non-Null Count  Dtype  
---  ------                                 --------------  -----  
 0   Post title                             62 non-null     object 
 1   Post link                              62 non-null     object 
 2   Post type                              62 non-null     object 
 3   Campaign name                          0 non-null      float64
 4   Posted by                              62 non-null     object 
 5   Created date                           62 non-null     object 
 6   Campaign start date                    0 non-null      float64
 7   Campaign end date                      0 non-null      float64
 8   Audience                               62 non-null     object 
 9   Impressions                            62 non-null     int64  
 10  Views (Excluding offsite video views)  16 non-null     float64
 11  Offsite 

In the dataset, we have a total of 62 posts. Now, let's examine the performance of these posts. On LinkedIn, post performance can be evaluated by analyzing key metrics like impressions, engagement rate, likes, comments, and reposts. Impressions on LinkedIn refer to the number of times a post has been displayed in users' feeds. High impressions indicate that the post is being seen by a larger audience. The engagement rate on LinkedIn is calculated by comparing the total interactions (likes, comments, shares, and clicks) to the total number of impressions. A higher engagement rate signifies that the post is capturing the interest of viewers and generating meaningful interactions.

In [20]:
def plot_histogram(data, x_lable, chart_title):
    fig = px.histogram(data, 
                   x = x_lable,
                   title = chart_title,
                   height = 450)
    fig.show()

In [33]:
# plot histogram of impressions
plot_histogram(df_posts['Impressions'], "Impressions", "Distribution of Impressions")

In [34]:
# plot histogram of impressions
plot_histogram(df_posts['Engagement rate'], "Engagement rate","Distribution of Engagement Rate")

It can be seen from the chart that the majority of posts receive approximately 100 to 300, which suggests that these posts are reaching a moderate audience on LinkedIn impressions. With an engagement rate of less than 0.2, it appears that there is room for improvement in capturing the audience's interest and encouraging more interactions with the posts

In [37]:
df_posts[df_posts['Impressions'] > 400]

Unnamed: 0,Post title,Post link,Post type,Campaign name,Posted by,Created date,Campaign start date,Campaign end date,Audience,Impressions,Views (Excluding offsite video views),Offsite Views,Clicks,Click through rate (CTR),Likes,Comments,Reposts,Follows,Engagement rate,Content Type
1,Business innovations are based on such applied...,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",07/08/2023,,,All followers,536,,,9,0.016791,5,0,0,,0.026119,
7,"🚀 Founders of RESEARCHPRENEURS, Alevtina and J...",https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,"Alevtina Evgrafova, PhD, ACC ICF",05/17/2023,,,All followers,419,,,34,0.081146,9,0,2,,0.107399,
43,That was the #websummit2022 in #Lisbon!\nVery ...,https://www.linkedin.com/feed/update/urn:li:ac...,Organic,,Rim Silini,11/07/2022,,,All followers,956,,,141,0.14749,20,0,0,,0.167364,
