<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# LinkedIn - Get posts stats from profile
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/LinkedIn/LinkedIn_Get_posts_stats_from_profile.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"/></a>

**Tags:** #linkedin #profile #post #stats #naas_drivers #content #automation #csv

**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel/)

With this notebook, you can get post stats from any profile in LinkedIn.<br>
A dataframe will be returned and saved in CSV on your local.<br><br>
**Available columns :**
- **ACTIVITY_ID:** Post unique ID.
- **PAGINATION_TOKEN:** Token used to decode published date.
- **PUBLISHED_DATE:** When the post has been published.
- **AUTHOR_NAME:** Name of post author.
- **AUTHOR_URL:** LinkedIn URL of post author.
- **SUBDESCRIPTION:** Subdescription of post (Time since published).
- **TITLE:** First sentence of post.
- **TEXT:** Content of post.
- **CHARACTER_COUNT:** Number of characters in the post.  
- **TAGS:** List of the hashtags. 
- **TAGS_COUNT:** Number of hashtags.
- **EMOJIS:** List of emojis.
- **EMOJIS_COUNT:** Number of emojis.
- **LINKS:** Links used in post.
- **LINKS_COUNT:** Number of links.
- **PROFILE_MENTION:** People mentioned in post. 
- **COMPANY_MENTION:** Companies mentioned in post.
- **CONTENT:** Type of content.
- **CONTENT_TITLE:** Type of post content.
- **CONTENT_URL:** Title of content.
- **CONTENT_ID:** ID of content.
- **IMAGE_URL:** Image URL linked in post.
- **POLL_ID:** Poll unique ID.
- **POLL_QUESTION:** Poll question.
- **POLL_RESULTS:** Poll results.
- **POST_URL:** Post URL.
- **VIEWS:** Amount of people who saw the content (Only available on your post profile).
- **COMMENTS:** Amount of people who wrote something in the comment section.
- **LIKES:** Amount of people who pushed the like (or other reaction) button.
- **SHARES:** Amount of people who shared the content.
- **ENGAGEMENT_SCORE:** Ratio between views and likes/comments (It will be at 0 if you are not the author of the post).
- **DATE_EXTRACT:** Date of last extraction.

## Input

### Get common variables, functions

In [1]:
%run "../common.ipynb"

## Model

### Get your posts

In [2]:
df_posts = get_data(LK_PROFILE_POSTS)
print("✅ Init posts fetched:", len(df_posts))
df_posts.head(1)

[Errno 2] No such file or directory: '/home/ftp/Naas Content Engine/LinkedIn/Inputs/LINKEDIN_PROFILE_POSTS_ACoAABCNSioBW3YZHc2lBHVG0E_TXYWitQkmwog.csv'
✅ Init posts fetched: 0


### Update last posts
It will get the last 7 posts from LinkedIn API and update it.<br>
PS: On the first execution all posts will be retrieved.

In [4]:
def update_posts(df_posts,
                 profile_url,
                 key="POST_URL",
                 no_posts=LINKEDIN_POSTS_UPDATE,
                 min_updated_time=300):
    # Init output
    df = pd.DataFrame()
    df_new = pd.DataFrame()
    
    # Init df posts is empty then return entire database
    if len(df_posts) > 0:
        if "DATE_EXTRACT" in df_posts.columns:
            last_update_date = df_posts["DATE_EXTRACT"].max()
            time_last_update = datetime.now() - datetime.strptime(last_update_date, "%Y-%m-%d %H:%M:%S")
            minute_last_update = time_last_update.total_seconds() / 60
            if minute_last_update > min_updated_time:
                # If df posts not empty get the last X posts (new and already existing)
                df_new = linkedin.connect(LI_AT, JSESSIONID).profile.get_posts_feed(profile_url,
                                                                                    limit=no_posts)
            else:
                print(f"🛑 Nothing to update. Last update done {int(minute_last_update)} minutes ago.")
    else:
        df_new = linkedin.connect(LI_AT, JSESSIONID).profile.get_posts_feed(profile_url,
                                                                            limit=-1)

    # Concat, save database in CSV and dependency in production
    df = pd.concat([df_new, df_posts]).drop_duplicates(key, keep="first")

    # Return all posts
    print(f"✅ Updated posts fetched:", len(df))
    return df.reset_index(drop=True)

df_update = update_posts(df_posts, LINKEDIN_PROFILE_URL)
df_update.head(1)

✅ Updated posts fetched: 25


Unnamed: 0,ACTIVITY_ID,PAGINATION_TOKEN,PUBLISHED_DATE,AUTHOR_NAME,AUTHOR_URL,SUBDESCRIPTION,TITLE,TEXT,CHARACTER_COUNT,TAGS,...,POLL_ID,POLL_QUESTION,POLL_RESULTS,POST_URL,VIEWS,COMMENTS,LIKES,SHARES,ENGAGEMENT_SCORE,DATE_EXTRACT
0,6950488391608098816,dXJuOmxpOmFjdGl2aXR5OjY5NTA0ODgzOTE2MDgwOTg4MT...,2022-07-06 18:39:26+0200,Florent Ravenel,https://www.linkedin.com/in/ACoAABCNSioBW3YZHc...,19 hours ago,⚡️📊 Do you want to learn how to create a data ...,⚡️📊 Do you want to learn how to create a data ...,862,#naas #jupyternotebook #github #linkedin #noti...,...,,,,https://www.linkedin.com/feed/update/urn:li:ac...,1114,1,14,0,0.0135,2022-07-07 14:05:30


## Output

### Save dataframe

In [5]:
save_data(df_update, LK_PROFILE_POSTS)

👌 Well done! Your Dependency has been sent to production. 

PS: to remove the "Dependency" feature, just replace .add by .delete
✅ Dataframe successfully saved in CSV: LinkedIn/Inputs/LINKEDIN_PROFILE_POSTS_ACoAABCNSioBW3YZHc2lBHVG0E_TXYWitQkmwog.csv
