<img width="8%" alt="LinkedIn.png" src="https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/LinkedIn.png" style="border-radius: 15%">

# LinkedIn - Update metrics from posts in Notion content calendar
<a href="https://bit.ly/3JyWIk6">Give Feedback</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=LinkedIn+-+Update+metrics+from+posts+in+Notion+content+calendar:+Error+short+description">Bug report</a>

**Tags:** #linkedin #profile #post #feed #naas_drivers #notion #automation #analytics #naas #scheduler #content #plotly #html #csv #image

**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel/)

**Last update:** 2023-10-04 (Created: 2022-03-22)

**Description:** This notebook allows users to track the performance of their LinkedIn posts by automatically updating metrics from their Notion content calendar.


<div class="alert alert-info" role="info" style="margin: 10px">
<b>Disclaimer:</b><br>
This code is in no way affiliated with, authorized, maintained, sponsored or endorsed by Linkedin or any of its affiliates or subsidiaries. It uses an independent and unofficial API. Use at your own risk.

This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account. We are not responsible for your account being banned.
<br>
</div>

## Input

### Import libraries
Here below is the list of tools needed.

In [None]:
import naas
from naas_drivers import linkedin, notion
from datetime import datetime
import pandas as pd
import os
import requests

### Setup variables

**Pre-requisite**
- Configure naas integration into your Notion
- Duplicate <a href="https://naas-official.notion.site/724fec443b134f288b356001bb1543bd?v=c82a8005a5bf4862b7c967a9689aa799">content calendar template</a>
- Share integration on the template

**Mandatory**

[Learn how to get your cookies on LinkedIn](https://www.notion.so/LinkedIn-driver-Get-your-cookies-d20a8e7e508e42af8a5b52e33f3dba75)
- `li_at`: Cookie used to authenticate Members and API clients
- `JSESSIONID`: Cookie used for Cross Site Request Forgery (CSRF) protection and URL signature validation
- `linkedin_url`: This variable represents the profile URL from LinkedIn
- `notion_token`: This variable represents the notion token shared with the database
- `notion_database_url`: This variable represents the database URL

**Optional**
- `csv_output`: CSV file path to be saved in your local.
- `limit`: The initial number of posts to be fetched during the first execution.
- `update`: The number of posts to be refreshed in each update.
- `cron`: This variable represents the CRON syntax used to run the scheduler. More information here: https://crontab.guru/#0_12,18_*_*_1-5
- `refresh_interval`: This variable sets the minimum time interval (in minutes) for data refresh when using this template manually. This helps to prevent excessive calls to the LinkedIn API.

In [None]:
# Mandatory
li_at = naas.secret.get("LINKEDIN_LI_AT") or "YOUR_LINKEDIN_LI_AT" #example: AQFAzQN_PLPR4wAAAXc-FCKmgiMit5FLdY1af3-2
JSESSIONID = naas.secret.get("LINKEDIN_JSESSIONID") or "YOUR_LINKEDIN_JSESSIONID" #example: ajax:8379907400220387585
linkedin_url = "https://www.linkedin.com/in/xxxxx/" # EXAMPLE "https://www.linkedin.com/in/myprofile/"
notion_token = "ENTER_YOUR_NOTION_TOKEN_HERE" # EXAMPLE : "secret_eaLtxxxxxxxzuBPQvParsFxxxxxxx"
notion_database_url = "ENTER_YOUR_NOTION_DATABASE_URL_HERE"  # EXAMPLE : "https://www.notion.so/naas-official/fc64df2aae7f4796963d14edec816xxxxx"

# Optional
csv_output = f"LINKEDIN_POSTS_{linkedin_url.split('https://www.linkedin.com/in/')[-1].split('/')[0]}.csv"
limit = 5
update = 3
cron = "0 8 * * *"
refresh_interval = 30

## Model

### Get your posts from CSV

In [None]:
def read_csv(file_path):
    try:
        df = pd.read_csv(file_path)
    except FileNotFoundError as e:
        # Empty dataframe returned
        return pd.DataFrame()
    return df

df_posts_init = read_csv(csv_output)
print("✅ Posts:", len(df_posts_init))
df_posts_init.head(1)

### Get or update last posts 

In [None]:
def update_posts(
    li_at,
    JSESSIONID,
    df_posts,
    linkedin_url,
    limit=5,
    update=3,
    refresh_interval=60,
    key="POST_URL",
):
    # Init output
    df = pd.DataFrame()
    df_new = pd.DataFrame()

    # Init df posts is empty then return entire database
    if len(df_posts) > 0:
        if "DATE_EXTRACT" in df_posts.columns:
            last_update_date = df_posts["DATE_EXTRACT"].max()
            time_last_update = datetime.now() - datetime.strptime(
                last_update_date, "%Y-%m-%d %H:%M:%S"
            )
            minute_last_update = time_last_update.total_seconds() / 60
            if minute_last_update > refresh_interval:
                # If df posts not empty get the last X posts (new and already existing)
                df_new = linkedin.connect(li_at, JSESSIONID).profile.get_posts_feed(
                    linkedin_url,
                    limit=update
                )
            else:
                print(
                    f"🛑 Nothing to update. Last update done {int(minute_last_update)} minutes ago."
                )
    else:
        df_new = linkedin.connect(li_at, JSESSIONID).profile.get_posts_feed(
            linkedin_url,
            limit=limit
        )

    # Concat, save database in CSV and dependency in production
    df = pd.concat([df_new, df_posts]).drop_duplicates(key, keep="first")

    # Return all posts
    print(f"✅ Updated posts:", len(df))
    return df.reset_index(drop=True)

df_posts = update_posts(
    li_at,
    JSESSIONID,
    df_posts_init,
    linkedin_url,
    limit=limit,
    update=update,
    refresh_interval=refresh_interval
)
df_posts.head(1)

### Save DataFrame in CSV and send to production

In [None]:
# Save dataframe in CSV
df_posts.to_csv(csv_output, index=False)

# Send CSV to production (It could be used with other scripts)
naas.dependency.add(csv_output)

### Get Notion database

In [None]:
def get_notion_db(notion_database, key, token):
    # Init
    df_output = pd.DataFrame()
    if not notion_database.startswith("https://www.notion.so/"):
        return df_output
    
    # Get database
    database_id = notion_database.split("/")[-1].split("?v=")[0]
    pages = notion.connect(token).database.query(database_id, query={})
    
    # Loop on page
    for page in pages:
        # Get page_id
        page_id = page.id
        
        # Create dataframe from page
        df = page.df()
        
        # Remove empty pages
        page_title = df.loc[df.Name == key, "Value"].values[0]
        if page_title == "":
            notion.connect(token).blocks.delete(page_id)
            print(f"Page '{page_id}' empty => removed from database")
        else:
            # Pivot rows to columns
            columns = df["Name"].unique().tolist()
            new_df = df.copy()
            new_df = new_df.drop("Type", axis=1)
            new_df = new_df.T
            for i, c in enumerate(new_df.columns):
                new_df = new_df.rename(columns={c: columns[i]})
            new_df = new_df.drop("Name").reset_index(drop=True)

            # Add page ID
            new_df["PAGE_ID"] = page_id

            # Concat dataframe
            df_output = pd.concat([df_output, new_df])
    return df_output

df_notion = get_notion_db(
    notion_database_url,
    "Name",
    notion_token
)
print("✅ Notion DB:", len(df_notion))
df_notion.head(1)

### Get rows to be updated

In [None]:
def get_updated_rows(
    df_posts,
    df_notion,
    force_update,
):
    # Init
    df = df_posts.copy()
    
    # Check if df is not empty
    if len(df) == 0:
        return pd.DataFrame()
    
    # Cleaning and filter
    df.COMPANY_MENTION = df.COMPANY_MENTION.fillna("")
    df.PROFILE_MENTION = df.PROFILE_MENTION.fillna("")
    df = df.fillna("None")
    df = df[df["ACTIVITY_ID"].astype(str) != "None"]

    # Get page ID
    engagements = {}
    page_ids = {}
    if len(df_notion) > 0:
        df_notion["Score"] = df_notion["Views"].fillna("0").astype(float) + df_notion["Likes"].fillna("0").astype(float) + df_notion["Comments"].fillna("0").astype(float) + df_notion["Shares"].fillna("0").astype(float)
        for index, row in df_notion.iterrows():
            engagements[row["Post URL"]] = row["Score"]
            page_ids[row["Post URL"]] = row["PAGE_ID"]
    df["Score"] = df["POST_URL"].map(engagements)
    df["PAGE_ID"] = df["POST_URL"].map(page_ids).fillna("None")
    
    # Return all rows if force update is True or notion database empty
    if force_update:
        return df.reset_index(drop=True)
    
    # Check Score
    df["SCORE"] = df["VIEWS"] + df["COMMENTS"] + df["LIKES"] + df["SHARES"]
    df = df[df["SCORE"].astype(float) != df["Score"].astype(float)]
    return df.reset_index(drop=True)

df_update = get_updated_rows(df_posts, df_notion, False)
print("✅ Rows to update:", len(df_update))
df_update.head(len(df_update))

## Output

### Update posts in Notion

In [None]:
def update_dynamic_properties(page, row):
    # Page properties : dynamic
    page.number("Engagment score", float(row.ENGAGEMENT_SCORE))
    page.number("Views", int(row.VIEWS))
    page.number("Likes", int(row.LIKES))
    page.number("Comments", int(row.COMMENTS))
    page.number("Shares", int(row.SHARES))
    return page

def update_content_notion(df, notion_database, notion_token):
    # Init
    if len(df) == 0:
        print(f"🛑 Nothing to update in Notion.")
        return
    database_id = notion_database.split("/")[-1].split("?v=")[0]
    
    # Loop in data
    for i, row in df.iterrows():
        # Init
        page_id = row.PAGE_ID
        title = row.TITLE
        content_title = row.CONTENT_TITLE
        if str(title) == "None" and str(content_title) != "None":
            title = f"Repost - {content_title}"
        elif str(title) == "None" and str(content_title) == "None":
            title = "Repost"
        post_url = row.POST_URL
        print(f"➡️ Start update for '{title}'")
        print(f"🔗 Post URL: {post_url}")

        # Create or update page
        try:
            if str(page_id) == "None":
                # Create new page in notion
                page = notion.connect(notion_token).Page.new(database_id=database_id).create()
                page.title("Name", title)
                print(f"✅ Page created in Notion.")

                # Page properties : static
                page.date("Publication Date", row.PUBLISHED_DATE)
                page.select("Content type", row.CONTENT)
                page.select("Platform", "LinkedIn")
                page.select("Status", "Published ✨")
                page.select("Author", row.AUTHOR_NAME)
                profile_mention = row.PROFILE_MENTION
                if str(profile_mention) != "None":
                    if len(profile_mention) > 2:
                        page.rich_text("Profile mention", profile_mention)
                company_mention = row.COMPANY_MENTION
                if str(company_mention) != "None":
                    if len(company_mention) > 2:
                        page.rich_text("Company mention", company_mention)
                page.number("Nb hashtags", int(row.TAGS_COUNT))
                tags = row.TAGS
                if str(tags) == "None":
                    tags = ""
                else:
                    if len(tags) < 2:
                        tags = ""
                page.rich_text("Hashtags", tags)
                page.number("Nb emojis", int(row.EMOJIS_COUNT))
                emojis = row.EMOJIS
                if str(emojis) == "None":
                    emojis = ""
                else:
                    if len(emojis) < 2:
                        emojis = ""
                page.rich_text("Emojis", emojis)
                page.number("Nb links", int(row.LINKS_COUNT))
                links = row.LINKS
                if str(links) == "None":
                    if len(links) > 2:
                        page.link("Links", links)
                page.number("Nb characters", int(row.CHARACTER_COUNT))
                page.link("Post URL", post_url)
                content_url = row.CONTENT_URL
                if str(content_url) != "None":
                    page.link("Content URL", content_url)
                print("✅ Static data updated in page properties.")

                # Page blocks text
                text = row.TEXT
                if str(text) != "None":
                    split_text = text.split("\n")
                    for t in split_text:
                        page.paragraph(t)
                    print("✅ Post updated in page blocks.")
                    
                # Add linkedin logo as page icon
                notion.client.pages.update(
                    page_id=page.id, icon={"type": "external", "external": {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/c/ca/LinkedIn_logo_initials.png/800px-LinkedIn_logo_initials.png"}}
                )
                print(f"✅ Icon successfully updated in page.")

                # Add image to background
                image_url = row.IMAGE_URL
                if str(image_url) != "None":
                    if image_url.startswith("https://media"):
                        notion.client.pages.update(
                            page_id=page.id, cover={"type": "external", "external": {"url": image_url}}
                        )
                        print(f"✅ Background successfully updated in page.")
            else:
                page = notion.connect(notion_token).page.get(page_id)

            # Page properties : dynamic
            page = update_dynamic_properties(page, row)

            # Update page
            page.update()
            print(f"✅ Post stats updated in page properties.")
        except Exception as e:
            print(f"❌ Error creating page '{title}' in Notion", e)
            print(row)
            raise(e)
                
update_content_notion(df_update, notion_database_url, notion_token)

### Add scheduler

In [None]:
# the default settings below will make the notebook run everyday at 8:00
# for information on changing this setting, please check https://crontab.guru/ for information on the required CRON syntax
naas.scheduler.add(cron=cron)

# to de-schedule this notebook, simply run the following command:
# naas.scheduler.delete()