<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Twitter - Get posts stats
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Twitter/Twitter_Get_posts_stats.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"/></a>

**Tags:** #twitter #post #comments #naas_drivers #snippet #content #notion

**Author:** [Maxime Jublou](https://www.linkedin.com/in/maixmejublou)

This notebook will synchronize your Twitter posts stats with a Notion database.

## Input

### Import libraries

In [1]:
import naas
from os import path

### Setup Twitter

In [2]:
TWITTER_CONSUMER_KEY = naas.secret.get('TWITTER_CONSUMER_KEY') or 'YourCredential'
TWITTER_CONSUMER_SECRET = naas.secret.get('TWITTER_CONSUMER_SECRET') or 'YourCredential'
TWITTER_BEARER_TOKEN = naas.secret.get('TWITTER_BEARER_TOKEN') or 'YourCredential'
TWITTER_ACCESS_TOKEN = naas.secret.get('TWITTER_ACCESS_TOKEN') or 'YourCredential'
TWITTER_ACCESS_TOKEN_SECRET = naas.secret.get('TWITTER_ACCESS_TOKEN_SECRET') or 'YourCredential'

### Setup Outputs

In [3]:
# Outputs
name_output = "TWITTER_TWEETS_STATS"
csv_output = path.join("Outputs", f"{name_output}.csv")

## Model

### Create a naas driver for Twitter

In [4]:
try:
    import tweepy
except:
    !pip install tweepy --user
    import tweepy
import pandas as pd
import json
from typing import List
import datetime
import pydash
from numpy import inf

tweet_fields = ["author_id,created_at,source,public_metrics"]
tweet_personal_fields = ["author_id,created_at,source,public_metrics,non_public_metrics,organic_metrics"]


class Twitter:
    
    # Authenticate as an app.
    __bearer_token : str
    
    # Authenticate as a user.
    __consumer_key : str
    __consumer_secret : str
    __access_token : str
    __access_token_secret : str
    
    # Twitter v2 auth
    __app_client : tweepy.Client
    __user_client : tweepy.Client
    
    __me : pd.Series
    
    def connect(self, bearer_token:str, consumer_key:str, consumer_secret:str, access_token:str, access_token_secret:str) -> "Twitter":
        self.__bearer_token = bearer_token
        
        self.__app_client = tweepy.Client(
            bearer_token=self.__bearer_token
        )
        
        self.__consumer_key = consumer_key
        self.__consumer_secret = consumer_secret
        self.__access_token = access_token
        self.__access_token_secret = access_token_secret
        
        
        self.__user_client = tweepy.Client(
            consumer_key=consumer_key,
            consumer_secret=consumer_secret,
            access_token=access_token,
            access_token_secret=access_token_secret
        )
        
        self.__me = self.get_me()
        
        return self
    
    @property
    def app_client(self):
        return self.__app_client
    
    @property
    def user_client(self):
        return self.__user_client
    
    def get_user(self, username:str) -> pd.Series:
        users = self.__app_client.get_users(usernames=[username])
        if users is None:
            return None
        
        return pd.Series(users.data[0].data)
    
    def get_me(self):
        me = self.__user_client.get_me()
        if me is None:
            return None
        return pd.Series(me.data.data)
    
    def get_my_tweets(self, **kwargs):
        return self.get_users_tweets(self.__me.id, **kwargs)


    def get_users_tweets(self,
                         user_id:str,
                         tweet_count=200,
                         tweet_fields :List[str]=tweet_fields,
                         start_time=datetime.datetime.now() - datetime.timedelta(days=30),
                         end_time=datetime.datetime.now())-> pd.DataFrame:
        should_stop = False
        tweets_array = []
        next_token = None
        
        while len(tweets_array) < tweet_count and should_stop is False:
            tweets_left_to_fetch = tweet_count - len(tweets_array)
            
            if tweets_left_to_fetch > 100:
                max_results = 100
            elif tweets_left_to_fetch < 5:
                max_results = 5
            else:
                max_results = tweets_left_to_fetch
            
        
            tweets = self.__app_client.get_users_tweets(
                id=user_id,
                max_results=max_results,
                start_time=start_time,
                end_time=end_time,
                pagination_token=next_token,
            )
            next_token = pydash.get(tweets, 'meta.next_token', None)
            if next_token is None:
                should_stop = True

            is_own_tweets = user_id == self.__me.id
            for tweet in tweets.data:
                tweet_id = tweet.id


                if is_own_tweets is True:
                    rich_tweet_response = self.__user_client.get_tweet(tweet_id, tweet_fields=tweet_personal_fields, user_auth=True)
                    if len(rich_tweet_response.errors):
                        rich_tweet_response = self.__user_client.get_tweet(tweet_id, tweet_fields=tweet_fields, user_auth=True)
                else:
                    rich_tweet_response = self.__app_client.get_tweet(tweet_id, tweet_fields=tweet_fields, user_auth=False)


                rtd = rich_tweet_response.data
                
                tweets_array.append({
                    "TWEET_ID": rtd.id,
                    "TWEET_URL": f'https://twitter.com/{self.__me.username}/status/{rtd.id}',
                    "CREATED_AT": rtd.created_at,
                    "AUTHOR_ID": rtd.author_id,
                    "AUTHOR_NAME": self.__me['name'],
                    "AUTHOR_USERNAME": self.__me.username,
                    "TEXT": rtd.text,
                    "PUBLIC_RETWEETS": pydash.get(rtd, 'public_metrics.retweet_count', 0),
                    "PUBLIC_REPLIES": pydash.get(rtd, 'public_metrics.reply_count', 0),
                    "PUBLIC_LIKES": pydash.get(rtd, 'public_metrics.like_count', 0),
                    "PUBLIC_QUOTES": pydash.get(rtd, 'public_metrics.quote_count', 0),
                    "ORGANIC_RETWEETS": pydash.get(rtd, 'organic_metrics.retweet_count', 0),
                    "ORGANIC_REPLIES": pydash.get(rtd, 'organic_metrics.reply_count', 0),
                    "ORGANIC_LIKES": pydash.get(rtd, 'organic_metrics.like_count', 0),
                    "ORGANIC_QUOTES": pydash.get(rtd, 'organic_metrics.quote_count', 0),
                    "USER_PROFILE_CLICKS": pydash.get(rtd, 'non_public_metrics.user_profile_clicks', 0),
                    "IMPRESSIONS": pydash.get(rtd, 'non_public_metrics.impression_count', 0),
                })
                
        # Create final dataframe
        as_types = {
            "PUBLIC_RETWEETS": int, 
            "PUBLIC_REPLIES": int,
            "PUBLIC_LIKES": int,
            "PUBLIC_QUOTES": int,
            "ORGANIC_RETWEETS": int,
            "ORGANIC_REPLIES": int,
            "ORGANIC_LIKES": int,
            "ORGANIC_QUOTES": int,
            "USER_PROFILE_CLICKS": int,
            "IMPRESSIONS": int
        }
        df = pd.DataFrame(tweets_array).astype(as_types)
        df["ENGAGEMENTS"] = (df["PUBLIC_RETWEETS"] + df["PUBLIC_REPLIES"] + df["PUBLIC_LIKES"] + df["PUBLIC_QUOTES"] + df["USER_PROFILE_CLICKS"])
        df["ENGAGEMENT_RATE"] = df["ENGAGEMENTS"] / df["IMPRESSIONS"]
        df = df.round({'ENGAGEMENT_RATE': 4})
        df = df.fillna({"ENGAGEMENT_RATE": 0})
        df['ENGAGEMENT_RATE'] = df['ENGAGEMENT_RATE'].replace(inf, 0)
        df['ENGAGEMENT_RATE'] = df['ENGAGEMENT_RATE'].apply(lambda x: 0 if x < 0 else x)        
        return df.reset_index(drop=True)
        
    
twitter = Twitter()

### Connect Twitter driver

In [5]:
twitter.connect(TWITTER_BEARER_TOKEN, TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET, TWITTER_ACCESS_TOKEN, TWITTER_ACCESS_TOKEN_SECRET)

<__main__.Twitter at 0x7f27308ef340>

### Get our user

In [6]:
user = twitter.get_me()
user

id          1328105689898766336
name                    Naas.ai
username            JupyterNaas
dtype: object

### Get our tweets

In [9]:
df_contents = twitter.get_my_tweets(start_time=datetime.datetime(year=2022, month=6, day=1))
print("Content fetched:", len(df_contents))
df_contents#.head(1)

Content fetched: 25


Unnamed: 0,TWEET_ID,TWEET_URL,CREATED_AT,AUTHOR_ID,AUTHOR_NAME,AUTHOR_USERNAME,TEXT,PUBLIC_RETWEETS,PUBLIC_REPLIES,PUBLIC_LIKES,PUBLIC_QUOTES,ORGANIC_RETWEETS,ORGANIC_REPLIES,ORGANIC_LIKES,ORGANIC_QUOTES,USER_PROFILE_CLICKS,IMPRESSIONS,ENGAGEMENTS,ENGAGEMENT_RATE
0,1541292314190036992,https://twitter.com/JupyterNaas/status/1541292...,2022-06-27 05:28:18+00:00,1328105689898766336,Naas.ai,JupyterNaas,RT @josephjacks_: “Demand for open source far ...,14,0,0,0,0,0,0,0,0,0,14,0.0
1,1539948667511128064,https://twitter.com/JupyterNaas/status/1539948...,2022-06-23 12:29:07+00:00,1328105689898766336,Naas.ai,JupyterNaas,@HamelHusain @choldgraf @GaelVaroquaux 🤙,0,0,0,0,0,0,0,0,0,33,0,0.0
2,1539375289150709760,https://twitter.com/JupyterNaas/status/1539375...,2022-06-21 22:30:43+00:00,1328105689898766336,Naas.ai,JupyterNaas,Why so many tools? \n🙃 https://t.co/1K9b84NGWE,0,0,3,0,0,0,3,0,3,116,6,0.0517
3,1537765266528120833,https://twitter.com/JupyterNaas/status/1537765...,2022-06-17 11:53:04+00:00,1328105689898766336,Naas.ai,JupyterNaas,RT @ravenel_jeremy: Is anyone here doing Whats...,4,0,0,0,0,0,0,0,0,0,4,0.0
4,1537429386445565956,https://twitter.com/JupyterNaas/status/1537429...,2022-06-16 13:38:24+00:00,1328105689898766336,Naas.ai,JupyterNaas,RT @ravenel_jeremy: Fun fact: the 3 layers of ...,3,0,0,0,0,0,0,0,0,0,3,0.0
5,1537419061159772161,https://twitter.com/JupyterNaas/status/1537419...,2022-06-16 12:57:22+00:00,1328105689898766336,Naas.ai,JupyterNaas,Wondering how to read a dataframe from your fi...,0,0,0,0,0,0,0,0,1,57,1,0.0175
6,1537386096673308673,https://twitter.com/JupyterNaas/status/1537386...,2022-06-16 10:46:23+00:00,1328105689898766336,Naas.ai,JupyterNaas,Do you want to download files in your AWS buck...,0,0,2,0,0,0,2,0,0,68,2,0.0294
7,1537339593900572674,https://twitter.com/JupyterNaas/status/1537339...,2022-06-16 07:41:36+00:00,1328105689898766336,Naas.ai,JupyterNaas,Did you know you can send daily billing notifi...,2,0,2,1,2,0,2,0,3,444,8,0.018
8,1536961711927971840,https://twitter.com/JupyterNaas/status/1536961...,2022-06-15 06:40:02+00:00,1328105689898766336,Naas.ai,JupyterNaas,⚡️We started a challenge this Monday: aggregat...,3,0,4,0,3,0,4,0,2,418,9,0.0215
9,1536827399505121281,https://twitter.com/JupyterNaas/status/1536827...,2022-06-14 21:46:19+00:00,1328105689898766336,Naas.ai,JupyterNaas,RT @josephjacks_: If you are consistently grow...,3,0,0,0,0,0,0,0,0,0,3,0.0


In [12]:
df_contents[df_contents["TWEET_ID"].astype(str) == "1541248373117685760"] 

Unnamed: 0,TWEET_ID,TWEET_URL,CREATED_AT,AUTHOR_ID,AUTHOR_NAME,AUTHOR_USERNAME,TEXT,PUBLIC_RETWEETS,PUBLIC_REPLIES,PUBLIC_LIKES,PUBLIC_QUOTES,ORGANIC_RETWEETS,ORGANIC_REPLIES,ORGANIC_LIKES,ORGANIC_QUOTES,USER_PROFILE_CLICKS,IMPRESSIONS,ENGAGEMENTS,ENGAGEMENT_RATE


## Output

### Save and share your csv file

In [10]:
# Save your dataframe in CSV
df_contents.to_csv(csv_output, index=False)

# Share output with naas
csv_link = naas.asset.add(csv_output)

#-> Uncomment the line below to remove your asset
# naas.asset.delete(csv_output)

👌 Well done! Your Assets has been sent to production.



<IPython.core.display.Javascript object>

Button(button_style='primary', description='Copy URL', style=ButtonStyle())

Output()

PS: to remove the "Assets" feature, just replace .add by .delete
