In [None]:

#!pip install tweepy
#!pip install pandas

# Loading required libraries 

1. The ``tweepy`` library (https://docs.tweepy.org/en/stable/) can be used to access the Twitter API for storing the relevant tweets and further information. For using Tweepy to extract tweets, we first needed to apply for developer credentials including private consumer keys and access tokens. 
2. ``Pandas`` is imported for first minor data transformations and reading in the politician's twitter handles.

In [54]:
import tweepy
import pandas as pd
import datetime

As a next step, the personal consumer keys/access tokens must be provided in order to access the Twitter API. In order to receive the keys and tokens, one must apply for twitter developer credentials. 

In [11]:
# Define your access credentials

consumer_key = input("Please input your consumer key")
consumer_secret = input("Please input your consumer secret")

access_token = input("Please input your access key")
access_token_secret = input("Please input your access token secret")

### Authentification and connection to the Twitter API

In the next chunk, the previously stored consumer keys are passed to the OAuthHandler instance, using the tweepy library. Subsequently, also the access token and secret need to be set up (which we also have stored in strings in the previous chunk). Finally, a new API variable is created. The `wait_on_rate_limit`-argument is set to true (this is useful since there are certain rate limits set by Twitter which should not be exceeded). 

In [13]:
auth = tweepy.OAuthHandler(
    consumer_key,
    consumer_secret
)

auth.set_access_token(
    access_token,
    access_token_secret
)

api = tweepy.API(auth, wait_on_rate_limit=True)

### Testing authentification status

Here, we are testing whether the authentification was successful using the ``verify_credentials``-function of ``tweepy``. In case of exisiting credentials, we are printing a confirmation statement. If an error occurs, we are printing an error statement. 

In [14]:
try:
    api.verify_credentials()
    print("Authentication OK")
except:
    print("Error during authentication")

Authentication OK


### Testing whether we are able to extract and store tweets

Here, we are extracting the last tweet of Johannes and storing it in an object. For this purpose, we use Tweepy's ``user_timeline``-function, and specify
- the user name (``screen_name``)
- the number of tweets we want to display (``count``)
- whether we want to include retweets (``include_rts``)
- that we want to display the tweets in extended view (``tweet_mode``)

In [None]:
tweets = api.user_timeline(
    screen_name = "halkenhaeusser",
    count = 1,
    include_rts = False,
    tweet_mode = 'extended'
)

print(tweets)                            

In [17]:
type(tweets)

tweepy.models.ResultSet

In [18]:
try:
    redirect_url = auth.get_authorization_url()
except tweepy.TweepError:
    print('Error! Failed to get request token.')

### Saving twitter handles of German MPs and printing a random handle

Since we want to be theoretically able to analyze the tweets of each German MP using twitter, we need a resource that stores the respective twitter handles of all MPs. For this purpose, we can draw on the work of Markus Konrad from the WZB. In the next chunk, we read in the csv.file containing the twitter handles using ``pandas``. In the following, we store only the twitter handles and dropping all MPs that do not have a twitter account. Furthermore, we use the sample command to display a random handle to check whether the previous code was successful. 

In [19]:
# Sample tweets
wzb_df = pd.read_csv("https://raw.githubusercontent.com/WZBSocialScienceCenter/mdb-twitter-network/master/data/deputies_twitter_20190702.csv")

twitter_df = wzb_df[["twitter_name"]].dropna()

# print(twitter_series)

twitter_sample = twitter_df.sample(n = 1)
# twitter_sample = twitter_sample.loc[:, "twitter_name"]

# print(twitter_sample)
# twitter_sample = twitter_sample.rename_axis(index = None)

twitter_string = twitter_sample.to_string(index = False, header = False)

print(twitter_string)
# type(twitter_string)


 gruenclaudia


### Printing the whole dataframe containing every MP

In the following chunk, we print the full data frame created by Markus Konrad. As we can see, it contains several columns with further information.

In [27]:
wzb_df

Unnamed: 0,meta.status,meta.edited,meta.uuid,meta.username,meta.questions,meta.answers,meta.standard_replies,meta.url,personal.degree,personal.first_name,...,personal.location.city,personal.location.postal_code,personal.picture.url,personal.picture.copyright,party,parliament.name,parliament.uuid,parliament.joined,parliament.retired,twitter_name
0,1,2019-06-07 13:28,3a08158b-eb25-4cff-9594-363493dfb4d0,alexander-graf-lambsdorff,21,10,0,https://www.abgeordnetenwatch.de/profile/alexa...,,Alexander,...,,,https://www.abgeordnetenwatch.de/sites/abgeord...,<p>Graf Alexander Lambsdorff</p>,FDP,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,
1,1,2019-06-07 13:28,d623efb5-7afc-4b56-a794-dfe6c1def0f9,martin-schulz-1,35,0,0,https://www.abgeordnetenwatch.de/profile/marti...,,Martin,...,Würselen,,https://www.abgeordnetenwatch.de/sites/abgeord...,,SPD,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,martinschulz
2,1,2019-06-07 13:28,e0209560-579a-4f65-bed9-a189a7146690,michael-theurer,3,3,0,https://www.abgeordnetenwatch.de/profile/micha...,,Michael,...,Horb am Neckar,,https://www.abgeordnetenwatch.de/sites/abgeord...,<p>Michael Theurer</p>,FDP,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,
3,1,2019-06-07 13:28,a96583f1-e2b0-4e41-aa4c-e0494cdff322,fabio-de-masi,21,21,0,https://www.abgeordnetenwatch.de/profile/fabio...,,Fabio,...,"Hamburg, Berlin",,https://www.abgeordnetenwatch.de/sites/abgeord...,Karen Demarowitz,DIE LINKE,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,fabiodemasi
4,1,2019-06-07 13:28,a5a53285-3632-4854-af0e-1bf301417955,sarah-ryglewski,11,11,0,https://www.abgeordnetenwatch.de/profile/sarah...,,Sarah,...,Bremen,,https://www.abgeordnetenwatch.de/sites/abgeord...,,SPD,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
713,1,2019-06-07 13:28,0987e856-e99a-4cd5-93bd-0a97e37347aa,verena-hartmann,11,2,0,https://www.abgeordnetenwatch.de/profile/veren...,,Verena,...,,,https://www.abgeordnetenwatch.de/sites/abgeord...,Verena Hartmann,AfD,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,
714,1,2019-06-07 13:28,b8da0922-d041-4a36-a6e2-e98a61f2be86,martin-reichardt,1,0,0,https://www.abgeordnetenwatch.de/profile/marti...,,Martin,...,,,https://www.abgeordnetenwatch.de/sites/abgeord...,,AfD,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,m_reichardt_afd
715,1,2019-06-07 13:28,d14354bf-24ee-49e2-ab32-6ca5ebb8248d,marcus-buhl,1,0,1,https://www.abgeordnetenwatch.de/profile/marcu...,,Marcus,...,Illmenau,,https://www.abgeordnetenwatch.de/sites/abgeord...,,AfD,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,
716,1,2019-06-07 13:28,680d201b-fbfc-4436-939e-9df5fa84d233,anton-friesen,32,31,0,https://www.abgeordnetenwatch.de/profile/anton...,Dr.,Anton,...,,,https://www.abgeordnetenwatch.de/sites/abgeord...,,AfD,Bundestag,60d0787f-e311-4283-a7fd-85b9f62a9b33,,,drfriesenmdb


### Storing the last 10 tweets of the sample MP

Here we test whether we are able to store a manually specified number of tweets for selected MPs, again using the ``user_timeline``-function. Retweets are excluded using the ``include_rts`` argument. ``Tweet_mode`` is set to extended since tweets would get truncated

In [None]:
tweets_sample_mp = api.user_timeline(
    screen_name = twitter_string, 
    # 100 is the maximum allowed count
    count = 10,
    include_rts = False,
    # Necessary to keep full_text 
    # otherwise only the first 140 words are extracted
    tweet_mode = 'extended'
)

tweets_sample_mp

Using Panda's ``json_normalize``-function for transorming the dataframe into a flat table

In [21]:
# tweets_sample_mp.full_text.encode("utf-8")

json_data = [r._json for r in tweets_sample_mp]

tweets_df = pd.json_normalize(json_data)

In [85]:
#look at dataframe
tweets_df

Unnamed: 0,created_at,id,id_str,full_text,truncated,display_text_range,source,in_reply_to_status_id,in_reply_to_status_id_str,in_reply_to_user_id,...,quoted_status.coordinates,quoted_status.place,quoted_status.contributors,quoted_status.is_quote_status,quoted_status.retweet_count,quoted_status.favorite_count,quoted_status.favorited,quoted_status.retweeted,quoted_status.possibly_sensitive,quoted_status.lang
0,2021-11-26 09:00:49+00:00,1464157182186995715,1464157182186995715,Herzlichen Glückwunsch @ABaerbock @SteffiLemk...,False,"[0, 126]","<a href=""http://twitter.com/download/android"" ...",,,,...,,,,,,,,,,
1,2021-11-25 19:04:51+00:00,1463946804488192001,1463946804488192001,Mood https://t.co/Wt51ykKWNo,False,"[0, 4]","<a href=""http://twitter.com/#!/download/ipad"" ...",,,,...,,,,,,,,,,
2,2021-11-25 13:38:48+00:00,1463864751474163718,1463864751474163718,❤️💚💛 https://t.co/aPFHeKkaDD,False,"[0, 4]","<a href=""http://twitter.com/download/android"" ...",,,,...,,,,False,17.0,176.0,False,False,False,de
3,2021-11-24 17:09:52+00:00,1463555480245223430,1463555480245223430,"Manchmal sind es die kleinen Dinge, die einen ...",False,"[0, 128]","<a href=""http://twitter.com/#!/download/ipad"" ...",,,,...,,,,,,,,,,
4,2021-11-24 14:50:09+00:00,1463520318358704136,1463520318358704136,Und auch @krusehamburg @schneidercar Sabine Po...,False,"[0, 120]","<a href=""http://twitter.com/#!/download/ipad"" ...",1.463519e+18,1.4635193639407288e+18,937403100.0,...,,,,,,,,,,
5,2021-11-24 14:46:21+00:00,1463519363940728835,1463519363940728835,"Jetzt ist er endlich da. Einfach war es nicht,...",False,"[0, 127]","<a href=""http://twitter.com/#!/download/ipad"" ...",,,,...,,,,,,,,,,
6,2021-11-24 10:45:44+00:00,1463458809226178561,1463458809226178561,@KuehniKev Wer muss es Olaf sagen?,False,"[11, 34]","<a href=""http://twitter.com/download/android"" ...",1.463457e+18,1.463456677592719e+18,2764750000.0,...,,,,,,,,,,
7,2021-11-24 10:43:59+00:00,1463458367347777536,1463458367347777536,@nnienass https://t.co/DX5ByQlGmB,False,"[9, 9]","<a href=""http://twitter.com/download/android"" ...",1.463453e+18,1.4634530472716124e+18,147947500.0,...,,,,,,,,,,


In [47]:
tweets_df['created_at']

0    Fri Nov 26 09:00:49 +0000 2021
1    Thu Nov 25 19:04:51 +0000 2021
2    Thu Nov 25 13:38:48 +0000 2021
3    Wed Nov 24 17:09:52 +0000 2021
4    Wed Nov 24 14:50:09 +0000 2021
5    Wed Nov 24 14:46:21 +0000 2021
6    Wed Nov 24 10:45:44 +0000 2021
7    Wed Nov 24 10:43:59 +0000 2021
Name: created_at, dtype: object