<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>


# Lab 3.2.2 
# *Mining Social Media with Twitter*

## The Twitter API and Tweepy Package

The Twitter API provides access to tweets and comments, and allows an application to post tweets to the user's timeline. 

Twitter requires developers to create and authenticate an app before they can use the API. As of recent policy changes, however, new developers must be approved before they can create an app. There is no indication of the waiting period for approval.

### 1. Apply for Developer Access

Go to https://blog.twitter.com/developer/en_us/topics/tools/2018/new-developer-requirements-to-protect-our-platform.html
and read the advice.


Apply at https://developer.twitter.com/en/apply-for-access.html. Where asked state that you will use the app as a student to explore the tweepy Python library and use it to fetch recent tweets and to create/delete a tweet.

### 2. Create Your Twitter App
Go to https://developer.twitter.com/en/portal/projects-and-apps and click on the "+ Create App" button. Give your app a name and then you will be able to generate Consumer Keys (API key & secret) and Authentication Tokens (Access token & secret). Copy-paste these authentication details for use in this lab.

### 3. Load Python Libraries

In [1]:
import tweepy
import json
import pprint

### 4. Authenticate from your Python script

You could assign your authentication details explicitly, as follows:

In [None]:
"""
NOT USED FOR PRIVACY REASONS

consumer_key = ''      # your consumer key (string) goes in here
consumer_secret = ''   # your consumer secret key (string) goes in here
access_token = ''      # your access token (string) goes in here
access_token_secret = ''  # your access token secret (string) goes in here"""

A better way would be to store these details externally, so they are not displayed in the notebook:

- create a file called "auth_twitter.json" in your "notebooks" directory, and save your credentials there in JSON format:

`{`<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`   "consumer_key": "your consumer key (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;` "consumer_secret": "your consumer secret key (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"access_token": "your access token (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"access_token_secret": "your access token secret (string) goes in here"` <br>
`}`

(Nb. Parsers are very fussy. Make sure each key:value pair has a comma after it except the last one!)  

Use the following code to load the credentials:  

In [None]:
pwd()  # make sure your working directory is where the file is

In [21]:
path_auth = 'auth.json'
auth = json.loads(open(path_auth).read())
pp = pprint.PrettyPrinter(indent=4)
# For debugging only:
#pp.pprint(auth)

consumer_key = auth['api_key']
consumer_secret = auth['api_secret']
access_token = auth['access_token']
access_token_secret = auth['access_token_secret']
bearer_token = auth['bearer_token']

Security considerations: 
- this method only keeps your credentials invisible as long as nobody accesses this notebook while it's running on your computer 
- if you wanted another user to have access to the executable notebook without divulging your credentials you should set up an OAuth 2.0 workflow to let them obtain and apply their own API tokens when using your app
- if you just want to share your analyses, you could use a separate script (which you don't share) to fetch the data and save it locally, then use a second notebook (with no API access) to load and analyse the locally stored data

### 5. Exploring the API

Here is how to connect to Twitter using the Tweepy library:

In [22]:
auth = tweepy.OAuthHandler(consumer_key, consumer_secret,access_token, access_token_secret)
#auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

In the next cell, put the cursor after the '.' and hit the [tab] key to see the available members and methods in the response object:

In [23]:
#set up a new client
client = tweepy.Client(bearer_token, consumer_key, consumer_secret, access_token, access_token_secret)

In [5]:
#Get a user data
handler = client.get_user(username = "NicolaiPHansen")

In [6]:
handler_id = handler.data.id
#get tweets
tweets = client.get_users_tweets(id = handler_id)

In [7]:
#display data in tweets
tweets.data

[<Tweet id=879587245040193536 text=RT @P4wnyhof: Only 5 Days left for our amazing june community giveaway with @msi_de :O! Follow + rt! #gg #giveaway #msi
 
 https://t.co/5aKzU…>,
 <Tweet id=858168558940979201 text=Relaxing Saturday by the pool, all we need now are some cocktails.… https://t.co/CNSButBLgN>,
 <Tweet id=799091668179910656 text=Look I can take care of more than one pretty flower becs.travels… https://t.co/EfNmeWBm0q>,
 <Tweet id=796516158182686720 text=Trump might be the next US president but the world is still… https://t.co/MxgIAgQZWO>,
 <Tweet id=796129299090776065 text=Found a couple of friends hanging in gardens by the bay @ Gardens by… https://t.co/qekRfHVu9f>,
 <Tweet id=597431220331540480 text=Worlds best airport security @CPHAirports que starts even before getting in to the security area http://t.co/5PFFIBIY9U>,
 <Tweet id=584659871393525760 text=Italian craftsmanship 101 build a sturdy door to keep intruders out http://t.co/E27PjBd919>,
 <Tweet id=584638029450600

In [8]:
#print the individual tweet ID's
for i in tweets.data:
    print(i.id)

879587245040193536
858168558940979201
799091668179910656
796516158182686720
796129299090776065
597431220331540480
584659871393525760
584638029450600448
582642580090957824
581350597317754880


In [9]:
#using the api rather than the client to display the id
for tweets in api.home_timeline(count = 5):
    print(tweets.id)

1485366839546765313
1485354115609681920
1485344218885373952
1485310125682434057
1485305233915928580


Consult the Tweept and Twitter API documentation. Print a few of the response members below:

In [10]:
#Use specific tweet id to retrieve information about a tweet
test = client.get_tweet(id = 879587245040193536, expansions = ['author_id'], user_fields = 'username')

In [11]:
print(test)

Response(data=<Tweet id=879587245040193536 text=RT @P4wnyhof: Only 5 Days left for our amazing june community giveaway with @msi_de :O! Follow + rt! #gg #giveaway #msi

https://t.co/5aKzU…>, includes={'users': [<User id=3094388649 name=Nicolai Plambeck Han username=NicolaiPHansen>]}, errors=[], meta={})


This will fetch recent tweets from accounts you follow:

In [12]:
# Recent tweets from accounts you follow:
tweets = api.home_timeline()
for tweet in tweets:
    print(tweet.text)

Tiempo Río de la Plata
https://t.co/SEc55gtXe0 https://t.co/oZi5yvwWFc
75-year-old Frenchman found dead after trying to cross the Atlantic in a canoe https://t.co/8xcd8FDKNg vía @NAUTICA… https://t.co/OMNFJrCgJj
RT @jacobkjaernoer: Test all you can. It’s a mistake not to! @EsbenBjerre @peterfalktoft https://t.co/GeKdpUm1JW
Hearthone Arena charity Marathon is floating through my mind since seeing @summit1g enjoy arena again 😅
Ab aufs Schlachtfeld - mit unseren neuesten "Waffen" 🔥

🔥 MAG B660M BAZOOKA DDR4
🔥 MAG VAMPIRIC 300R MIDNIGHT GREEN… https://t.co/L1sHraZPE1
Numbers down bad on #warzone youtube

Still making a video every day?

Hell jeah 

This gunsmith has so much more t… https://t.co/ux39QZz5Y0
Nej, Lisan Al-Ghaib😭
75-year-old Frenchman found dead after trying to cross the Atlantic in a canoe https://t.co/8xcd8FDKNg vía @NAUTICA… https://t.co/Kc9QkcoEgj
75-year-old Frenchman found dead after trying to cross the Atlantic in a canoe - https://t.co/rDq7y33Wfd NEWS https://t.co/MuPu

The request to see your own recent tweets is similar, but uses the `user_timeline` endpoint. Try this below:

In [13]:
#Tweets made by user him/her self
tweets = api.user_timeline()
for tweets in tweets:
    print(tweets.text)

RT @P4wnyhof: Only 5 Days left for our amazing june community giveaway with @msi_de :O! Follow + rt! #gg #giveaway #msi

https://t.co/5aKzU…
Relaxing Saturday by the pool, all we need now are some cocktails.… https://t.co/CNSButBLgN
Look I can take care of more than one pretty flower becs.travels… https://t.co/EfNmeWBm0q
Trump might be the next US president but the world is still… https://t.co/MxgIAgQZWO
Found a couple of friends hanging in gardens by the bay @ Gardens by… https://t.co/qekRfHVu9f
Worlds best airport security @CPHAirports que starts even before getting in to the security area http://t.co/5PFFIBIY9U
Italian craftsmanship 101 build a sturdy door to keep intruders out http://t.co/E27PjBd919
Italian craftsmanship 101 build a sturdy bridge for you and your family to use http://t.co/bureBFmAA7
@OliverFolkmann @NicolaiPHansen Es ist wahnsinnig aber ich bin ein großer fan!!
@OliverFolkmann aberdoch die Folkmann am Twitter. Ich möchte das sehen!!
@helmbaek oh yeah
@ThamesWaterJo

Now, instead of printing the text of each tweet, print the `created_at` and `id_str` methods:

In [14]:
tweets = api.user_timeline()
for tweets in tweets:
    print(f'Tweet created at: {tweets.created_at}, tweet ID: {tweets.id_str}')

Tweet created at: 2017-06-27 06:28:35+00:00, tweet ID: 879587245040193536
Tweet created at: 2017-04-29 03:58:23+00:00, tweet ID: 858168558940979201
Tweet created at: 2016-11-17 03:27:54+00:00, tweet ID: 799091668179910656
Tweet created at: 2016-11-10 00:53:45+00:00, tweet ID: 796516158182686720
Tweet created at: 2016-11-08 23:16:30+00:00, tweet ID: 796129299090776065
Tweet created at: 2015-05-10 16:01:17+00:00, tweet ID: 597431220331540480
Tweet created at: 2015-04-05 10:12:30+00:00, tweet ID: 584659871393525760
Tweet created at: 2015-04-05 08:45:42+00:00, tweet ID: 584638029450600448
Tweet created at: 2015-03-30 20:36:30+00:00, tweet ID: 582642580090957824
Tweet created at: 2015-03-27 07:02:38+00:00, tweet ID: 581350597317754880
Tweet created at: 2015-03-19 18:12:50+00:00, tweet ID: 578620159717466112
Tweet created at: 2015-03-18 13:39:45+00:00, tweet ID: 578189046624714752
Tweet created at: 2015-03-18 13:19:22+00:00, tweet ID: 578183917422927872


In [26]:
#setting up handler to get id
handler = 'NicolaiPHansen'
handler_id = client.get_user(username = "NicolaiPHansen").data.id

In [30]:
tweets = client.get_users_tweets(id = handler_id, tweet_fields = ['author_id', 'created_at'])

In [31]:
tweets.data

[<Tweet id=879587245040193536 text=RT @P4wnyhof: Only 5 Days left for our amazing june community giveaway with @msi_de :O! Follow + rt! #gg #giveaway #msi
 
 https://t.co/5aKzU…>,
 <Tweet id=858168558940979201 text=Relaxing Saturday by the pool, all we need now are some cocktails.… https://t.co/CNSButBLgN>,
 <Tweet id=799091668179910656 text=Look I can take care of more than one pretty flower becs.travels… https://t.co/EfNmeWBm0q>,
 <Tweet id=796516158182686720 text=Trump might be the next US president but the world is still… https://t.co/MxgIAgQZWO>,
 <Tweet id=796129299090776065 text=Found a couple of friends hanging in gardens by the bay @ Gardens by… https://t.co/qekRfHVu9f>,
 <Tweet id=597431220331540480 text=Worlds best airport security @CPHAirports que starts even before getting in to the security area http://t.co/5PFFIBIY9U>,
 <Tweet id=584659871393525760 text=Italian craftsmanship 101 build a sturdy door to keep intruders out http://t.co/E27PjBd919>,
 <Tweet id=584638029450600

You can create a tweet as follows:

In [24]:
# create a tweet (using the clint (V2)):
tweet = client.create_tweet(text = 'This was created with tweepy')

(Nb. Don't abuse this feature! If you try to generate a zillion tweets in a loop, Twitter will ban youur account.)

Tweets can be deleted by reference to their `id_str` attribute:

In [29]:
# delete a tweet:
status = client.delete_tweet(1485421348461883392)

You can follow a Tweeter:

In [33]:
# follow:
api.create_friendship(screen_name = '@YouTube')

User(_api=<tweepy.api.API object at 0x7fbf49bdd700>, _json={'id': 10228272, 'id_str': '10228272', 'name': 'YouTube', 'screen_name': 'YouTube', 'location': 'San Bruno, CA', 'description': 'like and subscribe.', 'url': 'https://t.co/bUisN3Gqbw', 'entities': {'url': {'urls': [{'url': 'https://t.co/bUisN3Gqbw', 'expanded_url': 'http://youtube.com', 'display_url': 'youtube.com', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 74087217, 'friends_count': 1206, 'listed_count': 79775, 'created_at': 'Tue Nov 13 21:43:46 +0000 2007', 'favourites_count': 5953, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': True, 'statuses_count': 41623, 'lang': None, 'status': {'created_at': 'Mon Jan 24 01:11:51 +0000 2022', 'id': 1485420047103410176, 'id_str': '1485420047103410176', 'text': '@WalklateElla the retro arcade style is def a vibe! 🕹🎶', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 

or unfollow:

In [35]:
# unfollow:
api.destroy_friendship(screen_name = '@YouTube')

User(_api=<tweepy.api.API object at 0x7fbf49bdd700>, _json={'id': 10228272, 'id_str': '10228272', 'name': 'YouTube', 'screen_name': 'YouTube', 'location': 'San Bruno, CA', 'description': 'like and subscribe.', 'url': 'https://t.co/bUisN3Gqbw', 'entities': {'url': {'urls': [{'url': 'https://t.co/bUisN3Gqbw', 'expanded_url': 'http://youtube.com', 'display_url': 'youtube.com', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 74087222, 'friends_count': 1206, 'listed_count': 79775, 'created_at': 'Tue Nov 13 21:43:46 +0000 2007', 'favourites_count': 5953, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': True, 'statuses_count': 41623, 'lang': None, 'status': {'created_at': 'Mon Jan 24 01:11:51 +0000 2022', 'id': 1485420047103410176, 'id_str': '1485420047103410176', 'text': '@WalklateElla the retro arcade style is def a vibe! 🕹🎶', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 



---



---



> > > > > > > > > © 2021 Institute of Data


---



---



