<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>


# Lab 2.2.2 
# *Mining Social Media with Twitter*

## The Twitter API and Tweepy Package

The Twitter API provides access to tweets and comments, and allows an application to post tweets to the user's timeline. 

Twitter requires developers to create and authenticate an app before they can use the API. As of recent policy changes, however, new developers must be approved before they can create an app. There is no indication of the waiting period for approval.

### 1. Apply for Developer Access

Go to https://blog.twitter.com/developer/en_us/topics/tools/2018/new-developer-requirements-to-protect-our-platform.html
and read the advice.
![image.png](attachment:image.png)

Apply at https://developer.twitter.com/en/apply-for-access.html
![image.png](attachment:image.png)

Then go to https://developer.twitter.com/en/review every day until you see whatever comes after this: 
![image.png](attachment:image.png)

### 2. Create Your Twitter App
![image.png](attachment:image.png)

### 3. Load Python Libraries

In [8]:
import tweepy
import json
import pprint

### 4. Authenticate from your Python script

You could assign your authentication details explicitly, as follows:

In [14]:
my_consumer_key = ''      # your consumer key (string) goes in here
my_consumer_secret = ''   # your consumer secret key (string) goes in here
my_access_token = ''      # your access token (string goes in here
my_access_token_secret = ''  # your access token secret (string) goes in here

A better way would be to store these details externally, so they are not displayed in the notebook:

- create a file called "auth_twitter.json" in your "notebooks" directory, and save your credentials there in JSON format:

`{   "my_consumer_key": "your consumer key (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;` "my_consumer_secret": "your consumer secret key (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"your access token (string goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"my_access_token_secret": "your access token secret (string) goes in here"` <br>
`}`

(Nb. Parsers are very fussy. Make sure each key:value pair has a comma after it except the last one!)  

Use the following code to load the credentials:  

In [None]:
pwd()  # make sure your working directory is where the file is

In [None]:
import json
import pprint
path_auth = 'auth_twitter.json'
auth = json.loads(open(path_auth).read())
pp = pprint.PrettyPrinter(indent=4)
# For debugging only:
#pp.pprint(auth)

my_consumer_key = auth['consumer_key']
my_consumer_secret = auth['consumer_secret']
my_access_token = auth['access_token']
my_access_token_secret = auth['access_token_secret']

Security considerations: 
- this method only keeps your credentials invisible as long as nobody accesses this notebook while it's running on your computer 
- if you wanted another user to have access to the executable notebook without divulging your credentials you should set up an OAuth 2.0 workflow to let them obtain and apply their own API tokens when using your app
- if you just want to share your analyses, you could use a separate script (which you don't share) to fetch the data and save it locally, then use a second notebook (with no API access) to load and analyse the locally stored data

### 5. Exploring the API

Here is how to connect to Twitter using the Tweepy library:

In [11]:
auth = tweepy.OAuthHandler(my_consumer_key, my_consumer_secret)
auth.set_access_token(my_access_token, my_access_token_secret)
api = tweepy.API(auth)

In the next cell, put the cursor after the '.' and hit the [tab] key to see the available members and methods in the response object:

In [0]:
api.

Consult the Tweept and Twitter API documentation. Print a few of the response members below:

This will fetch recent tweets from accounts you follow:

In [12]:
# Recent tweets from accounts you follow:
tweets = api.home_timeline()
for tweet in tweets:
    print(tweet.text)

ŸàŸÑÿØŸä ÿ¥ÿßÿ∑ÿ± ÿ®ÿ≥ ŸÖÿßŸäÿØÿ±ÿ≥ 
- ŸàŸÑÿØŸáŸÖ . https://t.co/BO4K3Znx9V
–¢–æ–ª—å–∫–æ —á—Ç–æ –æ–ø—É–±–ª–∏–∫–æ–≤–∞–Ω–æ –≤–∏–¥–µ–æ @ –ö–∞–Ω–æ–Ω–µ—Ä–∫–∞ https://t.co/Ap751cJfJV
–õ—é–¥–≤–∏–≥ —Ñ–æ–Ω –ú–∏–∑–µ—Å. –¶–∏—Ç–∞—Ç—ã. ‚Ññ162 –û–± –æ—Å–Ω–æ–≤–Ω–æ–π –∏–ª–ª—é–∑–∏–∏ –∫–ª–∞—Å—Å–∏—á–µ—Å–∫–æ–≥–æ –ª–∏–±–µ—Ä–∞–ª–∏–∑–º–∞
 #–º–∏–Ω–∞—Ä—Ö–∏–∑–º #minarchism‚Ä¶ https://t.co/JSBmGEhX2c
ŸÖÿßŸäÿ∂ÿ≠ŸÉŸÜŸä ÿßŸÑÿß ÿßŸÜÿß ŸàÿÆŸàŸäŸä ŸÉŸÜÿß ÿ®ŸÜÿ¥ÿ∫ŸÑ ÿßÿ∫ŸÜŸäŸá ŸàŸÜÿ≥ŸäŸÜÿß ÿßÿ≥ŸÖŸáÿß ŸàŸÉŸÑŸÜÿß ÿµŸÑŸäŸÜÿß ÿπŸÑŸâ ÿßŸÑŸÜÿ®Ÿä ÿπÿ¥ÿßŸÜ ŸÜÿ™ÿ∞ŸÉÿ±Ÿáÿß
–ú—é—Ä—Ä–µ–π –†–æ—Ç–±–∞—Ä–¥. –¶–∏—Ç–∞—Ç—ã. ‚Ññ118 –≠–∫—Å—Ç—Ä–µ–º–∞–ª—å–Ω—ã–µ —Å–∏—Ç—É–∞—Ü–∏–∏ –Ω–µ –¥–æ–ª–∂–Ω—ã —Å–ª—É–∂–∏—Ç—å –ø–æ–≤–æ–¥–æ–º –¥–ª—è —Ñ–æ—Ä–º–∏—Ä–æ–≤–∞–Ω–∏—è —ç—Ç–∏–∫–∏ –æ–±—ã—á–Ω–æ–π –∂–∏–∑–Ω–∏‚Ä¶ https://t.co/7Wjxb1ZHYP
RT @Press_Focusnik: ¬´–ë–æ—Ä—å–±–∞ —Å –∫–æ—Ä—Ä—É–ø—Ü–∏–µ–π - –ù–ï–¢!¬ª
–†–æ—Å—Å–∏—è - —ç—Ç–æ, –∫–æ–≥–¥–∞ –∂–µ–Ω–∞—Ç—ã–π –º–∏–Ω–∏—Å—Ç—Ä –∑–¥—Ä–∞–≤–æ–æ—Ö—Ä–∞–Ω–µ–Ω–∏—è –û–º—Å–∫–æ–π –æ–±–ª–∞—Å—Ç–∏ —Å–ø–æ–∫–æ–π–Ω–æ –ø–æ–µ—Ö–∞–ª –Ω

The request to see your own recent tweets is similar, but uses the `user_timeline` endpoint. Try this below:

In [13]:
tweets = api.user_timeline()
for tweet in tweets:
    print(tweet.text)

What a beautiful day in sydney!!!


Now, instead of printing the text of each tweet, print the `created_at` and `id_str` methods:

In [17]:
for tweet in tweets:
    print(tweet.created_at)
    print(tweet.id_str)

2012-04-05 23:15:10
188042123604525056


In [18]:
tweet

Status(_api=<tweepy.api.API object at 0x104136460>, _json={'created_at': 'Thu Apr 05 23:15:10 +0000 2012', 'id': 188042123604525056, 'id_str': '188042123604525056', 'text': 'What a beautiful day in sydney!!!', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [], 'urls': []}, 'source': '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>', 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 545828792, 'id_str': '545828792', 'name': 'jin', 'screen_name': 'gentley99', 'location': '', 'description': '', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 0, 'friends_count': 124, 'listed_count': 0, 'created_at': 'Thu Apr 05 09:22:04 +0000 2012', 'favourites_count': 0, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count'

You can create a tweet as follows:

In [19]:
# create a tweet:
tweet = api.update_status('Test: Made with Tweepy')

TweepError: Read-only application cannot POST.

(Nb. Don't abuse this feature! If you try to generate a zillion tweets in a loop, Twitter will ban youur account.)

Tweets can be deleted by reference to their `id_str` attribute:

In [0]:
# delete a tweet:
status = api.destroy_status(tweet.id_str)

You can follow a Tweeter:

In [0]:
# follow:
api.create_friendship('@YouTube')

or unfollow:

In [0]:
# unfollow:
api.destroy_friendship('@YouTube')

In [25]:
# Extracting tweets from the twitter user
#extract 100 tweets from the twitter user
posts = api.user_timeline(screen_name = "elonmusk", count = 100, lang = "en", tweet_mode="extended")



In [26]:
posts

[Status(_api=<tweepy.api.API object at 0x104136460>, _json={'created_at': 'Thu May 13 09:54:35 +0000 2021', 'id': 1392780304138473473, 'id_str': '1392780304138473473', 'full_text': 'Energy usage trend over past few months is insane https://t.co/E6o9s87trw https://t.co/bmv9wotwKe', 'truncated': False, 'display_text_range': [0, 73], 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [], 'urls': [{'url': 'https://t.co/E6o9s87trw', 'expanded_url': 'https://cbeci.org', 'display_url': 'cbeci.org', 'indices': [50, 73]}], 'media': [{'id': 1392780299965140994, 'id_str': '1392780299965140994', 'indices': [74, 97], 'media_url': 'http://pbs.twimg.com/media/E1QmTbWVkAIjDXp.jpg', 'media_url_https': 'https://pbs.twimg.com/media/E1QmTbWVkAIjDXp.jpg', 'url': 'https://t.co/bmv9wotwKe', 'display_url': 'pic.twitter.com/bmv9wotwKe', 'expanded_url': 'https://twitter.com/elonmusk/status/1392780304138473473/photo/1', 'type': 'photo', 'sizes': {'thumb': {'w': 150, 'h': 150, 'resize': 'crop'}, 'medium

In [27]:
#Print the last 5 twees from the account
print("show the 5 recent tweets: \n")
i = 1 
for tweet in posts[0:5]:
    print(str(i) + ') ' + tweet.full_text + '\n')
    i += 1

show the 5 recent tweets: 

1) Energy usage trend over past few months is insane https://t.co/E6o9s87trw https://t.co/bmv9wotwKe

2) Tesla &amp; Bitcoin https://t.co/YSswJmVZhP

3) @WholeMarsBlog Haha

4) @blockfolio ü§£ü§£

5) @garyblack00 Subscription rolls out in about a month



>
>

>
>



---



---



> > > > > > > > > ¬© 2019 Institute of Data


---



---



