<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>


# Lab 3.2.2 
# *Mining Social Media with Twitter*

## The Twitter API and Tweepy Package

The Twitter API provides access to tweets and comments, and allows an application to post tweets to the user's timeline. 

Twitter requires developers to create and authenticate an app before they can use the API. 

### 1. Apply for Developer Access


Apply at https://developer.twitter.com/en/apply-for-access.html for elevated access (https://developer.twitter.com/en/portal/products/elevated). Where asked state that you will use the app as a student to explore the tweepy Python library and use it to fetch recent tweets and to create/delete a tweet.

### 2. Create Your Twitter App
Go to https://developer.twitter.com/en/portal/projects-and-apps and click on the "+ New Project" button. Give your project a name, select Student as the use case and enter a brief project description. Then click on the "Create new" button to create a new App. Choose a "Development" app environment, give your app a name and you will be able to see your API Key, API Key Secret and Bearer Token. You will also be able to generate an Access Token and Secret. Copy-paste these authentication details for use in this lab.

### 3. Load Python Libraries

In [5]:
#!pip install tweepy
#!pip install textblob

In [68]:
import tweepy
import json
import pprint
import pandas as pd
import matplotlib.pyplot as plt
from textblob import TextBlob

Ensure your version of tweepy is 4.5.0 or later:

In [69]:
tweepy.__version__

'4.12.1'

### 4. Authenticate from your Python script

You could assign your authentication details explicitly, as follows:

In [70]:
bearer_token= "AAAAAAAAAAAAAAAAAAAAAL31jwEAAAAAqUR8kkBQFBI093U1bzZR8VX1Ks8%3DsH98PdA3WA1yjGU4W84veKy5cCS5gONoz3pGMY8mWRp41q4QE2" #your bearer token (string) goes in here
api_consumer_key = "7fBtEJSYLgDzj4DwPMHneIGx0" # your consumer key (string) goes in here
api_consumer_secret = "ve8SEz5clVOyQAvIygmkDyzOzbTig9O5HGypcsHDV69lPvLvbA" # your consumer secret key (string) goes in here
access_token = "1598799841081327621-2IDotOLqczT9tza1rCff5r33iYtxva"  # your access token (string) goes in here
access_token_secret = "cCKxRPsW2oiFIbsCcw73C2hu9a8QVGT1VemHMifxXDFsL"  # your access token secret (string) goes in here

A better way would be to store these details externally, so they are not displayed in the notebook:

- create a file called "auth_twitter.json" in your "notebooks" directory, and save your credentials there in JSON format:

`{`<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`   "bearer_token": "your bearer token (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`   "consumer_key": "your consumer key (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;` "consumer_secret": "your consumer secret key (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"access_token": "your access token (string) goes in here",` <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`"access_token_secret": "your access token secret (string) goes in here"` <br>
`}`

(Nb. Parsers are very fussy. Make sure each key:value pair has a comma after it except the last one!)  

Use the following code to load the credentials:  

In [71]:
pwd(HelloPanda)  # make sure your working directory is where the file is

'C:\\Users\\alice\\Desktop\\IOD\\Lab 3'

In [101]:
path_auth = 'auth_twitter.json'
auth = json.loads(open(path_auth).read())
pp = pprint.PrettyPrinter(indent=4)
# For debugging only:
#pp.pprint(auth)

my_bearer_token = auth['bearer_token']
my_consumer_key = auth['consumer_key']
my_consumer_secret = auth['consumer_secret']
my_access_token = auth['access_token']
my_access_token_secret = auth['access_token_secret']

Security considerations: 
- this method only keeps your credentials invisible as long as nobody accesses this notebook while it's running on your computer 
- if you wanted another user to have access to the executable notebook without divulging your credentials you should set up an OAuth 2.0 workflow to let them obtain and apply their own API tokens when using your app
- if you just want to share your analyses, you could use a separate script (which you don't share) to fetch the data and save it locally, then use a second notebook (with no API access) to load and analyse the locally stored data

### 5. Exploring the API

Here is how to connect to Twitter using the Tweepy library:

In [102]:
client = tweepy.Client(
    bearer_token=my_bearer_token,
    consumer_key=my_consumer_key,
    consumer_secret=my_consumer_secret,
    access_token=my_access_token,
    access_token_secret=my_access_token_secret
)

In the next cell, put the cursor after the '.' and hit the [tab] key to see the available members and methods in the response object:

In [74]:
#client.

To obtain your followers, enter your Twitter id. An easy way to find this is via https://tweeterid.com/.

In [88]:
myid = '1598799841081327621' #enter your id
followers = client.get_users_followers(id=myid)

In [89]:
followers.data[0].id
#try id, name, username

TypeError: 'NoneType' object is not subscriptable

You can enter a query such as the following to look up recent tweets based on a search string (excluding retweets).

In [90]:
query = 'winterolympics -is:retweet'

tweets = client.search_recent_tweets(query=query, tweet_fields=['context_annotations', 'created_at'],
                                     media_fields=['preview_image_url'], expansions='attachments.media_keys',
                                     max_results=100)

In [91]:
tweets.data[0].context_annotations

[{'domain': {'id': '6', 'name': 'Sports Event'},
  'entity': {'id': '1237760060828213249', 'name': 'Olympics'}},
 {'domain': {'id': '6', 'name': 'Sports Event'},
  'entity': {'id': '1284174028442230785', 'name': 'Winter Olympics'}},
 {'domain': {'id': '12',
   'name': 'Sports Team',
   'description': 'A sports team organization, like Arsenal and the Boston Celtics'},
  'entity': {'id': '1400530139130130433', 'name': 'Canada Olympic Team'}},
 {'domain': {'id': '46',
   'name': 'Business Taxonomy',
   'description': 'Categories within Brand Verticals that narrow down the scope of Brands'},
  'entity': {'id': '1557697289971322880',
   'name': 'Sports & Fitness Business',
   'description': 'Brands, companies, advertisers and every non-person handle with the profit intent related to sports nutrition, athletic apparel, sports apps, fitness venues'}},
 {'domain': {'id': '131',
   'name': 'Unified Twitter Taxonomy',
   'description': 'A taxonomy of user interests. '},
  'entity': {'id': '84790

Consult the Tweepy and Twitter API documentation. Print a few of the response members below:

In [92]:
username = 'nzherald' #choose a username
query = 'from:' + username
recent_tweets = client.search_recent_tweets(query=query, 
                                    tweet_fields=['attachments','author_id','context_annotations','created_at','entities','geo','id','in_reply_to_user_id','lang','possibly_sensitive','public_metrics','referenced_tweets','source','text','withheld'],
                                     max_results=10)
print(recent_tweets)

Response(data=[<Tweet id=1598846941705691137 text='‘Things are tough for everyone’: Auckland mayor’s latest cost-cutting call https://t.co/BvuIK7Xcy2 https://t.co/uTK2u7xYSR'>, <Tweet id=1598842081216282631 text="'Unacceptable' World Cup act slammed https://t.co/jSsAyTKrQl https://t.co/HXnpV9QMul">, <Tweet id=1598838378568585216 text='Podular creditors owed $5.2m https://t.co/EwmMPIDPfn https://t.co/e3zmTsVMFV'>, <Tweet id=1598834494819237888 text='Phil Gifford: Eight lasting rugby memories from 2022\xa0#HeraldPremium  https://t.co/TMNuPT1Xam https://t.co/9XFpMpnjQK'>, <Tweet id=1598830229824356352 text='#BREAKING | Homicide investigation launched into the death of Taranaki man https://t.co/CbzCZ5L9v7 https://t.co/6leCVQkEob'>, <Tweet id=1598825995364073472 text='Holiday hell: ACC dilemma keeps NZ woman from returning to Australia https://t.co/uhoXNMFSwp https://t.co/3H05JrZIkr'>, <Tweet id=1598820704962617353 text='#UPDATE | ‘Good progress’: Police say they are ‘confident’ of finding 

The following few cells will fetch recent tweets from accounts you follow:

In [93]:
users = client.get_users_following(id=myid, max_results = 1000, user_fields=['profile_image_url'])
print(len(users.data)) #number of accounts you follow

6


In [94]:
following_ids = []
for user in users.data:
    following_ids.append(user.id)
print(following_ids)

[27726303, 1604444052, 498292566, 14765253, 14301074, 20998647]


If you do not follow any accounts, create a list called following_ids with a list of ids of interest that can be looked up from https://tweeterid.com/.

In [95]:
for f in following_ids[:5]: #the number 10 can be increased to show tweets from more accounts
    query = 'from:' + str(f) + ' -is:retweet'

# get max. 10 tweets from each account followed
    tweets = client.search_recent_tweets(query=query, 
                                    tweet_fields=['author_id', 'created_at'],
                                     max_results=10)
    if tweets.data is not None:
        for tweet in tweets.data:
            print(tweet.author_id, tweet.created_at, tweet.text)

27726303 2022-12-03 00:16:00+00:00 Putting distance between the moment that your anger is triggered and your angry response can feel difficult at times—but it's doable. Here's how to start, by @DrRubinKhoddam https://t.co/Z7ySqywxAp
27726303 2022-12-02 22:18:00+00:00 "The client-therapist relationship is the single most important factor in a successful therapeutic outcome." But what does that actually mean? Here's one therapist's take. https://t.co/i4szcGhk3E
27726303 2022-12-02 20:21:00+00:00 Excessive noise may be doing more harm to your physical and mental well-being than you realize. @DrSamoonAhmad explains how. https://t.co/mLGzywv9tN
27726303 2022-12-02 18:32:00+00:00 "Children in detention have been subject to forced behavior therapy under the supervision of a psychologist and a cleric. In some instances, children have been prescribed psychiatric drugs." https://t.co/bYpYJ3wO9J
27726303 2022-12-02 17:04:00+00:00 When it comes to relationships, some people prefer to live out temp

The request to see your own recent tweets is similar, but uses a single call to the `get_users_tweets` endpoint. Try this below:

In [122]:
data_tweets = client.get_users_tweets(14765253)
data_tweets

Response(data=[<Tweet id=1598859958853976066 text='#BREAKING | Whangarei stabbing: Two hospitalised after fight; man appears in court https://t.co/xDYNL8a9X3 https://t.co/3hfBXq4WEw'>, <Tweet id=1598857311174402049 text='Kaipara mayor unrepentant over ban on karakia https://t.co/7odr7fzBij https://t.co/fqzmYqN7cq'>, <Tweet id=1598854211147825152 text="'Regret is not rape': Lawyers for rapist Harvey Weinstein attack accusers https://t.co/PunShponVw https://t.co/eheRqLEJoO">, <Tweet id=1598846941705691137 text='‘Things are tough for everyone’: Auckland mayor’s latest cost-cutting call https://t.co/BvuIK7Xcy2 https://t.co/uTK2u7xYSR'>, <Tweet id=1598842081216282631 text="'Unacceptable' World Cup act slammed https://t.co/jSsAyTKrQl https://t.co/HXnpV9QMul">, <Tweet id=1598838378568585216 text='Podular creditors owed $5.2m https://t.co/EwmMPIDPfn https://t.co/e3zmTsVMFV'>, <Tweet id=1598834494819237888 text='Phil Gifford: Eight lasting rugby memories from 2022\xa0#HeraldPremium  https://t.c

Now, instead of printing the text of each tweet, print the `created_at` and `id` attributes:

In [127]:
query = 'from:' + str(14765253) + ' -is:retweet'
tweets = client.search_recent_tweets(query=query, 
                                    tweet_fields=['author_id', 'created_at'],
                                     max_results=10)
if tweets.data is not None:
    for tweet in tweets.data:
        print(tweet.author_id, tweet.created_at)

14765253 2022-12-03 02:11:51+00:00
14765253 2022-12-03 02:01:34+00:00
14765253 2022-12-03 01:51:02+00:00
14765253 2022-12-03 01:38:43+00:00
14765253 2022-12-03 01:09:50+00:00
14765253 2022-12-03 00:50:31+00:00
14765253 2022-12-03 00:35:49+00:00
14765253 2022-12-03 00:20:23+00:00
14765253 2022-12-03 00:03:26+00:00
14765253 2022-12-02 23:46:36+00:00


You can create a tweet as follows:

In [103]:
# create a tweet:
tweet = client.create_tweet(text='Test: Made with Tweepy')

(Nb. Don't abuse this feature! If you try to generate a zillion tweets in a loop, Twitter will ban your account.)

Tweets can be deleted by reference to their `id_str` attribute:

In [104]:
tweet.data['id']

'1598856214443864065'

In [105]:
# delete a tweet:
status = client.delete_tweet(tweet.data['id'])

You can follow a Tweeter:

In [106]:
# follow:
client.follow_user('10228272') #YouTube

Response(data={'following': True, 'pending_follow': False}, includes={}, errors=[], meta={})

or unfollow:

In [107]:
# unfollow:
client.unfollow_user('10228272')

Response(data={'following': False}, includes={}, errors=[], meta={})



---



---



> > > > > > > > > © 2022 Institute of Data


---



---



