<center> <h2> Scraping Tweets using Tweepy </h2></center>

## Outline
1. <a href='#1'>Tweepy</a>
2. <a href='#2'>Authenticating with Twitter Via Tweepy</a>
3. <a href='#3'>Getting Information About a Twitter Account</a>
4. <a href='#4'>Getting Your Own Account’s Information</a>


## 1. Tweepy
* [**Tweepy library**](http://www.tweepy.org/)—**one of the most popular Python Twitter clients**
* Easy access to Twitter’s capabilities
* [Tweepy’s documentation](http://docs.tweepy.org/en/latest/)
* [Additional information and the Tweepy source code](https://github.com/tweepy/tweep)

### 1.1. Installing Tweepy 
* Use the Anaconda Prompt line to execute the following:
> `pip install tweepy`
* Windows users **should run the Anaconda Prompt as an Administrator**

## 1.2. Importing Tweepy
* Once installed, you can import tweepy and start using it!

In [1]:
import tweepy

In [3]:
import TwitterCredentials as keys

In [4]:
keys.consumer_API_key

'R3zpLRyNMTniP3I6RJYIh7bEF'

In [5]:
!more TwitterCredentials.py

consumer_API_key = "R3zpLRyNMTniP3I6RJYIh7bEF"
consumer_API_secret_key = "kniWSpJH18645bPHl3O0SHhMqO1zSDlmDeOFOgSFJ4tyCbYzm6"
access_token = "1244015025267675141-eStB8yY6Ce1XTxt7vucYlWZbPDjg8j"
access_token_secret = "7etDbFPgHYTUTI1ZQdq5XZmhd42Gx8BZxnJ0SmJPDaHwf"


## 2. Authenticating with Twitter Via Tweepy 
* A **Tweepy `API` object** is your gateway to using the Twitter APIs
* Must first **authenticate with Twitter** before scraping tweets

### 2.1. Creating and Configuring an `OAuthHandler`
* We need an authentican handler that will handle the authentication with Twitter
* Tweepy has a dedicated object for this purpose: "OAuthHandler"
* OAuthHandler needs two arguments: consumer key and secret key for our Twitter App:

In [6]:
#Step 1
auth = tweepy.OAuthHandler(keys.consumer_API_key, keys.consumer_API_secret_key)

* We also need to define an access token
* The access token is the “key” for opening the Twitter API treasure box

In [7]:
#Step 2
auth.set_access_token(keys.access_token, keys.access_token_secret)

### 2.2. Creating the Tweepy API Object 
* The last step is to define a Tweepy API Object with which we can interact:

In [8]:
#Step 3
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

* **`auth`** is the **`OAuthHandler`**
* **`wait_on_rate_limit=True`** tells Tweepy to **wait 15 minutes** each time it reaches a given API method’s rate limit&mdash;**prevents violations**
* **`wait_on_rate_limit_notify=True`** tells Tweepy to display a command-line message if you hit a rate limit

### Authentication Steps: Summary

In [9]:
import tweepy
import TwitterCredentials as keys

auth = tweepy.OAuthHandler(keys.consumer_API_key, keys.consumer_API_secret_key)
auth.set_access_token(keys.access_token, keys.access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

## 3. Getting Information About a Twitter Account
* **`API`** object’s **`get_user` method** returns a **`tweepy.models.User` object** containing information about a specific user’s Twitter account 

In [10]:
neu = api.get_user('Northeastern', tweet_mode="extended")

In [11]:
neu

User(_api=<tweepy.api.API object at 0x000002C6A8C75FC8>, _json={'id': 46477409, 'id_str': '46477409', 'name': 'Northeastern U.', 'screen_name': 'Northeastern', 'location': 'Boston, MA', 'profile_location': None, 'description': 'Northeastern is a global, experiential, research university built on a tradition of engagement with the world. #NUexperience', 'url': 'https://t.co/tUhI2oiKQI', 'entities': {'url': {'urls': [{'url': 'https://t.co/tUhI2oiKQI', 'expanded_url': 'http://northeastern.edu', 'display_url': 'northeastern.edu', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 44014, 'friends_count': 881, 'listed_count': 694, 'created_at': 'Thu Jun 11 20:08:56 +0000 2009', 'favourites_count': 9865, 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'verified': True, 'statuses_count': 26011, 'lang': None, 'status': {'created_at': 'Thu Jun 18 00:11:36 +0000 2020', 'id': 1273407982664134656, 'id_str': '1273407982664134656', 'full_text': 'As

In [12]:
neu = api.get_user('Northeastern', tweet_mode="extended")

### 3.1. get_user() method
* Calls the Twitter API’s [`users/show` method](https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-users-show)
* Currently can call **up to 900 times every 15 minutes**
* **`tweepy.models` classes** correspond to returned **JSON objects**
* **`User` class** corresponds to a Twitter [**user object**](https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/user-object)
* **`tweet_mode = "extended"`** indicates that tweets longer than 140 characters (up to 280 characters) will be displayed without being truncated.

### 3.2. Key Attributes of a Twitter User Object
| Attribute&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;	| Description
| :---	| :---
| `id` 	| The integer representation of the unique identifier for this User.
| `id_str` 	| The string representation of the unique identifier for this User.
| `name` 	| The name of the user, as they’ve defined it.
| `screen_name` 	| The screen name, handle, or alias that this user identifies themselves with. 
| `location` 	| The user-defined location for this account’s profile.
| `url` 	| A URL provided by the user in association with their profile.
| `description` 	| The user-defined UTF-8 string describing their account.
| `followers_count`	| The number of followers this account currently has.
| `friends_count` 	| The number of users this account is following (AKA their “followings”).
| `statuses_count` 	| The number of Tweets (including retweets) issued by the user.
| `created_at` 	| The UTC datetime that the user account was created on Twitter.
| `profile_image_url_https` 	| A HTTPS-based URL pointing to the user’s profile image.

* Full list of attributes: https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/user-object

* The unique ID for the account:

In [13]:
neu.id

46477409

* The user-defined name of the account:

In [14]:
neu.name

'Northeastern U.'

* The unique screen name, twitter alias:

In [15]:
neu.screen_name

'Northeastern'

* The user-defined description of the account:

In [16]:
neu.description

'Northeastern is a global, experiential, research university built on a tradition of engagement with the world. #NUexperience'

* The number of followers this account currently has:

In [18]:
neu.followers_count

44014

* The number of accounts this account is currently following (aka, followings):

In [19]:
neu.friends_count

881

* The number of Tweets (including retweets) issued by the user:

In [20]:
neu.statuses_count

26011

### Getting the Most Recent Status Update
* `User` object’s **`status` property** returns a **`tweepy.models.Status`** object
* Corresponds to a Twitter [**tweet object**](https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object)
* **`full_text`** attribute gives the entire untruncated tweet

In [21]:
neu.status.full_text

'As an innovator in global banking with extensive experience working in emerging markets, Bill Winters is the Chief Executive of @StanChart. He joins @PresidentAoun for a discussion about business in the age of global change Thursday at 11:30. Watch: https://t.co/OZ0su3lfHs https://t.co/ZyPnXw0ZwW'

In [22]:
type(neu.status)

tweepy.models.Status

In [23]:
neu.status

Status(_api=<tweepy.api.API object at 0x000002C6A8C75FC8>, _json={'created_at': 'Thu Jun 18 00:11:36 +0000 2020', 'id': 1273407982664134656, 'id_str': '1273407982664134656', 'full_text': 'As an innovator in global banking with extensive experience working in emerging markets, Bill Winters is the Chief Executive of @StanChart. He joins @PresidentAoun for a discussion about business in the age of global change Thursday at 11:30. Watch: https://t.co/OZ0su3lfHs https://t.co/ZyPnXw0ZwW', 'truncated': False, 'display_text_range': [0, 273], 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 'StanChart', 'name': 'Standard Chartered', 'id': 58354465, 'id_str': '58354465', 'indices': [128, 138]}, {'screen_name': 'PresidentAoun', 'name': 'President Aoun', 'id': 393395654, 'id_str': '393395654', 'indices': [149, 163]}], 'urls': [{'url': 'https://t.co/OZ0su3lfHs', 'expanded_url': 'http://facebook.com/northeastern', 'display_url': 'facebook.com/northeastern', 'indices': [2

## 4. Getting Your Own Account’s Information
* Use the me() method to get the User object for the authenticated account

In [24]:
me = api.me()

In [25]:
me.name

'Aneesha'

In [26]:
me.screen_name

'aneesha011'