&copy; 2019 by Pearson Education, Inc. All Rights Reserved. The content in this notebook is based on the book [**Python for Programmers**](https://amzn.to/2VvdnxE).

In [1]:
# enable high-res images in notebook 
%config InlineBackend.figure_format = 'retina'

# 12. Data Mining Twitter 
# Objectives
* **Data-mine Twitter** with **Tweepy** library
* Search **past tweets** with the **Twitter Search API**
* Sample the **live tweet stream** with the **Twitter Streaming API**
* Tweet object **meta data** 
* Use **NLP** to **clean and preprocess tweets** for analysis
* Perform **sentiment analysis** on tweets
* **Spot trends** with **Twitter’s Trends API**
* **Map tweets** using **folium** and **OpenStreetMap**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.1 Introduction 
* Popular **big-data source**  
* **Data mining** &mdash; searching large collections of data for **insights**
* **Sentiment** in tweets can help **make predictions**  
    * **Stock prices**
    * **Election results**
    * Likely **revenues** for a **new movie**
    * **Success** of a company’s **marketing campaign**
* Spot **faults in competitors’ products** 
* Spot **trending topics**
* **Connect to Twitter** with easy-to-use **Web services**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### What Is Twitter? 
* **Tweets**
    * Short messages
    * Initially limited to **140 characters**
    * Increased in 2017 for most languages to **280 characters**
* **Most open social network**&mdash;anyone can generally choose to follow anyone else
* **Free programmatic access** to a small portion of **last 7 days' tweets**
    * Can get **paid access** to larger portions the **all-time tweets database**
* [**Hundreds of millions of tweets are sent every day** with many thousands sent per second](http://www.internetlivestats.com/twitter-statistics/)
* Can tap into the **live stream** and get up to 1% of live tweets
    * **Like “drinking from a fire hose”** 

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.2 Overview of the Twitter APIs 
* **Web services** are **methods** that you call in the **cloud**
* Each **method** has a **web service endpoint** represented by a **URL**
* **Caution**: **apps can be brittle**
    * Internet connections can be lost, services can change, some services not available everywhere, ... 
* [Twitter API categories, subcategories and individual methods](https://developer.twitter.com/en/docs/api-reference-index.html)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Rate Limits and Restrictions
* Twitter expects developers to **use its services responsibly**
* **Understand rate limits** before using any method or you could get **blocked**
* Some methods list both **user rate limits** and **app rate limits**
    * We use **app rate limits** in the demos
    * **User rate limits** for apps in which individuals to log into own Twitter accounts
    * [Details on rate limiting](https://developer.twitter.com/en/docs/basics/rate-limiting)
    * [Specific rate limits on individual API methods](https://developer.twitter.com/en/docs/basics/rate-limits) — also see each API method’s docs
* **Follow Twitter’s rules/regulations or your developer account could be terminated.** 
	* [Terms of Service](https://twitter.com/tos), [Developer Agreement](https://developer.twitter.com/en/developer-terms/agreement-and-policy.html), [Developer Policy](https://developer.twitter.com/en/developer-terms/policy.html), [Other restrictions](https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.4 Getting Twitter Credentials—Creating an App 
* [Apply for a developer account](https://developer.twitter.com/en/apply-for-access) to use the APIs
* Must get **credentials** to use Twitter APIs
    * Part of the **OAuth 2.0 authentication process**
    * **Tweepy handles OAuth 2.0 authentication details for you**
* To get credentials, you’ll [**create an app**](https://developer.twitter.com) 
    * **Each app has separate credentials**
    * I present **details on creating apps** in my [**Python Fundamentals LiveLessons videos for Section 12.4**](https://learning.oreilly.com/videos/python-fundamentals/9780135917411/9780135917411-PFLL_Lesson12_04) and [**Python for Programmers Section 12.4**](https://learning.oreilly.com/library/view/Python+for+Programmers,+First+Edition/9780135231364/ch12.xhtml#ch12lev1sec4)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.5 What’s in a Tweet? 
* Twitter API returns **JSON (JavaScript Object Notation)** objects
* Text-based, human and computer readable
* Like Python **dictionaries**
* **JSON object format** &mdash; **all strings in double quotes (")**
>```python
{propertyName1: value1, propertyName2: value2}
```
* **JSON array format (like Python list)**:
>```python
[value1, value2, value3]
```
* **Tweepy handles the JSON for you** behind the scenes

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Key Properties of a Tweet Object 
* **Tweet object** contains **metadata**, including
    * **text** of the tweet
    * **extended tweet** for tweets up to **280 characters**
	* **when** it was created 
	* **who** created it,
	* lists of **hashtags**, **URLs**, **`@`-mentions**, **images**, **videos** and more
* [Our table of many key **tweet metadata attributes**](https://learning.oreilly.com/library/view/python-for-programmers/9780135231364/ch12.xhtml#ch12lev1sec5)
* [Complete list of the tweet object attributes](https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object.html)
* [General overview of all the JSON objects that Twitter APIs return, and links to the specific object details](https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/intro-to-tweet-json)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Sample Tweet JSON
* **Sample JSON** for this tweet from the **`@nasa`** account: 
>```
@NoFear1075 Great question, Anthony! Throughout its seven-year 
mission, our Parker #SolarProbe spacecraft... https://t.co/xKd6ym8waT'
```

<hr style="height:2px; border:none; color:#000; background-color:#000;">

```python
{'created_at': 'Wed Sep 05 18:19:34 +0000 2018',
 'id': 1037404890354606082,
 'id_str': '1037404890354606082',
 'text': '@NoFear1075 Great question, Anthony! Throughout its seven-year 
          mission, our Parker #SolarProbe spacecraft… https://t.co/xKd6ym8waT',
 'truncated': True,
 'entities': {'hashtags': [{'text': 'SolarProbe', 'indices': [84, 95]}],
    'symbols': [],
    'user_mentions': [{'screen_name': 'NoFear1075',
        'name': 'Anthony Perrone',
        'id': 284339791,
        'id_str': '284339791',
        'indices': [0, 11]}],
    'urls': [{'url': 'https://t.co/xKd6ym8waT',
        'expanded_url': 'https://twitter.com/i/web/status/1037404890354606082',
        'display_url': 'twitter.com/i/web/status/1…',
        'indices': [117, 140]}]},
 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>',
```

```python
 'in_reply_to_status_id': 1037390542424956928,
 'in_reply_to_status_id_str': '1037390542424956928',
 'in_reply_to_user_id': 284339791,
 'in_reply_to_user_id_str': '284339791',
 'in_reply_to_screen_name': 'NoFear1075',
 'user': {'id': 11348282,
    'id_str': '11348282',
    'name': 'NASA',
    'screen_name': 'NASA',
    'location': '',
    'description': 'Explore the universe and discover our home planet with 
            @NASA. We usually post in EST (UTC-5)',
    'url': 'https://t.co/TcEE6NS8nD',
    'entities': {'url': {'urls': [{'url': 'https://t.co/TcEE6NS8nD',
            'expanded_url': 'http://www.nasa.gov',
            'display_url': 'nasa.gov',
            'indices': [0, 23]}]},
    'description': {'urls': []}},
    'protected': False,
    'followers_count': 29486081,
    'friends_count': 287,
    'listed_count': 91928,
    'created_at': 'Wed Dec 19 20:20:32 +0000 2007',
    'favourites_count': 3963,
    'time_zone': None,
    'geo_enabled': False,
    'verified': True,
    'statuses_count': 53147,
    'lang': 'en',
```

```python
    'contributors_enabled': False,
    'is_translator': False,
    'is_translation_enabled': False,
    'profile_background_color': '000000',
    'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
    'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
    'profile_image_url': 'http://pbs.twimg.com/profile_images/188302352/nasalogo_twitter_normal.jpg',
    'profile_image_url_https': 'https://pbs.twimg.com/profile_images/188302352/nasalogo_twitter_normal.jpg',
    'profile_banner_url': 'https://pbs.twimg.com/profile_banners/11348282/1535145490',
    'profile_link_color': '205BA7',
    'profile_sidebar_border_color': '000000',
    'profile_sidebar_fill_color': 'F3F2F2',
    'profile_text_color': '000000',
    'profile_use_background_image': True,
    'has_extended_profile': True,
    'default_profile': False,
    'default_profile_image': False,
    'following': True,
    'follow_request_sent': False,
    'notifications': False,
    'translator_type': 'regular'},
 'geo': None,
 'coordinates': None,
 'place': None,
 'contributors': None,
 'is_quote_status': False,
 'retweet_count': 7,
 'favorite_count': 19,
 'favorited': False,
 'retweeted': False,
 'possibly_sensitive': False,
 'lang': 'en'}

```

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.6 Tweepy
* [**Tweepy library**](http://www.tweepy.org/)—**one of the most popular Python Twitter clients**
> `pip install tweepy>=3.7`
* Easy access to Twitter’s capabilities
* [Tweepy’s documentation](http://docs.tweepy.org/en/latest/)
* One function in `tweetutilities.py` file depends on [**geopy**](https://github.com/geopy/geopy) (used later to **plot tweet locations**)
>`conda install -c conda-forge geopy`

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.7 Authenticating with Twitter Via Tweepy
* **Authentication API**—Authenticate with your **Twitter credentials** to use other Twitter APIs
* **`keys.py`** must contain your credentials

In [2]:
import tweepy

In [3]:
import keys  

In [4]:
auth = tweepy.OAuthHandler(keys.consumer_key,
                           keys.consumer_secret)

In [5]:
auth.set_access_token(keys.access_token,
                      keys.access_token_secret)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Creating the Tweepy API Object
* A **Tweepy `API` object** is your **gateway** to Twitter APIs

* **CHANGE:** `wait_on_rate_limit_notify=True` (all errors are now reported, so they removed this)

In [6]:
api = tweepy.API(auth, wait_on_rate_limit=True); # , wait_on_rate_limit_notify=True)

* **`auth`** is the **`OAuthHandler`**
* **`wait_on_rate_limit=True`** &mdash; **wait 15 minutes** when app reaches an API method’s rate limit
    * prevents violations
* **`wait_on_rate_limit_notify=True`** &mdash; **display a command-line message** when you hit a rate limit

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.8 Getting Information About a Twitter Account
* **Accounts and Users API**—Access information about an account
* `API` object’s **`get_user` method** returns a **`tweepy.models.User` object** for an account

* **CHANGE:** Now must specify parameter name `screen_name`

In [7]:
nasa = api.get_user(screen_name='nasa')

* Calls the **Twitter API’s [`users/show` method](https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-users-show)**
* **`tweepy.models` classes** correspond to returned **JSON objects**
* **`User` class** corresponds to a Twitter [**user object**](https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/user-object)
* **`tweepy.models` classes** turn **JSON** into **Tweepy objects** 

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Getting Basic Account Information for `@nasa`
* IDs and twitter handles can be used to **track tweets to, from and about users**

In [8]:
nasa.id  # account ID created when the user joined Twitter

11348282

In [9]:
nasa.screen_name  # user’s Twitter handle

'NASA'

In [10]:
nasa.description  # description from the user’s profile

"There's space for everybody. ✨"

* Lots of additional attributes, like `name`, `description`, `followers_count`, `friends_count`, etc.

In [11]:
nasa.followers_count

49700022

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Getting `@nasa`'s Most Recent Status Update
* `User` object’s **`status` property** returns a **`tweepy.models.Status`** object
* Corresponds to a Twitter [**tweet object**](https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object)

In [12]:
nasa.status.text  # most recent tweet's text

'From darkness, light. Dark nebulae are clouds of gas and dust that block visible light coming from behind them. But… https://t.co/ysM7qJRfPq'

* **...** indicates **truncated** tweet text
* **`extended_tweet` property** for tweets between 141 and 280 characters (as of Nov. 2017) 
* **Retweeting** often results in truncation

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.9 Introduction to Tweepy Cursors: Getting an Account’s Followers 
* Twitter API methods often return collections of objects 
    * E.g., list of tweets that match specified search criteria 
* Each method has **maximum number of items returned by one call**
    * A **"page"** of results
* **JSON responses** indicate whether there are **more pages**
* A **`Cursor`** handles **paging** 
    * Invokes a method and **checks for more pages**
    * If so, **calls the method again**  
    * Continues until there are no more results to process
    * If `API` object configured to **wait on rate limits**, **`Cursor`s wait as needed**
* [Tweepy `Cursor` tutorial](http://docs.tweepy.org/en/latest/cursor_tutorial.html)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

## 12.9.1 Determining an Account’s Followers  Via the `API` object’s **`followers` Method**
* Calls Twitter’s [**followers/list** method](https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-followers-list.html)
* Returns groups of 20 by default, but can request up to 200 
* For demonstration purposes, we’ll grab 10 of NASA’s followers

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Creating a Cursor That Will Call the `followers` Method for NASA’s account

In [13]:
followers = []  # for storing followers' User objects

**CHANGE:** `api.followers` is now `api.get_followers`

In [14]:
cursor = tweepy.Cursor(api.get_followers, screen_name='nasa')

* First argument is **name of Tweepy method to call**
* Additional keyword arguments are passed to method named in first argument

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Get Results from `Cursor` and Display in Ascending Alphabetical Order  
* Cursor’s **`items` method** calls `api.followers` and returns the `follower`s method’s results

In [15]:
for account in cursor.items(100):  # request only 100 results
    followers.append(account.screen_name)

In [16]:
followers

['Amyaduvanshi22',
 'ShaoboDai',
 'Coranusss',
 'qiongqian1',
 'EdnaVallecilla',
 'iniriama',
 'VThetha',
 'slavacamarada',
 'LuisBur17133698',
 'Liviu_Ilie_',
 'Becki66547121',
 'Parimaldas123',
 'mer63345480',
 'Vinnny_Bomber07',
 'bilskirogers99',
 'bettershuttaup',
 'jonathans0718',
 '15kevilx',
 'kinismoko',
 'bhut_smit2',
 'BbyLibster',
 'khir11111',
 'xiaoluban233',
 'AyeAkalan5',
 'Nelita24478866',
 'IsabelsophiaTo3',
 'Chaosiick',
 'Chukwuma10955',
 'rustam_afzaal',
 'gombert_alix',
 'hotpoison69',
 'moozaa75',
 'PutihbinMas5',
 'IsabelaDeea',
 'syed_minhaj_',
 'TheJhvn',
 'Harshit79219525',
 'CrisMaria2021',
 '2jxjxjdckxkdk',
 'octobernorain',
 'diniiessss',
 'apolonia5k',
 'SeyeAbdoulaye16',
 'SachinK57615647',
 '_______target',
 'Cemt1903',
 'pia_belen_i',
 'Alejand03061632',
 'KatieRohrbacker',
 'traores93946846',
 'orangehopee',
 'kelli_stoffer',
 'Andrea43841356',
 'LakMitr',
 'ElmahiSallam',
 'R_eality_quotes',
 'Naresh22089948',
 'r2investiments',
 'huamulinsen11',
 'S

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Automatic Paging
* To **get up to 200 followers at a time**, create the `Cursor` with the **`count`** keyword argument
>```python
cursor = tweepy.Cursor(api.followers, screen_name='nasa', count=200)
```
* Calling `Cursor` method **`items`** with no argument attempts to get **all followers**
    * Could **take significant time** due to **rate limits**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.10 Searching Recent Tweets with Tweepy `API` method **`search`** 
* **Tweets API**—Search **past 7 days' tweets**, access **live tweet streams** and more
* Returns tweets that **match a query string**
* Only for the **previous seven days’ tweets**
* **Not guaranteed to return all matching tweets**
* Calls Twitter’s **`search/tweets` method**
* Returns 15 tweets at a time by default, but can return up to 100

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Utility Function `print_tweets` from `tweetutilities.py` 

In [17]:
from tweetutilities import print_tweets

```python
def print_tweets(tweets):
    """For each Tweepy Status object in tweets, display the 
    user's screen_name and tweet text. If the language is not
    English, translate the text with TextBlob."""
    for tweet in tweets:
        print(f'{tweet.screen_name}:', end=' ')
    
        if 'en' in tweet.lang:
            print(f'{tweet.text}\n')
        elif 'und' not in tweet.lang:  # translate to English first
            print(f'\n  ORIGINAL: {tweet.text}')
            print(f'TRANSLATED: {TextBlob(tweet.text).translate()}\n')

```

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Searching for Specific Words
* **`q` keyword argument** specifies the **query string**
* Should use **`Cursor`** for more than max results

**NOTE: Modified to search only English due to current known issue in TextBlob**
* **CHANGE:** `api.search` is now `api.search_tweets`

In [18]:
tweets = api.search_tweets(q='football', count=10) #lang='en',

In [19]:
print_tweets(tweets)

MindnPen: RT @AfricaFactsZone: South Africa, Zimbabwe and Namibia will host the 2027 Cricket World Cup.

South Africa also hosted the 1995 Rugby Worl…

_ksa_football_: 
  ORIGINAL: #المنتخب_السعودي يعود من فيتنام بـ3 نقاط مهمة في المرحله الاصعب من التصفيات النهائية لـ كأس العالم قطر 2022
الف مبر… https://t.co/NHdTjE0tni
TRANSLATED: The Saudi national team returns from Vietnam with 3 important points in the most difficult stage of the final qualifiers for the World Cup Qatar 2022 A thousand just… https://t.co/NHdTjE0tni

GarretDuncan: RT @incredibleffpod: The one and only ⁦@JustinWiggins_⁩ joins the pod this week to talk Chiefs and his 8-1 (now 9-1) start to the fantasy s…

Mufc1985123: @soccer_terra @Blue_Eyed_Socio @FUTWIZ It's football not soccer, stick to your eggball

ayrdrianuuh: RT @JeffKrisko: Since NFL football sucked today, I decided to look at every head coach and decide what they would be if football didn't exi…

Sammy_Jr47: RT @charles_watts: Lauren: "I couldn’t be more exc

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Searching with Twitter Search Operators (1 of 2)
* Use **Twitter search operators** to refine search results
* The following table shows several Twitter search operators. 
* [For all the operators, click the `operators` link here](https://twitter.com/search-home)
* ```python
tweets = api.search(q='from:nasa since:2019-11-15', count=3)
```

| Example&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;	| Finds tweets containing
| :---	| :---
| `python twitter` 	| **Implicit _logical and_ operator**—Finds tweets containing `python` _and_ `twitter`.
| `python OR twitter` 	| **Logical `OR` operator**—Finds tweets containing `python` or `twitter` or both.
| `python ?` 	| **`?` (question mark)**—Finds tweets asking questions about `python`.
| `planets -mars` 	| **`-` (minus sign)**—Finds tweets containing `planets` but not `mars`.
| `python :)` 	| **`:)` (happy face)**—Finds **positive sentiment** tweets containing `python`.
| `python :(` 	| **`:(` (sad face)**—Finds **negative sentiment** tweets containing `python`.
| `since:2018-09-01` 	| **Finds tweets **on or after** the specified date**, which must be in the form **`YYYY-MM-DD`**.
| `near:"New York City" `	| **Finds tweets that were sent near `"New York City"`**.
| `from:nasa` 	| **Finds tweets from the account `@nasa`**.
| `to:nasa` 	| **Finds tweets to the account `@nasa`**.

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.11 Spotting Trends: Twitter Trends API
* **“Going viral”** &mdash; thousands or millions of people tweeting at once 
* Twitter maintains a list of **trending topics** worldwide 
* **Twitter Trends API** can return lists of **trending-topic locations** and the **top 50 trending topics** for each **location**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

## 12.11.1 Places with Trending Topics 
* See how to find places with trending topics: https://learning.oreilly.com/videos/python-fundamentals/9780135917411/9780135917411-PFLL_Lesson12_15

<hr style="height:2px; border:none; color:#000; background-color:#000;">

## 12.11.2 Getting a List of Trending Topics with the Tweepy `API`’s **`trends_place` Method** 
* Calls **Twitter Trends API’s [`trends/place` method](https://developer.twitter.com/en/docs/trends/trends-for-location/api-reference/get-trends-place)**
* Returns top **50 trending topics for the location**
* [Look up WOEIDs](http://www.woeidlookup.com) (**Yahoo! Where on Earth IDs**)
* Look up WOEID’s programmatically using **Yahoo!’s web services** via [Python libraries like `woeid`](https://github.com/Ray-SunR/woeid)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Get Today's Worldwide Trending Topics (1 of 3)

* **CHANGE:** `api.trends_place` is now `api.get_place_trends`

In [20]:
world_trends = api.get_place_trends(id=1)  # list containing one dictionary

* **`'trends'` key** refers to a **list of dictionaries representing each trend**

In [21]:
trends_list = world_trends[0]['trends']

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Get Today's Worldwide Trending Topics (2 of 3)
* Each trend has **`name`**, **`url`**, **`promoted_content`** (whether it's an **advertisement**), **`query`** and **`tweet_volume`** keys

In [22]:
trends_list[0]

{'name': 'jongdae',
 'url': 'http://twitter.com/search?q=jongdae',
 'promoted_content': None,
 'query': 'jongdae',
 'tweet_volume': 134563}

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Get Today's Worldwide Trending Topics (3 of 3)
* For **trends with more than 10,000 tweets**, the **`tweet_volume`** is the **number of tweets**; otherwise, it’s `None`
* Filter the list so that it contains only trends with more than 10,000 tweets:

In [23]:
trends_list = [t for t in trends_list if t['tweet_volume']]

In [24]:
from operator import itemgetter 

In [25]:
trends_list.sort(key=itemgetter('tweet_volume'), reverse=True) 

In [26]:
for trend in trends_list:  # show top trending topics
    print(trend['name'])

Chen
#Chat_with_Kep1er
Cardio
Naruto
jongdae
#السعوديه_فيتنام
#HBDchopper
#ว่าแต่เอ็งใครวะ
まんばちゃん
JEONGHYO SEULGI UNZIPPED
Bazinga Behind The Scene
Zhou
#ゲームスタイル性格診断
スープカレー
ゲームのプレイスタイル
sorn
Alfa Romeo
Azeem Rafiq
バルミューダ
ゲーム性
肥前くん
Arctic Monkeys
パンダタイプ
Juana Rivas
#レコメンSexyZoneSP_DAY2
Dolar 10.20


<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.12 Cleaning/Preprocessing Tweets for Analysis
* **Data cleaning** is one of data scientists' most common tasks 
* Some **NLP tasks** for **normalizing tweets**
    * Converting text to **same case**
    * Removing **`#` from hashtags**, **`@`-mentions**, **duplicates**, **hashtags**
    * Removing **excess whitespace**, **punctuation**, **stop words**, **URLs**
    * Removing tweet keywords **`RT`** (retweet) and **`FAV`** (favorite) 
    * **Stemming** and **lemmatization**
    * **Tokenization**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### [**tweet-preprocessor**](https://github.com/s/preprocessor) Library and TextBlob Utility Functions
* `pip install tweet-preprocessor`
* Can automatically remove any combination of:

| Option	| Option constant
| :---	| :---
| **`OPT.MENTION`** | @-Mentions (e.g., `@nasa`)	
| **`OPT.EMOJI`** | Emoji	
| **`OPT.HASHTAG`** | Hashtag (e.g., `#mars`)	
| **`OPT.NUMBER`** | Number	
| **`OPT.RESERVED`** | Reserved Words (`RT` and `FAV`)	
| **`OPT.SMILEY`** | Smiley	
| **`OPT.URL`** | URL	

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Cleaning a Tweet Containing a Reserved Word and a URL
* The tweet-preprocessor library’s module name is **`preprocessor`** 

In [27]:
import preprocessor as p  # p recommended by docs

In [28]:
p.set_options(p.OPT.URL, p.OPT.RESERVED)  # specify what to clean

In [29]:
tweet_text = 'RT A sample retweet with a URL https://nasa.gov'

In [30]:
p.clean(tweet_text)

'A sample retweet with a URL'

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.13 Twitter Streaming API
* Streams **randomly selected** live tweets up to a **maximum of 1% of the tweets per day**
* According to https://InternetLiveStats.com
    * **~8500 tweets per second**
    * Nearly **750 million tweets per day**
* So Streaming API gives you **free access to approximately 7.5 million tweets/day**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

## 12.13.1 Creating a Subclass of `StreamListener` 
* A stream **pushes** tweets to your app via a **persistent connection** 
* **Streaming rate** varies tremendously, based on search criteria
* Subclass of Tweepy’s **`StreamListener`** listens for tweets
    * **Notified** when each **new tweet** or other **Twitter message** **arrives**
    * Each message results in a call to a **`StreamListener` method**
    * **Override** only the **methods you need**


<hr style="height:2px; border:none; color:#000; background-color:#000;">

### `StreamListener` Methods
* [`StreamListener` methods](https://github.com/tweepy/tweepy/blob/master/tweepy/streaming.py)  

| Method&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| Called when...
| :---	| :---
| **`on_connect(self)`** 	| App **successfully connects** to the Twitter stream. This is for statements that should execute only if your app is connected to the stream.
| **`on_status(self, status)`** 	| A **tweet arrives**—**`status`** is a Tweepy **`Status`** object.
| **`on_data(self, data)`** 	| A **tweet arrives**—**`data`** is the **raw JSON** of a Twitter status (tweet).
| **`on_limit(self, track)`** 	| A **limit notice** arrives. This occurs if your search matches more tweets than Twitter can deliver based on its current streaming rate limits. In this case, the limit notice contains the number of matching tweets that could not be delivered.
| **`on_error(self, status_code)`** 	| An **error code** arrives. 
| **`on_timeout(self)`** 	| The **connection times out**—that is, the Twitter server is not responding.
| **`on_warning(self, notice)`** 	| **Twitter sends a disconnect warning** to indicate that the connection might be closed. For example, Twitter maintains a queue of the tweets it’s pushing to your app. If the app does not read the tweets fast enough, `on_warning`’s notice argument will contain a warning message indicating that the connection will terminate if the queue becomes full. 

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Class `TweetListener` Defined in `tweetlistener.py`


<hr style="height:2px; border:none; color:#000; background-color:#000;">

## 12.13.2 Initiating Stream Processing

### Subclass of `tweepy.Stream` Receives Tweets
* **CHANGE:** Previously, the subclass was initialized with the API object and registered as a listener. 

In [31]:
from tweetlistener import TweetListener

In [32]:
tweet_listener = TweetListener(keys.consumer_key, keys.consumer_secret, keys.access_token, keys.access_token_secret, limit=5)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Creating a Tweepy **`Stream`** Object to Manage the Connection to the Twitter Stream
* Passes the messages to your `TweetListener` 

In [33]:
# OLD: tweet_stream = tweepy.Stream(auth=api.auth, listener=tweet_listener)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Starting the Tweet Stream with the `Stream` Object’s **`filter` Method** 
* `track` parameter specifies a list of search terms
* [Other `filter` method parameters](https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters) for refining your tweet searches
* Streaming API returns full tweet **JSON objects** for tweets that match any of the terms, **not just in the tweet’s text, but also in @-mentions, hashtags, expanded URLs and other information**  
* Might not see search terms in **tweets' text**

In [34]:
# OLD: tweet_stream.filter(track=['football'])  #, is_async=True) 

In [35]:
tweet_listener.filter(track=['football'])  #, is_async=True) 

Connection successful

Screen name: Yubiieeee:
   Language: en
     Status: RT @Nigerianscamsss: You mean European international tournaments?
We’re having fun in Africa, South America and CONCACAF

Screen name: DapsMcfc:
   Language: en
     Status: @DonHusam6 @MaybeDoubleA @TunnelTrafford @TyroneMc__ @Djrbiz Point 1: Liverpool (the city not club) will always be rivals, it’s deeper than football…

Point 2: admittedly last year you weren’t our rivals however the previous 2 years we went head to head - that makes us rivals

Point 3: if we ain’t rivals don’t smash up our team coach

Screen name: pompeylive:
   Language: en
     Status: RT @pn_neil_allen: The former #Pompey pair shared a classroom with Roy Keane and Tony Adams. And are doing rather well at Privett Park this…

Screen name: Lns_CFC:
   Language: fr
     Status: RT @B19_WOOOO: La seul nation africaine qui peut titiller les grandes nations européennes et américaines.
 Translated: RT @ B19_WOOOO: The only African nation that ca

Stream connection closed by Twitter


Screen name: Sammy_Jr47:
   Language: en
     Status: RT @charles_watts: Lauren: "I couldn’t be more excited to be taking up the role. I feel proud to be joining the FIFA family and there’s als…



<hr style="height:2px; border:none; color:#000; background-color:#000;">

### **Lecture note: Run remaining cells now so we don't have to wait for streamed tweets later.**

### Asynchronous vs. Synchronous Streams
* **`is_async=True`** &mdash; would initiate an **asynchronous tweet stream** 
* Without `is_async=True`, the stream is **synchronous** and the next In [] prompt appears **after the stream terminates**    
* in Jupyter, we used a **synchronous stream** ensure that all tweets display
* In IPython, can terminate an asynchronous tweet stream early:  
>`tweet_stream.running=False`    


<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.14 Tweet Sentiment Analysis 
* Political researchers might use to understand **how people are likely to vote**
* Companies might use to see what people are saying about **their products** and **competitors’ products**
* Script **`sentimentlistener.py`** checks **sentiment** on a specified topic for a specified number of tweets
* The script in this example is substantially the same as the previous example, but uses TextBlob to check the sentiment of each tweet as we did in the NLP presentation

In [36]:
run sentimentlistener.py football 10

  Mufc1985123: @soccer_terra @Blue_Eyed_Socio @FUTWIZ It's football not soccer, stick to your eggball

  Kitti02328304: eFootball™ CHAMPION SQUADS A football card game played with famous real-world players! Play with friends! iOS: Android:

+ InimfonUmoh4: @Pacificfccpl @CPLCavalryFC @onesoccer @TELUS Hello forum I am a very talented football from Nigeria... Can anyone help me with contacts to the club because I really want to join this club and I will surely bring improvement to the team if I get signed I would drop my CV together with my video clips if needed Thank you

  LocalCltNews: Justin Fuente out as Virginia Tech head football coach #coach #football #Fuente #Justin #tech #Virginia https:...

  calvinfox96: The expectations of Scottish football, a couldn’t even imagine being that shite

- coffeebreaks40: National Treasures Collegiate Football. All cards numbered to 99 or less. Get a random serial number for $9 @HobbyConnector

- SickoSports: @nickmaraldo @PatMcAfeeShow Not gonn

Stream connection closed by Twitter


  Smithead79: @FootyRustling They should auction their shirts with the money all going to grass roots football.

Tweet sentiment for "football"
Positive: 1
 Neutral: 7
Negative: 2


<hr style="height:2px; border:none; color:#000; background-color:#000;">

# 12.15 Geocoding and Mapping
* Collect **streaming tweets**, then **plot** their **locations** on an **interactive map**
* **Twitter disables precise location info (latitude/longitude) by default** (users must opt in to allowing Twitter to track locations) 
* Large percentage include the user’s home location information
    * Sometimes invalid or fictitious 
* **Map markers** will show `location` from each tweet’s `User` object

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### [**geopy** library](https://github.com/geopy/geopy)
* **Geocoding**&mdash;translate locations into **latitude** and **longitude**
* **geopy** supports dozens of **geocoding web services**, many with **free or lite tiers**
* We’ll use **OpenMapQuest geocoding service** 

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### OpenMapQuest Geocoding API (1 of 2)
* Convert locations, such as **Boston, MA** into their **latitudes** and **longitudes**, such as **42.3602534** and **-71.0582912**, for plotting on maps
* Currently allows **15,000 transactions per month** on their free tier
* [Sign up](https://developer.mapquest.com/)
* For a presentation of **signing up** and **getting your credentials**, see my [**Python Fundamentals LiveLessons video**](https://learning.oreilly.com/videos/python-fundamentals/9780135917411/9780135917411-PFLL_Lesson12_24) or the [**beginning of Section 12.15 in Python for Programmers**](https://learning.oreilly.com/library/view/python-for-programmers/9780135231364/ch12.xhtml#ch12lev1sec15)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### [**folium library**](https://github.com/python-visualization/folium) and Leaflet.js JavaScript Mapping Library
* Uses **Leaflet.js JavaScript mapping library** to display maps in a web page 
* Folium can output **HTML documents** for viewing in a **web browser**
* `pip install folium`

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Maps from OpenStreetMap.org
By default, **Leaflet.js** uses **open source maps** from **`OpenStreetMap.org`**
* To use these maps, **they require the following copyright notice**:

> `Map data © OpenStreetMap contributors`

* They also say: **You must make it clear that the data is available under the Open Database License. This can be achieved by providing a “License” or “Terms” link which links to https://www.openstreetmap.org/copyright or https://www.opendatacommons.org/licenses/odbl/index.html**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Collections Required By `LocationListener`
* Requires two collections
    * A **list (`tweets`)** to store the tweets we collect 
    * A **dictionary (`counts`)** to track the **total number of tweets** we collect and the **number that have location data**

In [37]:
tweets = [] 

In [38]:
counts = {'total_tweets': 0, 'locations': 0}

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Creating the LocationListener 

In [39]:
from locationlistener import LocationListener

In [40]:
location_listener = LocationListener(keys.consumer_key, keys.consumer_secret, 
    keys.access_token, keys.access_token_secret, counts_dict=counts, 
    tweets_list=tweets, topic='football', limit=50)

* **`LocationListener`** uses our **utility function `get_tweet_content`** to place each tweets **screen name**, **tweet text** and **location** into a **dictionary**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Configure and Start the `Stream` of Tweets

In [41]:
import tweepy

In [42]:
#stream = tweepy.Stream(auth=api.auth, listener=location_listener)

In [43]:
location_listener.filter(track=['football'], languages=['en'])

Stream connection closed by Twitter


PanAfricology: South African do u know that as a teenage Jacob Zuma, played football on Robben Island as a defender in the Makana Football Association (MFA) league while ANC &amp; PAC activists were imprisoned on that notorious Island.
#EFFPresser #DeKlerk #ANC #BLM #DeKlerkMemorialService #EFF https://t.co/BGx2wvbfm7



<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Displaying the Location Statistics 

In [44]:
counts['total_tweets']

81

In [45]:
counts['locations']

50

In [46]:
print(f'{counts["locations"] / counts["total_tweets"]:.1%}')

61.7%


<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Geocoding the Locations with Our `get_geocodes` Utility Function 
* **OpenMapQuest** geocoding service **times out** when it **cannot handle your request immediately**
* If so, **`get_geocodes`** **notifies** you, **waits**, then **retries** the request

In [47]:
from tweetutilities import get_geocodes

In [48]:
bad_locations = get_geocodes(tweets)

Getting coordinates for tweet locations...
HTTPSConnectionPool(host='open.mapquestapi.com', port=443): Max retries exceeded with url: /nominatim/v1/search?q=Parts+Unknown&format=json&limit=1&key=L0xOBy7HqIWsNgBvZvgVZx9HX1GSYzri (Caused by ReadTimeoutError("HTTPSConnectionPool(host='open.mapquestapi.com', port=443): Read timed out. (read timeout=1)"))
OpenMapQuest service timed out. Waiting.
HTTPSConnectionPool(host='open.mapquestapi.com', port=443): Max retries exceeded with url: /nominatim/v1/search?q=Jacksonville%2C+FL&format=json&limit=1&key=L0xOBy7HqIWsNgBvZvgVZx9HX1GSYzri (Caused by ReadTimeoutError("HTTPSConnectionPool(host='open.mapquestapi.com', port=443): Read timed out. (read timeout=1)"))
OpenMapQuest service timed out. Waiting.
Done geocoding


<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Displaying the Bad Location Statistics

In [49]:
bad_locations

8

In [50]:
print(f'{bad_locations / counts["locations"]:.1%}')

16.0%


<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Cleaning the Data with a pandas `DataFrame` Before Displaying the Data on a Map
* `DataFrame` will contain **`NaN`** for the **`latitude`** and **`longitude`** of any tweet that **did not have a valid location**
* Remove any such via `DataFrame`’s **`dropna` method** 

In [51]:
import pandas as pd

In [52]:
df = pd.DataFrame(tweets)

In [53]:
df

Unnamed: 0,screen_name,text,location,latitude,longitude
0,alburr92,"Cam back to the Panthers, Fuente fired from th...","Charlotte, NC",35.227087,-80.843127
1,St6ticz,reckon he hates football https://t.co/SmuirOZWof,"Sunderland, England",54.906379,-1.375053
2,Britta_Boehler,@_MelanieMartin @ojmason: Like football. Or hu...,Amsterdam/Cologne,,
3,AnasAhmadTJ,@Dee11Fibre @PSGHaryy @jukeyisdead @goal You a...,"Gombe | Kano | Lagos, Nigeria",,
4,gabrieldud,"@IJaSport Don't know if its needed, but would ...","Sao Paulo, Brazil",-23.550651,-46.633382
5,Schmidtburgh,@Ethan__Tremblay @ksufan97 @thegrant_ksu Don't...,Parts Unknown,40.138216,-79.837506
6,MontanaBasque,"Everyone in Montana saying FTC or GO Cats Go, ...",Montana,47.375267,-109.638758
7,ardeerjoe,@itslegaltender @STVSport @BBCScotland @Scotti...,"NORTH AYRSHIRE, Europe",,
8,ItsFoxedUp,Kasper Schmeichel admits plans to leave Leices...,Les-tah,27.671982,-12.956325
9,backinblack_wx,@IanBoatmanUGA Apparently they gave you the fa...,"Mobile, AL",30.694357,-88.043054


In [54]:
df = df.dropna()

In [55]:
df

Unnamed: 0,screen_name,text,location,latitude,longitude
0,alburr92,"Cam back to the Panthers, Fuente fired from th...","Charlotte, NC",35.227087,-80.843127
1,St6ticz,reckon he hates football https://t.co/SmuirOZWof,"Sunderland, England",54.906379,-1.375053
4,gabrieldud,"@IJaSport Don't know if its needed, but would ...","Sao Paulo, Brazil",-23.550651,-46.633382
5,Schmidtburgh,@Ethan__Tremblay @ksufan97 @thegrant_ksu Don't...,Parts Unknown,40.138216,-79.837506
6,MontanaBasque,"Everyone in Montana saying FTC or GO Cats Go, ...",Montana,47.375267,-109.638758
8,ItsFoxedUp,Kasper Schmeichel admits plans to leave Leices...,Les-tah,27.671982,-12.956325
9,backinblack_wx,@IanBoatmanUGA Apparently they gave you the fa...,"Mobile, AL",30.694357,-88.043054
10,InsqneMagic,@markgoldbridge Mark when you playing Football...,Cristiano Ronaldo,32.698094,-16.773876
11,Football__Talks,@JeffreyHarharw2 @mikeparry8 @JPickford1 He is...,Lagos,20.017111,103.378253
12,perdricof,@whstancil this is fucking lucy-with-the-footb...,"Long Beach, CA",44.45677,-78.73597


<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Creating a Map with Folium

In [56]:
import folium

In [57]:
usmap = folium.Map(location=[39.8283, -98.5795],  # center of U.S.
                   tiles='Stamen Terrain',
                   zoom_start=4, detect_retina=True)

* **`location`** &mdash; sequence containing **latitude** and **longitude** of **map center point**
    * [Geographic center of the continental United States](http://bit.ly/CenterOfTheUS) 
* **`zoom_start`** &mdash; **map’s initial zoom level**
* **`detect_retina`** &mdash; enables folium to use **higher-resolution maps**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Creating Folium `Popup` Objects for the Tweet Locations
* **`itertuples`** creates **tuples** from **each row** of the **`DataFrame`**
* Each **tuple** contains a **property** for each **`DataFrame` column**

In [58]:
for t in df.itertuples():
    text = ': '.join([t.screen_name, t.text])
    popup = folium.Popup(text)
    marker = folium.Marker((t.latitude, t.longitude), 
                           popup=popup)
    marker.add_to(usmap)

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Saving the Map with Map’s **`save`** Method 

In [59]:
usmap.save('tweet_map.html')

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### Displaying the Map in Jupyter 
* The resulting map follows. 
<a href="./tweet_map.html">Interactive tweet map</a>

In [60]:
usmap

<!--
#from IPython.display import IFrame
#IFrame(src="./tweet_map.html", width=800, height=450)
-->

<hr style="height:2px; border:none; color:#000; background-color:#000;">

## Utility Functions
See the following for details: 
* [**Python Fundamentals LiveLessons video**](https://learning.oreilly.com/videos/python-fundamentals/9780135917411/9780135917411-PFLL_Lesson12_26) 
* [Sections 12.5.2-12.5.3 in **Python for Programmers**](https://learning.oreilly.com/library/view/python-for-programmers/9780135231364/ch12.xhtml#ch12lev2sec10)

### `get_tweet_content` Utility Function (2 of 2)
```python
def get_tweet_content(tweet, location=False):
    """Return dictionary with data from tweet (a Status object)."""
    fields = {}
    fields['screen_name'] = tweet.user.screen_name

    # get the tweet's text
    try:  
        fields['text'] = tweet.extended_tweet["full_text"]
    except: 
        fields['text'] = tweet.text

    if location:
        fields['location'] = tweet.user.location

    return fields

```

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### `get_geocodes` Utility Function (1 of 3)
* Receives a **list of dictionaries** containing tweets and **geocodes their locations**
* If **geocoding** is **successful** for a tweet, adds the **latitude** and **longitude** to the tweet’s **dictionary in `tweet_list`**
* Requires class **`OpenMapQuest`** from the **geopy module**

<hr style="height:2px; border:none; color:#000; background-color:#000;">

### `get_geocodes` Utility Function (2 of 3)
```python
from geopy import OpenMapQuest
```

```python
def get_geocodes(tweet_list):
    """Get the latitude and longitude for each tweet's location.
    Returns the number of tweets with invalid location data."""
    print('Getting coordinates for tweet locations...')
    geo = OpenMapQuest(api_key=keys.mapquest_key)  # geocoder
    bad_locations = 0  

    for tweet in tweet_list:
        processed = False
        delay = .1  # used if OpenMapQuest times out to delay next call
        while not processed:
            try:  # get coordinates for tweet['location']
                geo_location = geo.geocode(tweet['location'])
                processed = True
            except:  # timed out, so wait before trying again
                print('OpenMapQuest service timed out. Waiting.')
                time.sleep(delay)
                delay += .1

        if geo_location:  
            tweet['latitude'] = geo_location.latitude
            tweet['longitude'] = geo_location.longitude
        else:  
            bad_locations += 1  # tweet['location'] was invalid
    
    print('Done geocoding')
    return bad_locations

```

<hr style="height:2px; border:none; color:#000; background-color:#000;">

## 12.15.3 Class `LocationListener`
```python
# locationlistener.py
"""Receives tweets matching a search string and stores a list of
dictionaries containing each tweet's screen_name/text/location."""
import tweepy
from tweetutilities import get_tweet_content
from IPython.display import clear_output

class LocationListener(tweepy.StreamListener):
    """Handles incoming Tweet stream to get location data."""
```

```python
    def __init__(self, api, counts_dict, tweets_list, topic, limit=10):
        """Configure the LocationListener."""
        self.tweets_list = tweets_list
        self.counts_dict = counts_dict
        self.topic = topic
        self.TWEET_LIMIT = limit
        super().__init__(api)  # call superclass's init
```

```python
    def on_status(self, status):
        """Called when Twitter pushes a new tweet to you."""
        # get each tweet's screen_name, text and location
        tweet_data = get_tweet_content(status, location=True)  

        # ignore retweets and tweets that do not contain the topic
        if (tweet_data['text'].startswith('RT') or
            self.topic.lower() not in tweet_data['text'].lower()):
            return

        self.counts_dict['total_tweets'] += 1  # original tweet

        # ignore tweets with no location 
        if not status.user.location:  
            return

        self.counts_dict['locations'] += 1  # tweet with location
        self.tweets_list.append(tweet_data)  # store the tweet
        clear_output()
        print(f'{status.user.screen_name}: {tweet_data["text"]}\n')
        
        # if TWEET_LIMIT is reached, return False to terminate streaming
        return self.counts_dict['locations'] < self.TWEET_LIMIT

```

<hr style="height:2px; border:none; color:#000; background-color:#000;">

# More Info 
* See Lesson 12 in [**Python Fundamentals LiveLessons** here on O'Reilly Online Learning](https://learning.oreilly.com/videos/python-fundamentals/9780135917411)
* See Chapter 12 in [**Python for Programmers** on O'Reilly Online Learning](https://learning.oreilly.com/library/view/python-for-programmers/9780135231364/)
* See Chapter 13 in [**Intro Python for Computer Science and Data Science** on O'Reilly Online Learning](https://learning.oreilly.com/library/view/intro-to-python/9780135404799/)
* Interested in a print book? Check out:

| Python for Programmers<br>(640-page professional book) | Intro to Python for Computer<br>Science and Data Science<br>(880-page college textbook)
| :------ | :------
| <a href="https://amzn.to/2VvdnxE"><img alt="Python for Programmers cover" src="../images/PyFPCover.png" width="150" border="1"/></a> | <a href="https://amzn.to/2LiDCmt"><img alt="Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud" src="../images/IntroToPythonCover.png" width="159" border="1"></a>

>Please **do not** purchase both books&mdash;_Python for Programmers_ is a subset of _Intro to Python for Computer Science and Data Science_

&copy; 2019 by Pearson Education, Inc. All Rights Reserved. The content in this notebook is based on the book [**Python for Programmers**](https://amzn.to/2VvdnxE).