<img src="https://datasciencedegree.wisconsin.edu/wp-content/themes/data-gulp/images/logo.svg" width="300">


# Lesson 13 Activity -- Using ```tweepy``` for Data Mining

This is an introduction to data collection from <a href="http://www.twitter.com/">Twitter</a> using the [`tweepy`](http://www.tweepy.org/) package.

---

## Getting set up -- things you do about once

### Install Tweepy

You must install the tweepy package from either Anaconda or the terminal on your computer before using it!

### Make a twitter account

You'll need to set up an app at <a href="https://apps.twitter.com/">apps.twitter.com</a>.

### Save your credentials to an external file

Make a plain text file on your computer called `twitter_credentials.py`, and put it anywhere but this directory.  I put mine in my home directory for my user.  It will look something like this:

    con_key = 'your consumer key goes here'
    con_secret = 'your consumer secret goes here'
    acc_token = 'your access token goes here'
    acc_secret = 'your access secret goes here'
    
* Save your consumer key, consumer secret, access token, and access secret there.
* Don't share these secrets with others!  
* It's also possible to generate access tokens and secrets from within an app, but now's not the right time for this.

---

## Preliminaries to using tweepy -- things you do once per session

You have to do these things about once per session.  If you close your notebook, or restart the kernel, then you have to do these things before you can again use the Tweepy interface to the Twitter API.

#### 1. Gain access to the Tweepy library

As you would any other Python library, `import`.

In [2]:
import tweepy

#### 2. Load your credentials from the external file

Invoke a python plain text source file located somewhere else on your computer.

In [17]:
%run ~/source/repos/twitter_credentials.py
# this cell will evaluate silently 🙊, and not print anything.  
# This is desired, because a person with your keys can act as you on Twitter in literally every way 😟


🔐 If you need to check whether the four variables, such as `con_key` have the correct value, insert a cell and print the value, then delete the cell.  Keep your credentials secret and safe!!!  

In [18]:
print(con_key)

4PblUSttnf86e91yP5C2TP4CW


#### 3. Make an `API` object

The `tweepy.API` object handles construction of the Twitter API calls for you.  It's a convenience layer, but it's really dang convenient!

In [19]:
#Use tweepy.OAuthHandler to create an authentication using the given key and secret
auth = tweepy.OAuthHandler(consumer_key=con_key, consumer_secret=con_secret)
auth.set_access_token(acc_token, acc_secret)

#Connect to the Twitter API using the authentication
api = tweepy.API(auth)

--- 

## Using the API

Twitter has two versions of its API:
* The [REST](https://en.wikipedia.org/wiki/Representational_state_transfer) [API](https://en.wikipedia.org/wiki/Application_programming_interface) allows you to _pull_ information from Twitter, or _push_ information back to Twitter.  For example,  
  💡 if I wanted to have a Python script that ran as a CRON job to automatically tweet for me under certain conditions, I would use the REST API.
* The Streaming API allows us to monitor Twitter in real time, grabbing tweets as they are made.  For example,  
  💡 if I wanted to make a little device powered by a Raspberry Pi that showed interesting tweets in real time on a tiny screen by my desk, I would use the streaming API.

### Method 1. The REST API

The REST API allows you to _pull_ information from Twitter, or _push_ information back to Twitter.  We'll use the REST API to run a specific search.  You could also use the REST API to make automatic tweets on Twitter, or get information about specific users.

In [50]:
#Use the REST API for a static search
#Our example finds recent tweets using the hashtag #datascience

#tweet_list = api.search(q='#%23datascience') #%23 is used to specify '#'
tweet_list = api.search(q='#%23mallorca')

See [twitter's search documentation](https://dev.twitter.com/rest/public/search) for examples of query operators.  Pay attention to how to URL encode your query.  [This w3schools page](https://www.w3schools.com/tags/ref_urlencode.asp) has information on what `%23` and other encodings for URL's mean.

We retrieve a SearchResult object for each tweet, full of data such as the language, the identity of the poster, etc.

In [51]:
tweet_list[0]

Status(_api=<tweepy.api.API object at 0x000001F75971C908>, _json={'created_at': 'Sun Apr 22 16:27:13 +0000 2018', 'id': 988091864493514752, 'id_str': '988091864493514752', 'text': "I'm starting to put a few of my #wildlife &amp; #travel clips on #YouTube. Here is one of my favourites on my… https://t.co/EIrF77Q6pu", 'truncated': True, 'entities': {'hashtags': [{'text': 'wildlife', 'indices': [32, 41]}, {'text': 'travel', 'indices': [48, 55]}, {'text': 'YouTube', 'indices': [65, 73]}], 'symbols': [], 'user_mentions': [], 'urls': [{'url': 'https://t.co/EIrF77Q6pu', 'expanded_url': 'https://twitter.com/i/web/status/988091864493514752', 'display_url': 'twitter.com/i/web/status/9…', 'indices': [111, 134]}]}, 'metadata': {'iso_language_code': 'en', 'result_type': 'recent'}, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_

In [45]:
#We can use the dir command to view a list of the attributes of each tweet
dir(tweet_list[0])

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_api',
 '_json',
 'author',
 'contributors',
 'coordinates',
 'created_at',
 'destroy',
 'entities',
 'favorite',
 'favorite_count',
 'favorited',
 'geo',
 'id',
 'id_str',
 'in_reply_to_screen_name',
 'in_reply_to_status_id',
 'in_reply_to_status_id_str',
 'in_reply_to_user_id',
 'in_reply_to_user_id_str',
 'is_quote_status',
 'lang',
 'metadata',
 'parse',
 'parse_list',
 'place',
 'retweet',
 'retweet_count',
 'retweeted',
 'retweets',
 'source',
 'source_url',
 'text',
 'truncated',
 'user']

In [56]:
#Let's display the text of each tweet we found.
#[tweet.text for tweet in tweet_list]
[tweet.lang for tweet in tweet_list]

['en',
 'und',
 'fi',
 'es',
 'en',
 'es',
 'es',
 'es',
 'en',
 'en',
 'ca',
 'es',
 'pt',
 'es',
 'de']

By default, the REST API returns 15 tweets.  We can get up to 100 by using the argument "count".

In [57]:
tweet_list = api.search(q='#%23mallorca', count = 1000) #tweet_list = api.search(q='#%23datascience', count = 100)
len(tweet_list)

100

If we want more than 100 tweets, we can use a *while* loop.  The max_id argument lets us collect tweets that are older than a particular tweet index (in this case, the oldest tweet we've seen so far).

The `try/except/else` structure lets us fail gracefully in case the API search returns an error (e.g., if we run up against Twitter's rate limits).

In [59]:
num_needed = 1000
tweet_list = []
last_id = -1 # id of last tweet seen
while len(tweet_list) < num_needed:
    try:
        new_tweets = api.search(q = '#%23mallorca', count = 100, max_id = str(last_id - 1))
    except tweepy.TweepError as e:
        print("Error", e)
        break
    else:
        if not new_tweets:
            print("Could not find any more tweets!")
            break
        tweet_list.extend(new_tweets)
        last_id = new_tweets[-1].id

len(tweet_list)

1075

Note that the free REST API restricts the number of tweets you can retrieve, and the dates: you may not be able to retrieve tweets that are more than a week old.  Pay attention to this restriction as you approach your final project topic!

## Method 2. The Streaming API

The Streaming API allows us to monitor Twitter in real time, grabbing tweets as they are made.

The ```tweepy``` package includes a class called ```StreamListener``` which monitors Twitter for us.  However, by default StreamListener does nothing with the tweets it collects.

In this demonstration, we'll modify ```StreamListener``` to make a class that prints each tweet we're interested in to the screen.  Later, you may wish to create your own class which saves information from tweets to a file.

In [11]:
#We create a subclass of tweepy.StreamListener to add a response to on_status

class PrintingStreamListener(tweepy.StreamListener):

    def on_status(self, status):
        print(status.text)
        
    #disconnect the stream if we receive an error message indicating we are overloading Twitter
    
    def on_error(self, status_code):
        if status_code == 420:
            #returning False in on_data disconnects the stream
            return False
 

Once we have created our subclass, we can set up our own Twitter stream.

In [12]:
#We create and authenticate an instance of our new ```PrintingStreamListener``` class

my_stream_listener = PrintingStreamListener()
my_stream = tweepy.Stream(auth = api.auth, listener=my_stream_listener)

We'll use the ```track``` command to look for tweets with a specific keyword.  You can read more about constructing searches with ```track``` in the <a href="https://dev.twitter.com/streaming/overview/request-parameters#track">Twitter streaming API documentation</a>.

In [13]:
# Now, we're ready to start streaming!  We'll look for recent tweets which use the word "data".
# You can pause the display of tweets by interrupting the Python kernel (use the menu bar at the top)

my_stream.filter(track=['data'])

RT @LemieuxLGM: But if you learn very basic things about political science and history and data analysis it's very, very clearly false, and…
RT @tcoughlin28: Verizon: You’re all out of data. You’ve been charged overage data for this month

Family group chat 30 seconds later: http…
https://t.co…
#Nuggets https://t.co/CUiBBuw6Ky
Use verified data to maximise the mobile marketing ROI https://t.co/Q9113yuZIU
The Best Programming Languages For Data Scientists https://t.co/BbPHnSt3mC https://t.co/Ujf4DWBzam
Kamu seorang yg cerdas dan kreatif. Terima kasih, karena Allah telah mengirimkanmu untuk menolong ku. Tidk hanya da… https://t.co/P52xHu5Uja
RT @AlastairJT: The independent UK Border Agency was obliged in 2009 under the Data Protection Act to begin plans to destroy Windrush board…
3 Ways Marketers Should Use Data Science to Skyrocket Marketing Results https://t.co/t5BGcjo0Sj
RT @tcoughlin28: Verizon: You’re all out of data. You’ve been charged overage data for this month

Family group cha

RT @heiseonline: Big Data: Wie die Erde war und werden könnte https://t.co/l12YKJ0ghU #BigData #NASA
RT @Forbes: 35 globally interesting and free big data source accessible to everyone 
https://t.co/guSVSLp8s2 https://t.co/Euu9aNffoi
boyfriend yang baik ialah boyfriend yang terus on hotspot dia bila keluar nak share dengan girlfriend bila dia tahu… https://t.co/6QOkOWE0Iy
RT @RuthCoo50598510: Judge's handwritten notes released under UK data laws for first time https://t.co/dyJQtqcEER
RT @albertogaruccio: Big opportunities for startups: how Banking ecosystems extend out from the core business into 3 levels

https://t.co/C…
RT @ASapardan: Woii Amien,
Jika anda mau bicara Hutang Negara,
Menkeu Bu Sri Mulyani sudah panjang lebar jelaskan dg panjang lebar disertai…
RT @alres9: 足を引っ張ることにかけては、世界屈指だなあ。
この件はまあ、当然だとは思うけど(^_^;) https://t.co/WPeJf3Ff6W
@carolecadwalla @DamianCollins @CommonsCMS #TheresaMay under your Government clear breaches of the Data Protection… https://t.co/Nrnr6JF13e
RT @alh

RT @reinergallardo: #LaunchSASEC 
"The purpose of the Safety and Security Committee is to gather the pertinent data, to help identify the r…
RT @Forbes: 35 globally interesting and free big data source accessible to everyone 
https://t.co/guSVSLp8s2 https://t.co/Euu9aNffoi
RT @dFaktoHQ: It's time to become data-driven (or die). Nice article focusing also on the importance of combining tools with talent" #ddmgm…
@HisMajestyJT @_ShowtimeRX Facts. Data. Research. https://t.co/X7hcgzMhyu
RT @SweetLilly007: Want to easily communicate to your audience? Signup for ClickSend

|Real Results &amp; Data
|Easy-to-use Products
|Real Busi…
RT @lianabell8: 5 Applications of big data in #socialmedia
#Business #data #marketing

#Digital #CX #smm #smartech #ux #branding #socialsel…
RT @CO2_earth: 📈  411.14 parts per million (ppm) #CO2 in the atmosphere on April 21, 2018  🏵️🏵️  HIGHEST DAILY AVERAGE REPORTED SO FAR IN 2…
RT @kuriharan: Check. “The McGruff Market Cycle Strategy” by Potato McGruff https://

RT @drob: .@chowthedog of DataCamp shares his experience teaching #rstats and the #tidyverse to high schoolers 📊 https://t.co/aIMo0uvaRw ht…
@ngels59785300 @aiww Referendum ILEGAL and many people vote up to 4 times, and there are videos and photos, and the… https://t.co/4Yd03blxQE
RT @Jin_Butterfly: 🌟Singer Brand Reputation Rankings (April)🌟

#1. Wanna One: 12,054,871

#2. @BTS_twt: 8,023,277🏆

#3. TWICE 7,945,940

#4…
RT @wef: Gender #equality? It starts with data https://t.co/YMEAdBm7Mj https://t.co/4ExWBDMtgY
RT @carolecadwalla: NEW: Price comparison websites implicated in new https://t.co/elpqcwdjTS data scandal 
https://t.co/M8r4TQf1Yj
AbeeSukarna Malam, Mas Abee. Mengenai permintaan naik limit, silakan infokan data berikut via DM agar dapat dibantu… https://t.co/m6ZwjHmiWI
RT @tcoughlin28: Verizon: You’re all out of data. You’ve been charged overage data for this month

Family group chat 30 seconds later: http…
RT CoinDeskMarkets: Bitcoin is testing $9K for the 2nd time today

(D

RT @AI_5tephen: Spot on. #Alexa would grind to a halt without this. https://t.co/3v7HhdlSjA
Aqui no RJ é feriado amanhã. Dia de São Jorge. Até aí tudo bem, todo mundo tem suas crenças mas

A data é conhecida… https://t.co/yJjFteRgn9
RT @CasperAPI: Casper API- universal solutions for secure data storage
 We are proud to say that Casper API is a universal one that suits v…
Se é pra ser religioso, vou no caminho de pedir que Deus proteja nossos animais de estimação nesta data.
RT @lmbrownlee1: You know who collects more of your personal data than Facebook? Google—so why aren’t we talking about it?  https://t.co/h8…
When data become available, facts and many more research about public policy could be produced. Should train myself… https://t.co/Wjy1wvL7DW
INSS pode fixar data para fim de benefício concedido judicialmente, diz TNU https://t.co/WkkhyTZOth
Others see blockchains as a distributed ledger and immutable data source that can be applied to logistics, supply c… https://t.co/4wsQmcgin

RT @holly_holl: Cancel Statcast, we have all the data we need now. https://t.co/sa5o1JTDMU
RT @kubernan: Can data science save social media? https://t.co/a64ClxlCFi #Microsoft
Navigating the data using JDE https://t.co/ILZzKzYI1y
RT @SheronWilkie: This is only right and proper . 

But Labour fail to call for same investigations into referendum result following revela…
RT @lwoskie: Full of practical but flexible guidance: iterating on definitions, mapping governance &amp; stakeholders, assessing existing perfo…
RT @metabar_papers: Counting with DNA in metabarcoding studies: how should we convert sequence reads to dietary data? https://t.co/4uedcIjZ…
Yes "science" gets corrupted by lobbyists and corporate greed via ghost writing "scientific" articles, cherry picki… https://t.co/nlks15iMYT
RT @Forbes: 35 globally interesting and free big data source accessible to everyone 
https://t.co/guSVSLp8s2 https://t.co/Euu9aNffoi
@ABNAMROZakelijk 'Data' is meervoud net als 'media'. Hoe is het toch 

RT @Le_BR_on: NESTA MESMA DATA, em 2006, LeBron anota o primeiro triple-double da sua carreira em playoffs.

#LeHistory  https://t.co/w53Bw…
Uma data? — 04/06/17 https://t.co/y9pe2tc5xY
https://t.co…
RT @cardstack: Why decentralized apps need data-driven, algorithmic decision making: https://t.co/E9WCgPkIpa
RT @antoniogm: Google is a far worse enabler of sketchy data use and ads targeting than Facebook will ever be. Glad someone like @mims who…
https://t.co/CMty3yfS3y #ico #mvl #mobility 
#ecosystem #mvlchain https://t.co/lRs4ilDPbf
RT @tcoughlin28: Verizon: You’re all out of data. You’ve been charged overage data for this month

Family group chat 30 seconds later: http…
Why systems need data! https://t.co/HNiFlmFSqz
RT @zarazettirazr: Jangan sembarangan kasih NIK / no KTP pada siapapun tweeps nanti disalah gunakan. Untuk registrasi sim card aja data boc…
RT @TVChosunNews: '드루킹'이 '더불어민주당 대선후보 영남권 경선 현장'에서 직접 '경인선'을 이끄는 모습이 포착됐습니다. 행사 진행요원이나 보안요원들이 주로 사용하는 '오디오 리시버'까지 귀에 꽂은 모습이었습니다. 
ht

RT @RealHistoriPix: CEO of a social networking company caught stealing users’ data. (2018) https://t.co/7fGxeXgjd6
Hackers want your data. Meet the ones who are trying to protect it.: The last two years… https://t.co/Gra0N6NHRI
@EmmeOlivera @JohnPhilboy606 @bookerbmo @cathythumann @TeshPunja @RealCandaceO @PerezHilton @TomArnold @ShaunKing T… https://t.co/wSmYVNfxqA
RT @jn_shine: Instagram is for people who have many clothes, that App is also for people who don't have issues with Mobile Data. Let's just…
RT @rorjatlqoddl: 그러니까 메크로를 몇 건의 기사에 몇 개의 댓글에 몇 개의 추천을 눌렀는지 좀 써봐. 

사건과 관련된 것은 첫날 나오고, 무슨 드루킹 신변잡기나 망상소설 풀어대냐? https://t.co/c4P3BWY0Bd
RT @zelo_street: ARCHIVE Arron Banks Data Abuse EXPOSED: data from his insurance firms used for Leave campaign WITHOUT CONSENT https://t.co…
Missouri governor charged with felony computer data tampering https://t.co/xkocoDUXDE
RT @ItsMeCathi: I repeat. The Trump administration has separated more than 700 children from their parents since October, includ

RT @katestarbird: Today, a group of 4th graders toured our lab. I showed a network graph of Twitter accounts from the Oso landslide. One ki…
RT @CERequena: I'm honored to be participating in #AppCon18 with #tech and government leaders, to represent #app #developers &amp; discuss #Net…
#BTC #ETH #Fishcoin https://t.co/FryLDAYuLx
RT @agusabah1980: @KemnakerRI @hanifdhakiri @jokowi @Pak_JK Data itu kalian yg punya, silahkan. Sedangkan fakta kami yg merasakan. Kudet? K…
@lindaholmes Thank you, Linda! This account shouldn't include unwanted hot takes like this. Report facts based on d… https://t.co/Px8kDZBWgh
RT @tcoughlin28: Verizon: You’re all out of data. You’ve been charged overage data for this month

Family group chat 30 seconds later: http…
RT @PatiMariano: @ammandafalcao O Acre são as tias da ceia de Natal que depois do anúncio do namoro já querem saber a data do casamento hah…
RT @EIaVersos: Mas sempre haverá uma data, palavra, um olhar, um sorriso, uma musica, pra te fazer lembrar

RT @paulsperry_: BREAKING: Obama spying on consumers without security protections backfires in mass hack attacks; ex-president ignored repe…
RT @YooMiPHA: Watch as Mia answers Twitter user @PixyJohnson's question, "Can #ArtificialIntelligence interact with each other?" and @Airdr…
RT @socrata: In honor of #EarthDay, see how local governments like @nycgov and @austintexasgov are using environmental #opendata: https://t…
#BTC #ETH #Fishcoin
The Fishcoin Ecosystem has been designed to create a data ecosystem that operates in parallel w… https://t.co/xq07UONe20
RT @SparkleOps: @SwiftOnSecurity I did once nuke the Iridium sat/data network by driving the car though 600 nested geofences on all 40 trac…
@BrengOV @gemeentearnhem Haltes in het centrum afgesloten en bussen omgeleid, maar de @NS_online app met @9292 data… https://t.co/bcljECVbB4
If your computer and USB drive meet the above requirements, please download the installer manual and the firmware u… https://t.co/80X3DStzJu
#Canada
 #Cyb

RT @ESIP_Erin: 🌎 Happy #EarthDay! Thanks to @NASA for the cool #nasagifs &amp; making #earth #data accessible 🌎 https://t.co/dFVMoJuqW7
@Johnny_Pott @TheOtherPaper1 @tariqnasheed the data provided simple math ..... Arrests, by Race and Ethnicity, 2016… https://t.co/kwcPyLbi7W
בעלי חשבון אינסטגרם? חדש - מעתה גם אתם יכולים להוריד את המידע האישי שלכם https://t.co/BiGVdNtPg3
RT @RakgadiKoolKat: @Joey_MakG Imagine your English data bundles depleting in the middle of an argument and you’re forced to say things lik…
RT @tcoughlin28: Verizon: You’re all out of data. You’ve been charged overage data for this month

Family group chat 30 seconds later: http…
RT @fansunite: We're hiring a Senior Data Architect.  #jobs #yvr #vancouver

https://t.co/DKhLW75G1X https://t.co/HXZLoYsGMd
RT @PatiMariano: @ammandafalcao O Acre são as tias da ceia de Natal que depois do anúncio do namoro já querem saber a data do casamento hah…
RT @goodby_chu_bldg: Googleのストリートビューに劇場も収録されました！かなり内部までしっかり探索できる様になっております。舞台から

mathiasrain Malam, Mas Mathias. Mohon maaf atas keluhan yang dialami. Mengenai keluhan Telkomsel billing, silakan r… https://t.co/FCBiwYnQpk
The @ProjectShivom Shivom is creating a genomics ecosystem on the blockchain. We will offer an open marketplace for… https://t.co/3Ey8B2BlHl
देश का युवा जाग चुका है
.
.
.
.
.
.
.
.

अब ये उठेगा ब्रश करेगा,
और 
1.5 GB Data खतम करेगा। 🤓🤓🤓🤓🤓🤓😀😀@narendramodi @RahulGandhi
@showusyourwork Do you even know what happened in the Senate or do you need to cherry pick your data because you’re… https://t.co/YUNTjX3S0j
RT @Jin_Butterfly: 🌟Singer Brand Reputation Rankings (April)🌟

#1. Wanna One: 12,054,871

#2. @BTS_twt: 8,023,277🏆

#3. TWICE 7,945,940

#4…
The @FishcoinNetwork a Decentralized network of tools for data collection, and monitoring quality and honest transa… https://t.co/mSAxLt1kmg
RT @SGomezSpotify: Selena Gomez is currently listening to " no data available .mp3" by Selena Gomez, null
RT @gulleyj1: @AndytweetM Here is the actual data from the pap

RT @esa: On 25 April, @ESAGaia will publish its much awaited 2nd data release, including high-precision measurements of nearly 1.7 billion…
RT @antoniogm: Google is a far worse enabler of sketchy data use and ads targeting than Facebook will ever be. Glad someone like @mims who…
RT @cksDamiano: Fast SSD VPS Hosting from ThemeVPS. 1GB RAM, 10GB SSD - just $10/month! Data-centers Switzerland Germany and US. Dedicated…
RT @MarekADochnal: Cambridge Analytica sets up new standards of idiotism. Their principals bragging shows such abyss of stupidity the whole…
Game over. Gulf Coast Fury 14u - 15, Rawlings Legends 14u - 0. Go to https://t.co/ZeB2cbsHGP to see results or watc… https://t.co/q6TaoYEDkg
@tarequelaskar I cannot tell you how much I detest the term “data-giri”.
RT @lacasadepapelbr: Quando eu lembro que La Casa de Papel só terá novos episódios em alguma data de 2019: https://t.co/grS0h4QaFx
@PNauticExpress Call it 'whatever:' You are connected inherently: Recognize &amp; use Universa

@Seabeacon7 @BLR_Halos27 @MBeckAZ @TjdVeteran @MonteMathews @JeSuisNola @MNPDNashville Numbers and fbi data collect… https://t.co/NTwgXiC6Lw
RT @nuggetsPAYandID: Nuggets is all about giving consumers the chance to take back control of their data — and today you could become one o…
📈 411.14 parts per million (ppm) #CO2 in the atmosphere on April 21, 2018 🏵️🏵️ HIGHEST DAILY AVERAGE REPORTED SO FA… https://t.co/A31Bx4jSWC
Zuckerberg had prepared Apple attack https://t.co/mWVZySYcAQ
@SPrajwal376 We apologize for the inconvenience. Few customers might have experienced free voice and data service d… https://t.co/mb2RLlsILI
A man that is reported to be six inches in height, is asking a 4 feet 10 inches man to sit down. 
 😂😂😂😂😂😂

Honestly… https://t.co/hO6WtAFMOp
@WorldOf_RPs no data
RT @Jin_Butterfly: 🌟Singer Brand Reputation Rankings (April)🌟

#1. Wanna One: 12,054,871

#2. @BTS_twt: 8,023,277🏆

#3. TWICE 7,945,940

#4…
RT @De_Hedge: WHERE DO YOU STORE YOUR DNA?
Our pilot hedging project, Pr

RT @mrwardbio: @NikeSupport @Nike you say we have until 4/30 to access our data and clear our bands yet your servers are already down 2 wee…
RT @Office365_Tech: Learn how Microsoft encryption helps financial services stay in control of their data and compliance! https://t.co/EElB…
@damian_ferry It’s so hard to determine things even w/statistics. Until you can research how a data set was obtaine… https://t.co/BCmrO2bmCe
RT @Milenastriper: Olá amores, devido alguns contratempos não consegui fazer o sorteio da rifa na sexta, então farei amanhã anoite, ainda t…
RT @slowtaehyung: Ok então eles mudaram a data do burn the stage pra toda quinta, começando a partir de amanhã, antes era 1 episódio por qu…
BEAT @beattoken https://t.co/RUAcW96BOL
Pra sa mga free data copy post ko lng 2 pra mbsa/mlaman ño!

shout out sa Black Arrow Express esp. black arrow dasm… https://t.co/Gf3gjrGzvH
a menina tem a mesma altura que eu e nascemos na mesma data 
GENTE EU TO MUITO ASSUSTADA
RT @lukerosiak: 50 storie

KeyboardInterrupt: 

In [None]:
# Even if you pause the display of tweets, your stream is still connected to Twitter!
# To disconnect (for example, if you want to change which words you are searching for), 
# use the disconnect() function.

my_stream.disconnect()

---

## Suggestions for skills to learn

* Collect 1000 tweets matching a search, or all available in the current time window, whichever comes first.  That 1000 was arbitrary
* Extract just the fields you are most interested in from a search, and create a Pandas data frame
* Follow the graph of followers from a specific Twitter user

---

## Useful resources and links

* [the structure of the Status object of Tweepy](https://gist.github.com/dev-techmoe/ef676cdd03ac47ac503e856282077bf2)
* [Tweet Data Dictionary](https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object)
* [Standard Operators](https://developer.twitter.com/en/docs/tweets/search/guides/standard-operators) -- premium operators cost money.
* [Twitter operators by product](https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/operators-by-product) -- by product they mean *paid access level*
* [How to use Twitter’s Search REST API most effectively](https://www.karambelkar.info/2015/01/how-to-use-twitters-search-rest-api-most-effectively./)
* [Collecting Tweets with Tweepy](http://www.dealingdata.net/2016/07/23/PoGo-Series-Tweepy/)
