# Twitter data

## Copyright and Licensing

You are free to use or adapt this notebook for any purpose you'd like. However, please respect the [Simplified BSD License](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/blob/master/LICENSE.txt) that governs its use.

# Twitter API Access

Twitter implements **OAuth 1.0A** as its standard authentication mechanism, and in order to use it to make requests to Twitter's API, you'll need to go to https://dev.twitter.com/apps and create a sample application.

Choose any name for your application, write a description and use `http://google.com` for the website.

Under **Key and Access Tokens**, there are four primary identifiers you'll need to note for an OAuth 1.0A workflow: 
* consumer key, 
* consumer secret, 
* access token, and 
* access token secret (Click on Create Access Token to create those).

Note that you will need an ordinary Twitter account in order to login, create an app, and get these credentials.

The first time you execute the notebook, add all credentials so that you can save them in the `pkl` file, then you can remove the secret keys from the notebook because they will just be loaded from the `pkl` file.

**pkl** is a Python utility module that does **serialization** to convert any Python object or data structure to a character stream so that it can be saved to the disk and recreated in Python if we need it later.

The `pkl` file contains sensitive information that can be used to take control of your twitter acccount, **do not share it**.

In [1]:
import pickle
import os

Install the `twitter` package to interface with the Twitter API

In [5]:
#!pip install twitter

Collecting twitter
  Downloading twitter-1.17.1-py2.py3-none-any.whl (55kB)
Installing collected packages: twitter
Successfully installed twitter-1.17.1


In [2]:
if not os.path.exists('C:/ml/twitter_credentials.pkl'):
    Twitter={}
    Twitter['Consumer Key'] = ''
    Twitter['Consumer Secret'] = ''
    Twitter['Access Token'] = ''
    Twitter['Access Token Secret'] = ''
    with open('C:/ml/twitter_credentials.pkl','wb') as f:
        pickle.dump(Twitter, f)
else:
    Twitter=pickle.load(open('C:/ml/twitter_credentials.pkl','rb'))

## Example 1. Authorizing an application to access Twitter account data

In [3]:
import twitter

## use twitter keys from pkl file to create twitter api object in python

# create authentication object
auth = twitter.oauth.OAuth(Twitter['Access Token'],
                           Twitter['Access Token Secret'],
                           Twitter['Consumer Key'],
                           Twitter['Consumer Secret'])

# create twitter api object
twitter_api = twitter.Twitter(auth = auth)

# Nothing to see by displaying twitter_api except that it's now a defined variable
print(twitter_api)

<twitter.api.Twitter object at 0x0000000004A52E10>


## Example 2. Retrieving trends

Twitter identifies locations using the **Yahoo! Where On Earth ID**.

The Yahoo! Where On Earth ID for the entire world is 1.
See https://dev.twitter.com/docs/api/1.1/get/trends/place and
http://developer.yahoo.com/geo/geoplanet/

look at the BOSS placefinder here: https://developer.yahoo.com/boss/placefinder/

In [4]:
WORLD_WOE_ID = 1
US_WOE_ID = 23424977

Look for the WOEID for [san-diego](http://woeid.rosselliot.co.nz/lookup/san%20diego%20%20ca)

You can change it to another location.

In [9]:
SANDIEGO_LOCAL_WOE_ID = 2487889
PHILLY_LOCAL_WOE_ID = 2471217

# Prefix ID argument in .trends.place() with the underscore for query string parameterization.
#     - Without underscore, twitter package appends the ID value to the URL itself as a special case keyword argument.

# get top 50 trends from various areas via a trend object in JSON
world_trends = twitter_api.trends.place(_id = WORLD_WOE_ID) # = 1
us_trends = twitter_api.trends.place(_id = US_WOE_ID) # = 23424977
sd_trends = twitter_api.trends.place(_id = SANDIEGO_LOCAL_WOE_ID) # = 2487889
philly_trends = twitter_api.trends.place(_id = PHILLY_LOCAL_WOE_ID) # = 2487889


# see top 2 world trends
world_trends[:2]

[{'as_of': '2017-07-25T12:34:56Z',
  'created_at': '2017-07-25T12:31:20Z',
  'locations': [{'name': 'Worldwide', 'woeid': 1}],
  'trends': [{'name': '#FelizMartes',
    'promoted_content': None,
    'query': '%23FelizMartes',
    'tweet_volume': 19310,
    'url': 'http://twitter.com/search?q=%23FelizMartes'},
   {'name': 'Mbappé',
    'promoted_content': None,
    'query': 'Mbapp%C3%A9',
    'tweet_volume': 77402,
    'url': 'http://twitter.com/search?q=Mbapp%C3%A9'},
   {'name': '#Barcelona92',
    'promoted_content': None,
    'query': '%23Barcelona92',
    'tweet_volume': 12267,
    'url': 'http://twitter.com/search?q=%23Barcelona92'},
   {'name': '#TuesdayThoughts',
    'promoted_content': None,
    'query': '%23TuesdayThoughts',
    'tweet_volume': 22969,
    'url': 'http://twitter.com/search?q=%23TuesdayThoughts'},
   {'name': '土用の丑の日',
    'promoted_content': None,
    'query': '%E5%9C%9F%E7%94%A8%E3%81%AE%E4%B8%91%E3%81%AE%E6%97%A5',
    'tweet_volume': 141920,
    'url': 'http

API responses are in **JSON (JavaScript Object Notation)** format, which is used to transfer data on the web and is roughly equivalent to nested Python lists and dictionaries, or 'more-concise' XML

Top 2 trends as of 7/24/2017, 9:59 AM EST are "#FelizLunes" and "#MondayMotivation"

In [21]:
# see type of the trends object
print(type(sd_trends),'\n')

# see the keys from the 1st record/object
print(list(sd_trends[0].keys()),'\n')

# see the trend data from the 1st object
print(sd_trends[0]['trends'])

<class 'twitter.api.TwitterListResponse'> 

['trends', 'as_of', 'created_at', 'locations'] 

[{'name': '#InsecureHBO', 'url': 'http://twitter.com/search?q=%23InsecureHBO', 'promoted_content': None, 'query': '%23InsecureHBO', 'tweet_volume': 127537}, {'name': '#ElTriEng', 'url': 'http://twitter.com/search?q=%23ElTriEng', 'promoted_content': None, 'query': '%23ElTriEng', 'tweet_volume': None}, {'name': '#WWEBattleground', 'url': 'http://twitter.com/search?q=%23WWEBattleground', 'promoted_content': None, 'query': '%23WWEBattleground', 'tweet_volume': 227952}, {'name': '#NW88JR', 'url': 'http://twitter.com/search?q=%23NW88JR', 'promoted_content': None, 'query': '%23NW88JR', 'tweet_volume': None}, {'name': '#SDCC2017', 'url': 'http://twitter.com/search?q=%23SDCC2017', 'promoted_content': None, 'query': '%23SDCC2017', 'tweet_volume': 102633}, {'name': 'Jared Kushner', 'url': 'http://twitter.com/search?q=%22Jared+Kushner%22', 'promoted_content': None, 'query': '%22Jared+Kushner%22', 'tweet_vo

## Example 3. Displaying API responses as pretty-printed JSON

In [10]:
import json

# get a more-formatted version of JSON output with indents for every new level
print((json.dumps(us_trends[:2], indent = 1)))

[
 {
  "trends": [
   {
    "name": "#TuesdayThoughts",
    "url": "http://twitter.com/search?q=%23TuesdayThoughts",
    "promoted_content": null,
    "query": "%23TuesdayThoughts",
    "tweet_volume": 22969
   },
   {
    "name": "#ThingsToAvoidAtAPublicPool",
    "url": "http://twitter.com/search?q=%23ThingsToAvoidAtAPublicPool",
    "promoted_content": null,
    "query": "%23ThingsToAvoidAtAPublicPool",
    "tweet_volume": null
   },
   {
    "name": "Michael Kors",
    "url": "http://twitter.com/search?q=%22Michael+Kors%22",
    "promoted_content": null,
    "query": "%22Michael+Kors%22",
    "tweet_volume": 18091
   },
   {
    "name": "#BoyScoutSpeech",
    "url": "http://twitter.com/search?q=%23BoyScoutSpeech",
    "promoted_content": null,
    "query": "%23BoyScoutSpeech",
    "tweet_volume": null
   },
   {
    "name": "#TravelTuesday",
    "url": "http://twitter.com/search?q=%23TravelTuesday",
    "promoted_content": null,
    "query": "%23TravelTuesday",
    "tweet_volume": 

## Example 4. Computing the intersection of two *sets* of trends

i.e. Find commonalities in the trends from different locations

In [14]:
# create empty set
trends_set = {}

# from the list value of the 'trends' key from the dictionary in the JSON response, go through each dictionary element w/in
#   the list and get the value from the 'name' key and keep only unique ones
trends_set['world'] = set(trend['name'] for trend in world_trends[0]['trends'])

trends_set['us'] = set(trend['name'] for trend in us_trends[0]['trends'])

trends_set['san diego'] = set(trend['name'] for trend in sd_trends[0]['trends'])

trends_set['philly'] = set(trend['name'] for trend in philly_trends[0]['trends'])

In [16]:
# join the datasets together and loop through them to get each trend from each set 
for loc in ['world','us','san diego','philly']:
    print(('-'*10,loc))
    print((','.join(trends_set[loc])))

('----------', 'world')
Michael Kors,#TongueOutTuesday,#قطار_الحرمين,#PresidentKovind,#MardiConseil,#DíadeGalicia,#BedelliMasayaYatırılıyor,SefaÖzcan Kader,#برنامج_الرسام,#SIRE,#CHEBAY,#Barcelona92,#AltınElbiseliAdam,#افضل_لاعب_امتعك_الموسم_الماضي,#25Jul,#NinjaWarriorAU,초동 60만,#myhyv,#FelizMartes,#Bignardi,Melilla,Marcos Alonso,#ThingsToAvoidAtAPublicPool,#ŞişliyiBatırdınİnönü,#25TemmuzDünyaKiloVermeGünü,#디스패치_도를넘어나대,Mbappé,#Ccs450,#Fan2IsOverParty,#KılıçdaroğluNeSöyledi,横浜優勝,#MyKoreanJagiyaTheReveal,#VenezuelaVotaEn5Dias,#TuesdayThoughts,#santiagoapostol,#NUESTW_If_You,#CFCFCB,#DiadaPatriaGalega,#仰天ニュース,天神祭,#cosasbuenas,#TercaDetremuraSDV,#SempreQuisTer,Artur Almeida,土用の丑の日,#Halilİnalcık,#マツコの知らない世界,#علي_سبيل_المزاج,#تركي_يعتدي_علي_سعوديين,#ابن_فهد_يجلد_ابن_زايد
('----------', 'us')
Michael Kors,Taliban,#njmorningshow,#MIPmornings,23 STEM,#FreeRightsDefenders,Die Young,Attorney General Sessions,#iamup,Win A Copy Of The Book,Alice Cooper,House Intel,#BoyScoutSpeech,#TISL,O God,#25Jul,#

In [17]:
# find trends that are happening in 2 different areas of the worrld
print(( '='*10,'Intersection of World and US Trends'))
print((trends_set['world'].intersection(trends_set['us'])),'\n')

print(('='*10,'Intersection of US and San Diego Trends'))
print((trends_set['san diego'].intersection(trends_set['us'])),'\n')

print(('='*10,'Intersection of Philadelphia and San Diego Trends'))
print((trends_set['san diego'].intersection(trends_set['philly'])))

{'Michael Kors', '#25Jul', '#FelizMartes', '#TuesdayThoughts', '#CFCFCB', '#ThingsToAvoidAtAPublicPool'} 

{'Michael Kors', '#njmorningshow', 'Taliban', '#MIPmornings', '23 STEM', '#FreeRightsDefenders', 'Die Young', '#iamup', 'Alice Cooper', 'House Intel', '#BoyScoutSpeech', '#TISL', 'O God', '#25Jul', '#FelizMartes', 'Barron Trump', '#ThingsToAvoidAtAPublicPool', '#markets', '#MyIrrationalFearIn4Words', 'A.G.', 'Yano', '#MyDreamTripWouldBe', 'The Daily 202', '#SSZ1035', '#Suffield', 'Mbappe', '#TuesdayThoughts', 'Luis Fonsi y Daddy Yankee', '#LiveAtDaybreak', '#WhatILearnedToday', '#CFCFCB', 'Ben Franklin Bridge', 'HAPPENING TODAY', '#TravelTuesday', '#PetFoodDrive', 'Justice Dept', '#2bittues', '#CAAFB', '#AtlantaAlive', 'Gwinnett Co', 'What to Expect', 'Capitol Hill for 2nd', 'Comerica Park'} 

{'Michael Kors', '#njmorningshow', 'Taliban', '#MIPmornings', '23 STEM', '#FreeRightsDefenders', 'Die Young', '#iamup', 'Alice Cooper', 'House Intel', '#BoyScoutSpeech', '#TISL', '#NationalT

## Example 5. Collecting search results

In [18]:
# Set a variable `q` to a trending topic or anything else for that matter.
q = '#ThingsToAvoidAtAPublicPool' 

number = 100

# See https://dev.twitter.com/docs/api/1.1/get/search/tweets

# search through all tweets for a specified number of tweets about a topic
search_results = twitter_api.search.tweets(q = q, count = number)

# get data from these tweets
statuses = search_results['statuses']

print(len(statuses),'\n')
print(statuses)

100 

[{'created_at': 'Tue Jul 25 12:38:01 +0000 2017', 'id': 889827074546053120, 'id_str': '889827074546053120', 'text': 'RT @StopEatingBees: #ThingsToAvoidAtAPublicPool\n\nR. Kelly', 'truncated': False, 'entities': {'hashtags': [{'text': 'ThingsToAvoidAtAPublicPool', 'indices': [20, 47]}], 'symbols': [], 'user_mentions': [{'screen_name': 'StopEatingBees', 'name': 'Corey Miller', 'id': 633292225, 'id_str': '633292225', 'indices': [3, 18]}], 'urls': []}, 'metadata': {'iso_language_code': 'en', 'result_type': 'recent'}, 'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>', 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 311797884, 'id_str': '311797884', 'name': 'Eddie Ferrero', 'screen_name': 'eddie_ferrero', 'location': 'Norman, OK', 'description': 'Abandoned by wolves. Raised by humans. Kind of an asshole. @midnight o

Twitter often returns duplicate results, we can filter them out checking for duplicate texts:

In [19]:
all_text = []
filtered_statuses = []

# for each actual tweet, if the tweet's text is not already in our list, add the tweet data is filtered_statuses
#   and the actual tweet text to all_text
for s in statuses:
    if not s["text"] in all_text:
        filtered_statuses.append(s)
        all_text.append(s["text"])
        
statuses = filtered_statuses 
len(statuses)

87

So we removed 13 duplicate statuses

In [20]:
# get the actual text/message for each tweet 
[s['text'] for s in statuses]

['RT @StopEatingBees: #ThingsToAvoidAtAPublicPool\n\nR. Kelly',
 '#ThingsToAvoidAtAPublicPool Getting your gear and building a private pool, only available for your closest family.',
 'Drowning. #ThingsToAvoidAtAPublicPool',
 'RT @TapleyMorgan: #CNNBlackmail posting on Reddit and getti mg blackmailed by CNN #ThingsToAvoidAtAPublicPool #CNNMemeWars',
 'The Taliban\n#ThingsToAvoidAtAPublicPool',
 "#ThingsToAvoidAtAPublicPool\n\n...the pool itself is a chlorinated stew of other people's bodily fluids.",
 "That's why i love Hamburger 459\n#ThingsToAvoidّAtAPuَblicًPool\n#TuُeًsdaُyThouُghts\nMiًcِhael Korsُ\nhttps://t.co/l297FVUaU6",
 '#ThingsToAvoidAtAPublicPool this https://t.co/rAjzXYAz7R',
 'RT @geoffreyclark37: The public  #ThingsToAvoidAtAPublicPool',
 'Getting in #ThingsToAvoidAtAPublicPool',
 'The Pool itself !  #ThingsToAvoidAtAPublicPool',
 '#ThingsToAvoidAtAPublicPool\n\nhttps://t.co/F4l6faFLmU',
 '#ThingsToAvoidAtAPublicPool\n\nSomeone giving head in the pool',
 'RT @SheaBrowni

In [29]:
# Show 1 sample JSON search result by slicing the list
print(json.dumps(statuses[34], indent=1))

{
 "created_at": "Tue Jul 25 12:36:40 +0000 2017",
 "id": 889826733511372802,
 "id_str": "889826733511372802",
 "text": "RT @LiAisbackagain: #ThingsToAvoidAtAPublicPool\n@kurteichenwald doing \"research\" on young, nubile little boys for his CP \"investigation for\u2026",
 "truncated": false,
 "entities": {
  "hashtags": [
   {
    "text": "ThingsToAvoidAtAPublicPool",
    "indices": [
     20,
     47
    ]
   }
  ],
  "symbols": [],
  "user_mentions": [
   {
    "screen_name": "LiAisbackagain",
    "name": "LiA is Back Again",
    "id": 888386027135668224,
    "id_str": "888386027135668224",
    "indices": [
     3,
     18
    ]
   },
   {
    "screen_name": "kurteichenwald",
    "name": "Kurt Eichenwald",
    "id": 215207998,
    "id_str": "215207998",
    "indices": [
     48,
     63
    ]
   }
  ],
  "urls": []
 },
 "metadata": {
  "iso_language_code": "en",
  "result_type": "recent"
 },
 "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
 "in_re

In [31]:
# The result of the list comprehension is a list with only 1 element that can be accessed by its index 
# Set this list to the variable t
t = statuses[34]

#[ status for status in statuses 
#          if status['id'] == 316948241264549888 ][0]

# Explore the variable t to get familiarized with the data structure...

print(t['retweet_count'],'\n')
print(t['favorite_count'],'\n')
print(t['entities'],'\n')
print(t['user'],'\n')
print(t['lang'])

5 

0 

{'hashtags': [{'text': 'ThingsToAvoidAtAPublicPool', 'indices': [20, 47]}], 'symbols': [], 'user_mentions': [{'screen_name': 'LiAisbackagain', 'name': 'LiA is Back Again', 'id': 888386027135668224, 'id_str': '888386027135668224', 'indices': [3, 18]}, {'screen_name': 'kurteichenwald', 'name': 'Kurt Eichenwald', 'id': 215207998, 'id_str': '215207998', 'indices': [48, 63]}], 'urls': []} 

{'id': 2391320695, 'id_str': '2391320695', 'name': 'The Deplorable Truth', 'screen_name': 'LeftyLieBuster', 'location': 'Nunyabidness', 'description': '#Bluehand Calling out the LIARS of Progressivism. List Adders get BLOCKED. Assholes get Muted (at the very least). #Gab.ai', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 1201, 'friends_count': 1154, 'listed_count': 18, 'created_at': 'Sat Mar 15 16:42:13 +0000 2014', 'favourites_count': 17756, 'utc_offset': -25200, 'time_zone': 'Pacific Time (US & Canada)', 'geo_enabled': False, 'verified': False, '

## Example 6. Extracting text, screen names, and hashtags from tweets

In [32]:
#get the actual tweet message for each tweet
status_texts = [status['text'] for status in statuses]

# get the usernames from each tweet
screen_names = [user_mention['screen_name'] 
                 for status in statuses  # for each tweet retrieved
                     for user_mention in status['entities']['user_mentions'] ]  # for each user_mention in the "entity" dict

# get the usernames from each tweet
hashtags = [hashtag['text'] 
             for status in statuses # for each tweet retrieved 
                 for hashtag in status['entities']['hashtags']] # for each hashtag value in the "entity" dict

# Compute a collection of all words from all tweets
words = [w 
          for t in status_texts 
              for w in t.split()]

In [33]:
# Explore the first 5 items for each...
print(json.dumps(status_texts[0:5], indent = 1))
print(json.dumps(screen_names[0:5], indent = 1)) 
print(json.dumps(hashtags[0:5], indent = 1))
print(json.dumps(words[0:5], indent = 1))

[
 "RT @StopEatingBees: #ThingsToAvoidAtAPublicPool\n\nR. Kelly",
 "#ThingsToAvoidAtAPublicPool Getting your gear and building a private pool, only available for your closest family.",
 "Drowning. #ThingsToAvoidAtAPublicPool",
 "RT @TapleyMorgan: #CNNBlackmail posting on Reddit and getti mg blackmailed by CNN #ThingsToAvoidAtAPublicPool #CNNMemeWars",
 "The Taliban\n#ThingsToAvoidAtAPublicPool"
]
[
 "StopEatingBees",
 "TapleyMorgan",
 "geoffreyclark37",
 "SheaBrowning",
 "Truthseer1961"
]
[
 "ThingsToAvoidAtAPublicPool",
 "ThingsToAvoidAtAPublicPool",
 "ThingsToAvoidAtAPublicPool",
 "CNNBlackmail",
 "ThingsToAvoidAtAPublicPool"
]
[
 "RT",
 "@StopEatingBees:",
 "#ThingsToAvoidAtAPublicPool",
 "R.",
 "Kelly"
]


## Example 7. Creating a basic frequency distribution from the words in tweets

**Collections** package countains the **Counter** class which is very helpful.

It helps us get lists of tuples for counts of items, and the **most_common()** method returns sorted counts

In [34]:
from collections import Counter

# for each list created above, count the top 10 of each (top 10 words used, screen names present, hashtags used)
for item in [words, screen_names, hashtags]:
    c = Counter(item)
    print(c.most_common()[:10]) # top 10
    print()

[('#ThingsToAvoidAtAPublicPool', 82), ('RT', 24), ('the', 19), ('The', 14), ('in', 12), ('pool', 9), ('and', 7), ('a', 5), ('on', 5), ('your', 4)]

[('TWSnyderman', 2), ('StopEatingBees', 1), ('TapleyMorgan', 1), ('geoffreyclark37', 1), ('SheaBrowning', 1), ('Truthseer1961', 1), ('Scottdell11', 1), ('robertrobq', 1), ('Cry_C', 1), ('beaucoupbougee', 1)]

[('ThingsToAvoidAtAPublicPool', 84), ('CNNBlackmail', 2), ('CNNMemeWars', 2), ('scottdell', 2), ('TuesdayThoughts', 2), ('ThingsToAvoidّAtAPuَblicًPool', 1), ('TuُeًsdaُyThouُghts', 1), ('RetroGaming', 1), ('GamersUnite', 1), ('immigrants', 1)]



## Example 8. Create a prettyprint function to display tuples in a nice tabular format

* Using **advanced string formatting**
 * **{:20}** adds blank spaces to the end of the string for padding for 20 total spaces
 * **{:^20}** centers a string within 20 spaces
 * **{:>20}** adds blank spaces to the start of the string for padding (right-align) for 20 total spaces
 

In [44]:
def prettyprint_counts(label, list_of_tuples):
    # center the label w/in 20 spaces, add "|", right align the string 'Count' in 6 spaces
    # print line of *'s to seperate header from data
    # pring the key left-aligned with 20 overall spaces, print the value right-aligned w/ 6 total spaces
    
    print("\n{:^30} | {:^6}".format(label, "Count"))    
    print("*"*40)
    for k,v in list_of_tuples:
        print("{:30} | {:>6}".format(k,v))

In [45]:
for label, data in (('Word', words), 
                    ('Screen Name', screen_names), 
                    ('Hashtag', hashtags)):
    
    c = Counter(data) # count up totals of each item in each of the 3 lists above
    prettyprint_counts(label, c.most_common()[:10]) # print out top 10 most common items in format from function


             Word              | Count 
****************************************
#ThingsToAvoidAtAPublicPool    |     82
RT                             |     24
the                            |     19
The                            |     14
in                             |     12
pool                           |      9
and                            |      7
a                              |      5
on                             |      5
your                           |      4

         Screen Name           | Count 
****************************************
TWSnyderman                    |      2
StopEatingBees                 |      1
TapleyMorgan                   |      1
geoffreyclark37                |      1
SheaBrowning                   |      1
Truthseer1961                  |      1
Scottdell11                    |      1
robertrobq                     |      1
Cry_C                          |      1
beaucoupbougee                 |      1

           Hashtag             | Co

## Example 9. Finding the most popular retweets

In [46]:
retweets = [
            # Store a tuple of 3 values:
            (status['retweet_count'], 
             status['retweeted_status']['user']['screen_name'],
             status['text'].replace("\n","\\")) 
            
            # ... for each status ...
            for status in statuses 
            
            # ... so long as the status meets this condition.
                if 'retweeted_status' in status
           ]

We can build another `prettyprint` function to print entire tweets with their retweet count.

We also want to split the text of the tweet in up to 3 lines, if needed.

In [47]:
row_template = "{:^7} | {:^15} | {:50}" # right align, center, left align
def prettyprint_tweets(list_of_tuples):
    print()
    print(row_template.format("Count", "Screen Name", "Text"))
    print("*"*60)
    for count, screen_name, text in list_of_tuples:
        print(row_template.format(count, screen_name, text[:50]))
        if len(text) > 50:
            print(row_template.format("", "", text[50:100]))
            if len(text) > 100:
                print(row_template.format("", "", text[100:]))

In [49]:
# Slice off the 1st 5 from the sorted results and display each item in the tuple

prettyprint_tweets(sorted(retweets, reverse = True)[:5])


 Count  |   Screen Name   | Text                                              
************************************************************
  67    | geoffreyclark37 | RT @geoffreyclark37: The public  #ThingsToAvoidAtA
        |                 | PublicPool                                        
  49    | StopEatingBees  | RT @StopEatingBees: #ThingsToAvoidAtAPublicPool\\R
        |                 | . Kelly                                           
  27    |      Cry_C      | RT @Cry_C: #ThingsToAvoidAtAPublicPool warm spots 
        |                 | in the water.                                     
  27    |    Cattereia    | RT @Cattereia: Looking too suspicious in the chang
        |                 | erooms. #ThingsToAvoidAtAPublicPool https://t.co/Z
        |                 | C8iYutgPN                                         
  23    |    sman9876     | RT @sman9876: That's why i love Hamburger 534\#Tue
        |                 | sdِaًyThoّughts\#ThinّgsToَAvoَidّAtAPubl