# Workshop on Spatial Analysis of Twitter

This workshop demonstrates the process of acquiring Twitter data using the search API and conduct simple spatial analyses on the data.

This workshop requires Anaconda3 (64-bit Python 3.7) installed in your computer.

You can access to this website from https://bit.ly/2GQ13hl

## 1. Preparation

Install packages needed for this workshop.

In [1]:
!pip install tweepy

Collecting tweepy
  Using cached https://files.pythonhosted.org/packages/36/1b/2bd38043d22ade352fc3d3902cf30ce0e2f4bf285be3b304a2782a767aec/tweepy-3.8.0-py2.py3-none-any.whl
Collecting requests-oauthlib>=0.7.0 (from tweepy)
  Using cached https://files.pythonhosted.org/packages/c2/e2/9fd03d55ffb70fe51f587f20bcf407a6927eb121de86928b34d162f0b1ac/requests_oauthlib-1.2.0-py2.py3-none-any.whl
Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->tweepy)
  Using cached https://files.pythonhosted.org/packages/05/57/ce2e7a8fa7c0afb54a0581b14a65b56e62b5759dbc98e80627142b8a3704/oauthlib-3.1.0-py2.py3-none-any.whl
Installing collected packages: oauthlib, requests-oauthlib, tweepy
Successfully installed oauthlib-3.1.0 requests-oauthlib-1.2.0 tweepy-3.8.0


In [2]:
!pip install folium

Collecting folium
  Downloading https://files.pythonhosted.org/packages/72/ff/004bfe344150a064e558cb2aedeaa02ecbf75e60e148a55a9198f0c41765/folium-0.10.0-py2.py3-none-any.whl (91kB)
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/63/36/1c93318e9653f4e414a2e0c3b98fc898b4970e939afeedeee6075dd3b703/branca-0.3.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.3.1 folium-0.10.0


Import packages needed for this tutorial

In [2]:
# Run the following lines if there is an error loading basemap
#import os
#os.environ['PROJ_LIB'] = '~your anaconda 3 path/Anaconda3/Library/share/'

import tweepy
import pandas as pd
#from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt

Go to this website for generating an App and get its keys and token: https://developer.twitter.com/en/apps

<img src='images/portal interface.jpg' align='left' width='800'>

<img src='images/Twitter App setting.jpg' align='left' width='600'>

Make sure that in the "User authentication settings" under your app setting, you have selected "read and write" or "read and write and direct message" (or you will not be able to post a tweet).

You can use "http://127.0.0.1:8080" as Callback URL and make any website you know as the website URL.

Make sure to **regenerate** the keys and tokens after confirming the setting.

In [1]:
# paste your key and secret here.
consumer_key = 'consumer_key'
consumer_secret = 'consumer_secret'
access_token = 'access_token'
access_token_secret = 'access_token_secret'

In [3]:
# Set up for Twitter authentication
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

In [4]:
# Set up tweepy API and set rate limit to be true
api = tweepy.API(auth, wait_on_rate_limit=True)

---

## 2. Programmatic Manipulation of Twitter

Now, your working environment is ready for Twitter analysis.

Let's first try a few simple operations in Twitter in a programmatic way.

The full functionalities of Twitter API and Tweepy can be found in:

- [Twitter APIs](https://developer.twitter.com/en/docs.html)
- [Tweepy documentation](http://docs.tweepy.org/en/v4.8.0/)

First, let's post a message in Twitter.

**Note**: if you don't want to disturb with your followers with a meanless tweet, don't run the following block of code.

In [110]:
# Post a tweet from Python
test_tweet = api.update_status("DRILL: I'm creating a robot to tweet!")

In [119]:
test_tweet._json

{'created_at': 'Wed Mar 30 02:12:17 +0000 2022',
 'id': 1508990464728317953,
 'id_str': '1508990464728317953',
 'text': "DRILL: I'm creating a robot to tweet!",
 'truncated': False,
 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [], 'urls': []},
 'source': '<a href="http://hennarot.forest.usf.edu/main/depts/geosci/students/jinwenxu/" rel="nofollow">Twitter Feed 1224</a>',
 'in_reply_to_status_id': None,
 'in_reply_to_status_id_str': None,
 'in_reply_to_user_id': None,
 'in_reply_to_user_id_str': None,
 'in_reply_to_screen_name': None,
 'user': {'id': 1470935353120854018,
  'id_str': '1470935353120854018',
  'name': 'Jinmin',
  'screen_name': 'Jinmin7517',
  'location': '',
  'description': '',
  'url': None,
  'entities': {'description': {'urls': []}},
  'protected': False,
  'followers_count': 0,
  'friends_count': 1,
  'listed_count': 0,
  'created_at': 'Wed Dec 15 01:55:18 +0000 2021',
  'favourites_count': 0,
  'utc_offset': None,
  'time_zone': None,
  'geo_enabled'

Delete the tweet you just posted.

In [120]:
api.destroy_status(test_tweet.id_str)

Status(_api=<tweepy.api.API object at 0x00000254C7BF6160>, _json={'created_at': 'Wed Mar 30 02:12:17 +0000 2022', 'id': 1508990464728317953, 'id_str': '1508990464728317953', 'text': "DRILL: I'm creating a robot to tweet!", 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [], 'urls': []}, 'source': '<a href="http://hennarot.forest.usf.edu/main/depts/geosci/students/jinwenxu/" rel="nofollow">Twitter Feed 1224</a>', 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 1470935353120854018, 'id_str': '1470935353120854018', 'name': 'Jinmin', 'screen_name': 'Jinmin7517', 'location': '', 'description': '', 'url': None, 'entities': {'description': {'urls': []}}, 'protected': False, 'followers_count': 0, 'friends_count': 1, 'listed_count': 0, 'created_at': 'Wed Dec 15 01:55:18 +0000 2021', 'favourites_count': 0, 'utc_offset': None, 'time_zone'

### 2.1 Get the first 100 retweets of a tweet

You can get the tweet ID from the address box of a browser when you click into a tweet.

<img src='images/football_tweet_id.jpg' align='left' width='400'>

<img src='images/football_content.jpg' align='left' width='400'>

Get the first 10 retweets of the tweet.

In [5]:
retweets_workshop = api.get_retweets(1508871111383027712)

Get the username of the first retweet of the tweet

In [6]:
retweets_workshop[0]._json['user']['name']

'Joe Jimenez'

Print usernames, names, and user locations of all retweets of the tweet.

Note: Twitter API can only return the first 100 retweets.

In [7]:
[[tweet.user.screen_name, tweet.user.name, tweet.user.location] for tweet in retweets_workshop]

[['donjoseauto', 'Joe Jimenez', ''],
 ['CoachDaPrato', '𝑪𝒐𝒂𝒄𝒉 𝑫𝒂𝒏𝒊𝒆𝒍 𝑫𝒂 𝑷𝒓𝒂𝒕𝒐', 'Tampa, FL'],
 ['USFHerd', 'USF T w i t t e r Herd', 'Tampa, FL'],
 ['NFL_UNICORN1', 'CFB UNICORN', ''],
 ['CoachTTrickett', 'Travis Trickett', 'Tampa, FL']]

### 2.2 Current trends in the world

Get the list of cities where trends are available

In [121]:
city_ls = api.available_trends()

Convert the list (in JSON format) into a dataframe (i.e. a table).

In [122]:
df_city = pd.DataFrame(city_ls)

Print a sub-list (20 of the 467) of cities where trends are available

In [123]:
df_city.head(20)

Unnamed: 0,name,placeType,url,parentid,country,woeid,countryCode
0,Worldwide,"{'code': 19, 'name': 'Supername'}",http://where.yahooapis.com/v1/place/1,0,,1,
1,Winnipeg,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/2972,23424775,Canada,2972,CA
2,Ottawa,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/3369,23424775,Canada,3369,CA
3,Quebec,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/3444,23424775,Canada,3444,CA
4,Montreal,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/3534,23424775,Canada,3534,CA
5,Toronto,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/4118,23424775,Canada,4118,CA
6,Edmonton,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/8676,23424775,Canada,8676,CA
7,Calgary,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/8775,23424775,Canada,8775,CA
8,Vancouver,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/9807,23424775,Canada,9807,CA
9,Birmingham,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/12723,23424975,United Kingdom,12723,GB


In [124]:
len(df_city)

467

Get the trends in Tampa

In [125]:
df_city[df_city['name']=='Tampa']

Unnamed: 0,name,placeType,url,parentid,country,woeid,countryCode
394,Tampa,"{'code': 7, 'name': 'Town'}",http://where.yahooapis.com/v1/place/2503863,23424977,United States,2503863,US


In [44]:
city_id = df_city[df_city['name']=='Tampa']['woeid']

#print city_id
city_id

394    2503863
Name: woeid, dtype: int64

Return the trends in Tampa.

Note: you need to convert the city_id from a pandas series object into an integer.

In [45]:
# make Tampa as an example
trend_sf = api.get_place_trends(int(city_id))

Print the trends in JSON format

In [46]:
# print the top 20 trends in Tampa
trend_sf[0:20]

[{'trends': [{'name': '#actualkpopopinions',
    'url': 'http://twitter.com/search?q=%23actualkpopopinions',
    'promoted_content': None,
    'query': '%23actualkpopopinions',
    'tweet_volume': 11618},
   {'name': '#TarHeels',
    'url': 'http://twitter.com/search?q=%23TarHeels',
    'promoted_content': None,
    'query': '%23TarHeels',
    'tweet_volume': None},
   {'name': 'Ferreira',
    'url': 'http://twitter.com/search?q=Ferreira',
    'promoted_content': None,
    'query': 'Ferreira',
    'tweet_volume': 24119},
   {'name': 'Gio Reyna',
    'url': 'http://twitter.com/search?q=%22Gio+Reyna%22',
    'promoted_content': None,
    'query': '%22Gio+Reyna%22',
    'tweet_volume': None},
   {'name': 'Rick Scott',
    'url': 'http://twitter.com/search?q=%22Rick+Scott%22',
    'promoted_content': None,
    'query': '%22Rick+Scott%22',
    'tweet_volume': 31789},
   {'name': '#NASCAR',
    'url': 'http://twitter.com/search?q=%23NASCAR',
    'promoted_content': None,
    'query': '%23NAS

In [47]:
# print first 5 trends in the selected city
trend_sf[0]['trends'][0:5]

[{'name': '#actualkpopopinions',
  'url': 'http://twitter.com/search?q=%23actualkpopopinions',
  'promoted_content': None,
  'query': '%23actualkpopopinions',
  'tweet_volume': 11618},
 {'name': '#TarHeels',
  'url': 'http://twitter.com/search?q=%23TarHeels',
  'promoted_content': None,
  'query': '%23TarHeels',
  'tweet_volume': None},
 {'name': 'Ferreira',
  'url': 'http://twitter.com/search?q=Ferreira',
  'promoted_content': None,
  'query': 'Ferreira',
  'tweet_volume': 24119},
 {'name': 'Gio Reyna',
  'url': 'http://twitter.com/search?q=%22Gio+Reyna%22',
  'promoted_content': None,
  'query': '%22Gio+Reyna%22',
  'tweet_volume': None},
 {'name': 'Rick Scott',
  'url': 'http://twitter.com/search?q=%22Rick+Scott%22',
  'promoted_content': None,
  'query': '%22Rick+Scott%22',
  'tweet_volume': 31789}]

Organize the Tampa trends in a table (dataframe)

In [48]:
trend_ls = [[trend['name'], trend['url'], trend['tweet_volume']] for trend in trend_sf[0]['trends']]

df_trend = pd.DataFrame(trend_ls,columns=['name','url','tweet_volume'])

In [49]:
# Sort the trends by tweet volumn in a descending order
df_trend.sort_values("tweet_volume", inplace = True, ascending = False)

# Print the top 10 trends ranked by tweet volumne
df_trend.head(10)

Unnamed: 0,name,url,tweet_volume
30,Beyoncé,http://twitter.com/search?q=Beyonc%C3%A9,203721.0
31,Dune,http://twitter.com/search?q=Dune,150948.0
34,Encanto,http://twitter.com/search?q=Encanto,120818.0
42,Zendaya,http://twitter.com/search?q=Zendaya,117653.0
48,Duke,http://twitter.com/search?q=Duke,116909.0
45,World Cup,http://twitter.com/search?q=%22World+Cup%22,111669.0
36,Ariana,http://twitter.com/search?q=Ariana,95088.0
14,billie,http://twitter.com/search?q=billie,68328.0
17,Carolina,http://twitter.com/search?q=Carolina,64003.0
37,Bond,http://twitter.com/search?q=Bond,61730.0


---

## 3. Acquiring Tweets using the Search API

### 3.1 Search using keywords

Get the trend on the top

In [126]:
df_trend.name[0]

'#actualkpopopinions'

use the trend as the keyword for searching.

In [127]:
# Define the search term as a variable
search_words = df_trend.name[0] # you can set the search words to any keyword you are interested.

search _n_ tweets using the keyword (the top trend). The search will return all tweets containing the keyword worldwide.

In [128]:
# set up tweepy cursor and search 5 tweets according preset parameters
tweets = tweepy.Cursor(api.search_tweets,
              q=search_words,
              lang="en",
              count = 5).items(5)
[tweet.text for tweet in tweets]

["RT @yeongenre: it's not too late to stan seventeen\U0001faf6 stan legends everyone! #actualkpopopinions\n https://t.co/GJmwHX17hi",
 'RT @coffeemaze_: when txt debuted, everyone made fun of their no position/all-rounder concept but now? Suddenly all new 4th gen groups are…',
 'RT @coffeemaze_: when txt debuted, everyone made fun of their no position/all-rounder concept but now? Suddenly all new 4th gen groups are…',
 'RT @fltrjwy: ateez have the best vocal line of 4th gen and one of the best in the industry  #actualkpopopinions https://t.co/MJonq9KuWB',
 'RT @tamyyx1: If only X1 didnt disband they could’ve been the real 4th gen bg leader 🤧\n#actualkpopopinions']

As lots of retweets are repeating the original tweets, we can set up a filter to eliminate the retweets and keep only the original tweets.

In [129]:
new_search = search_words + " -filter:retweets"
new_search

'#actualkpopopinions -filter:retweets'

Now, you can see only original tweets are retrieved.

In [130]:
# Use a cursor to retrieve 5 tweets (most recent sample) using the keywords.
tweets = tweepy.Cursor(api.search_tweets,
                       q=new_search,
                       lang="en",
                       count = 5).items(5)

# store the retrieved tweets in a list
tweets_ls = [[tweet.created_at, tweet.user.screen_name, tweet.text, tweet.user.location] for tweet in tweets]

Print to see the list.

In [131]:
tweets_ls

[[datetime.datetime(2022, 3, 29, 21, 9, 31, tzinfo=datetime.timezone.utc),
  'shilala123',
  "Once calling Straykids sales inorganic but at the same time they're the main reason why Twice album sales are inorg… https://t.co/WsKYJ0YhSu",
  ''],
 [datetime.datetime(2022, 3, 29, 18, 39, 38, tzinfo=datetime.timezone.utc),
  'skz_addicted05',
  'who’s ur bias from\n#StrayKids \n\n#BlessingsWait4SKZ \n#actualkpopopinions \n#ODDINARY',
  'Norway | 16 | She/her'],
 [datetime.datetime(2022, 3, 29, 16, 22, 9, tzinfo=datetime.timezone.utc),
  'SoumyaS59729407',
  'As for some Stays too .... Stop discrediting other groups achivements... We all know well enough how each and every… https://t.co/SqI0DFuI5N',
  'stayville'],
 [datetime.datetime(2022, 3, 29, 15, 37, 23, tzinfo=datetime.timezone.utc),
  'gfofbeomgyu',
  "THAT ONE TIME WHEN BEOMGYU ALMOST ENDED YEONJUN'S CAREER 😮\u200d💨😭 \n#BEOMGYU #actualkpopopinions #TXT_BEOMGYU #txt https://t.co/ef6PCBGv4O",
  ''],
 [datetime.datetime(2022, 3, 29, 13,

Convert the list into dataframe and print it out

In [132]:
tweets_df = pd.DataFrame(tweets_ls)
tweets_df

Unnamed: 0,0,1,2,3
0,2022-03-29 21:09:31+00:00,shilala123,Once calling Straykids sales inorganic but at ...,
1,2022-03-29 18:39:38+00:00,skz_addicted05,who’s ur bias from\n#StrayKids \n\n#BlessingsW...,Norway | 16 | She/her
2,2022-03-29 16:22:09+00:00,SoumyaS59729407,As for some Stays too .... Stop discrediting o...,stayville
3,2022-03-29 15:37:23+00:00,gfofbeomgyu,THAT ONE TIME WHEN BEOMGYU ALMOST ENDED YEONJU...,
4,2022-03-29 13:44:21+00:00,yoooyves,they ate #actualkpopopinions https://t.co/87db...,LOONAPROTECT


### 3.2 Search using keywords and locations

Query for a popular trend keyword in Tampa (200 miles range)

First, let's check what are the top 10 trending topics in the selected city (Tampa).

In [133]:
df_trend.head(10)

Unnamed: 0,name,url,tweet_volume
30,Beyoncé,http://twitter.com/search?q=Beyonc%C3%A9,203721.0
31,Dune,http://twitter.com/search?q=Dune,150948.0
34,Encanto,http://twitter.com/search?q=Encanto,120818.0
42,Zendaya,http://twitter.com/search?q=Zendaya,117653.0
48,Duke,http://twitter.com/search?q=Duke,116909.0
45,World Cup,http://twitter.com/search?q=%22World+Cup%22,111669.0
36,Ariana,http://twitter.com/search?q=Ariana,95088.0
14,billie,http://twitter.com/search?q=billie,68328.0
17,Carolina,http://twitter.com/search?q=Carolina,64003.0
37,Bond,http://twitter.com/search?q=Bond,61730.0


The following code may take a few minutes to run to collect the tweets, depending on the number of tweets.

In [134]:
# new_search = "#new_search -filter:retweets"
#new_search = " -filter:retweets"

# use cursor to send your request with parameters
tweets = tweepy.Cursor(api.search_tweets,
                   q="beach",
                   geocode = "27.9506,-82.4572,20000mi",
                   lang="en").items(100)

# restore the results as a list
search_result = [[tweet.user.screen_name, tweet.text, tweet.user.location,tweet.place] for tweet in tweets]

Convert the searched tweets into a dataframe

In [135]:
search_result

[['vaenergy',
  '@SethQuick5 @mkobach great history, great foodie town, 1 hour to mountains, 1 1/2 hour to the beach, 2 hours to Was… https://t.co/R84RdcotZL',
  'Key Largo FL',
  None],
 ['gwardhome',
  'Bumbling Buffon Beach Cosplay Lawyer @DWUhlfelderLaw starts campaign page @DanielUhlfelder .\nTwice the trolling twi… https://t.co/Zpi1dytICK',
  'Plantation, FL',
  Place(_api=<tweepy.api.API object at 0x00000254C7BF6160>, id='7df9a00dcf914d5e', url='https://api.twitter.com/1.1/geo/id/7df9a00dcf914d5e.json', place_type='city', name='Plantation', full_name='Plantation, FL', country_code='US', country='United States', contained_within=[], bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x00000254C7BF6160>, type='Polygon', coordinates=[[[-80.330201, 26.088262], [-80.1968332, 26.088262], [-80.1968332, 26.160753], [-80.330201, 26.160753]]]), attributes={})],
 ['EdPiotrowski',
  'The ferris wheel at Broadway at the Beach still shines brightly in the colors of the Ukrainian flag to 

In [136]:
df_result = pd.DataFrame(data=search_result, 
                    columns=['user', "text","location","place"])

Preview the first 5 tweets

In [137]:
df_result.head()

Unnamed: 0,user,text,location,place
0,vaenergy,"@SethQuick5 @mkobach great history, great food...",Key Largo FL,
1,gwardhome,Bumbling Buffon Beach Cosplay Lawyer @DWUhlfel...,"Plantation, FL",Place(_api=<tweepy.api.API object at 0x0000025...
2,EdPiotrowski,The ferris wheel at Broadway at the Beach stil...,"Myrtle Beach, SC",
3,claudefla01,Spectacular beach front condo with panoramic o...,miami beach,
4,jakelkapri,"Ugh, quick beach run won’t hurt 😩","Orlando, fl",


Preview the first 5 tweets with geotags

In [138]:
df_result[df_result['place'].notna()].head()

Unnamed: 0,user,text,location,place
1,gwardhome,Bumbling Buffon Beach Cosplay Lawyer @DWUhlfel...,"Plantation, FL",Place(_api=<tweepy.api.API object at 0x0000025...
9,What2WearWhere,@megbraffdesigns What a fab opening! Don't m...,New York,Place(_api=<tweepy.api.API object at 0x0000025...
10,NWJS_jobs,"See our latest Vero Beach, FL job opening. htt...",,Place(_api=<tweepy.api.API object at 0x0000025...
12,donlexofficial,@majorleaguedjz 🅿️ushing 🅿️iano to the world 🌍...,"Miami, FL",Place(_api=<tweepy.api.API object at 0x0000025...
40,D3zMix,Last night in #pcb so of course we have to mak...,FL USA,Place(_api=<tweepy.api.API object at 0x0000025...


### 3.3 check how many tweets are geotagged

In [77]:
all_tweets = len(df_result[df_result['place'].notna()]) # all retrieved tweets
geo_tweets = len(df_result) # tweets that actually have geotags.

print("%s out of the %s retrieved tweets actually have geotags" % (all_tweets, geo_tweets))

21 out of the 100 retrieved tweets actually have geotags


#### Copy tweets with geotags to a new dataframe called "geotags"

In [78]:
geotags = df_result.loc[df_result['place'].notna()].copy()

#### get their place and view where first 5 tweets are from

In [79]:
geotags['place_name'] = geotags.place.apply(lambda s:s.name)

In [80]:
geotags.head()

Unnamed: 0,user,text,location,place,place_name
6,D3zMix,@alt_lyfe heavy out here in #pcb #coyoteugly ...,FL USA,Place(_api=<tweepy.api.API object at 0x0000025...,Panama City Beach
7,divinemoira,Swing through life as fearlessly as you did wh...,,Place(_api=<tweepy.api.API object at 0x0000025...,Sarasota
22,divinemoira,Dance with me… @ Lido Beach Resort https://t.c...,,Place(_api=<tweepy.api.API object at 0x0000025...,Sarasota
24,paulleary,"At Palm Beach Pride, 30 LGBTQ couples who marr...","Miami, FL",Place(_api=<tweepy.api.API object at 0x0000025...,Fort Lauderdale
27,MrsGinaC,Delicious blackened sea food platter and muscl...,"Oswego, IL",Place(_api=<tweepy.api.API object at 0x0000025...,Aunt Kate's


#### Check place information and parse them into dataframe

In [81]:
geotags.place[min(geotags.index)]

Place(_api=<tweepy.api.API object at 0x00000254C2A7A3D0>, id='9ebd5acfac2301ba', url='https://api.twitter.com/1.1/geo/id/9ebd5acfac2301ba.json', place_type='city', name='Panama City Beach', full_name='Panama City Beach, FL', country_code='US', country='United States', contained_within=[], bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x00000254C2A7A3D0>, type='Polygon', coordinates=[[[-85.95802, 30.1650609], [-85.7860766, 30.1650609], [-85.7860766, 30.266595], [-85.95802, 30.266595]]]), attributes={})

#### Print the bounding box of the geotag

In [82]:
geotags.place[min(geotags.index)].bounding_box

BoundingBox(_api=<tweepy.api.API object at 0x00000254C2A7A3D0>, type='Polygon', coordinates=[[[-85.95802, 30.1650609], [-85.7860766, 30.1650609], [-85.7860766, 30.266595], [-85.95802, 30.266595]]])

###### Print the coordinates of the bounding box

In [83]:
geotags.place[min(geotags.index)].bounding_box.coordinates[0]

[[-85.95802, 30.1650609],
 [-85.7860766, 30.1650609],
 [-85.7860766, 30.266595],
 [-85.95802, 30.266595]]

#### Generate a column called bounding_box to restore bounding box information

In [84]:
geotags['bounding_box'] = geotags.place.apply(lambda s:s.bounding_box.coordinates[0])

In [85]:
geotags.head()

Unnamed: 0,user,text,location,place,place_name,bounding_box
6,D3zMix,@alt_lyfe heavy out here in #pcb #coyoteugly ...,FL USA,Place(_api=<tweepy.api.API object at 0x0000025...,Panama City Beach,"[[-85.95802, 30.1650609], [-85.7860766, 30.165..."
7,divinemoira,Swing through life as fearlessly as you did wh...,,Place(_api=<tweepy.api.API object at 0x0000025...,Sarasota,"[[-82.588866, 27.293114], [-82.477281, 27.2931..."
22,divinemoira,Dance with me… @ Lido Beach Resort https://t.c...,,Place(_api=<tweepy.api.API object at 0x0000025...,Sarasota,"[[-82.588866, 27.293114], [-82.477281, 27.2931..."
24,paulleary,"At Palm Beach Pride, 30 LGBTQ couples who marr...","Miami, FL",Place(_api=<tweepy.api.API object at 0x0000025...,Fort Lauderdale,"[[-80.20811, 26.080935], [-80.0902351, 26.0809..."
27,MrsGinaC,Delicious blackened sea food platter and muscl...,"Oswego, IL",Place(_api=<tweepy.api.API object at 0x0000025...,Aunt Kate's,"[[-81.31005883828144, 29.949614861868813], [-8..."


#### Parse the latitude and longitude hidden in the bounding box, finally check the dataframe

Store the centroids in a column 'point'

In [86]:
geotags['point']  = geotags['bounding_box'].apply(lambda s: [(s[0][1]+s[2][1])/2,(s[0][0]+s[2][0])/2])

Store the latitude of the centroids in the column 'lat'

In [87]:
geotags['lat']  = geotags['bounding_box'].apply(lambda s: (s[0][1]+s[2][1])/2)

Store the longitude of the centroids in the column 'lon'

In [88]:
geotags['lon']  = geotags['bounding_box'].apply(lambda s: (s[0][0]+s[2][0])/2)

Print to see the dataframe again.

You'll see the centroids, latitude, and longitude are added as columns in the dataframe.

Note: the point column is an redundancy of the lat and lon columns. We create all these columns just for demonstration of mapping in the next step.

In [89]:
geotags.head()

Unnamed: 0,user,text,location,place,place_name,bounding_box,point,lat,lon
6,D3zMix,@alt_lyfe heavy out here in #pcb #coyoteugly ...,FL USA,Place(_api=<tweepy.api.API object at 0x0000025...,Panama City Beach,"[[-85.95802, 30.1650609], [-85.7860766, 30.165...","[30.215827949999998, -85.8720483]",30.215828,-85.872048
7,divinemoira,Swing through life as fearlessly as you did wh...,,Place(_api=<tweepy.api.API object at 0x0000025...,Sarasota,"[[-82.588866, 27.293114], [-82.477281, 27.2931...","[27.3411215, -82.5330735]",27.341121,-82.533074
22,divinemoira,Dance with me… @ Lido Beach Resort https://t.c...,,Place(_api=<tweepy.api.API object at 0x0000025...,Sarasota,"[[-82.588866, 27.293114], [-82.477281, 27.2931...","[27.3411215, -82.5330735]",27.341121,-82.533074
24,paulleary,"At Palm Beach Pride, 30 LGBTQ couples who marr...","Miami, FL",Place(_api=<tweepy.api.API object at 0x0000025...,Fort Lauderdale,"[[-80.20811, 26.080935], [-80.0902351, 26.0809...","[26.150368, -80.14917255]",26.150368,-80.149173
27,MrsGinaC,Delicious blackened sea food platter and muscl...,"Oswego, IL",Place(_api=<tweepy.api.API object at 0x0000025...,Aunt Kate's,"[[-81.31005883828144, 29.949614861868813], [-8...","[29.949614861868813, -81.31005883828144]",29.949615,-81.310059


---

## 4. Spatial visualization using folium package

Import the folium package to create an interactive map.

In [90]:
import folium

Create a basemap.

In [91]:
#oahu = folium.Map(location = [21.473,-157.9868],zoom_start = 10)
maptweet = folium.Map()

Add the tweets into the basemap

In [92]:
for i, row in geotags.iterrows():
    folium.Marker(row.point,popup = row.text).add_to(maptweet)

Zoom closer into the tweets

In [93]:
maptweet.fit_bounds([[min(geotags.lat),min(geotags.lon)],[max(geotags.lat),max(geotags.lon)]])

In [94]:
display(maptweet)