# Analysis of Twitter Data
## Geolocation and Interactive Maps
Geolocation is the process of identifying the geographic location of an object such as a mobile phone or a computer. 

Twitter allows its users to provide their location when they publish a tweet, in the form of latitude and longitude coordinates. With this information, we are ready to create some nice visualisation for our data, in the form of interactive maps.

This lab briefly introduces the GeoJSON format and **Leaflet.js**, a nice Javascript library for interactive maps, and discusses its integration with the Twitter data we have collected in the previous lab.

### GeoJSON

GeoJSON is a format for encoding geographic data structures. The format supports a variety of geometric types that can be used to visualise the desired shapes onto a map. For our examples, we just need the simplest structure, a Point. A point is identified by its coordinates (latitude and longitude).

In GeoJSON, we can also represent objects such as a Feature or a FeatureCollection. The first one is basically a geometry with additional properties, while the second one is a list of features.

Our Twitter data set can be represented in GeoJSON as a FeatureCollection, where each tweet would be an individual Feature with its one geometry (the aforementioned Point).

This is how the JSON structure looks like:

### From Tweets to GeoJSON

Assuming the tweet data has been downloaded into a single file as described in previous lab, we simply need to iterate all the tweets looking for the coordinates field, which may or may not be present. Keep in mind that you need to use coordinates, because the geo field is deprecated (see the API).

This code will read the data set, looking for tweets where the coordinates are explicitely given. Once the GeoJSON data structure is created (in the form of a Python dictionary), then the data are dumped into a file called geo_data.json:

In [None]:
import json
import sys

fname = "c:\\Program Files\\Anaconda2\\tweets_bigData_dataAnalytic.json"
f = open(fname, 'r')
# f is the file pointer to the JSON data set
count = 1

# Tweets are stored in "fname"
with open(fname, 'r') as f:
    geo_data = {
        "type": "FeatureCollection",
        "features": []
    }
    tweets = []
    geo2 = []
    for line in f:
        count = count + 1
        if count%500 == 0:
            sys.stdout.write('.')
        if count%35000 == 0:
            sys.stdout.write('\n')
        tweet = json.loads(line)
        tweets.append(tweet)
        if tweet['coordinates']:
            geo_json_feature = {
                "type": "Feature",
                "geometry": tweet['coordinates'],
                "properties": {
                    "text": tweet['text'],
                    "created_at": tweet['created_at']
                }
            }
            geo_data['features'].append(geo_json_feature)
            geo2.append(tweet['coordinates'])
print '\nWait..'
# Save geo data
with open('geo_data.json', 'w') as fout:
    fout.write(json.dumps(geo_data, indent=4))
    
print 'Done..'

In [None]:
import pandas as pd
def populate_tweet_df(tweets):
    df = pd.DataFrame()
 
    df['text'] = list(map(lambda tweet: tweet['text'], tweets))
 
    df['location'] = list(map(lambda tweet: tweet['user']['location'], tweets))
 
    df['country_code'] = list(map(lambda tweet: tweet['place']['country_code']
                                  if tweet['place'] != None else '', tweets))
 
    df['long'] = list(map(lambda tweet: tweet['coordinates']['coordinates'][0]
                        if tweet['coordinates'] != None else 'NaN', tweets))
 
    df['latt'] = list(map(lambda tweet: tweet['coordinates']['coordinates'][1]
                        if tweet['coordinates'] != None else 'NaN', tweets))
 
    return df

df = populate_tweet_df(tweets)

In [None]:
print df[0:10]

### Interactive Maps with Leaflet.js

*Leaflet.js* is an open-source Javascript library for interactive maps. You can create maps with tiles of your choice (e.g. from OpenStreetMap or MapBox), and overlap interactive components.

In order to prepare a web page that will host a map, you simply need to include the library and its CSS, by putting in the head section of your document the following lines:

Moreover, we have all our GeoJSON data in a separate file, so we want to load the data dynamically rather than manually put all the points in the map. For this purpose, we can easily play with jQuery, which we also need to include:

The map itself will be placed into a div element:

We’re now ready to create the map with Leaflet: