# Social Media Disaster Alert System

## Alert Map Creation with Bokeh Application

Using the dataset that was created after adding the "label" column to the actual Twitter data, we wanted to visualize the locations of tweets on a map of the Greater New York City Region. This includes New York City, parts of Yonkers, parts of Long Island, and parts of New Jersey.

In [1]:
#pip install bokeh

In [2]:
import pandas as pd
import numpy as np

import bokeh
from bokeh.io import output_file, show
from bokeh.models import ColumnDataSource, GMapOptions
from bokeh.plotting import gmap
%matplotlib inline

As is customary whenever a dataset is read into a notebook, we must do some mandatory cleaning. Once again, we removed the "Unnamed: 0" column and check the column titles and data types.

In [11]:
df = pd.read_csv('./Data/emergency_location.csv')

In [12]:
df.drop(columns='Unnamed: 0', inplace=True)
df.head()

Unnamed: 0,time,text,lat,lon,label
0,2019-07-30 16:45:02,RT @BloodAid: #Hyderabad #Emergency Need A+ #b...,40.754305,-74.144927,1
1,2019-07-30 16:43:29,RT @HoxworthUC: #CRITICAL APPEAL: Due to multi...,40.856658,-74.101605,1
2,2019-07-30 16:43:00,"RT @terapump_socal: No Leak, No Hassle and No ...",40.625146,-73.771159,0
3,2019-07-30 16:42:56,RT @BloodAid: #Hyderabad #Emergency Need A+ #b...,40.871323,-73.766075,1
4,2019-07-30 16:42:19,RT @BloodAid: #Hyderabad #Emergency Need A+ #b...,40.619937,-73.992653,1


In [13]:
df.shape

(4334, 5)

In [14]:
df.columns

Index(['time', 'text', 'lat', 'lon', 'label'], dtype='object')

In [15]:
df.dtypes

time      object
text      object
lat      float64
lon      float64
label      int64
dtype: object

The dataset was split into two separate datasets according to the "label" of the tweet. It was split according to whether a tweet was relevant or not in order to visualize them using two different colors on the map. As the dataset was fairly large, we decided to only plot 50 observations from each dataset for demonstration purposes. 

In [19]:
df_rel = df[df['label'] == 1][0:50]
df_not_rel = df[df['label'] == 0][0:50]

The map was created using a regular "roadmap" style map as seen on Google Maps with the standard functionalities. Our map sources included the two newly created datasets using the latitudes and longitudes from each dataset to plot individual tweets. Relevant tweets are labeled in red while non-relevant tweets are label in blue.

In [21]:
output_file("Live_Tweet_Location.html")
map_options = GMapOptions(lat=40.7128, lng=-74.0060, map_type='roadmap', zoom=10)

TOOLS = "pan,wheel_zoom,reset,hover,save"

plot = gmap('AIzaSyCVTmuo9ZaTw5PhJvWeIkTVbG_xV0KlNLA', map_options, title='NYC Boroughs', tools = TOOLS)

source1 = ColumnDataSource(data=dict(lat=df_rel['lat'],
                                     lon=df_rel['lon']))

source2 = ColumnDataSource(data = dict(lat = df_not_rel['lat'],
                                      lon = df_not_rel['lon']))

plot.circle(x="lon", y="lat", size=5, fill_color="red", line_color = "red", fill_alpha=1.0, source=source1)
plot.circle(x="lon", y="lat", size=5, fill_color="blue", line_color = "blue", fill_alpha=1.0, source=source2)
show(plot)