# Mapping Simulation

##### This is code is built out to simulate mapping processed and classified geotagged tweets in a Flask App. For this example, we will generate 700 off-topic samples and 300 on-topic samples. The on-topic samples will be both diffuse and highly localized, with the localized samples representing a high-need area such as a disaster epicenter.

## Contents
-  [Import Alberta Floods Data](#Import-Alberta-Floods-Data)
-  [Generate Geo-Coordinates](#Generate-Geo-Coordinates)
-  [Combine DataFrames](#Combine-DataFrames)
-  [Save DataFrame](#Save-DataFrame)
-  [Launch Flask App](#Launch-Flask-App)

In [1]:
import csv
import pandas as pd
import numpy as np

from sklearn.utils import shuffle

## Import Alberta Floods Data

In [2]:
alberta_df = pd.read_csv('../data/CrisisLexT6/2013_Alberta_Floods/2013_Alberta_Floods-ontopic_offtopic.csv')

## Generate Geo-Coordinates

### Off-Topic

Select 700 samples off-topic samples for the mapping demo and assign it to dataframe `flood_df`.

In [3]:
flood_df = alberta_df[[' tweet', ' label']][alberta_df[' label'] == 'off-topic'][:700]

Rename columns

In [4]:
flood_df.rename(columns = {' tweet':'tweet', ' label':'label'}, inplace=True)

Reset index in order to merge it with other dataframes.

In [5]:
flood_df.reset_index(drop=True, inplace=True)

Randomly generate geo-coordinates representing a geo-fenced Twitter query and add the coordinates to the dataframe.

In [6]:
lat = pd.DataFrame(np.random.uniform(33.691060, 34.176593, 900))
long = pd.DataFrame(np.random.uniform(-118.022506, -117.134077, 900))

In [7]:
flood_df['lat'] = lat
flood_df['long'] = long

Check that the coordinates were properly added to each sample.

In [8]:
flood_df.head()

Unnamed: 0,tweet,label,lat,long
0,@Jay1972Jay Nope. Mid 80's. It's off Metallica...,off-topic,33.854271,-117.643958
1,Nothing like a :16 second downpour to give us ...,off-topic,33.955365,-117.570415
2,"Party hard , suns down , still warm , lovin li...",off-topic,34.095789,-117.272218
3,@Exclusionzone if you compare yourself to wate...,off-topic,33.800306,-117.154119
4,"and is usually viewed in a #heroic light, rece...",off-topic,34.045244,-117.623747


### On-Topic - Localized

We'll be repeating the process for 100 localized on-topic examples. We expect to see more off-topic than on-topic tweets during natural disasters so a smaller sample size is taken. These will be assigned to dataframe `flood_df2`.

In [9]:
flood_df2 = alberta_df[[' tweet', ' label']][alberta_df[' label'] == 'on-topic'][:100]
flood_df2.rename(columns = {' tweet':'tweet', ' label':'label'}, inplace=True)
flood_df2.reset_index(drop=True, inplace=True)

Generate coordinates within a smaller bounding box to simulate higher density of emergency-related Twitter traffic.

In [10]:
lat2 = pd.Series(np.random.uniform(34.080250, 34.027232, 100))
long2 = pd.Series(np.random.uniform(-117.378770,  -117.314329, 100))

In [11]:
flood_df2['lat'] = lat2
flood_df2['long'] = long2

In [12]:
flood_df2.head()

Unnamed: 0,tweet,label,lat,long
0,@NelsonTagoona so glad that you missed the flo...,on-topic,34.043506,-117.343093
1,@LiseMouskaal 17th Avenue is flooded from McLe...,on-topic,34.048252,-117.373203
2,@Crackmacs same seems like 1/2 of#yyc is shut ...,on-topic,34.036121,-117.3717
3,Supreme bug protection. Cooking for a house fu...,on-topic,34.032642,-117.369655
4,Lies Okotoks tells itself... The river only fl...,on-topic,34.055958,-117.370801


### On-Topic - Diffused

Create 200 diffuse samples that are on-topic, representing general conversation about a disaster. These will be assigned to dataframe `flood_df3`.

In [13]:
flood_df3 = alberta_df[[' tweet', ' label']][alberta_df[' label'] == 'on-topic'][100:300]
flood_df3.rename(columns = {' tweet':'tweet', ' label':'label'}, inplace=True)
flood_df3.reset_index(drop=True, inplace=True)

In [14]:
lat3 = pd.DataFrame(np.random.uniform(33.691060, 34.176593, 200))
long3 = pd.DataFrame(np.random.uniform(-118.022506, -117.134077, 200))

In [15]:
flood_df3['lat'] = lat3
flood_df3['long'] = long3

In [16]:
flood_df3.head()

Unnamed: 0,tweet,label,lat,long
0,Flood Party Update #3: My pants took themselve...,on-topic,34.111649,-117.520021
1,Help rescue workers by staying home for the re...,on-topic,33.7124,-117.137996
2,@jeanniedixon the water is 3 to 4 metres deep ...,on-topic,34.016128,-118.000052
3,@thomaskeeper as long as resources aren't take...,on-topic,33.855105,-117.198551
4,Just drove through southern Alberta. All exits...,on-topic,33.698364,-117.152067


## Combine DataFrames

Concatenate the three sample groups and shuffle to represent data processed by our classification pipeline.

In [17]:
flood_df = shuffle(pd.concat([flood_df, flood_df2, flood_df3], axis=0))

Reset index and verify proper output.

In [18]:
flood_df.reset_index(drop=True, inplace=True)

In [19]:
flood_df.head()

Unnamed: 0,tweet,label,lat,long
0,@jackshope They are airlifting our crew into C...,off-topic,34.082321,-117.335853
1,"#Skywire was boring, what was exciting was whe...",off-topic,34.122635,-117.243592
2,@WBrettWilson @TarzanDan whenever works for yo...,off-topic,34.092197,-117.266696
3,@joshclassenCTV you are dead wrong sir there i...,on-topic,34.049241,-117.378255
4,"I'm at Canada Olympic Park (Calgary, AB) http:...",off-topic,33.70454,-117.192101


## Save DataFrame

Export to a CSV file in order to be read in by the Flask App.

In [20]:
flood_df.to_csv('../data/map_data.csv')

## Launch Flask App

**In terminal**: 
1. `cd` into working directory
2.  `export FLASK_APP=../assets/dashboard.py`
3.  `flask run`

**Visit http://127.0.0.1:5000/**