## Visualization of the data

To predict the route of a certain police squad, we will now visualize our dataset with various maps.

In [1]:
import gmaps
import pandas as pd
import feather as fth

import folium
from folium import features
from folium.plugins import HeatMap
from folium.plugins import MarkerCluster
from folium.plugins import HeatMapWithTime

new_york_coordinates = (40.75, -74.00) #starting point for maps - middle of nyc

apikey = "AIzaSyApj3xGPGx1naRs2DZiUlJ6moRftzWzTJU"
gmaps.configure(api_key=apikey)

def load( precinct, squad, date ):
    file = 'geo_squad_route_time_' + squad + precinct + '_' + date + '_Parking_Violations_' + str(int(date[-2:]) + 1) #fiscal year
    datadir = '../../data/nyc_parking_tickets/squad_route/'
    fileformat = '.fth'
    path = datadir + file + fileformat
    
    data = fth.read_dataframe(path)
    data = data[['lat', 'lng', 'Violation Time']]
    
    return data

def prepare( dataIn ):
    dataHourMin = []
    for x in range(1, 13):
        datatemp = dataIn[dataIn['Violation Time'].str[:2].str.contains(str(x).zfill(2))]
        for y in range(0, 59):
            datatemp2 = []
            datatemp2 = datatemp[datatemp['Violation Time'].str[2:4].str.contains(str(y).zfill(2))]
            
            if not datatemp2.empty:
                datatemp2 = datatemp2[['lat','lng']].values.tolist()
                dataHourMin.append(datatemp2)
                
    return dataHourMin


### Data preparation I

To connect the tickets correctly, we have to sort our data by the time the tickets were recorded. Therefore we have two different methods allowing us to use heatmaps with a time axis. Therefore we have to make an array inside our dataHourMin-array for every timeslot (here: 1min). 

First we split the dataset in AM and PM and then process the timestamps.

In [2]:
#data divided into single tickets hour - min needed for heatmapwithtime

def prepTime( data ): 
    print('Processing...')
    
    #print(len(data))
    dataAM = data[data['Violation Time'].str.contains('A')]
    dataPM = data[data['Violation Time'].str.contains('P')]


    dataTime = []
    dataTime.extend(prepare(dataAM))
    dataTime.extend(prepare(dataPM))

    print('Done!')
    
    return dataTime

### Data preparation II

To simply print them out as points on a map and optionally connection between them, we only have to sort the dataset by the timestamp and write the coordinates (lat, lng) in an array.

In [3]:
#data sorted by time needed for pointmaps

def sortTime( data ):
    
    #print('Processing...')
    
    dataAM = data[data['Violation Time'].str.contains('A')]
    #dataAM['Violation Time'] = dataAM['Violation Time'].str[:4]
    dataPM = data[data['Violation Time'].str.contains('P')]

    dataAM = dataAM.sort_values(by=['Violation Time'])
    dataPM = dataPM.sort_values(by=['Violation Time'])
    data = dataAM[['lat', 'lng']]
    data = data.append(dataPM[['lat', 'lng']])

    dataTimeSort = data.values.tolist()
    
    #print('Done!')
    
    return dataTimeSort

### Visualization II - folium marker and connections

    Data: preparation II
    Map: folium
    Type: map, marker, polyline

#### Analysis I
In this map, every ticket issued on the 29th of September 2016 in by squad 'A' are visualized. What can be see is, that there are parking tickets all over NYC, which means that there are more 'A' squads. A combination of precinct and squad is needed to identify a specific squad.

In [10]:
data = sortTime(load('', 'A', '09292016'))

pubs_map = folium.Map(location=new_york_coordinates, zoom_start=10)
HeatMap(data, radius = 15).add_to(pubs_map)

pubs_map

#### Analysis II
In this map, every ticket issued on the 29th of September 2016 in the precinct '1' (part of Manhatten) are visualized.

In [9]:
data = sortTime(load('1', '', '09292016'))

pubs_map = folium.Map(location=new_york_coordinates, zoom_start=12)
HeatMap(data, radius = 15).add_to(pubs_map)

pubs_map

#### Analysis III
In this map, every ticket issued on the 29th of September 2016 by squad 'A' in the precinct '1' are visualized.
We come to the conclusion, that a combination between precinct and squad makes our prediction more specific.

In [23]:
pubs_map = folium.Map(location=[40.7154787, -74.005545], zoom_start=14)

data = sortTime(load('1', 'A', '09292016'))

#add markers
for each in data:
    folium.Marker(location=each, popup=str(each)).add_to(pubs_map)
    continue
folium.PolyLine(data, color="green", weight=2.5, opacity=1).add_to(pubs_map)

pubs_map

In [26]:
data = sortTime(load('1', 'A', '09292016'))

pubs_map = folium.Map(location=[40.7154787, -74.005545], zoom_start=14)
HeatMap(data, radius = 15).add_to(pubs_map)

pubs_map

#### Analysis IV
To see if our thesis is true, we compare different squad/precinct-combinations with each other and visualize them in the map below. Our data is still not specific enough. The reason for this are a lot of possibly wrong records.

In [29]:
pubs_map = folium.Map(location=new_york_coordinates, zoom_start=11)

data = sortTime(load('112', 'C', '09272016'))
#points = dataTimeSort[:]
#add markers
#for each in points:
    #folium.Marker(location=each, popup=str(each)).add_to(pubs_map)
    #continue
#add lines
folium.PolyLine(data, color="red", weight=2.5, opacity=1).add_to(pubs_map)

data = sortTime(load( '120', 'A', '09272016'))
folium.PolyLine(data, color="green", weight=2.5, opacity=1).add_to(pubs_map)

data = sortTime(load( '103', 'B', '09272016'))
folium.PolyLine(data, color="blue", weight=2.5, opacity=1).add_to(pubs_map)

data = sortTime(load( '107', 'D', '09272016'))
folium.PolyLine(data, color="yellow", weight=2.5, opacity=1).add_to(pubs_map)

data = sortTime(load( '90', 'E', '09272016'))
folium.PolyLine(data, color="black", weight=2.5, opacity=1).add_to(pubs_map)

pubs_map

# Appendix

### Alternative - Visualization III - gmaps points

    Data: preparation II
    Map: gmaps

In [6]:
#heatmap

data = load('112', 'C', '09272016')
data = data[['lat', 'lng']]

fig = gmaps.figure(center=new_york_coordinates, zoom_level=10)
fig.add_layer(gmaps.heatmap_layer(data))
fig

In [7]:
#pointmap
data = load('112', 'C', '09272016')
data = data[['lat', 'lng']]

fig = gmaps.figure(center=new_york_coordinates, zoom_level=10)
ticket_layer = gmaps.symbol_layer(data, fill_color="red", stroke_color="red", scale=1)
fig.add_layer(ticket_layer)
fig

### IN PROGRESS - Visualization I - folium heatmap with time

    Data: preparation I
    Map: folium
    Type: HeatMapWithTime

Because of a bug, the coordinates are not visualized in the heatmap. An approach of comparing the different days of the week per squad can be seen in the folium_SquadAnalysisWeekDay notebook.

In [8]:
dataAllHeat = []
dataAllHeatTime = []

tmp = load('120', 'A', '09272013')
tmp = tmp[['lat', 'lng']]
dataAllHeat.extend(tmp.values.tolist())
dataAllHeatTime.append(tmp.values.tolist())

tmp = load('120', 'A', '09272014')
tmp = tmp[['lat', 'lng']]
dataAllHeat.extend(tmp.values.tolist())
dataAllHeatTime.append(tmp.values.tolist())

tmp = load('120', 'A', '09272015')
tmp = tmp[['lat', 'lng']]
dataAllHeat.extend(tmp.values.tolist())
dataAllHeatTime.append(tmp.values.tolist())

tmp = load('120', 'A', '09272016')
tmp = tmp[['lat', 'lng']]
dataAllHeat.extend(tmp.values.tolist())
dataAllHeatTime.append(tmp.values.tolist())

print('Done!')

ArrowIOError: Failed to open local file: ../../data/nyc_parking_tickets/squad_route/geo_squad_route_time_A120_09272013_Parking_Violations_14.fth , error: No such file or directory

In [8]:
#folium map

pubs_map = folium.Map(location=new_york_coordinates, zoom_start=10)

#HeatMapWithTime(data, radius = 20, index = ['2013', '2014', '2015', '2016']).add_to(pubs_map)
HeatMapWithTime(dataAllHeatTime, radius = 20).add_to(pubs_map)

pubs_map