# <u>Data, Metadata and APIs</u>

## <u>Part 5: The Google Maps API and Open Data</u>

Now that you've extracted GPS coordinates from JPEG metadata and mapped it using the Google Maps API, you might be wondering what else you can do with the Google Maps API. The short answer is... a lot. 

In this notebook, you'll see how to combine your knowledge of the Google Maps API with your knowledge of data analysis with Pandas.

### <u>Find an Open Data Set that contains Location Data</u>

Here's a data set that tracks the location of all potholes filled by the City of Chicago for the past 7 days. Chicago is [known for its potholes](https://www.wbez.org/shows/curious-city/city-of-big-potholes-is-asphalt-the-best-choice-for-chicagos-streets/8bbd9e7a-b27e-4e00-a868-aa0b826b53b2), so this should be good. 

We will load this _.csv_ file in from a URL so that it is guaranteed to be the most up-to-date as possible:

In [1]:
# Note: the spike in traffic from Fremd may get us IP-banned by Chicago's Open Data portal.
#       If this happens, your teacher will share a static copy of Potholes_Patched.csv,
#       and you'll need to run the code "potholes_DF = pd.read_csv('Potholes_Patched.csv')"

import pandas as pd

potholes_DF = pd.read_csv("Potholes_Patched.csv")

# display the 3 most recent potholes that were filled
potholes_DF[-3:]

Unnamed: 0,ADDRESS,REQUEST DATE,COMPLETION DATE,NUMBER OF POTHOLES FILLED ON BLOCK,LATITUDE,LONGITUDE,LOCATION
106304,600 W 59TH ST,4/4/2022 13:51,4/4/2022 13:52,23,41.787195,-87.640291,POINT (-87.640291146857 41.787194823791)
106305,1000 W 59TH ST,4/4/2022 13:56,4/4/2022 13:57,10,41.787045,-87.649957,POINT (-87.649957148437 41.787044902409)
106306,6328 N LINCOLN AVE,3/31/2022 17:21,4/4/2022 12:14,1,41.996206,-87.717424,POINT (-87.717424109667 41.996205985167)


Check how many potholes were filled in the last week since the spreadsheet was generated:

In [2]:
print(len(potholes_DF))

106307


That's a lot of potholes. Now extract the location data, clean out the "nan" values, and store it as a list of tuples:

In [3]:
import numpy as np

lat = list(potholes_DF["LATITUDE"])

lon = list(potholes_DF["LONGITUDE"])

tuple_list = []

'''
for i in range(len(lat)):
    coord = (lat[i],lon[i])
    tuple_list.append(coord)
'''

tuple_list = [(lat[i],lon[i]) for i in range(len(lat))]

tuple_list = [x for x in tuple_list if not np.isnan(x[1])]

Let's compare the length of *potholes_DF* to *tuple_list* to see how many "nan" values we cleaned out:

In [4]:
print(len(potholes_DF),len(tuple_list))

106307 105955


Depending on the week, there may be a handful of "nan" values to clean out. If you were lucky, there were none.

Now let's look at a few of the tuples in the list:

In [5]:
tuple_list[-10:]

[(41.745690100000004, -87.60546472),
 (41.79259436, -87.79514811),
 (41.96039461, -87.68338403),
 (41.78973533, -87.70494202),
 (41.81948525, -87.69375105),
 (41.99997707, -87.69576195),
 (41.73013202, -87.54693733),
 (41.78719482, -87.64029115),
 (41.7870449, -87.64995715),
 (41.99620599, -87.71742411)]

### <u>Google Maps API with Markers</u>

Let's put a marker every place we found a pothole.
#### *WARNING: Adding more than 500 marker points could potentially crash your kernel!  To combat this, we are creating a list of 500 random entries from the original tuple_list.*

In [6]:
import numpy as np

tuple_list_500 = []
indicies_used = []
for i in range(500):                                # Loop 500 times
    random = np.random.randint(0,500)               # Generate random index number
    if random not in indicies_used:                 # Check if number has already been generated
        indicies_used.append(random)                # Add new number to list of used numbers
        tuple_list_500.append(tuple_list[random])   # Add the tuple from that index to the new list of 500
print(indicies_used[:50])
#indicies_used = [random for np.random.randint(0,500) in range(500) if random not in indicies_used]
print(tuple_list_500[:10])

[36, 205, 298, 470, 194, 135, 412, 185, 437, 190, 498, 90, 202, 170, 269, 55, 104, 292, 281, 458, 487, 165, 220, 132, 107, 289, 6, 264, 243, 125, 395, 270, 468, 192, 424, 187, 332, 372, 59, 154, 108, 46, 474, 288, 339, 31, 464, 172, 491, 448]
[(41.84196943, -87.61880872), (41.94656209, -87.71529293), (41.84916764, -87.62689661), (41.93977365, -87.81562606), (41.89511162, -87.62036795), (41.69626546, -87.70228214), (41.89441744, -87.62509449999999), (41.83390448, -87.65886216), (41.92058579, -87.74879675), (41.94842052, -87.64404654)]


In [7]:
# Import the gmaps python module and load in your API Key:
import gmaps
gmaps.configure(api_key="AIzaSyCLla6Q7krE9xNg6SnNMoGNIzjCLddE9EU")

In [8]:
from ipywidgets.embed import embed_minimal_html # Allows us to create a separte file for the Google Maps

markers = gmaps.marker_layer(tuple_list_500)    # Create markers for each tuple/coordinate
markermap = gmaps.Map()                         # Create a GMap variable
markermap.add_layer(markers)                    # Add the layer of markers to GMap

embed_minimal_html('output/MarkerMap1.html', views=[markermap])
print("*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name \"MarkerMap1.html\". ***")

markermap

*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name "MarkerMap1.html". ***


Map(configuration={'api_key': 'AIzaSyCLla6Q7krE9xNg6SnNMoGNIzjCLddE9EU'}, data_bounds=[(41.647905676006154, -8…

**<u>Question 1:</u>** Look at the marker map at various zoom levels. What do you notice above the graph? Comment on anything interesting you see and try to summarize "the good" and "the bad" in this visualization.

**<u>Your Answer:</u>** Most of the potholes are all in chicago and there aren't much outside chicago.

### <u>Google Maps API to Create a Heatmap</u>

Instead of markers, let's make a heat map:
#### *WARNING: Adding more than 500 marker points could potentially crash your kernel!  To combat this, we are again using the list of 500 random entries from the original tuple_list.*

In [9]:
from ipywidgets.embed import embed_minimal_html # Allows us to create a separte file for the Google Maps

heatm = gmaps.Map()
heatm.add_layer(gmaps.heatmap_layer(tuple_list_500))

embed_minimal_html('output/HeatMap1.html', views=[markermap])
print("*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name \"HeatMap1.html\". ***")

heatm

*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name "HeatMap1.html". ***


Map(configuration={'api_key': 'AIzaSyCLla6Q7krE9xNg6SnNMoGNIzjCLddE9EU'}, data_bounds=[(41.647905676006154, -8…

**<u>Question 2:</u>** Look at the heatmap at various zoom levels. What do you notice above the graph? Comment on anything interesting you see and try to summarize "the good" and "the bad" in this visualization.

**<u>Your Answer:</u>** The most red parts are right in chicago.

### <u>Task 1: Find your own dataset!</u>

You are going to create a marker map **and** a heatmap from a dataset you have found. For Task 1, find a dataset with location data (GPS coordinates!). Fill in the following:

_Name:_ 

_Date:_ 

_Source for Data Set:_ https://coronavirus.jhu.edu/map.html

_URL for Data Set:_ 

_Description of Data Set:_ 

_File Format for Data Set:_ 

_Age of Data Set:_ 

### <u>Task 2: Show some entries fom your dataset</u>

Import your data set as a Pandas Data Frame, then show the last 10 entries:

In [None]:
# Your code here


### <u>Task 3: Create a list of tuples</u>

Use your dataset to create a list of tuples (a list of DD coordinates) representing the locations in your dataset:
#### *WARNING: Adding more than 500 marker points could potentially crash your kernel!  To combat this, create a list of 500 random entries from the original list of tuples.*

In [None]:
# Your code here


### <u>Task 4: Create a marker map from your data</u>

Use the Google Maps API to create a marker map using your list of tuples from above.

In [None]:
# Your code here


### <u>Task 5: Create a heatmap from your data</u>

Use the Google Maps API to create a **heatmap** using your list of tuples from above.

*Note: The Google Maps API can struggle with heatmaps that have more than 1000 datapoints. If your map is not working, try reducing your list to fewer tuples (try creating a list with just the most recent 100 entries in the dataset). Once this works, you can always add in a few more tuples!*

In [None]:
# Your code here


### <u>Task 6: Comment on what you see</u>

Look at your marker map and your heatmap at various zoom levels. Comment on anything interesting or notable that you see. 

**<u>Your Answer:</u>** 

### <u>Task 7: Brainstorm further study</u>

If you had more time and resources, what else would you like to explore using the GPS data in this dataset?

**<u>Your Answer:</u>**