# <u>Data, Metadata and APIs</u>

## <u>Part 5: The Google Maps API and Open Data</u>

Now that you've extracted GPS coordinates from JPEG metadata and mapped it using the Google Maps API, you might be wondering what else you can do with the Google Maps API. The short answer is... a lot. 

In this notebook, you'll see how to combine your knowledge of the Google Maps API with your knowledge of data analysis with Pandas.

### <u>Find an Open Data Set that contains Location Data</u>

Here's a data set that tracks the location of all potholes filled by the City of Chicago for the past 7 days. Chicago is [known for its potholes](https://www.wbez.org/shows/curious-city/city-of-big-potholes-is-asphalt-the-best-choice-for-chicagos-streets/8bbd9e7a-b27e-4e00-a868-aa0b826b53b2), so this should be good. 

We will load this _.csv_ file in from a URL so that it is guaranteed to be the most up-to-date as possible:

In [8]:
# Note: the spike in traffic from Fremd may get us IP-banned by Chicago's Open Data portal.
#       If this happens, your teacher will share a static copy of Potholes_Patched.csv,
#       and you'll need to run the code "potholes_DF = pd.read_csv('Potholes_Patched.csv')"

import pandas as pd

potholes_DF = pd.read_csv("Potholes_Patched.csv")

# display the 3 most recent potholes that were filled
potholes_DF[-3:]

Unnamed: 0,ADDRESS,REQUEST DATE,COMPLETION DATE,NUMBER OF POTHOLES FILLED ON BLOCK,LATITUDE,LONGITUDE,LOCATION
106304,600 W 59TH ST,4/4/2022 13:51,4/4/2022 13:52,23,41.787195,-87.640291,POINT (-87.640291146857 41.787194823791)
106305,1000 W 59TH ST,4/4/2022 13:56,4/4/2022 13:57,10,41.787045,-87.649957,POINT (-87.649957148437 41.787044902409)
106306,6328 N LINCOLN AVE,3/31/2022 17:21,4/4/2022 12:14,1,41.996206,-87.717424,POINT (-87.717424109667 41.996205985167)


Check how many potholes were filled in the last week since the spreadsheet was generated:

In [9]:
print(len(potholes_DF))

106307


That's a lot of potholes. Now extract the location data, clean out the "nan" values, and store it as a list of tuples:

In [10]:
import numpy as np

lat = list(potholes_DF["LATITUDE"])

lon = list(potholes_DF["LONGITUDE"])

tuple_list = []

for i in range(len(lat)):
    coord = (lat[i],lon[i])
    tuple_list.append(coord)

tuple_list = [x for x in tuple_list if not np.isnan(x[1])]

Let's compare the length of *potholes_DF* to *tuple_list* to see how many "nan" values we cleaned out:

In [11]:
print(len(potholes_DF),len(tuple_list))

106307 105955


Depending on the week, there may be a handful of "nan" values to clean out. If you were lucky, there were none.

Now let's look at a few of the tuples in the list:

In [12]:
tuple_list[-10:]

[(41.745690100000004, -87.60546472),
 (41.79259436, -87.79514811),
 (41.96039461, -87.68338403),
 (41.78973533, -87.70494202),
 (41.81948525, -87.69375105),
 (41.99997707, -87.69576195),
 (41.73013202, -87.54693733),
 (41.78719482, -87.64029115),
 (41.7870449, -87.64995715),
 (41.99620599, -87.71742411)]

### <u>Google Maps API with Markers</u>

Let's put a marker every place we found a pothole.
#### *WARNING: Adding more than 500 marker points could potentially crash your kernel!  To combat this, we are creating a list of 500 random entries from the original tuple_list.*

In [13]:
import numpy as np

tuple_list_500 = []
indicies_used = []
for i in range(500):                                # Loop 500 times
    random = np.random.randint(0,500)               # Generate random index number
    if random not in indicies_used:                 # Check if number has already been generated
        indicies_used.append(random)                # Add new number to list of used numbers
        tuple_list_500.append(tuple_list[random])   # Add the tuple from that index to the new list of 500
print(tuple_list_500[:10])

[(41.9361212, -87.66369366), (41.9999401, -87.69608763), (41.70531377, -87.71081072), (41.80752865, -87.62105054), (42.01216514, -87.67047259), (41.7670158, -87.61075922), (41.82847362, -87.67312287), (41.83386574, -87.6589391), (42.00794406, -87.81914209), (41.8400597, -87.65422727)]


In [14]:
# Import the gmaps python module and load in your API Key:
import gmaps
gmaps.configure(api_key="AIzaSyCLla6Q7krE9xNg6SnNMoGNIzjCLddE9EU")

In [15]:
from ipywidgets.embed import embed_minimal_html # Allows us to create a separte file for the Google Maps

markers = gmaps.marker_layer(tuple_list_500)    # Create markers for each tuple/coordinate
markermap = gmaps.Map()                         # Create a GMap variable
markermap.add_layer(markers)                    # Add the layer of markers to GMap

# embed_minimal_html('MarkerMap1.html', views=[markermap])
print("*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name \"MarkerMap1.html\". ***")

markermap

*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name "MarkerMap1.html". ***


Map(configuration={'api_key': 'AIzaSyCLla6Q7krE9xNg6SnNMoGNIzjCLddE9EU'}, data_bounds=[(41.659665495591334, -8…

**<u>Question 1:</u>** Look at the marker map at various zoom levels. What do you notice above the graph? Comment on anything interesting you see and try to summarize "the good" and "the bad" in this visualization.

**<u>Your Answer:</u>** The closer I zoom in, the farther apart the markers are from each other. The further i zoom out, the closer the markers are from each other and look like theyre combining into one 

### <u>Google Maps API to Create a Heatmap</u>

Instead of markers, let's make a heat map:
#### *WARNING: Adding more than 500 marker points could potentially crash your kernel!  To combat this, we are again using the list of 500 random entries from the original tuple_list.*

In [16]:
from ipywidgets.embed import embed_minimal_html # Allows us to create a separte file for the Google Maps

heatm = gmaps.Map()
heatm.add_layer(gmaps.heatmap_layer(tuple_list_500))

# embed_minimal_html('HeatMap1.html', views=[markermap])
print("*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name \"HeatMap1.html\". ***")

heatm

*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name "HeatMap1.html". ***


Map(configuration={'api_key': 'AIzaSyCLla6Q7krE9xNg6SnNMoGNIzjCLddE9EU'}, data_bounds=[(41.659665495591334, -8…

**<u>Question 2:</u>** Look at the heatmap at various zoom levels. What do you notice above the graph? Comment on anything interesting you see and try to summarize "the good" and "the bad" in this visualization.

**<u>Your Answer:</u>** 

### <u>Task 1: Find your own dataset!</u>

You are going to create a marker map **and** a heatmap from a dataset you have found. For Task 1, find a dataset with location data (GPS coordinates!). Fill in the following:

_Name:_ McDonald's Locations

_Date:_ 2019

_Source for Data Set:_ Kaggle

_URL for Data Set:_ https://www.kaggle.com/datasets/ben1989/mcdonalds-locations/metadata?resource=download

_Description of Data Set:_ "Not Provided"

_File Format for Data Set:_ .csv

_Age of Data Set:_ 3 years

### <u>Task 2: Show some entries fom your dataset</u>

Import your data set as a Pandas Data Frame, then show the last 10 entries:

In [17]:
# Your code here
import pandas as pd

McDonalds_DF = pd.read_csv("McDonalds.csv")

# display the 10 most recent McDonalds that were entryed
McDonalds_DF[-10:]

Unnamed: 0,geometry.coordinates,properties.addressLine1,properties.addressLine2,properties.addressLine3,properties.drivethruhours.driveHoursFriday,properties.drivethruhours.driveHoursMonday,properties.drivethruhours.driveHoursSaturday,properties.drivethruhours.driveHoursSunday,properties.drivethruhours.driveHoursThursday,properties.drivethruhours.driveHoursTuesday,...,properties.restaurantUrl,properties.restauranthours.hoursFriday,properties.restauranthours.hoursMonday,properties.restauranthours.hoursSaturday,properties.restauranthours.hoursSunday,properties.restauranthours.hoursThursday,properties.restauranthours.hoursTuesday,properties.restauranthours.hoursWednesday,properties.subDivision,properties.telephone
13791,"[-124.238837, 43.392261]",3303 BROADWAY AVENUE,WALNUT CREEK FIELD OFFICE,NORTH BEND,04:00 - 01:00,04:00 - 01:00,04:00 - 01:00,05:00 - 00:00,04:00 - 01:00,04:00 - 01:00,...,,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,06:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,OR,(541) 756-6617
13792,"[-124.190687, 40.776316]",3450 S BROADWAY,WALNUT CREEK FIELD OFFICE,EUREKA,05:00 - 02:00,05:00 - 02:00,05:00 - 02:00,05:00 - 02:00,05:00 - 02:00,05:00 - 02:00,...,https://my.peoplematter.com/USMCD1000892164/Hi...,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,CA,(707) 442-5981
13793,"[-124.252163, 43.389526]",2051 NEWMARK AVE,WALNUT CREEK FIELD OFFICE,COOS BAY,06:00 - 23:00,06:00 - 22:00,06:00 - 23:00,07:00 - 22:00,06:00 - 22:00,06:00 - 22:00,...,,06:00 - 23:00,06:00 - 22:00,06:00 - 23:00,07:00 - 22:00,06:00 - 22:00,06:00 - 22:00,06:00 - 22:00,OR,(541) 888-2479
13794,"[-123.867098, 46.975623]",2501 SIMPSON AVENUE,WALNUT CREEK FIELD OFFICE,HOQUIAM,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,...,https://my.peoplematter.com/USMCD1000891249/Hi...,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,WA,(360) 532-6020
13795,"[-124.28712, 42.053332]",815 CHETCO AVE,WALNUT CREEK FIELD OFFICE,BROOKINGS,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,...,https://my.peoplematter.com/USMCD1000892290/Hi...,05:00 - 00:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 00:00,05:00 - 23:00,05:00 - 23:00,OR,(541) 469-9572
13796,"[-124.054297, 46.342259]",100 16TH ST SE,WALNUT CREEK FIELD OFFICE,LONG BEACH,06:00 - 00:00,06:00 - 23:00,06:00 - 00:00,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,...,https://my.peoplematter.com/USMCD1000890791/Hi...,06:00 - 23:00,06:00 - 22:00,06:00 - 23:00,06:00 - 22:00,06:00 - 22:00,06:00 - 22:00,06:00 - 22:00,WA,(360) 642-8212
13797,"[-124.162087, 47.007879]",701 POINT BROWNE AVE NW,WALNUT CREEK FIELD OFFICE,OCEAN SHORES,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,...,https://my.peoplematter.com/USMCD1000891249/Hi...,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,05:00 - 23:00,WA,(360) 289-8672
13798,"[-131.672389, 55.34867]",2417 TONGASS AVE,WALNUT CREEK FIELD OFFICE,KETCHIKAN,05:00 - 03:00,05:00 - 00:00,04:00 - 03:00,04:00 - 00:00,05:00 - 00:00,05:00 - 00:00,...,https://my.peoplematter.com/USMCD1000891356/Hi...,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,AK,(907) 225-1704
13799,"[-134.576119, 58.364486]",2285 TROUT ST,WALNUT CREEK FIELD OFFICE,JUNEAU,05:00 - 01:00,05:00 - 00:00,05:00 - 01:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,...,https://my.peoplematter.com/USMCD1000891356/Hi...,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,05:00 - 00:00,AK,(907) 789-4653
13800,"[-135.349288, 57.060111]",913 HALIBUT POINT RD,WALNUT CREEK FIELD OFFICE,SITKA,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,06:30 - 22:30,06:00 - 23:00,06:00 - 23:00,...,https://my.peoplematter.com/USMCD1000891356/Hi...,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,06:30 - 22:30,06:00 - 23:00,06:00 - 23:00,06:00 - 23:00,AK,(907) 747-8709


### <u>Task 3: Create a list of tuples</u>

Use your dataset to create a list of tuples (a list of DD coordinates) representing the locations in your dataset:
#### *WARNING: Adding more than 500 marker points could potentially crash your kernel!  To combat this, create a list of 500 random entries from the original list of tuples.*

In [31]:
# create list of tuples
import numpy as np

coordinates = list(McDonalds_DF["geometry.coordinates"])

tuple_list = []

for i in range(len(coordinates)):
    coord = (coordinates[i])
    tuple_list.append(coord)
    
tuple_list = tuple_list.replace("[","")
tuple_list = tuple_list.str.replace("]","")
print(tuple_list)

AttributeError: 'list' object has no attribute 'replace'

### <u>Task 4: Create a marker map from your data</u>

Use the Google Maps API to create a marker map using your list of tuples from above.

In [19]:
# Your code here
import numpy as np

tuple_list_500 = []
indicies_used = []
for i in range(500):                                # Loop 500 times
    random = np.random.randint(0,500)               # Generate random index number
    if random not in indicies_used:                 # Check if number has already been generated
        indicies_used.append(random)                # Add new number to list of used numbers
        tuple_list_500.append(tuple_list[random])   # Add the tuple from that index to the new list of 500
print(tuple_list_500[:10])

['[-86.882105, 36.138497]', '[-86.353194, 35.871699]', '[-86.368039, 35.811719]', '[-84.069428, 35.944496]', '[-85.922157, 37.008219]', '[-84.097713, 37.139065]', '[-87.016889, 35.635171]', '[-86.396984, 35.80686]', '[-84.472804, 36.740518]', '[-85.597376, 38.163287]']


### <u>Task 5: Create a heatmap from your data</u>

Use the Google Maps API to create a **heatmap** using your list of tuples from above.

*Note: The Google Maps API can struggle with heatmaps that have more than 1000 datapoints. If your map is not working, try reducing your list to fewer tuples (try creating a list with just the most recent 100 entries in the dataset). Once this works, you can always add in a few more tuples!*

In [22]:
from ipywidgets.embed import embed_minimal_html # Allows us to create a separte file for the Google Maps

heatm = gmaps.Map()
heatm.add_layer(gmaps.heatmap_layer(tuple_list_500))

embed_minimal_html('HeatMap1.html', views=[markermap])
print("*** If no map appears, uncomment the line above, re-run this cell, and check your 'Metadata Part 5' folder to find the new HTML file name \"HeatMap1.html\". ***")

heatm

ValueError: too many values to unpack (expected 2)

### <u>Task 6: Comment on what you see</u>

Look at your marker map and your heatmap at various zoom levels. Comment on anything interesting or notable that you see. 

**<u>Your Answer:</u>** 

### <u>Task 7: Brainstorm further study</u>

If you had more time and resources, what else would you like to explore using the GPS data in this dataset?

**<u>Your Answer:</u>**