# Lab 6 - Mapping data

In this lab, we will use the [folium package](https://python-visualization.github.io/folium/quickstart.html) to create maps with markers on them and to make [choropleth maps](https://en.wikipedia.org/wiki/Choropleth_map).

### Section 1:  Installing the folium package

Follow the appropriate instructions for how you are running Python and Jupyter notebooks.

#### Jupyter Hub on Lehman 360: 
Run the following code and wait until the \[*\] beside the cell changes to a number to continue (may take 5-10 min; if takes longer, try restarting the notebook)

In [None]:
!pip install --user folium

#### Google Colab
Folium should be installed, but there are other problems with getting the maps to display and uploading files.  Instead, use Jupyter Hub on Lehman 360 for this lab.

#### Anaconda on your own computer

1. In Anaconda Navigator, click on Environment in the left menu.
2. Select "Not installed" in the list.
3. Select folium and click "Apply" in the pop-up box.

[Instructions with images](https://docs.anaconda.com/anaconda/navigator/tutorials/manage-packages/#installing-a-package)

If you can't get folium installed, let me know what you have tried and any error messages as soon as possible, so we can figure out how to get it installed.

Now import folium and pandas.

In [None]:
import folium
import pandas as pd
%matplotlib inline

### Section 2:  Folium maps and markers

To create a folium map, we just need to provide a latitude and longitude.  Latitude specifies the north-south position on the globe, with north of the equator being positive and south of the equator being negative.  Longitude specifies the east-west position on the globe, relative to the meridian at Greenwich, UK.  West of the meridian is written as negative and east of the meridian is written as positive.

In [None]:
folium.Map(location=[40.8747, -73.8951])

What is shown in the map?

We can also save the map as a variable and display it that way:

In [None]:
m = folium.Map(location=[40.8747, -73.8951])
m

Let's put a marker on the map `m` at Lehman College, which is located at 40.8733° N, 73.8941° W.

In [None]:
folium.Marker([40.8747, -73.8951], 
              popup="Lehman College", 
              tooltip="Click me!").add_to(m)
m

What does the marker look like?

Try clicking on the marker.  In the `Marker()` function, what does the parameter `popup` do?  What does the parameter `tooltip` does?

Add another marker for City College, which is located at 40.8200° N, 73.9493° W.  Use either the popup or tooltip parameter (your choice) to label the marker as City College.

We can also change the color and image on the marker.  For example, the following code marks the location of Hostos Community College with a red marker with an info sign on it.

In [None]:
folium.Marker([40.8181, -73.9269], 
              popup='Hostos Community College', 
              tooltip="Click for info",
              icon=folium.Icon(color='red', icon='info-sign')
             ).add_to(m)
m

You can usually find the lattitude and longitude coordinates for many New York landmarks by using the landmark name and the word coordinates in the same search.  Find the coordinates for another landmark in New York and add that marker to your map.  

### Section 3: Location of Recycling Bins

We'll now create a map of all public recycling bins in New York City from the data set of locations on NYC Open Data:  [https://data.cityofnewyork.us/Environment/Public-Recycling-Bins/sxx4-xhzg](https://data.cityofnewyork.us/Environment/Public-Recycling-Bins/sxx4-xhzg)

To download from the NYC Open Data site: 
    - click "View Data" (blue button in upper right)
    - on the next page, click "Export" (in menu in upper right)
    - click "CSV" to download
    
Or use this URL for the CSV file: [https://raw.githubusercontent.com/megan-owen/MAT328-Techniques_in_Data_Science/main/data/Public_Recycling_Bins.csv](https://raw.githubusercontent.com/megan-owen/MAT328-Techniques_in_Data_Science/main/data/Public_Recycling_Bins.csv)
    
Read the CSV file into a dataframe called `bins`.

Is there any missing data?  

There is some missing data, so let's drop all rows with an NaN value.  We have already covered this, but an easier way to drop all rows with at least one missing value is shown below.

In [None]:
bins = bins.dropna()

Display `bins` and look at the index, which is the number on the very left.  Scroll down to rows 15, 16, and 17.  We can see that the row with index 16 is missing, and must have  contained missing values and been removed.  The jump in the row index from 15 to 17 would cause problems later in this lab, so we will re-index the DataFrame.

In [None]:
bins = bins.reset_index(drop = True)

Display `bins`, and look at the index around rows 15, 16, and 17.  What happened?

Create a new folium map variable called `bins_map` centered at 40.7128° N, 74.0060° W (New York City coordinates).

Recall that in a loop, the counter counts from 0 to the range - 1.  In the example below, the counter is `i` and it will count from 0 to 4.

In [None]:
for i in range(5):
    print(i)

We can use this to loop through the rows in our dataframe `bins`.  Below we loop through the first 10 rows of `bins`, store the current row in the variable `row`, and use it to print the latitude and longitude.

In [None]:
for i in range(10):
    row = bins.loc[i]
    print("Coordinates: ", row["Latitude"], row["Longitude"] )

Now, let's plot the bins on the map.  First change the code below through all rows in `bins`, not just the first 10.We want to loop through all the rows (how can we find the number of rows?) and use the latitude and longitude from each row to create a new Marker for our map.

In [None]:
for i in range(10):
    row = bins.iloc[i]
    print("Coordinates: ", row["Latitude"], row["Longitude"] )

Next, in this loop, replace the code 

`print("Coordinates: ", row["Latitude"], row["Longitude"] )`

with code that places a folium Marker at latitude `row["Latitude"]` and longitude `row["Longitude"]` on `bins_map`, and display the map.

Note:  You have to rerun the code creating `bins_map` if you make a mistake in adding the markers and want to clear them.

What's the closest bin to Lehman College?

There's a column in the `bins` dataframe called "Park\Site Name".  Can you re-plot the map of the bins, but using the value in this column as the tooltip text?

<details><summary>Answer:</summary>
<code>
for i in range(bins.shape[0]):
    row = bins.iloc[i]
    folium.Marker([row["Latitude"], row["Longitude"]],
                 tooltip = row["Park/Site Name"]
                 ).add_to(bins_map)
bins_map
</code>
</details>



### Section 4: Choropleth Maps

A choropleth map is a map with areas shaded or colored in proportion to the mean (or some other statistical variable) of some property of that area (like income, population density, etc.)

For our first choropleth map, we'll use the NYC school district boundaries, originally downloaded from NYC Open Data.

Download the GeoJSON file from [https://github.com/megan-owen/MAT328-Techniques_in_Data_Science/blob/main/data/nyc_school_districts.json](https://github.com/megan-owen/MAT328-Techniques_in_Data_Science/blob/main/data/nyc_school_districts.json)


This GeoJSON file was originally downloaded from [NYC Open Data Planning](https://www1.nyc.gov/site/planning/data-maps/open-data/districts-download-metadata.page) (first row under "School, Police, Health & Fire"), but is currently unavailable.  

The district math scores are at https://infohub.nyced.org/reports-and-policies/citywide-information-and-data/test-results  under Math Test Results 2013 to 2019.  They are only available as an Excel file, which then needs to be opened in Excel and the correct page saved as a CSV file.

Instead, download the CSV file directly from [http://comet.lehman.cuny.edu/owen/teaching/mat328/math_district.csv](http://comet.lehman.cuny.edu/owen/teaching/mat328/math_district.csv)

Create a New York City map called `school_map`:

Create a layer showing the school districts, and add it to your map.  You may get an warning that "IOPub data rate exceeded." and no map displayed.  This does not prevent us from saving the map in the next step and viewing it that way.

In [None]:
folium.Choropleth(geo_data ="nyc_school_districts.json",
                     fill_opacity=0.5, line_opacity=0.5
                     ).add_to(school_map)
school_map

To save the map as an HTML file in the same place (folder) as this lab.  You should be able to open the map in a web browser to view it.

In [None]:
school_map.save(outfile='testScores.html')

Let's color the districts by the mean 8th grade math score in 2018.  Read the math district scores CSV file into a dataframe called `all_scores`:

Filter this dataset to only include scores from 2018 and 8th grade (note the `Grade` column is stored as categorical data, so you will need to write `"8"` in your filter).  Call your new dataframe `scores_gr8_2018`:

Reset the `school_map` by creating it again: 

To add the shading to the map:

In [None]:
#Create a layer, shaded by test scores:
folium.Choropleth(geo_data ="nyc_school_districts.json",
                     fill_color='YlGn',
                     data = scores_gr8_2018,
                     key_on="feature.properties.SchoolDist",
                     columns = ['District', 'Mean Scale Score'],
                     fill_opacity=0.4, line_opacity=0.5
                     ).add_to(school_map)
school_map


Save your new map to a different .html file.