- To follow along with this lecture, you will need to install the `plotly` package in your PIC16B Anaconda environment.

- Also, if you get a blank screen, adding this has worked for at least one student in the past.

```
import plotly.io as pio 

pio.renderers.default= 'iframe'
```

- Reminder about Intro + project pitch, HW0, HW1.

- Today: Interactive, geographic visualization using Plotly

https://plotly.com/python/

In [11]:
import pandas as pd
from plotly import express as px
import plotly.io as pio 

pio.renderers.default= 'iframe'

# scatter maps

https://plotly.github.io/plotly.py-docs/generated/plotly.express.scatter_mapbox.html


In [12]:
coords = pd.DataFrame({
    "lon" : [-118.44300984639733], 
    "lat" : [34.0696449790177],
    "message" : ["We are here!"]
})
coords

Unnamed: 0,lon,lat,message
0,-118.44301,34.069645,We are here!


In [16]:
fig = px.scatter_mapbox(coords, 'lat', 'lon', hover_name="message", mapbox_style="open-street-map")

fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
# fig.show()


Let's break this down a bit. The first line imports the `express` module of `plotly`, which provides a high-level interface to a variety of Plotly tools. One can also work directly with the low-level `graph_objects` module, which allows one a finer level of control over the settings of visualizations. We won't use `graph_objects` in this course. 

The magic happens starting on the third line, when we call `px.scatter_mapbox()`. The first argument must be a data frame. The `lat` and `lon` arguments tell `px` which columns contain the latitude and longitude coordinates. The `hover_name` specifies what should appear when we hover over the plotted point with our mouse. `zoom` controls the initial zoom level of the map, which can subsequently be modified by the user. `height` allows one to control the aspect ratio. There are many [other parameters](https://plotly.github.io/plotly.py-docs/generated/plotly.express.scatter_mapbox.html) to `px.scatter_mapbox()`. 

The final next two lines control which *map tiles* are used in the visualization and the amount of whitespace around the visualization. The final line actually displays the map. 

Now let's try changing up the zoom level and the map tiles. The `positron` tiles from CartoDB are very low-contrast, which is very helpful when creating plots that use these tiles as backgrounds. 

In [None]:
# different zoom level, use cartoDB tiles

fig = px.scatter_mapbox(

)

fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

Maybe you dream of mountains, valleys, and beaches? 

In [None]:
# different zoom level, use Stamen Terrain tiles
fig = px.scatter_mapbox(

)

fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

Summing up, Plotly makes it unreasonably easy to create attractive, interactive maps in Python. Let's now go from "pretty maps" to "informative, scientific data graphics." 

# Visualizing Climate Measurement Stations

Let's now use our GHCN data on global temperatures to create some interesting visualizations. As a first step, we'll create a set of markers for different climate stations. First, let's grab the data on stations: 

In [None]:
import numpy as np

#url = "https://raw.githubusercontent.com/PhilChodrow/PIC16B/master/datasets/noaa-ghcn/station-metadata.csv"
stations = pd.read_csv("../sql/station-metadata.csv")
stations.head()

For the purposes of geographic plotting, the key columns here are the `LATITUDE` and `LONGITUDE` columns. Let's try plotting! 

Note that it might take a little while for the map to render. There are 27.5k points, which is kind of a lot! 

In [None]:
fig = px.scatter_mapbox(

)

fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

This is cool and interactive, but there are a few shortcomings if we want to display scientific information. It's hard to make comparisons -- for example, it looks like there might be a higher density of stations in the US than in many other areas, but it's hard to be sure from the map above. For comparing densities, *heatmaps* provided a useful approach. Ploty again makes this unreasonably easy. 

In [None]:
fig = px.density_mapbox(

)

fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

The colors get brighter and more intense the more stations there are in that area. We can notice a few things, such as the very high density of measurement stations in the US and Germany. 

However, it's harder to see patterns when we zoom in much more. If we want to look at patterns within Europe, for example, we might want to increase the radius. 

Experimentation with the [various arguments](https://python-visualization.github.io/folium/plugins.html#folium.plugins.HeatMap) of the `HeatMap` function is usually necessary to obtain a good result. 

In [None]:
fig = px.density_mapbox(

)

fig.update_layout(mapbox_style="carto-positron")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

## Geographic Scatterplots

Another thing we might want to do is color code the climate stations according to some quantitative measure. Let's compute the average temperature in January for each one over the most recent decade, and use this to color code them. 

In [None]:
# from group exercise last week
import sqlite3
conn = sqlite3.connect("../sql/temps.db")

cmd = \
"""
SELECT SUBSTRING(S.id,1,2) country, S.name, ROUND(AVG(T.temp), 1) mean_temp, S.latitude, S.longitude
FROM temperatures T
LEFT JOIN stations S ON T.id = S.id
WHERE (T.month = 1) AND (T.year BETWEEN 2011 AND 2020)
GROUP BY S.name
"""

temp_per_country = pd.read_sql_query(cmd, conn)
conn.close()

In [None]:
temp_per_country #  averages of Jan temperatures from 2011 to 2020, per station

Great! This is the data we need. Now we can supply this data to `px.scatter_mapbox`, using as the value of `color` the name variable that we want use to shade the points. 

In [None]:
fig = px.scatter_mapbox(

)

fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

This plot makes it easy to see that countries near the equator tend to be warmer (at least in March). 

# Saving and Sharing

To save your visualization as HTML, just use `write_html` from `plotly.io`. 

In [None]:
from plotly.io import write_html
write_html(fig, "geo_scatter.html")

You can then send this file to people you'd like to impress! 

With quarto, these links will be helpful:
- https://quarto.org/docs/interactive/widgets/jupyter.html#plotly
- https://quarto.org/docs/get-started/computations/vscode.html

# Choropleths

A *choropleth* is a polygon-based visualization, in which different geographic polygons are assigned different colors. If you've ever seen a map of election results by state, or of CO2 emissions by country, you've seen a choropleth. 

Let's make one! We'll visualize the average January temperature for each country. We need two things: 

1. A data frame containing the average march temperature for each country. 
2. A GeoJSON file containing the coordinates for the country polygons. 

GeoJSON's are pretty complex files, but fortunately we don't really need to interact with them too much. The code below uses the `json` module to read a GeoJSON file from the web. This file contains the borders of countries. 

https://plotly.com/python/choropleth-maps/

In [None]:
from urllib.request import urlopen
import json

countries_gj_url = "https://raw.githubusercontent.com/PhilChodrow/PIC16B/master/datasets/countries.geojson"

with urlopen(countries_gj_url) as response:
    countries_gj = json.load(response)

countries_gjGeoJSON files can be very complicated, and often contain large quantities of metadata. For our purposes, we only need the name of the country and the shape in coordinates, which is supplied by the `geometry` feature: 

In [None]:
type(countries_gj), countries_gj.keys()

In [None]:
type(countries_gj['features']), len(countries_gj['features'])

In [None]:
type(countries_gj["features"][1]), countries_gj["features"][1].keys()

In [None]:
type(countries_gj["features"][1]['properties']), countries_gj["features"][1]['properties'].keys()

In [None]:
countries_gj["features"][1]

In [None]:
len(countries_gj["features"])

The next thing we need is temperature data and country names!

In [None]:
temp_per_country # average Jan temp from 2011 to 2020, per country

In [None]:
#countries_url = "https://raw.githubusercontent.com/mysociety/gaze/master/data/fips-10-4-to-iso-country-codes.csv"
countries_code = pd.read_csv('../sql/countries.csv')

countries_code

In [None]:
# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html
# this is the same as 
'''
SELECT ...
FROM countries_code
INNER JOIN df ON countries_code.FIPS 10-4 = df.country
'''


countries_code = countries_code.merge(temp_per_country, 
                                      how='inner', 
                                      left_on='FIPS 10-4', 
                                      right_on='country')

countries_code



And now we're done with our data prep! We now need to use px.choropleth to create the map. We need to pass the data frame of temperature data, the GeoJSON file, and some additional information.

- `locations`: We need to indicate which column in march_avgs_by_country to use as the identifiers of countries.
- `locationmode`: We need to specify that the values in the columns passed to locations are names of countries and not, say, FIPS ID codes.
- `color`: We need to state which column should be used to determine the color of each country.



In [None]:
fig = px.choropleth(

)

fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

## Paired exercise:

Think back to your first days at UCLA. What are some information that would have been helpful for you? Describe at least one example for each type of (geographical) visualization that we've covered today. 

You can assume you have any data you need, and the scope can be as small as UCLA campus and as large as world. It can be silly or serious.

For example,

`scatter_mapbox`
- Question: Where are the showers with the best water pressure on campus?
- Each point corresponds to every shower, and the size of the point scales with the strength of the water pressure.