# Introduction to choropleth maps with plotly express
[Plotly Express](https://plotly.com/python/plotly-express/) is a high level API for creating charts using Python. By high level, we mean that the much of the complexity of the code required to create a chart, such as you saw with matplotlib, is hidden from you and provided by the API, allowing you to create charts with less code. It also allows you to create interactive charts, i.e. those that the user can interact with, such as the example you saw in week 1 with the animated bubble chart using the gapminder data. In fact you used the plotly express library to create that chart.

We are using plotly express this week to introduce you to the plotly syntax and introduce some interactive chart types. Next week we will take this to the next level and a web framework, Plotly Dash, to create interactive dashboards that can be delivered as web apps.

You might recall that JavaScript (not Python) is used to add interactive elements to a web page. So why are you learning Plotly? Plotly uses D3.js behind the scenes. D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. Why don't we just learn D3.js then? Well, we could, but it would mean learning another programming language, JavaScript, whereas the foundation language in this IEP minor is Python (and Python is more widely used in data science than JavaScript).

Plotly express is now a module in the plotly Python package, though it used to be a separate package. Plotly.py is an open-source library built on plotly.js which in turn is built on d3.js. To add a further source of confusion, there are numerous third party wrappers for Plotly for different languages.

The plotly express API also includes access to [some datasets](https://plotly.com/python-api-reference/generated/plotly.data.html#module-plotly.data), two of which we have already used (gapminder and iris). We will use some of their other built in datasets in this exercise.
 
### Getting help from the [Plotly community forum](https://community.plotly.com/) and elsewhere
The current version of Plotly.py at the time of writing this is 4.9. Some of the examples you may find online are from much earlier versions and the code is likely to need to be adapted to work wuth the current version.

## Getting started

Let's get on with some coding.

You should already have installed the libraries for this week if you have been using the GitHub repository. If not then you will need to install plotly into your Python environment first using `pip install plotly`.

The first step is to import the necessary libraries for use in this notebook. You can use `import plotly.express as px 
` however as of version 4.8 of plotly, you can now also use a Plotly Express-powered backend for Pandas plotting. This means you can now produce interactive plots directly from a data frame, without even needing to import Plotly.

In [None]:
import pandas as pd
pd.options.plotting.backend = "plotly"

df = pd.DataFrame(dict(a=[1,3,2], b=[3,2,1]))

# using Plotly Express via the Pandas backend
fig1 = df.plot.bar()
fig1.show()

In [None]:
# using Plotly Express directly
import plotly.express as px
fig2 = px.bar(df)
fig2.show()

### Styling charts
There are the 4 ways you can style and customize figures made with Plotly Express:

1. Control common parameters like width & height, titles, labeling and colors using built-in Plotly Express function arguments
2. Updating the figure attributes using [update methods or by directly setting attributes](https://plotly.com/python/creating-and-updating-figures/)
3. Using Plotly's [theming/templating mechanism](https://plotly.com/python/templates/) via the template argument to every Plotly Express function
4. Setting default values for common parameters using px.defaults

Try out the [examples from the Plotly documentation](https://plotly.com/python/styling-plotly-express/) of these 4 methods by replacing the code in the cell below and running it.

In [None]:
import plotly.express as px
df = px.data.tips()
fig = px.histogram(df, x="day", y="total_bill", color="sex")
fig.show()

### Choropleth map
Choropleth maps display divided geographical areas or regions that are coloured, shaded or patterned in relation to a data variable. [Read the description of their use](https://datavizcatalogue.com/methods/choropleth.html) before moving on to the next code cell.

The following example is copied from the [Plotly documentation](https://plotly.com/python/plotly-express/#maps).

In [None]:
import plotly.express as px

df = px.data.election()

geojson = px.data.election_geojson()

fig = px.choropleth_mapbox(df, 
                           geojson=geojson, 
                           color="Bergeron",
                           locations="district", 
                           featureidkey="properties.district",
                           center={"lat": 45.5517, "lon": -73.7073},
                           mapbox_style="carto-positron", 
                           zoom=9)
fig.show()

Let's try and create a choropleth map using the gapminder data.

In [None]:
import plotly.express as px

gapminder = px.data.gapminder()

px.choropleth(gapminder,               
              locations="iso_alpha",               
              color="lifeExp",
              hover_name="country",  
              animation_frame="year",    
              color_continuous_scale='Plasma',  
              height=600
)

Now modify the code to add the following styling:
- A suitable title for the chart
- Apply the plotly_dark template

In [None]:
# Modify the following code and add
#A suitable title for the chart
#Apply the plotly_dark template

import plotly.express as px
gapminder = px.data.gapminder()
px.choropleth(gapminder,               
              locations="iso_alpha",               
              color="lifeExp",
              hover_name="country",  
              animation_frame="year",    
              color_continuous_scale='Plasma',  
              height=600
)

### Create a choropleth map using England 2011 census data and local authority boundaries

#### Data source
The census boundary data was downloaded from [UK data service](https://census.ukdataservice.ac.uk/get-data/boundary-data.aspx)

You can also get geo boundary data from the [office for national statistics (ONS)](https://geoportal.statistics.gov.uk/datasets/census-merged-local-authority-districts-december-2011-generalised-clipped-boundaries-in-great-britain).

The geodata was converted to geojson using an [online conversion tool](https://odileeds.github.io/CSV2GeoJSON/).

The 2011 census data was downloaded from [InFuse](https://infuse.ukdataservice.ac.uk).

#### Geojson data
In the previous examples we used default geo data using iso_alpha, for this example we only want UK local authorities drawn with boundaries as at the 2011 census.

To do this we need to provide our own geojson.

Plotly expects the geojson to have an id field, however if you run the next cell you will see that our data is indexed on LAD19CD.

You could rename the JSON so LAD19CD is renamed as 'id'. Note: when converting the data using the csv to geodata converter there was also an option to change the field name.

Plotly allows you to use a different name for the id field, however if you do that then you have to have a column of the same name in your dataframe. This is the approach taken in the cells below, where the index of the age dataframe is set to LAD19CD.

In [3]:
#View the geojson data
from urllib.request import urlopen
import json

with urlopen('https://raw.githubusercontent.com/nicholsons/comp0034_week2/master/plotly_express/england_lad_2011.geojson') as response:
    la_coords = json.load(response)

la_coords['features'][0]

{'type': 'Feature',
 'properties': {'LAD19CD': 'E06000002',
  'name': 'Middlesbrough',
  'x': 450402.7095,
  'y': 516635.4855},
 'geometry': {'type': 'MultiPolygon',
  'coordinates': [[[[-1.20029, 54.57694],
     [-1.20026, 54.57692],
     [-1.19959, 54.57664],
     [-1.20029, 54.57694]]],
   [[[-1.19707, 54.58179],
     [-1.19739, 54.58132],
     [-1.19846, 54.58167],
     [-1.20247, 54.58041],
     [-1.20216, 54.57899],
     [-1.20095, 54.57769],
     [-1.19869, 54.57805],
     [-1.19434, 54.57987],
     [-1.19707, 54.58179]]],
   [[[-1.23003, 54.58411],
     [-1.23432, 54.58521],
     [-1.23981, 54.58722],
     [-1.24618, 54.59028],
     [-1.24922, 54.5907],
     [-1.25171, 54.59068],
     [-1.25335, 54.59028],
     [-1.25555, 54.58917],
     [-1.25716, 54.58714],
     [-1.26014, 54.57297],
     [-1.26041, 54.57169],
     [-1.26224, 54.56962],
     [-1.26585, 54.56791],
     [-1.26918, 54.56704],
     [-1.27385, 54.5668],
     [-1.27815, 54.56621],
     [-1.28177, 54.56527],
     [-

In [1]:
import pandas as pd

# Read the data for the over 100's into a data frame and set the index to be LAD19CD
col_list = ["GEO_CODE", "GEO_LABEL", "F105"]
age_data = pd.read_csv('Data_AGE_UNIT.csv', usecols=col_list, dtype={"GEO_CODE": str})
age_data = age_data.drop([0], axis=0) # Remove the first row which is secondary heading data
age_data = age_data.rename(columns={"GEO_CODE": "LAD19CD"})

age_data

Unnamed: 0,LAD19CD,GEO_LABEL,F105
1,E06000001,Hartlepool,14
2,E06000002,Middlesbrough,15
3,E06000003,Redcar and Cleveland,20
4,E06000004,Stockton-on-Tees,14
5,E06000005,Darlington,28
...,...,...,...
320,E09000030,Tower Hamlets,18
321,E09000031,Waltham Forest,38
322,E09000032,Wandsworth,47
323,E41000052,"Cornwall, Isles of Scilly",176


In [None]:
import plotly.express as px
import pandas as pd
from urllib.request import urlopen
import json

# Read the data for the over 100's into a data frame, drop the second heading row, rename the GEO_CODE to LAD19CD
col_list = ["GEO_CODE", "GEO_LABEL", "F105"]
age_data = pd.read_csv('Data_AGE_UNIT.csv', usecols=col_list, dtype={"GEO_CODE": str})
age_data = age_data.drop([0], axis=0)
age_data = age_data.rename(columns={"GEO_CODE": "LAD19CD"})
#age_data = age_data.set_index('LAD19CD')

# Open the json data, in this case from GitHub
with urlopen('https://raw.githubusercontent.com/nicholsons/comp0034_week2/master/plotly_express/england_lad_2011.geojson') as response:
    la_coords = json.load(response)

# Create the choropleth mapbox
fig = px.choropleth_mapbox(data_frame=age_data, 
                           geojson=la_coords, 
                           locations="LAD19CD",
                           featureidkey="properties.LAD19CD",
                           color='F105',
                           center={"lat": 51.509865, "lon": -0.118092},
                           mapbox_style="carto-positron",
                           zoom=5,
                           hover_name="GEO_LABEL",
                           labels={'GEO_LABEL':'Local authority'},
                           title="Number of residents over 100 by London Local Authority in the 2011 Census",
                           height=600
                          )


fig.show()