## Exercise 5.02: Creating a Choropleth Plot with GeoJSON Data

In this exercise, we not only want work with geojson data, but also see how we can create a choropleth visualization.   
They are espacially useful to display statistical variables in shaded areas. In our case the areas will be the outlines of the states of the USA.

The GeoJSON format is used to encode a variety of data structures like Points, Lines, and Polygons with focus on geographical visualization.
The format has a defined structure that each valid file has to follow.

```json
{
  "type": "Feature",
  "properties": {
    "name": "Dinagat Islands"
  },
  "geometry": {
    "type": "Point",
    "coordinates": [125.6, 10.1]
  }
}
```

Each object with additional properties, like e.g. an id or name attribute, are Features.
Properties simply allow for additional information to be added to the feature.
The geometry attribute holds information about the type of feature we are working with, e.g. a Point, and its specific coordinates. 
The coordinates define the positions for the “waypoints” of the given type. Those coordinates define the shape of the element to be displayed by the plotting library.

---

#### Loading the dataset

Our dataset contains the points that define the different states and a census area value that represents the amount of census areas per state.   
Since the geojson method of geoplotlib works by providing a file path to the geojson file, we don't have to do any importing and loading of the data.   


We still want to load the dataset and look at the structure of the geojson file.

In [1]:
# importing the necessary dependencies
import json
import geoplotlib
from geoplotlib.colors import ColorMap
from geoplotlib.utils import BoundingBox

Read the dataset as json object and display the fourth entry to understand the structure.

In [2]:
# displaying the fourth entry of the states dataset
with open('../../Datasets/us_states.json') as data:
    dataset = json.load(data)
    
    fourth_state = dataset.get('features')[3]
    
    # only showing one coordinate instead of all points
    fourth_state['geometry']['coordinates'] = fourth_state['geometry']['coordinates'][0][0]
    print(json.dumps(fourth_state, indent=4))

{
    "type": "Feature",
    "properties": {
        "GEO_ID": "0400000US05",
        "STATE": "05",
        "NAME": "Arkansas",
        "LSAD": "",
        "CENSUSAREA": 52035.477
    },
    "geometry": {
        "type": "Polygon",
        "coordinates": [
            -94.042964,
            33.019219
        ]
    }
}


Our dataset contains a few properties. Only the state name `NAME`, and the number of consensus areas `CENSUSAREA` are important for us in this exercise.

A consensus area is a small area with a minimum of 65 households that is defined for the purpose of taking a census for data acquisition.

Extract the names of all the states of the US from the dataset.
- Print the amount of states in the dataset
- Print all the states as a list

In [3]:
# listing the states in the dataset
with open('../../datasets/us_states.json') as data:
    dataset = json.load(data)
    
    states = [feature['properties']['NAME'] for feature in dataset.get('features')]
    print('Number of states:', len(states))
    print(states)

Number of states: 52
['Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut', 'Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Hawaii', 'Idaho', 'Illinois', 'Indiana', 'Iowa', 'Kansas', 'Kentucky', 'Louisiana', 'Maine', 'Maryland', 'Massachusetts', 'Michigan', 'Minnesota', 'Mississippi', 'Missouri', 'Montana', 'Nebraska', 'Nevada', 'New Hampshire', 'New Jersey', 'New Mexico', 'New York', 'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon', 'Pennsylvania', 'Rhode Island', 'South Carolina', 'South Dakota', 'Tennessee', 'Texas', 'Utah', 'Vermont', 'Virginia', 'Washington', 'West Virginia', 'Wisconsin', 'Wyoming', 'Puerto Rico']


---

#### Creating a Choropleth with geojson data

Create a geojson plot using the `geojson()` method of geoplotlib.

In [4]:
# plotting outline of the united states
geoplotlib.geojson('../../Datasets/us_states.json')
geoplotlib.show()

The visualization above does not give us any kind of information about the states, it simply displays the outlines of them.    
When using choropleth plots, the shading of given areas is the important feature, in this case we'll use the census areas value as statistical value to determine the color value of shading.

Therefore we have to create a mapping function that converts the numerical value into a color.

**Note:**   
If you're stuck, please take a look at the example provided by the library to understand how to create a custom color mapping.   
https://github.com/andrea-cuttone/geoplotlib/blob/master/examples/choropleth.py

In [5]:
# converting the obesity into a color
cmap = ColorMap('Reds', alpha=255, levels=10)

def get_color(properties):
    return cmap.to_color(properties['CENSUSAREA'], maxvalue=300_000,scale='lin')

Use the `us_states.json` file in the data folder to visualize the different states.

- Provide the color mapping to our color parameter and set the fill parameter to True.
- In addition to that draw a black outline for each state. Use the color argument and provide the RGBA value for black. 
- Use the USA constant of the BoundingBox class to set the bounding box

In [6]:
# plotting the shaded states and adding another layer which plots the state outlines in white
# our BoundingBox should focus the USA
geoplotlib.geojson('../../Datasets/us_states.json', fill=True, color=get_color)
geoplotlib.geojson('../../Datasets/us_states.json', fill=False, color=[0, 0, 0, 255])

geoplotlib.set_bbox(BoundingBox.USA)
geoplotlib.show()

**Note:**   
In the introduction we mentioned that geoplotlib works with a layer based system.   
This means that we can simply stack the same or different layers on top like in the example above. 

Give the user some more information for this plot, use the f_tooltip argument to provide a tooltip displaying the name and census area value of the currently hovered state.

In [7]:
# adding the f_tooltip that
geoplotlib.geojson('../../Datasets/us_states.json', fill=True, color=get_color, f_tooltip=lambda properties: properties['NAME'] + ' - Census Areas: ' + str(properties['CENSUSAREA']))
geoplotlib.geojson('../../Datasets/us_states.json', fill=False, color=[0, 0, 0, 255])

geoplotlib.set_bbox(BoundingBox.USA)
geoplotlib.show()