## Choropleth plot with geojson data

In this activity, we not only want work with geojson data, but also see how we can create a choropleth visualization.   
They are espacially useful to display statistical variables in shaded areas. In our case the areas will be the outlines of the states of the USA.   

Note:
The dataset can be found here:   
https://catalog.data.gov/dataset/national-obesity-by-state-b181b

#### Loading the dataset

Our dataset contains the points that define the different states and an obesity value that represents the percentage of people that are obese per state.   
Since the geojson method of geoplotlib works by providing a file path to the geojson file, we don't have to do any importing and loading of the data.   


We still want to load the dataset and look at the structure of the geojson file.

In [2]:
# importing the necessary dependencies
import json
import geoplotlib
from geoplotlib.colors import ColorMap
from geoplotlib.utils import BoundingBox

In [20]:
# displaying one of the entries for the states
with open('data/National_Obesity_By_State.geojson') as data:
    dataset = json.load(data)
    
    first_state = dataset.get('features')[0]
    
    # only showing one coordinate instead of all points
    first_state['geometry']['coordinates'] = first_state['geometry']['coordinates'][0][0]
    print(json.dumps(first_state, indent=4))
    

{
    "type": "Feature",
    "properties": {
        "OBJECTID": 1,
        "NAME": "Texas",
        "Obesity": 32.4,
        "Shape__Area": 7672329221282.43,
        "Shape__Length": 15408321.8693326
    },
    "geometry": {
        "type": "Polygon",
        "coordinates": [
            -106.623454789568,
            31.9140391520155
        ]
    }
}


Extracting the names of all the states provided in the geojson might also be helpful later on.

In [7]:
# listing the states in the dataset
with open('data/National_Obesity_By_State.geojson') as data:
    dataset = json.load(data)
    
    states = [feature['properties']['NAME'] for feature in dataset.get('features')]
    print(states)

['Texas', 'California', 'Kentucky', 'Georgia', 'Wisconsin', 'Oregon', 'Virginia', 'Tennessee', 'Louisiana', 'New York', 'Michigan', 'Idaho', 'Florida', 'Alaska', 'Montana', 'Minnesota', 'Nebraska', 'Washington', 'Ohio', 'Illinois', 'Missouri', 'Iowa', 'South Dakota', 'Arkansas', 'Mississippi', 'Colorado', 'North Carolina', 'Utah', 'Oklahoma', 'Wyoming', 'West Virginia', 'Indiana', 'Massachusetts', 'Nevada', 'Connecticut', 'District of Columbia', 'Rhode Island', 'Alabama', 'Puerto Rico', 'South Carolina', 'Maine', 'Arizona', 'New Mexico', 'Maryland', 'Delaware', 'Pennsylvania', 'Kansas', 'Vermont', 'New Jersey', 'North Dakota', 'New Hampshire']


##### **Note:**   
The dataset has been altered, if you download it from the link mentioned in the introduction, please edit the file and remove the object describing Hawaii.   
It lacks geometry data which leads to an error in geoplotlib.

---

#### Creating a Choropleth with geojson data

Use the `National_Obesity_By_State.geojson` file in the data folder to visualize the different states.

In [1]:
# plotting the information from the geojson file


The visualization above does not give us any kind of information about the obesity per state. It completely lacks the information we wanted to display.   
When using choropleth plots, the shading of given areas is the important feature, in this case we'll use the obesity percentage as statistical value to determine the value of the shading.

Therefore we have to create a mapping function that converts the numerical value into a color.

**Note:**   
If you're stuck, please take a look at the example provided by the library to understand how to create a custom color mapping.   
https://github.com/andrea-cuttone/geoplotlib/blob/master/examples/choropleth.py

In [2]:
# converting the obesity into a color


In [3]:
# plotting the shaded states and adding another layer which plots the state outlines in white
# our BoundingBox should focus the USA


**Note:**   
In the introduction we mentioned that geoplotlib works with a layer based system.   
This means that we can simply stack the same or different layers on top like in the example above. 