# More Choropleths
The only built-in maps for choropleths are for countries and U.S. States - and they only work with `px.choropleth` not `px.choropleth_mapbox`. So if we want to use a web-based map or if we want to use some other maps for choropleths we must supply our own 'geometry' (the shapes that are colored in - like the U.S. states or countries). Some other common regions we might want to use for choropleths are states/provinces of other countries, city boundaries,neighborhoods, zip codes, counties, voting districts, etc.

There are a number of standard geographical file types to represent these various regions, including Shapefiles and GeoJSONs. Plotly is designed to work with GeoJSONs, so that's what we'll use.

## GeoJSONs

A json file is a standard file format used widely on the internet (primarily by web developers so you probably haven't heard of it). A GeoJSON is just a json with some special geograpic attributes. In order to read a json file we'll need to use a new library, called the `requests` library. `requests` is primarily designed to get files from the internet - we haven't needed to use it yet because our libraries - `pandas` and `plotly` - have taken care of it for us. But requests is built to decode jsons as well, so we'll use it here.

As always we'll import it, along with our regular imports:

In [None]:
import pandas as pd
import plotly.express as px
import requests

In order to 'request' a file over the internet, we call `requests.get()` with the address of the file we want, then we add `.json()` on the end to signify that this is a json file. Let's use this to import a file with U.S. counties here.

In [None]:
address = 'https://github.com/SkyIslandsMath/semester-2/raw/master/data/counties_med.geojson'
counties = requests.get(address).json()

In [None]:
counties

If you look at the `counties` file above, you see that there is a lot of information in it -- much of it is just gibberish to us. There are a lot of individual entries in this file - one for each county - let's look at the first one to see it's structure:
```
{'type': 'Feature',
   'properties': {'STATEFP': '39',
    'COUNTYFP': '071',
    'COUNTYNS': '01074048',
    'AFFGEOID': '0500000US39071',
    'GEOID': '39071',
    'NAME': 'Highland',
    'LSAD': '06',
    'ALAND': 1432479992,
    'AWATER': 12194983},
   'geometry': {'type': 'Polygon',
    'coordinates': [[[-83.869762, 39.05553],
      [-83.865679, 39.247333],
      [-83.834404, 39.245715],
      [-83.835869, 39.223304],
      [-83.813873, 39.223044],
      [-83.80134, 39.231567],
      [-83.784793, 39.262888],
      [-83.590878, 39.378736],
      [-83.372714, 39.377416],
      [-83.375073, 39.347827],
      [-83.386057, 39.31969],
      [-83.376629, 39.290944],
      [-83.393794, 39.268048],
      [-83.381125, 39.249618],
      [-83.362351, 39.240576],
      [-83.344984, 39.250604],
      [-83.343479, 39.233216],
      [-83.353531, 39.197585],
      [-83.385637, 39.055197],
      [-83.611592, 39.01889],
      [-83.673018, 39.02043],
      [-83.70535, 39.021351],
      [-83.816859, 39.020173],
      [-83.872214, 39.021304],
      [-83.869762, 39.05553]]]}},
```
Most of this we don't need to understand. There is some information about the county followed by it's geometry. The geometry is just a set of latitude/longitude coordinates that describe the boundary of the county. The main thing we need to consider is how we will connect this geometry to a value in our dataset. In general, we need some kind of name or id code that is present in our data and our geojson. Some obvious possibilities are the `'NAME'` property - this is apparently the geometry for Highland county - and the `'GEOID'` property which is hopefully some kind of geographical id.

Let's look at our data to see whether either of these are there.

## Making our choropleth

We'll start by looking at the median age of the residents of each county. Our data is located at 
`'https://raw.githubusercontent.com/SkyIslandsMath/semester-2/master/data/county_median_age.csv'`
let's load it and look at it.

In [None]:
age = pd.read_csv(
    'https://raw.githubusercontent.com/SkyIslandsMath/semester-2/master/data/county_median_age.csv',
    # the line below tells pandas to treat the data in the 'geoid' column as strings 
    #rather than numbers - I'll tell you when you need to do this
    dtype={'geoid': str})
age

We can see that the dataset has both county names and geoids. If we look at the county names,though, we see they contain the state names too, as well as the word county, some spaces and a comma. None of that appears in our GeoJSON above, and we need them to match exactly. We could try and fix that column, but since we also have the geoids we'll just use them.

Now, when we call our choropleth function, we are really passing it two datasets - the one with the median ages, and the one with the county geometries. We will connect these with their geoids.

In [None]:
px.choropleth_mapbox(
    #the dataframe with our median age data
    age,
    # we use the argument to 'locations=' to identify 
    # the column in our dataframe that will connect us to our geometry
    locations='geoid',
    # the 'column in our data that we want to control the color of our counties
    color='median',
    
    #county geometries
    geojson=counties,
    #the argument below tells plotly what part of the geojson connects to our data
    #in this case the property 'GEOID' which we identify with 'properties.GEOID'
    featureidkey='properties.GEOID',
    
    #all of our other options
    color_continuous_scale='Sunset',
    range_color=[28,55],
    hover_data=['name'],
    mapbox_style="carto-positron",
    zoom=3,
    center={
        "lat": 37.0902,
        "lon": -95.7129
    },
    opacity=0.5)

Great! Now you try one, We'll use the same data for our county geometries, but use a different dataset. We'll look at the median household income for each county. Our data can be found here:
`'https://raw.githubusercontent.com/SkyIslandsMath/semester-2/master/data/county_median_income.csv'`

In [None]:
#load our data using pd.read_csv , save it to a variable named income
# use the argument dtype={'geoid': str} when you read the csv file

#display our data


We can see that we have our geoids again, so we can use the same structure as our last choropleth.V

In [None]:
px.choropleth_mapbox(
    #the dataframe with our median income data
    ,
    # we use the argument to 'locations=' to identify 
    # the column in our dataframe that will connect us to our geometry
    locations=,
    # the 'column in our data that we want to control the color of our counties
    color=,
    
    #county geometries
    geojson=,
    #the argument below tells plotly what part of the geojson connects to our data
    #in this case the property 'GEOID' which we identify with 'properties.GEOID'
    featureidkey=\,
    
    #all of our other options
    color_continuous_scale='Greens',
    range_color=[16000,100000],
    hover_data=['name'],
    mapbox_style="carto-positron",
    zoom=3,
    center={
        "lat": 37.0902,
        "lon": -95.7129
    },
    opacity=0.5)

Neat-o. Feel free to play around with these maps to change their appearances.

## Assignment

Let's look at county poverty levels. We have a dataset with each counties population, population in poverty, and percent of population in poverty here:
```
https://raw.githubusercontent.com/SkyIslandsMath/semester-2/master/data/county_poverty.csv
```
Load it with `pd.read_csv` and use the argument `dtype={'geoid':str}`. Then make two choropleths, one for county populations and one for county poverty rates.