# Introduction

It can be a troubling time, but we do have hope on the horizon, with the news we get daily about vaccines. Multiple companies are releasing and getting their vaccines approved; we may  soon see a path forward. 

Using the robust toolset provided by Kaggle, I'll show you how to create an interactive map to track, for each state, the percentage of inhabitants that have been vaccinated against COVID-19.  

To get started, if you haven't already, make your own copy of this notebook by clicking on the **[Copy and Edit]** button in the top right corner. 

This notebook is an example of a project that you can create based on what you'd learn from taking Kaggle's [Geospatial Analysis course](https://www.kaggle.com/learn/geospatial-analysis).

# US Vaccine Tracker

We'll use two datasets.  

- The first dataset has the total number of inhabitants of each state, along with latitude and longitude data for each state's capital city.  This dataset is pulled from the 2019 US Census, and I've uploaded it [here](https://www.kaggle.com/peretzcohen/2019-census-us-population-data-by-state).
- The second dataset contains a recent estimate for the total number of people that have been vaccinated in each state.  This [vaccine dataset](https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/us_state_vaccinations.csv) is drawn from [Our World In Data](https://ourworldindata.org/), who update their vaccine datasets from the CDC quite regularly.  Every time you run this notebook, you'll use the most recent version of their data.

In the next code cell, we load and preprocess the data.  As output, you'll see the total percent of the population that has been vaccinated in the US, along with a preview of the Pandas DataFrame that we'll use to make the tracker.

In [None]:
# Imports
import pandas as pd
from datetime import date, timedelta
import folium
from folium import Choropleth, Marker
from folium.plugins import MarkerCluster
import math
import matplotlib.pyplot as plt
import seaborn as sns
import geopandas as gpd
from folium import GeoJson 

# Population Data
populationData = pd.read_csv('/kaggle/input/2019-census-us-population-data-by-state/2019_Census_US_Population_Data_By_State_Lat_Long.csv')

# Get the most recent date for filtering
freshDate = date.today() - timedelta(days=1)
freshDate = date.strftime(freshDate,"%Y%m%d")
freshDate = freshDate[0:4] + "-" + freshDate[4:6] + "-" + freshDate[6:8]

# Vaccination data, for most recent date
vaccinationData = pd.read_csv('https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/us_state_vaccinations.csv')
vaccinationByLocation = vaccinationData.loc[(vaccinationData.date == freshDate)][["location", "people_vaccinated"]]

# Vaccination and population data
vaccinationAndPopulationByLocation = pd.merge(populationData, vaccinationByLocation, left_on='STATE',right_on='location').drop(columns="location")

# Calculate percentage vaccinated by state
vaccinationAndPopulationByLocation["percent_vaccinated"] = vaccinationAndPopulationByLocation["people_vaccinated"] / vaccinationAndPopulationByLocation["POPESTIMATE2019"]

vaccinationAndPopulationByLocation.head()

In [None]:
# Calculate the total percent vaccinated in the US
percentageTotal = vaccinationAndPopulationByLocation["people_vaccinated"].sum() / vaccinationAndPopulationByLocation["POPESTIMATE2019"].sum()
print('Percentage Vaccinated in the US: {}%'.format(round(percentageTotal*100, 2))) 

The next code cell uses the data to create a tracker, with one marker for each state.  You can click on the markers to see the percentage of the population that has been vaccinated.

Import US States boundaries

In [None]:
states_full = gpd.read_file('../input/us-states-cartographic-boundary-shapefiles/cb_2016_us_state_500k.shp')
#states = states_full[["NAME", "geometry"]].set_index("NAME")
states = states_full[['geometry','NAME']]
states.head()

In [None]:
# Create the map
v_map = folium.Map(location=[42.32,-71.0589], tiles='cartodbpositron', zoom_start=4) 

vaccinationAndPopulationByLocation.groupby(by='STATE')[["STATE", "percent_vaccinated", 'people_vaccinated']].median().head()

states["per_vac"] = states.merge(
    vaccinationAndPopulationByLocation.groupby(by='STATE')[["STATE", "percent_vaccinated"]].median(),
    how="left",
    left_on="NAME",
    right_on="STATE",
)["percent_vaccinated"]

states["peo_vac"] = states.merge(
    vaccinationAndPopulationByLocation.groupby(by='STATE')[["STATE", "people_vaccinated"]].median(),
    how="left",
    left_on="NAME",
    right_on="STATE",
)["people_vaccinated"]

states['per_vac100']=states['per_vac']*100

states.head()

In [None]:
import branca


colormap = branca.colormap.LinearColormap(
    vmin=states["per_vac100"].quantile(0.0),
    vmax=states["per_vac100"].quantile(1),
    colors=["red", "orange", "lightblue", "green", "darkgreen"],
    caption="Percentage of vaccinated people",
)

In [None]:
from folium.features import GeoJsonPopup, GeoJsonTooltip


m = folium.Map(location=[35.3, -97.6], zoom_start=4)

tooltip = GeoJsonTooltip(
    fields=["NAME", "per_vac100", 'peo_vac'],
    aliases=["State:", "Vaccinated in %:", 'Total vaccinated:'],
    localize=True,
    sticky=False,
    labels=True,
    style="""
        background-color: #F0EFEF;
        border: 2px solid black;
        border-radius: 3px;
        box-shadow: 3px;
    """,
    max_width=800,
)


g = folium.GeoJson(
    states,
    style_function=lambda x: {
        "fillColor": colormap(x["properties"]["per_vac100"])
        if x["properties"]["per_vac100"] is not None
        else "transparent",
        "color": "black",
        "fillOpacity": 0.4,
    },
    tooltip=tooltip,
).add_to(m)

colormap.add_to(m)

m

# Your turn

Here are some ideas for how you might improve on the work here:
- In Kaggle's [Geospatial Analysis course](https://www.kaggle.com/learn/geospatial-analysis), you learn how to use folium to create many different types of interactive maps.  How might you use this data to instead create a choropleth map?
- In case you would like to work with more data sources,
  - The Centers for Disease Control and Prevention (CDC) in the US releases daily vaccine data and has a vaccination progress tracker on its [COVID Data Tracker site](https://covid.cdc.gov/covid-data-tracker/#vaccinations).
  - NBC News has a [vaccine tracker](https://www.nbcnews.com/health/health-news/map-covid-19-vaccination-tracker-across-u-s-n1252085) as well which is quite well done.
  
Once you have created your own extension of this work, let us know about it in the comments!