# COVID-19 Geographic Map
This tutorial will visualize COVID-19 data as a geographic map and will use a preprocessed dataset from [COVID-19 Time Series](tutorials/covid_19_timeseries.ipynb) and [COVID-19 Bar Chart Race](covid_19_bar_chart_race.ipynb) tutorials.

Click [here](covid_19_geographic_map.ipynb#final-animation) to see the full animation.

### load data

Here, we import ahlive and open the preprocessed dataset and further preprocess it.

In [None]:
import ahlive as ah
import pandas as pd
df = ah.tutorial.open_dataset('covid19_global_cases')

# convert cases to new cases per day
df_diff = df.pivot_table(
    'cases', columns='country_region', index='date'
).diff()
df_melt = df_diff.dropna().reset_index().melt(
    'date', value_name='new_cases'
).sort_values('date')

# start in March when testing became available
df_slice  = df_melt.loc[df_melt['date'] >= '2020-03-01']

# normalize by population
df_pop = ah.tutorial.open_dataset('covid19_population')[['combined_key', 'population']]
df_norm = df_slice .merge(df_pop, left_on='country_region', right_on='combined_key')
df_norm['new_cases'] = df_norm['new_cases'] / df_norm['population']
df_norm['new_cases'] *= 1e5

# join lats/lons
df_coords = df[['country_region', 'lat', 'long']].drop_duplicates(subset='country_region')
df_norm = df_norm.merge(df_coords, left_on='country_region', right_on='country_region')

display(df_norm)

We can now run a test!

In [None]:
ah_df = ah.DataFrame(
    df_norm, 'long', 'lat', label='country_region',
    chart='scatter', cs='new_cases', s='new_cases',
    state_labels='date', crs='platecarree',
    worldwide=True, legend=False, animate='test'
)
ah_df.render()

### tweak animation

Some things noticed:

1. The figure's aspect is not ideal.
2. The colorbar has awkward decimal values.
3. There are negative new cases.
4. Missing colorbar label.
5. There's negative cases.
6. To smooth, we can set `frames` and `interp`

In [None]:
df_norm.loc[df_norm['new_cases'] < 0, 'new_cases'] = 0

ah_df = ah.DataFrame(
    df_norm, 'long', 'lat', label='country_region',
    chart='scatter', cs='new_cases', s='new_cases',
    state_labels='date', crs='platecarree',
    worldwide=True, legend=False,
    figsize=(11, 6), vmin=0, vmax=350,
    projection='mollweide', clabel='New Cases / 100k',
    frames=15, interp='cubic',
    animate='test'
).config('cticks', format='.0f', num_colors=10)
ah_df.render()

### add annotations

Looks mostly good. We can add `title`, `subtitle`, `note`, `borders`, and `ocean`.

In [None]:
ah_df = ah.DataFrame(
    df_norm, 'long', 'lat', label='country_region',
    chart='scatter', cs='new_cases', s='new_cases',
    state_labels='date', crs='platecarree',
    worldwide=True, legend=False,
    figsize=(11, 6), vmin=0, vmax=350,
    projection='mollweide', clabel='New Cases / 100k',
    title='New Confirmed COVID-19 Cases per Day',
    subtitle='Source: JHU CSSE COVID-19', 
    note='*marker size depicts new cases',
    borders=True, ocean=True,
    frames=15, interp='cubic',
    animate='test'
).config('cticks', format='.0f', num_colors=10)
ah_df.render()

### set preset

Since Europe seems a bit crowded, we can use another way to display through `preset='rotate'` and `Orthographic` projection. Since the projections' aspect changes from wide to square, we can move the source from `subtitle` to `caption` and update the `figsize` for a less wide aspect.

In [None]:
ah_df = ah.DataFrame(
    df_norm, 'long', 'lat', label='country_region',
    chart='scatter', cs='new_cases', s='new_cases',
    state_labels='date', crs='platecarree',
    worldwide=True, legend=False,
    figsize=(7, 6), vmin=0, vmax=350,
    clabel='New Cases / 100k',
    title='COVID-19 Across the Globe',
    caption='Source: JHU CSSE COVID-19', 
    note='*marker size depicts new cases',
    borders=True, ocean=True,
    frames=15, interp='cubic',
    projection='Orthographic',
    preset='rotate', animate='test'
).config(
    'cticks', format='.0f', num_colors=10
).config(
    'projection', central_latitude=40
)
ah_df.render()

### final animation

We can also add a timeseries of the total confirmed cases across the globe. For the sake of this tutorial, only days after January 2021 will be shown.

In [None]:
df_cut = df_norm.loc[df_norm['date'] >= '2021-01-01']
df_cumsum = df.groupby('date')['cases'].sum()['2021-01-01':] / 1e6

ah_df = ah.DataFrame(
    df_cut, 'long', 'lat', label='country_region',
    chart='scatter', cs='new_cases', s='new_cases',
    state_labels='date', crs='platecarree',
    worldwide=True, legend=False,
    figsize=(13, 5), vmin=0, vmax=350,
    clabel='New Cases / 100k',
    suptitle='COVID-19 Across the Globe',
    caption='Source: JHU CSSE COVID-19', 
    note='*marker size depicts new cases',
    borders=True, ocean=True,
    frames=15, interp='cubic',
    projection='Orthographic',
    preset='rotate'
).config(
    'cticks', format='.0f', num_colors=10
).config(
    'projection', central_latitude=40
)

ah_df_ts = ah.DataFrame(
    df_cumsum , 'date', 'cases',
    state_labels='date',
    ylabel='Total Confirmed Cases [million]'
).reference(
    y0s='y', inline_labels='y'
).config('ref_inline', suffix='m')

(ah_df + ah_df_ts).render()

### full code

```python
import ahlive as ah
import pandas as pd

# laod dataset
df = ah.tutorial.open_dataset('covid19_global_cases')

# convert cases to new cases per day
df_diff = df.pivot_table(
    'cases', columns='country_region', index='date'
).diff()
df_melt = df_diff.dropna().reset_index().melt(
    'date', value_name='new_cases'
).sort_values('date')

# start in March when testing became available
df_slice  = df_melt.loc[df_melt['date'] >= '2020-03-01']

# normalize by population
df_pop = ah.tutorial.open_dataset('covid19_population')[['combined_key', 'population']]
df_norm = df_slice .merge(df_pop, left_on='country_region', right_on='combined_key')
df_norm['new_cases'] = df_norm['new_cases'] / df_norm['population']
df_norm['new_cases'] *= 1e5

# join lats/lons
df_coords = df[['country_region', 'lat', 'long']].drop_duplicates(subset='country_region')
df_norm = df_norm.merge(df_coords, left_on='country_region', right_on='country_region')

# remove negative cases
df_norm.loc[df_norm['new_cases'] < 0, 'new_cases'] = 0

# cut animation
df_cut = df_norm.loc[df_norm['date'] >= '2021-01-01']
df_cumsum = df.groupby('date')['cases'].sum()['2021-01-01':] / 1e6

# render
ah_df = ah.DataFrame(
    df_cut, 'long', 'lat', label='country_region',
    chart='scatter', cs='new_cases', s='new_cases',
    state_labels='date', crs='platecarree',
    worldwide=True, legend=False,
    figsize=(13, 5), vmin=0, vmax=350,
    clabel='New Cases / 100k',
    suptitle='COVID-19 Across the Globe',
    caption='Source: JHU CSSE COVID-19', 
    note='*marker size depicts new cases',
    borders=True, ocean=True,
    frames=15, interp='cubic',
    projection='Orthographic',
    preset='rotate'
).config(
    'cticks', format='.0f', num_colors=10
).config(
    'projection', central_latitude=40
)

ah_df_ts = ah.DataFrame(
    df_cumsum , 'date', 'cases',
    state_labels='date',
    ylabel='Total Confirmed Cases [million]'
).reference(
    y0s='y', inline_labels='y'
).config('ref_inline', suffix='m')

(ah_df + ah_df_ts).render()
```