<a href="https://colab.research.google.com/github/Paraml3sS/Covid-map-visualization/blob/main/covid_vis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import altair as alt
from vega_datasets import data
import altair as alt
import pandas as pd

## Configurations

In [None]:
map_options = {
    "general": {
        "title": "Number of confirmed cases in different countries",
        "category": "equirectangular"
    },
    "style": {
        "color": "#666666",
        "stroke-color": "#FFF",
        "width": 900,
        "height": 540
    }
}

## Get data

In [None]:
countries_background = alt.topo_feature(data.world_110m.url, "countries")
covid = pd.read_csv("https://raw.githubusercontent.com/Paraml3sS/Covid-Visualization/main/covid19_global_confirmed_cases.csv")
original_covid = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv", ',')

## Prepare data

In [None]:
original_covid_without_regions = original_covid.drop_duplicates('Country/Region')

covid_with_coordinates = covid.merge(original_covid_without_regions[['Country/Region','Lat', 'Long']], left_on='country', right_on='Country/Region')\
  .drop(columns='Country/Region')\
  .sort_values('date', ascending=True)

covid_with_coordinates['date'] = pd.to_datetime(covid_with_coordinates['date']).dt.dayofyear

In [None]:
title, category = map_options["general"].values()
color, stroke_color, width, height = map_options["style"].values()

## Create visualization

In [None]:
slider = alt.binding_range(min=22, max=269, step=1, name="2020 year. Day - ")

select_day = alt.selection_single(
    fields=['date'], init={'date': 269},
    bind=slider
)

In [None]:
background = alt.Chart(countries_background
    ).mark_geoshape(
        fill=color,
        stroke=stroke_color
    ).properties(
        width=width,
        height=height).project(
        category
    ).properties(title=title)

In [None]:
confirmed_cases = alt.Chart(covid_with_coordinates
  ).mark_circle(color='yellow').encode(
      latitude='Lat:Q',
      longitude='Long:Q',
      size=alt.Size('confirmed_cases', scale=alt.Scale(range=[0, 2000])),
      tooltip=[
               alt.Tooltip('country:N'),
               alt.Tooltip('confirmed_cases:Q')
      ]
  ).add_selection(select_day).transform_filter(select_day)

In [None]:
alt.data_transformers.disable_max_rows()
background + confirmed_cases

**Why.** I think main reason why I choose to visualize covid data on world map is because it is something I have seen a couple of times and I liked it because, in my opinion, it is the most convenient way to show some general data using only one graph. 

**Countries unique shapes** - represented as discreet values. With so many categories histogram, linecharts, matrices with color for number of cases would look messy and big. However, maybe there are some better ways?

**Circles as number of cases**. I thought between scale, circle and color. Scale wouldn't fit to some small countries, shades of color is harder to distinguish for people, so I chose circle which works well in any case.

**Color - yellow to contrast greyish world map**, so it is easy to see even very small circles.

**Date** is a continious variable in this scenario, so I can't think of any other way than displaying number of confirmed cases using some interactivity. So slider helps here. It could be displayed using some kind lines representing confirmed cases over time for each country, but it would not look good.

**Tooltip** for each circle for seeing exact number of confirmed cases where it can be hard to distinguish between circles and to see country name as not everybody knows all the small countries.




> I can learn all the values given in the dataset through the visualization, as it shows data for all countries for the whole period. We can see exact number for any given day and country via tooltip. The thing which pops out on this graph is the largest circles so it is easy to see areas with the biggest number of ill people over time using slider.

