# 02806 Final project 
> An analysis and visualization for novel COVID-19 virus

- toc: true 
- badges: true
- author: Georgios Zefkilis & Yucheng Ren
- comments: false
- categories: [data_analysis, visualization]

> Tip: This page is generated from a Jupyter notebook, some of the code are hid under the hood, some of them can be shown by clicking the button `Show Code`. If you want to visit the complete notebook, please click the `view on github` button above.

# Introduction

The COVID-19 virus put us in a severe situation and changed our lives in many ways. The huge impact we feel first is in the economy. Many people start working remotely and others even lost their jobs. Governments around the world are preparing economic stimulus plans and so on. In this project, we decide to do some statistics and data visualization about how this virus has an impact on our economy. 

The first part is a general analysis and visualization of the current situation of the COVID-19, like the confirmed cases and death cases distribution, growing trend and so on. The second part is about how the virus affect our economy state which includes macroeconomic and microeconomic.

In [116]:
# hide
# this block contains all the import packages
import pandas as pd
import numpy as np
import altair as alt
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import matplotlib.colors as mcolors

# Macroeconomic

In [355]:
# hide
# import data
path = 'data/'

imfGDP = pd.read_csv(path + 'imf-dm-export-20200423.csv')
stockOMX20 = pd.read_csv(path + 'OMX20.csv')
stockOMX25 = pd.read_csv(path + 'OMX25.csv')
stockCopenhagenAllShare = pd.read_csv(path + 'OMXCopenhagenAllshares.csv')
data = pd.read_csv(path + 'MEI_CLI_24042020100529269.csv')
denEmploy = pd.read_csv(path + 'DenmarkEmploymentQuarterly.csv')
denUnemployMonthly = pd.read_csv(path + 'DenmarkUnemploymentMonthly.csv')
imfGlobalGdp = pd.read_csv(path + 'imf-global-gdp.csv')
unemployRate = pd.read_csv(path + 'unemploymentRate.csv')

## Stock Market

Talk about the stock market

In [198]:
# hide
# stock data preprocessing
stockOMX20['Symbol'] = 'OMX 20'
stockOMX25['Symbol'] = 'OMX 25'
stockCopenhagenAllShare['Symbol'] = 'Copenhagen All Shares'
stockAll = pd.concat([stockOMX20, stockOMX25, stockCopenhagenAllShare])
stockAll['Date'] = pd.to_datetime(stockAll.Date)
stockAll = stockAll.sort_values(by=['Symbol', 'Date'])
stockAll['Price'] = stockAll['Price'].str.replace(',', '')
stockAll['Price'] = stockAll['Price'].astype(float)

In [252]:
# collapse-hide
line = alt.Chart(stockAll).mark_line(interpolate='basis').encode(
    x='Date',
    y='Price',
    color='Symbol',
)

nearest = alt.selection(type='single', nearest=True, on='mouseover',
                        fields=['Date'], empty='none')

selectors = alt.Chart(stockAll).mark_point().encode(
    x='Date',
    opacity=alt.value(0),
).add_selection(
    nearest
)

# Draw points on the line, and highlight based on selection
points = line.mark_point().encode(
    opacity=alt.condition(nearest, alt.value(1), alt.value(0))
)

# Draw text labels near the points, and highlight based on selection
text = line.mark_text(align='left', dx=5, dy=-5).encode(
    text=alt.condition(nearest, 'Price', alt.value(' '))
)

# Draw a rule at the location of the selection
rules = alt.Chart(stockAll).mark_rule(color='gray').encode(
    x='Date',
).transform_filter(
    nearest
)

# Put the five layers into a chart and bind the data
alt.layer(
    line, selectors, points, rules, text
).properties(
    width=600, height=300
)

In [356]:
unemployRate.head()

Unnamed: 0,Unemployment rate (Percent),1980,1981,1982,1983,1984,1985,1986,1987,1988,...,2012,2013,2014,2015,2016,2017,2018,2019,2020,2021
0,Albania,5,4.2,2.8,3.3,4.4,5.9,5.4,5.2,6,...,13.4,15.9,17.5,17.1,15.2,13.7,12.3,12.0,11.8,11.5
1,Algeria,15.8,15.4,15,14.3,16.5,16.9,18.4,20.1,21.8,...,11.0,9.8,10.6,11.2,10.5,11.7,11.7,11.4,15.1,13.9
2,Argentina,3,5,4.5,5,5,6.2,6.3,6,6.5,...,7.2,7.1,7.3,6.5,8.5,8.4,9.2,9.8,10.9,10.1
3,Armenia,no data,no data,no data,no data,no data,no data,no data,no data,no data,...,17.3,16.2,17.6,18.5,18.0,17.8,20.4,17.7,19.0,18.4
4,Aruba,no data,no data,no data,no data,no data,no data,no data,no data,no data,...,9.6,7.6,7.5,7.3,7.7,8.9,7.3,7.5,7.5,7.5


In [358]:
url_geojson = 'https://github.com/vega/vega-datasets/blob/master/data/world-110m.json'
data_geojson_remote = alt.Data(url=url_geojson, format=alt.DataFormat(property='features',type='json'))

# chart object
alt.Chart(data_geojson_remote).mark_geoshape(
).encode(
    color="properties.name:N"
).properties(
    projection={'type': 'identity', 'reflectY': True}
)

In [359]:
from vega_datasets import data

counties = alt.topo_feature('https://github.com/vega/vega-datasets/blob/master/data/world-110m.json', 'counties')
source = data.unemployment.url

alt.Chart(counties).mark_geoshape().encode(
    color='rate:Q'
).transform_lookup(
    lookup='id',
    from_=alt.LookupData(source, 'id', ['rate'])
).project(
    type='albersUsa'
).properties(
    width=500,
    height=300
)

In [354]:
counties = alt.topo_feature(data.us_10m.url, 'counties')
counties

UrlData({
  format: TopoDataFormat({
    feature: 'counties',
    type: 'topojson'
  }),
  url: 'https://vega.github.io/vega-datasets/data/us-10m.json'
})

## Denmark's GDP

Denmark 40 years GDP annual growth rate data from IMF includes forecast of 2020 and 2021

In [236]:
# collapse-hide
gdp = imfGDP.loc[0][1:]
years = []
values = []
for index, value in gdp.items():
    years.append(index)
    values.append(value)
    
gdpDF = pd.DataFrame(list(zip(years, values)), 
               columns =['Year', 'Value']) 

alt.Chart(gdpDF).mark_line(point=True).encode(
    alt.X('Year:O'),
    alt.Y('Value:Q', title= 'Growth Rate'),
)

# Microeconomic


Employment and unemployment data

In [314]:
# hide
# preprocess employment data
denEmploy.rename(columns={'ref_area.label':'Country',
                          'indicator.label':'indicator',
                          'source.label':'source',
                         'sex.label': 'sex',
                         'classif1.label': 'classif1',
                         'obs_status.label': 'obs_status',
                         'note_classif.label': 'note_classif',
                         'note_indicator.label': 'note_indicator'}, inplace=True)

denUnemployMonthly.rename(columns={'ref_area.label':'Country',
                          'indicator.label':'indicator',
                          'source.label':'source',
                         'sex.label': 'sex',
                         'classif1.label': 'classif1',
                         'obs_status.label': 'obs_status',
                         'note_classif.label': 'note_classif',
                         'note_indicator.label': 'note_indicator'}, inplace=True)

denEmploy = denEmploy.drop(['note_source.label', 'note_classif'], axis=1)
denUnemployMonthly = denUnemployMonthly.drop(['note_source.label', 'note_classif', 'obs_status'], axis=1)

In [338]:
denEmploy

Unnamed: 0,Country,indicator,source,sex,classif1,time,obs_value,obs_status,note_indicator
0,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Total,Age (5-year bands): Total,2016Q1,2720.0,Break in series,Frequency: Quarterly | Break in series: Method...
1,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Total,Age (5-year bands): 15-19,2016Q1,130.6,Break in series,Frequency: Quarterly | Break in series: Method...
2,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Total,Age (5-year bands): 20-24,2016Q1,236.2,Break in series,Frequency: Quarterly | Break in series: Method...
3,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Total,Age (5-year bands): 25-29,2016Q1,264.7,Break in series,Frequency: Quarterly | Break in series: Method...
4,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Total,Age (5-year bands): 30-34,2016Q1,253.2,Break in series,Frequency: Quarterly | Break in series: Method...
...,...,...,...,...,...,...,...,...,...
1339,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Male,"Age (Youth, adults): 25+",2019Q4,1337.1,,Frequency: Quarterly
1340,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Female,"Age (Youth, adults): 15+",2019Q4,1355.3,,Frequency: Quarterly
1341,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Female,"Age (Youth, adults): 15-64",2019Q4,1328.4,,Frequency: Quarterly
1342,Denmark,Employment by sex and age (thousands),DNK - LFS - EU Labour Force Survey,Sex: Female,"Age (Youth, adults): 15-24",2019Q4,198.2,,Frequency: Quarterly


In [337]:
# collapse-hide
plotData = denEmploy.loc[(denEmploy.sex != 'Sex: Total') & (denEmploy['time'] > '2018M01')]

alt.Chart(plotData).mark_bar().encode(
    x='sex:O',
    y= alt.Y('obs_value:Q', title = 'Employmnet Count (thousands)'),
    color='sex:N',
    column=alt.Column('time:N', title='Quarterly')
)

In [336]:
# collapse-hide
plotData = denUnemployMonthly.loc[(denUnemployMonthly.sex != 'Sex: Total') & (denUnemployMonthly['time'] > '2019M01')]

alt.Chart(plotData).mark_bar().encode(
    x='sex:O',
    y= alt.Y('obs_value:Q', title='Unemplyment Count (thousands)'),
    color='sex:N',
    column= alt.Column('time:N', title='Monthly')
)

# data source 

https://ilostat.ilo.org

https://www.investing.com

https://www.imf.org

http://www.oecd.org