# TABLE OF CONTENTS


* [1. INTRODUCTION](#section-one)
* [2. SETUP](#section-two)
    - [2.1 Installing Packages](#subsection-two-one)
    - [2.1 Importing Packages](#subsection-two-two)
    - [2.2 Wrangle Data](#subsection-two-three)
* [3. Covid-19 Analysis](#section-three)
    - [3.1 Covid-19 cases by country](#subsection-three-one)
    - [3.2 Covid-19 cases by state](#subsection-three-two)
    - [3.3 Impact of state cases on stocks performance](#subsection-three-three)
    - [3.4 Predicting future Covid-19 patterns](#subsection-three-four)
* [4. Stock analysis](#section-four)
    - [4.1 Stock selection](#subsection-four-one)
        - [4.1.1 Bear ETF and Treasury ETF](#subsection-four-one-one)
        - [4.1.2 Technology stocks](#subsection-four-one-two)
        - [4.1.3 Consumer Cylical stocks](#subsection-four-one-three)
        - [4.1.4 Real estate stocks](#subsection-four-one-four)
        - [4.1.5 Healthcare stocks](#subsection-four-one-five)        
* [5. Portfolio optimization](#section-five)
    - [5.1 Stock correlation](#subsection-five-one)
    - [5.2 Simulation of portfolio performance](#subsection-five-two)
    - [5.3 Portfolio allocation](#subsection-five-three)
    - [# 5.4 Portfolio returns relationship to new cases](#subsection-five-four)
* [6. Conclusion](#section-six)

<a id="section-one"></a>
# 1. Introduction

<a id="section-two"></a>
# 2. Set-up 

- [2.1 Installing Packages](#subsection-two-one)
#  2.1 Installing pips

In [None]:
!pip install --upgrade pip
!pip install yfinance

<a id="subsection-two-two"></a>
# 2.2 Importing

In [None]:
# Date
from dateutil import relativedelta as rd
from datetime import datetime, date, timedelta
from datetime import date

#Data Manipulation
import pandas as pd
import numpy as np
from pandas import DataFrame
from numpy import inf

# Visualization
import matplotlib.pyplot as plt
import plotly as py
import plotly.express as px
import plotly.graph_objects as go
import plotly.offline as pyo
pyo.init_notebook_mode()
import seaborn as sns
import matplotlib.cm as cm
import matplotlib.dates as mdates
from matplotlib.dates import DateFormatter
import geopandas as gpd
import matplotlib as mpl
from scipy.stats.mstats import winsorize
from matplotlib.patches import Ellipse
from matplotlib.text import OffsetFrom
from plotly.subplots import make_subplots
import os

# Regression 
import statsmodels.api as sm
from statsmodels.formula.api import ols
import statsmodels.graphics.api as smg
from scipy.optimize import curve_fit
import yfinance as yf


#Prediction
#from sklearn.preprocessing import MinMaxScaler
#from sklearn.tree import DecisionTreeRegressor
#from sklearn.linear_model import LinearRegression
#from sklearn.model_selection import train_test_split
#from tensorflow.keras.models import Sequential
#from tensorflow.keras.layers import Dense
#from tensorflow.keras.layers import LSTM

<a id="subsection-two-three"></a>
# 2.3 WRANGLING DATA

In [None]:
# Color Palettes
cnf, dth, rec, act = '#393e46', '#ff2e63', '#21bf73', '#fe9801' 

In [None]:
#Germany Country Data up till November
end_date = "2020-11-30"

world_confirmed = pd.read_csv('../input/covid19report20201202/time_series_covid19_confirmed_global.csv')
germany_confirmed = world_confirmed[world_confirmed['Country/Region'].isin(['Germany'])]
germany_confirmed = germany_confirmed.drop(['Province/State','Country/Region','Lat','Long'],axis=1).transpose()
germany_confirmed.reset_index(inplace = True)
germany_confirmed.columns = ["Date",'Total Confirmed']
#print(germany_confirmed)
world_deaths = pd.read_csv('../input/covid19report20201202/time_series_covid19_deaths_global.csv')
germany_deaths = world_deaths[world_deaths['Country/Region'].isin(['Germany'])]
germany_deaths = germany_deaths.drop(['Province/State','Country/Region','Lat','Long'],axis=1).transpose()
germany_deaths.reset_index(inplace = True)
germany_deaths.columns = ["Date",'Total Deaths']
#print(germany_deaths)
world_recovered = pd.read_csv('../input/covid19report20201202/time_series_covid19_recovered_global.csv')
germany_recovered = world_recovered[world_recovered['Country/Region'].isin(['Germany'])]
germany_recovered = germany_recovered.drop(['Province/State','Country/Region','Lat','Long'],axis=1).transpose()
germany_recovered.reset_index(inplace = True)
germany_recovered.columns = ["Date",'Total Recovered']
#print(germany_recovered)
germanycountry= pd.merge(germany_confirmed,germany_deaths,on='Date',how='outer')
germanycountry= pd.merge(germanycountry,germany_recovered,on='Date',how='outer')
germanycountry['Total Active'] = germanycountry['Total Confirmed']-germanycountry['Total Deaths']-germanycountry['Total Recovered']
germanycountry['Date'] = pd.to_datetime(germanycountry['Date'])
germanycountry= germanycountry[germanycountry['Date']<= end_date]
print(germanycountry)

First we wrangle the data, by making datetime format for dates, and removing data that are missing in both age_groups and gender. Next, we found the total cases and deaths per state. Next, we merged the data from population to create find cases and deaths as a percentage of population. 

In [None]:
##Germany Country Data up till November
end_date = "2020-11-30"

germanystate = pd.read_csv('../input/covid19-tracking-germany/covid_de.csv')
germanystate['date'] = pd.to_datetime(germanystate['date'])

#print(germanystate['gender'].isnull().sum()) #missing data from gender
#print(germanystate['age_group'].isnull().sum()) #missing data from age_group
germanystate.dropna(subset=['gender','age_group'], how='all',inplace=True)

state = germanystate.sort_values(['state','date','gender','age_group']).reset_index()
state_cases_per_day=state.groupby(['state','date','gender','age_group']).agg({'cases':'sum','deaths':'sum'}).reset_index()
state_cases_per_day['Total cases']=state_cases_per_day.groupby('state')['cases'].cumsum()
state_cases_per_day['Total deaths']=state_cases_per_day.groupby('state')['deaths'].cumsum()

germanypop= pd.read_csv('../input/covid19-tracking-germany/demographics_de.csv')
#print(germanypop.info())
germanypop = germanypop.replace('female','F')
germanypop = germanypop.replace('male','M')
#print(germanypop.head(20))
germany_cases_pop = pd.merge(germanystate,germanypop,on=['state','gender','age_group'],how='inner')
germany_cases_pop['Total cases']=germany_cases_pop.groupby('state')['cases'].cumsum()
germany_cases_pop['Total deaths']=germany_cases_pop.groupby('state')['deaths'].cumsum()
germany_cases_pop.rename(columns={'cases':'new_cases','deaths':'new_deaths'},inplace=True)
germany_cases_pop.drop(columns=['county'],inplace=True)
germany_cases_pop= germany_cases_pop[germany_cases_pop['date']<= end_date]
print(germany_cases_pop)

<a id="subsection-three"></a>
# 3. Covid-19 Analysis


<a id="subsection-three-one"></a>
# 3.1 **Covid-19 cases by country**

In [None]:
germanycountry['Recovered%'] = round(germanycountry['Total Recovered']/germanycountry['Total Confirmed']*100,2)
germanycountry["New Active"] = germanycountry["Total Active"].diff()
germanycountry["New Cases"] = germanycountry["Total Confirmed"].diff()
germanycountry["New Deaths"] = germanycountry["Total Deaths"].diff()
germanycountry = germanycountry.replace('', np.nan).fillna(0)
print(germanycountry)
temp = germanycountry.melt(id_vars="Date", value_vars=['New Cases', 'New Deaths'],
                 var_name='Case', value_name='Count')
temp.head()

fig = px.area(temp, x="Date", y="Count", color='Case', height=600, width=1200,
             title='Cases over time (Germany)', color_discrete_sequence = [rec, dth, act])
fig.update_layout(xaxis_rangeslider_visible=True)
fig.show()

> There is a huge spike of covid 19 cases in March to May, at its peak, there were over 7000 cases per day. In October 2020, the second wave came and the daily cases is even higher, reaching its peak at 32k cases per day on November 25

In [None]:
temp = germanycountry[['Date','Total Deaths', 'Total Recovered', 'Total Active']].tail(1)
temp.head()
temp = temp.melt(id_vars="Date", value_vars=['Total Active', 'Total Deaths', 'Total Recovered'])
#print(temp)
fig = px.pie(temp, values = 'value',names = 'variable', title = 'Porportion of Covid-19 cases')
fig.show()

> By end of November, a high amount of cases (28.4%) still remain active. To prevent overloading of medical facilities, we expect long lockdowns in order to curb such a growth.

# Population and population density

<a id="subsection-three-two"></a>
# 3.2 Covid-19 cases by state
> Based on the estimation, Germany does not have such a high fatality ratio as recorded. Next we will look at cases based on state, gender and age to see how it affects the deaths and cases in Germany.

In [None]:

for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        os.path.join(dirname, filename)

# Any results you write to the current directory are saved as output.
path_to_file_covid = '../input/covid19-tracking-germany/covid_de.csv'
path_to_file_demo  = '../input/covid19-tracking-germany/demographics_de.csv'
path_to_file_shape = '../input/covid19-tracking-germany/de_state.shp'

#getting the data
covid_de  = pd.read_csv(path_to_file_covid, index_col="date", parse_dates=True) #cases and deaths per state and age and sex
demo_de   = pd.read_csv(path_to_file_demo)    # demography file
shape_de2 = gpd.read_file(path_to_file_shape) # geography file

# replace Umlaute
shape_tmp = shape_de2.replace({'Baden-Württemberg' : 'Baden-Wuerttemberg', 'Thüringen' : 'Thueringen' }).copy()
shape_de = shape_tmp.rename(columns={'GEN': 'state'}).copy()

# conversion factor for later
m2tokm2 = 1/1000000
# Set the coordinate reference system (CRS) to EPSG 3035

shape_de.crs = {'init': 'epsg:3025'}
#print(shape_de.geometry.crs)

In [None]:
 norm_axis1 = 90e3
norm_axis2 = 4.1e3


# map with population
def add_pop_state(state):
    popu = demo_de[demo_de.state == state].population.sum()
    #print(popu)
    shape_de[shape_de.state == state].plot(figsize=(10,10),color= cm.Greens(popu/norm_axis1), edgecolor='gainsboro', zorder=3, ax =  ax1)

#map with population density
# for this get the area from the polygon, i.e. geometry
def get_area(state):   
    return shape_de[shape_de.state == state].geometry.area

def add_dens_state(state):
    dens  = demo_de[demo_de.state == state].population.sum()/float(get_area(state))/(m2tokm2)
    #print(state , '----' , round((dens)/(m2tokm2),2), 'people/km**2')#properly normalised density people/km**2
    shape_de[shape_de.state == state].plot(figsize=(10,10),color= cm.Greens(dens/norm_axis2), edgecolor='gainsboro', zorder=3, ax =  ax2)
    
    
plt.figure() 

# Create a map
ax1 = plt.axes([0., 0., 1., 2.])
shape_de['geometry'].plot(color='whitesmoke', edgecolor='gainsboro', zorder=3, ax = ax1)
for i in shape_de.state:
    add_pop_state(i)
ax1.set_title('Population of the states', fontsize=20)

# add colorbar
fig = ax1.get_figure()
cax = fig.add_axes([1.1, 0.0, 0.1, 2.0])
norm = mpl.colors.Normalize(vmin=0,vmax=norm_axis1)
sm = plt.cm.ScalarMappable(norm = norm, cmap='Greens')
sm._A = []
cbar = fig.colorbar(sm, cax=cax , ax=ax1)
cbar.ax.tick_params(labelsize=15)
# Create a second map
ax2 = plt.axes([1.6, 0., 1., 2.])
shape_de['geometry'].plot(figsize=(10,10),color='whitesmoke', edgecolor='gainsboro', zorder=3, ax = ax2)
for i in shape_de.state:
    add_dens_state(i)
    
# add colorbar
fig2 = ax2.get_figure()
cax2 = fig.add_axes([2.8, 0.0, 0.1, 2.])
norm2 = mpl.colors.Normalize(vmin=0,vmax=norm_axis2)
sm2 = plt.cm.ScalarMappable(norm=norm2,cmap='Greens')
sm2._A = []
cbar = fig.colorbar(sm2, cax=cax2)
cbar.ax.tick_params(labelsize=15)
ax2.set_title('Population density of the states (ppl/sqkm)', fontsize=20)

plt.show()

Here once can see, that the population is large in e.g. NRW, but the density is of course much higher in cities.

# Cases and deaths per state

In [None]:
 norm_axis1 = 90e3
norm_axis2 = 1800

def add_case_per_state(state):
    case = covid_de.loc[covid_de['state'] == state ].cases.sum()
    #print(case)
    shape_de[shape_de.state == state].plot(figsize=(10,10), color= cm.Blues(case/norm_axis1), edgecolor='gainsboro', zorder=3, ax =  ax1)

def add_death_per_state(state):
    death = covid_de.loc[covid_de['state'] == state ].deaths.sum()
    #print(death)
    shape_de[shape_de.state == state].plot(figsize=(10,10), color= cm.YlOrRd(death/norm_axis2), edgecolor='gainsboro', zorder=3, ax = ax2)

plt.figure() 

# Create a map
ax1 = plt.axes([0., 0., 1., 2.])
shape_de['geometry'].plot(color='whitesmoke', edgecolor='gainsboro', zorder=3, ax = ax1)
for i in shape_de.state:
    add_case_per_state(i)
ax1.set_title('Cases per state', fontsize=20)

# add colorbar
fig = ax1.get_figure()
cax = fig.add_axes([1.1, 0.0, 0.1, 2.0])
norm = mpl.colors.Normalize(vmin=0,vmax=norm_axis1)
sm = plt.cm.ScalarMappable(norm = norm, cmap='Blues')
sm._A = []
cbar = fig.colorbar(sm, cax=cax , ax=ax1)
cbar.ax.tick_params(labelsize=15)
# Create a second map
ax2 = plt.axes([1.6, 0., 1., 2.])
shape_de['geometry'].plot(figsize=(10,10),color='whitesmoke', edgecolor='gainsboro', zorder=3, ax = ax2)
for i in shape_de.state:
    add_death_per_state(i)
    
# add colorbar
fig2 = ax2.get_figure()
cax2 = fig.add_axes([2.8, 0.0, 0.1, 2.])
norm2 = mpl.colors.Normalize(vmin=0,vmax=norm_axis2)
sm2 = plt.cm.ScalarMappable(norm=norm2,cmap='YlOrRd')
sm2._A = []
cbar = fig.colorbar(sm2, cax=cax2)
cbar.ax.tick_params(labelsize=15)
ax2.set_title('Deaths per state', fontsize=20)

plt.show()



Cases and deaths do mainly occur in highly populated states. In the following codes, we can check the normalisation per population and population density. One interesting point is also that the eastern part of Germany is much less affected than the eastern part.

In [None]:
norm_axis1 = 0.012
norm_axis2 = 1e9


def add_case_per_pop_state(state):
    case_norm = covid_de.loc[covid_de['state'] == state ].cases.sum() / demo_de[demo_de.state == state].population.sum()
    #print(case_norm)
    shape_de[shape_de.state == state].plot(figsize=(10,10), color= cm.Blues(case_norm/norm_axis1), edgecolor='gainsboro', zorder=3, ax =  ax1)

def get_area(state):   
    return shape_de[shape_de.state == state].geometry.area
    
def add_case_per_dens_state(state):
    case_dens = covid_de.loc[covid_de['state'] == state ].cases.sum() / (demo_de[demo_de.state == state].population.sum()/float(get_area(state)))
    #print(case_dens)
    shape_de[shape_de.state == state].plot(figsize=(10,10), color= cm.Blues(case_dens/norm_axis2), edgecolor='gainsboro', zorder=3, ax =  ax2)

    

plt.figure() 

# Create a map
ax1 = plt.axes([0., 0., 1., 2.])
shape_de['geometry'].plot(color='whitesmoke', edgecolor='gainsboro', zorder=3, ax = ax1)
for i in shape_de.state:
    add_case_per_pop_state(i)
ax1.set_title('Cases per state per population', fontsize=20)

# add colorbar
fig = ax1.get_figure()
cax = fig.add_axes([1.1, 0.0, 0.1, 2.0])
norm = mpl.colors.Normalize(vmin=0,vmax=norm_axis1)
sm = plt.cm.ScalarMappable(norm = norm, cmap='Blues')
sm._A = []
cbar = fig.colorbar(sm, cax=cax , ax=ax1)
cbar.ax.tick_params(labelsize=15)
# Create a second map
ax2 = plt.axes([1.6, 0., 1., 2.])
shape_de['geometry'].plot(figsize=(10,10),color='whitesmoke', edgecolor='gainsboro', zorder=3, ax = ax2)
for i in shape_de.state:
    add_case_per_dens_state(i)
    
# add colorbar
fig2 = ax2.get_figure()
cax2 = fig.add_axes([2.8, 0.0, 0.1, 2.])
norm2 = mpl.colors.Normalize(vmin=0,vmax=norm_axis2)
sm2 = plt.cm.ScalarMappable(norm=norm2,cmap='Blues')
sm2._A = []
cbar = fig.colorbar(sm2, cax=cax2)
cbar.ax.tick_params(labelsize=15)
ax2.set_title('Cases per state per population density', fontsize=20)

plt.show()

When dividing by the population (left side), one can see that mainly the south has many cases. When normalising by the population density of the state, Bavaria is leading. this is probably due to the fact that the area is rather large compared to the other states. 

In [None]:
germ_gender_sum=germanypop.groupby('gender').population.sum()
germ_covid_sum=germany_cases_pop.groupby('gender',as_index=False)[['new_cases','new_deaths']].sum()
germ_summary_2 = pd.merge(germ_gender_sum,germ_covid_sum,on='gender')
#print(germ_summary_2)
germ_summary_2['positivity_rate'] = germ_summary_2['new_cases'] / germ_summary_2['population']
germ_summary_2['death_rate'] = germ_summary_2['new_deaths'] / germ_summary_2['new_cases']
germ_summary_2['prop_positives'] = germ_summary_2['new_cases'] / germ_summary_2['new_cases'].sum()
germ_summary_2['prop_deaths'] = germ_summary_2['new_deaths'] / germ_summary_2['new_deaths'].sum()
print(germ_summary_2)


fig, ax = plt.subplots(3,2, figsize=(12, 12), facecolor='#f7f7f7')
fig.subplots_adjust(top=0.92)
fig.suptitle('Summary of the situation in Germany (gender)', fontsize=18)

germ_summary_2.set_index('gender').new_cases.plot(kind='bar', ax=ax[0][0], color='gold')
germ_summary_2.set_index('gender').new_deaths.plot(kind='bar', ax=ax[1][0], color='red')
germ_summary_2.set_index('gender').positivity_rate.plot(kind='bar', ax=ax[0][1], color='gold')
germ_summary_2.set_index('gender').death_rate.plot(kind='bar', ax=ax[1][1], color='red')

ax[2][0].pie(germ_summary_2.prop_positives.values, labels=germ_summary_2.gender, autopct='%.0f%%')
ax[2][1].pie(germ_summary_2.prop_deaths.values, labels=germ_summary_2.gender, autopct='%.0f%%')

ax[0][0].set_title('Total Cases', fontsize=14)
ax[1][0].set_title('Total Deceased', fontsize=14)
ax[0][1].set_title('Positivity Rate (%)', fontsize=14)
ax[1][1].set_title('Death Rate (%)', fontsize=14)
ax[2][0].set_title('Proportion of Positives', fontsize=14)
ax[2][1].set_title('Proportion of Victims', fontsize=14)

for axes in ax[0]:
    axes.set_xlabel('')
    axes.set_xticklabels(axes.get_xticklabels(), rotation=0)
    axes.grid(axis='y')
for axes in ax[1]:
    axes.set_xlabel('')
    axes.set_xticklabels(axes.get_xticklabels(), rotation=0)
    axes.grid(axis='y')
    axes.set_yticklabels(['{:,.2%}'.format(x) for x in axes.get_yticks()])

plt.show()

> Based on the graphs, both genders contact the virus at the same rate, however, males seem to have a higher death rate of 2.5% to 2%

In [None]:
germ_pop_sum=germanypop.groupby('age_group').population.sum()
germ_covid_sum=germany_cases_pop.groupby('age_group',as_index=False)[['new_cases','new_deaths']].sum()
germ_summary = pd.merge(germ_pop_sum,germ_covid_sum,on='age_group')
germ_summary['positivity_rate'] = germ_summary['new_cases'] / germ_summary['population']
germ_summary['death_rate'] = germ_summary['new_deaths'] / germ_summary['new_cases']
germ_summary['prop_positives'] = germ_summary['new_cases'] / germ_summary['new_cases'].sum()
germ_summary['prop_deaths'] = germ_summary['new_deaths'] / germ_summary['new_deaths'].sum()
#print(germ_summary)


fig, ax = plt.subplots(3,2, figsize=(15, 20), facecolor='#f7f7f7')
fig.subplots_adjust(top=0.92)
fig.suptitle('Summary of the situation in Germany (age)', fontsize=18)

germ_summary.set_index('age_group').new_cases.plot(kind='bar', ax=ax[0][0], color='gold')
germ_summary.set_index('age_group').new_deaths.plot(kind='bar', ax=ax[1][0], color='red')
germ_summary.set_index('age_group').positivity_rate.plot(kind='bar', ax=ax[0][1], color='gold')
germ_summary.set_index('age_group').death_rate.plot(kind='bar', ax=ax[1][1], color='red')

ax[2][0].pie(germ_summary.prop_positives.values, labels=germ_summary.age_group, autopct='%.0f%%',labeldistance=None)
ax[2][1].pie(germ_summary.prop_deaths.values, labels=germ_summary.age_group, autopct='%.0f%%',labeldistance=None)

ax[0][0].set_title('Total Cases', fontsize=14)
ax[1][0].set_title('Total Deceased', fontsize=14)
ax[0][1].set_title('Positivity Rate (%)', fontsize=14)
ax[1][1].set_title('Death Rate (%)', fontsize=14)
ax[2][0].set_title('Proportion of Positives', fontsize=14)
ax[2][1].set_title('Proportion of Victims', fontsize=14)
ax[2][0].legend()
ax[2][1].legend()

for axes in ax[0]:
    axes.set_xlabel('')
    axes.set_xticklabels(axes.get_xticklabels(), rotation=0)
    axes.grid(axis='y')
for axes in ax[1]:
    axes.set_xlabel('')
    axes.set_xticklabels(axes.get_xticklabels(), rotation=0)
    axes.grid(axis='y')
    axes.set_yticklabels(['{:,.2%}'.format(x) for x in axes.get_yticks()])

plt.show()

> Most cases are contacted by age 15-39, even if scaled to population percentages. This can be generally due to the fact most of the working forces are within that age range. However, most deaths are spread between age 60 and above, around nearly 90% of the deaths. 

<a id="subsection-three-three"></a>
# 3.3 Impact of Germany Covid cases on stocks performance


In [None]:
EWG = yf.download("EWG",start = "2019-12-01", end = '2020-11-30')
EWG = EWG["Adj Close"]
EWG = pd.DataFrame(EWG)
EWG.columns = ['Adj_Close']
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces to create subplots
fig.add_trace(
    go.Scatter(x=EWG.index, y=EWG['Adj_Close'], name = 'EWG'),  
    secondary_y=False,
)

fig.add_trace(
    go.Scatter(x=germanycountry['Date'], y=germanycountry['New Cases'], name = 'New COVID19 Cases'), 
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="EWG and New COVID19 Cases"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>EWG</b>", secondary_y=False)
fig.update_yaxes(title_text="<b>New COVID19 Cases</b>", secondary_y=True)

fig.show()

The first spike in cases happened in early March where the cases spike over 5000 per day, a drastic dip in EWG, Germany market index is seen from around $28 to $18. As cases remain stable from May onwards, EWG has seen to gradually recover to previous highs. When the second spike started in late October, EWG seen a small dip this time, despite the cases spike are much higher, around 20-30k per day. This signifies that investors still feel the negative correlation between cases per day and market returns, but the change in investors' mindset are gradually becoming less extreme as they are getting used to the changes in the Covid-19 striken world. Going forward, we expect a spike in Covid cases will still negatively impact the market returns, but in smaller scale and recovery will be much faster. 

#  Investigating specific companies 

Many companies have set-up physical production plants and retail stores in different states in Germany. Due to the lockdown in the states, these companies might suffer a larger loss than others. We will take a closer look at some examples of stocks that are located in Bavern, one of the states that is hugely populated and is going into a second lockdown. 

In [None]:
end = date.today()
Siemens = yf.download("SIEGY",start = "2019-12-01", end = end)
Siemens = Siemens["Adj Close"]
Siemens = pd.DataFrame(Siemens)
Siemens.columns = ['Adj_Close']

Adidas = yf.download("ADDYY",start = "2019-12-01", end = end)
Adidas = Adidas["Adj Close"]
Adidas = pd.DataFrame(Adidas)
Adidas.columns = ['Adj_Close']

VOW = yf.download("VOW.DE",start = "2019-12-01", end = end)
VOW = VOW["Adj Close"]
VOW = pd.DataFrame(VOW)
VOW.columns = ['Adj_Close']

In [None]:
fig, ax = plt.subplots(3,1, figsize=(15, 15), facecolor='#f7f7f7')


fig.subplots_adjust(top=0.92)
fig.suptitle('Companies in Bayern', fontsize=18)
fig.tight_layout(pad=8.0)

Siemens.Adj_Close.plot(kind='line', ax=ax[0], color='red')
Adidas.Adj_Close.plot(kind='line', ax=ax[1], color='red')
VOW.Adj_Close.plot(kind='line',ax=ax[2], color ='red')



ax[0].set_title('Siemens', fontsize=14)
ax[1].set_title('Adidas', fontsize=14)
ax[2].set_title('Volkswagen', fontsize=14)



plt.show()

Looking at the market returns in these 3 companies, all suffered a huge dip in prices similar to EWG in March. However, in late Octover to November where the second lockdown occurs, no significant dip in prices occured. These results confirmed that the companies are now increasingly uncorrelated to the Covid-19 Pandemic and they are relatively safe investments. Volkswagen stock price analysis will be done further in another notebook to decide whether it is a good buy for the long-term. 
https://www.kaggle.com/tohyongkai/volkswagen-analysis 