# **Investment Strategy - Japan**

# TABLE OF CONTENTS

* [1. SETUP](#section-one)
    - [1.1 Load Packages](#subsection-one-one)   
    - [1.2 Wrangle Data](#subsection-one-two)   
* [2. COVID-19 in Japan](#section-two)
    - [2.1 COVID-19 Cases in Japan](#subsection-two-one)
    - [2.2 Cumulative COVID-19 Cases in Japan](#subsection-two-two)
* [3. The Investment Strategy](#section-three)
    - [3.1 Nikkei 225 and COVID-19](#subsection-three-one)
    - [3.2 BTC and Benchmarks (YTD)](#subsection-three-two)
    - [3.3 Kadokawa and Benchmarks (YTD)](#subsection-three-three)
    - [3.4 Takeda and Benchmarks (YTD)](#subsection-three-four)
    - [3.5 Comparison of All Assets (YTD)](#subsection-three-five)
    - [3.6 Correlation Pair-Plot](#subsection-three-six)
* [4. Portfolio Optimization - Variance, Covariance, Mean, and Sharpe Ratio](#section-four)
    - [4.1 More Details about Benchmarks](#subsection-four-one)
    - [4.2 Mean Return Vector](#subsection-four-two)
    - [4.3 Covariance Matrix](#subsection-four-three)
* [5. REFERENCES](#section-five)

<a id="section-one"></a>
# 1. SETUP

<a id="section-one-one"></a>
# 1.1 Load Packages

In [None]:
# for numerical analysis
import numpy as np

# to store and process data in dataframe
import pandas as pd

# to interface with operating system
import os

# for basic visualization
import matplotlib.pyplot as plt

# for advanced visualization
import seaborn as sns; sns.set()

# for interactive visualization
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objs as go

# for offline interactive visualization
from plotly.offline import plot, iplot, init_notebook_mode
init_notebook_mode(connected=True)

# for trendlines
import statsmodels

# data manipulation
from datetime import datetime as dt
from scipy.stats.mstats import winsorize

import warnings
warnings.filterwarnings("ignore")

<a id="subsection-one-two"></a>
## 1.2 Wrangle Data

Dataset used in this notebook:  
1. COVID-19 by country - Daily update [https://www.kaggle.com/jcsantiago/covid19-by-country-with-government-response](http://)  
2. Porfolio, derive from Yahoo Finance [https://finance.yahoo.com/](http://)

In [None]:
full_grouped = pd.read_csv('../input/covid19-by-country-with-government-response/covid19_by_country.csv')

full_grouped['Date'] = pd.to_datetime(full_grouped['Date'], format = '%Y-%m-%d')
full_grouped['active'] = full_grouped['confirmed'] - full_grouped['deaths'] - full_grouped['recoveries']
full_grouped = full_grouped[full_grouped['Date'] <= '2020-11-09']
jpn_covid = full_grouped[full_grouped['Country'] == 'Japan']
jpn_covid.tail()

In [None]:
files = []

for dirname, _, filenames in os.walk('../input/porfolio'):
    for filename in filenames:
        files.append(os.path.join(dirname, filename))
        
files = sorted(files)
files

In [None]:
series = [pd.read_csv(f, na_values=['.']) for f in files]

# Define series name, which becomes the dictionary key
series_name = ['btc','daikin','ewj','gold','kadokawa','keyence','makita','n225','prjpx','takeda']

# series name = dictionary key, series = dictionary value
series_dict = dict(zip(series_name, series))

In [None]:
# 1. Nikkei 225
n225 = series_dict['n225']
n225['Date'] = pd.to_datetime(n225['Date'])
n225.rename(columns={'Adj Close':'n225'}, inplace=True)
n225['n225_return'] = n225['n225'].pct_change()
n225['n225_volatility_1m'] = (n225['n225_return'].rolling(20).std())*(20)**(1/2) 
n225 = n225[['Date','n225','n225_return','n225_volatility_1m']]
# Calculate 1-month forward cumulative returns
n225['one_month_forward_n225_return'] = n225['n225_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
n225['n225_ytd'] = (n225['n225'] - n225.iloc[0, 1]) / n225.iloc[0, 1]

n225.tail()

In [None]:
# 2. Bitcoin
btc = series_dict['btc']
btc['Date'] = pd.to_datetime(btc['Date'])
btc.rename(columns={'Adj Close':'btc'}, inplace=True)
btc['btc_return'] = btc['btc'].pct_change()
btc['btc_volatility_1m'] = (btc['btc_return'].rolling(20).std())*(20)**(1/2) 
btc = btc[['Date','btc','btc_return','btc_volatility_1m']]
btc['one_month_forward_btc_return'] = btc['btc_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
btc['btc_ytd'] = (btc['btc'] - btc.iloc[0, 1]) / btc.iloc[0, 1]

btc.tail()

In [None]:
# 3. Gold
gold = series_dict['gold']
gold['Date'] = pd.to_datetime(gold['Date'])
gold.rename(columns={'Adj Close':'gold'}, inplace=True)
gold['gold_lag1'] = gold['gold'].shift(1)
gold['gold_lag2'] = gold['gold'].shift(2)
gold['gold'] = gold['gold'].fillna(gold['gold_lag1'])
gold['gold'] = gold['gold'].fillna(gold['gold_lag2'])
gold['gold'] = gold['gold'].astype('float64')
gold['gold_return'] = gold['gold'].pct_change()
gold['gold_volatility_1m'] = (gold['gold_return'].rolling(20).std())*(20)**(1/2) 
gold = gold[['Date','gold','gold_return','gold_volatility_1m']]
gold['one_month_forward_gold_return'] = gold['gold_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
gold['gold_ytd'] = (gold['gold'] - gold.iloc[0, 1]) / gold.iloc[0, 1]

gold.tail()

In [None]:
# 4. Kadokawa
kadokawa = series_dict['kadokawa']
kadokawa['Date'] = pd.to_datetime(kadokawa['Date'])
kadokawa.rename(columns={'Adj Close':'kadokawa'}, inplace=True)
kadokawa['kadokawa_return'] = kadokawa['kadokawa'].pct_change()
kadokawa['kadokawa_volatility_1m'] = (kadokawa['kadokawa_return'].rolling(20).std())*(20)**(1/2)
kadokawa = kadokawa[['Date','kadokawa','kadokawa_return','kadokawa_volatility_1m']]
kadokawa['one_month_forward_kadokawa_return'] = kadokawa['kadokawa_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
kadokawa['kadokawa_ytd'] = (kadokawa['kadokawa'] - kadokawa.iloc[0, 1]) / kadokawa.iloc[0, 1]

kadokawa.tail()

In [None]:
# 5. Makita
makita = series_dict['makita']
makita['Date'] = pd.to_datetime(makita['Date'])
makita.rename(columns={'Adj Close':'makita'}, inplace=True)
makita['makita_return'] = makita['makita'].pct_change()
makita['makita_volatility_1m'] = (makita['makita_return'].rolling(20).std())*(20)**(1/2)
makita = makita[['Date','makita','makita_return','makita_volatility_1m']]
makita['one_month_forward_makita_return'] = makita['makita_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
makita['makita_ytd'] = (makita['makita'] - makita.iloc[0, 1]) / makita.iloc[0, 1]

makita.tail()

In [None]:
# 6. Daikin
daikin = series_dict['daikin']
daikin['Date'] = pd.to_datetime(daikin['Date'])
daikin.rename(columns={'Adj Close':'daikin'}, inplace=True)
daikin['daikin_return'] = daikin['daikin'].pct_change()
daikin['daikin_volatility_1m'] = (daikin['daikin_return'].rolling(20).std())*(20)**(1/2)
daikin = daikin[['Date','daikin','daikin_return','daikin_volatility_1m']]
daikin['one_month_forward_daikin_return'] = daikin['daikin_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
daikin['daikin_ytd'] = (daikin['daikin'] - daikin.iloc[0, 1]) / daikin.iloc[0, 1]

daikin.tail()

In [None]:
# 7. EWJ
ewj = series_dict['ewj']
ewj['Date'] = pd.to_datetime(ewj['Date'])
ewj.rename(columns={'Adj Close':'ewj'}, inplace=True)
ewj['ewj_return'] = ewj['ewj'].pct_change()
ewj['ewj_volatility_1m'] = (ewj['ewj_return'].rolling(20).std())*(20)**(1/2)
ewj = ewj[['Date','ewj','ewj_return','ewj_volatility_1m']]
ewj['one_month_forward_ewj_return'] = ewj['ewj_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
ewj['ewj_ytd'] = (ewj['ewj'] - ewj.iloc[0, 1]) / ewj.iloc[0, 1]


ewj.tail()

In [None]:
# 8. Keyence
keyence = series_dict['keyence']
keyence['Date'] = pd.to_datetime(keyence['Date'])
keyence.rename(columns={'Adj Close':'keyence'}, inplace=True)
keyence['keyence_return'] = keyence['keyence'].pct_change()
keyence['keyence_volatility_1m'] = (keyence['keyence_return'].rolling(20).std())*(20)**(1/2)
keyence = keyence[['Date','keyence','keyence_return','keyence_volatility_1m']]
keyence['one_month_forward_keyence_return'] = keyence['keyence_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
keyence['keyence_ytd'] = (keyence['keyence'] - keyence.iloc[0, 1]) / keyence.iloc[0, 1]

keyence.tail()

In [None]:
# 9. PRJPX
prjpx = series_dict['prjpx']
prjpx['Date'] = pd.to_datetime(prjpx['Date'])
prjpx.rename(columns={'Adj Close':'prjpx'}, inplace=True)
prjpx['prjpx_return'] = prjpx['prjpx'].pct_change()
prjpx['prjpx_volatility_1m'] = (prjpx['prjpx_return'].rolling(20).std())*(20)**(1/2)
prjpx = prjpx[['Date','prjpx','prjpx_return','prjpx_volatility_1m']]
prjpx['one_month_forward_prjpx_return'] = prjpx['prjpx_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
prjpx['prjpx_ytd'] = (prjpx['prjpx'] - prjpx.iloc[0, 1]) / prjpx.iloc[0, 1]

prjpx.tail()

In [None]:
# 10. Takeda
takeda = series_dict['takeda']
takeda['Date'] = pd.to_datetime(takeda['Date'])
takeda.rename(columns={'Adj Close':'takeda'}, inplace=True)
takeda['takeda_return'] = takeda['takeda'].pct_change()
takeda['takeda_volatility_1m'] = (takeda['takeda_return'].rolling(20).std())*(20)**(1/2)
takeda = takeda[['Date','takeda','takeda_return','takeda_volatility_1m']]
takeda['one_month_forward_takeda_return'] = takeda['takeda_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]
takeda['takeda_ytd'] = (takeda['takeda'] - takeda.iloc[0, 1]) / takeda.iloc[0, 1]

takeda.tail()

In [None]:
baseline = pd.merge(n225, ewj, how='left', on='Date')
baseline = pd.merge(baseline, btc, how='left', on='Date')
baseline = pd.merge(baseline, gold, how='left', on='Date')
baseline = pd.merge(baseline, daikin, how='left', on='Date')
baseline = pd.merge(baseline, kadokawa, how='left', on='Date')
baseline = pd.merge(baseline, keyence, how='left', on='Date')
baseline = pd.merge(baseline, makita, how='left', on='Date')
baseline = pd.merge(baseline, takeda, how='left', on='Date')
baseline = pd.merge(baseline, prjpx, how='left', on='Date')

baseline.loc[baseline.Date >= '2020-03-25', "recession"] = 1
baseline["recession"] = baseline["recession"].fillna(0)

baseline2020 = pd.merge(baseline,jpn_covid, how='left', on='Date')

baseline2020.info()

In [None]:
baseline2020.tail()

<a id="section-two"></a>
# 2. COVID-19 in Japan

**Latest data (till Nov.9 2020) is applied**

<a id="subsection-two-one"></a>
## 2.1 COVID-19 Cases in Japan

In [None]:
temp = jpn_covid.melt(id_vars="Date", value_vars=['confirmed_inc', 'deaths_inc'],
                 var_name='Case', value_name='Count')
temp.head()

fig = px.area(temp, x="Date", y="Count", color='Case', height=600, width=1200,
             title='COVID-19 Cases in Japan')
fig.update_layout(xaxis_rangeslider_visible=True)
fig.show()

<a id="subsection-two-two"></a>
## 2.2 Cumulative COVID-19 Cases in Japan

In [None]:
def cases_over_time(country):
    #"""This function is to screen out the target country and draw the 'cases over time' graph of the country"""
    # Example of input of this function: Japan, South Korea, China. Be carefull with the format of your input
    # Create a boolean to screen out Japan's data from the full_grouped dataset
    selected = jpn_covid['Country'].str.contains(country)
    full_country = jpn_covid[selected]

    # Wrangling the target country's data
    temp = jpn_covid.melt(id_vars="Date", value_vars=['recoveries', 'deaths', 'active'],
                 var_name='Case', value_name='Count')
    temp.head()

    # Plot a stack area graph with the three types of cases (i.e., recovered, deaths, and active)
    fig = px.area(temp, x="Date", y="Count", color='Case', height=600, width=700,
             title='Cases over time' + ' - ' + 'Japan')
    fig.update_layout(xaxis_rangeslider_visible=True)
    fig.show()

In [None]:
temp = jpn_covid.melt(id_vars="Date", value_vars=['recoveries', 'deaths', 'active'],
        var_name='Case', value_name='Count')

fig = px.area(temp, x="Date", y="Count", color='Case', height=600, width=700,
        title='Cumulative Cases over time' + ' - ' + 'Japan')
fig.update_layout(xaxis_rangeslider_visible=True)
fig.show()

<a id="section-three"></a>
# 3. The Investment Strategy

<a id="subsection-three-one"></a>
## 3.1 Nikkei 225 and COVID-19

In [None]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces to create subplots
fig.add_trace(
    go.Scatter(x=baseline2020['Date'], y=baseline2020['n225'], name = 'Nikkei 225'),  
    secondary_y=False,
)

fig.add_trace(
    go.Scatter(x=baseline2020['Date'], y=baseline2020['confirmed_inc'], name = 'New COVID19 Cases'), 
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Nikkei 225 and New COVID19 Cases"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Nikkei 225</b>", secondary_y=False)
fig.update_yaxes(title_text="<b>New COVID19 Cases</b>", secondary_y=True)

fig.show()

<a id="subsection-three-two"></a>
## 3.2 BTC and Benchmarks (YTD)

In [None]:
plt.title('Performance of Bitcoin and Benchmarks (YTD)', fontsize=20)
plt.plot(baseline2020['Date'], baseline2020['ewj_ytd'],label='EWJ(ETF)', color='blue', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['btc_ytd'],label='BTC', color='red', linewidth = 2.5)
plt.plot(baseline2020['Date'], baseline2020['prjpx_ytd'],label='PRJPX(Mutual Fund)', color='green', linewidth = 1.5)
plt.xlabel('Date', fontsize=18)
plt.ylabel('YTD Return', fontsize=18)
plt.legend(fontsize=14)
plt.rcParams['figure.figsize'] = (20.0, 10.0)

<a id="subsection-three-three"></a>
## 3.3 Kadokawa and Benchmarks (YTD)

In [None]:
plt.title('Performance of Kadokawa and Benchmarks (YTD)', fontsize=20)
plt.plot(baseline2020['Date'], baseline2020['ewj_ytd'],label='EWJ(ETF)', color='blue', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['kadokawa_ytd'],label='Kadokawa', color='red', linewidth = 2.5)
plt.plot(baseline2020['Date'], baseline2020['prjpx_ytd'],label='PRJPX(Mutual Fund)', color='green', linewidth = 1.5)
plt.xlabel('Date', fontsize=18)
plt.ylabel('YTD Return', fontsize=18)
plt.legend(fontsize=14)
plt.rcParams['figure.figsize'] = (20.0, 10.0)

<a id="subsection-three-four"></a>
## 3.4 Takeda and Benchmarks (YTD)

In [None]:
plt.title('Performance of Takeda and Benchmarks (YTD)', fontsize=20)
plt.plot(baseline2020['Date'], baseline2020['ewj_ytd'],label='EWJ(ETF)', color='blue', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['takeda_ytd'],label='Takeda', color='red', linewidth = 2.5)
plt.plot(baseline2020['Date'], baseline2020['prjpx_ytd'],label='PRJPX(Mutual Fund)', color='green', linewidth = 1.5)
plt.xlabel('Date', fontsize=18)
plt.ylabel('YTD Return', fontsize=18)
plt.legend(fontsize=14)
plt.rcParams['figure.figsize'] = (20.0, 10.0)

<a id="subsection-three-five"></a>
## 3.5 Comparison of All Assets (YTD)

In [None]:
plt.title('Comparison of all assets (YTD)', fontsize=20)
plt.plot(baseline2020['Date'], baseline2020['ewj_ytd'],label='EWJ(ETF)', color='red', linewidth = 2.5)
plt.plot(baseline2020['Date'], baseline2020['takeda_ytd'],label='Takeda', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['prjpx_ytd'],label='PRJPX(Mutual Fund)', color='orange', linewidth = 2.5)
plt.plot(baseline2020['Date'], baseline2020['keyence_ytd'],label='Keyence', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['daikin_ytd'],label='Daikin', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['makita_ytd'],label='Makita', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['takeda_ytd'],label='Takeda', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['kadokawa_ytd'],label='Kadokawa', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['gold_ytd'],label='Gold', linewidth = 1.5)
plt.plot(baseline2020['Date'], baseline2020['btc_ytd'],label='Bitcoin', linewidth = 1.5)

plt.xlabel('Date', fontsize=18)
plt.ylabel('YTD Return', fontsize=18)
plt.legend(fontsize=10)
plt.rcParams['figure.figsize'] = (20.0, 10.0)

<a id="subsection-three-six"></a>
## 3.6 Correlation Pair-Plot

In [None]:
# Draw scatter of asset returns during Covid19 pandemic
baseline_returns = baseline2020[["n225_return", "btc_return", "gold_return", "ewj_return", "prjpx_return", 
                  "kadokawa_return", "keyence_return", "daikin_return", "makita_return", "takeda_return", "confirmed_inc", "deaths_inc"]]

sns.pairplot(baseline_returns)

In [None]:
# Draw heatmap of correlation strength across asset classes (returns and volatilities) and Covid19 new cases and deaths during the pandemic period 
baseline_corr = baseline2020[["n225_return", "btc_return", "gold_return", "ewj_return", "prjpx_return", 
                  "kadokawa_return", "keyence_return", "daikin_return", "makita_return", "takeda_return", "confirmed_inc", "deaths_inc"]].corr()

fig, ax = plt.subplots(figsize=(16,5)) 
sns.heatmap(baseline_corr, annot=True, ax = ax)

<a id="section-four"></a>
# 4. Portfolio Optimization - Variance, Covariance, Mean, and Sharpe Ratio

<a id="subsection-four-one"></a>
## 4.1 More Details about Benchmarks

In [None]:
# Basic statistics about EWJ
ewj['ewj_return'].describe()

In [None]:
# Annual return of EWJ
ewj['ewj_return'].mean() * 250

In [None]:
# Basic statistics about PRJPX
prjpx['prjpx_return'].describe()

In [None]:
# Annual return of PRJPX
prjpx['prjpx_return'].mean() * 250

In [None]:
# Benchmarks Sharpe Ratio
# EWJ
rf = 0.0002
ewj_sharpe = (baseline2020['ewj_return'].mean() * 250 - rf) / baseline2020['ewj_return'].std()
print("EWJ Sharpe Ratio: ", ewj_sharpe)

# PRJPX
prjpx_sharpe = (baseline2020['prjpx_return'].mean() * 250 - rf) / baseline2020['prjpx_return'].std()
print("PRJPX Sharpe Ratio: ", prjpx_sharpe)



<a id="subsection-four-two"></a>
## 4.2 Mean Return Vector

In [None]:
# Calculate the Mean Return Vector
mean = pd.DataFrame({
        "BTC":[baseline2020['btc_return'].mean()],
        'Kadokawa':[kadokawa['kadokawa_return'].mean()],
        'Keyence':[keyence['keyence_return'].mean()],
        'Daikin':[daikin['daikin_return'].mean()],
        'Gold':[baseline2020['gold_return'].mean()],
        'Makita':[makita['makita_return'].mean()],
        'Takeda':[takeda['takeda_return'].mean()],
        'EWJ':[baseline2020['ewj_return'].mean()]
    
    })
print(mean)


<a id="subsection-four-three"></a>
## 4.3 Covariance Matrix

In [None]:
# List all target assets' daily return
data = pd.DataFrame({
        "BTC":baseline2020['btc_return'][1:],
        'Kadokawa':kadokawa['kadokawa_return'][1:],
        'Keyence':keyence['keyence_return'][1:],
        'Daikin':daikin['daikin_return'][1:],
        'Gold':baseline2020['gold_return'][1:],
        'Makita':makita['makita_return'][1:],
        'Takeda':takeda['takeda_return'][1:],
        'EWJ':baseline2020['ewj_return'][1:]
    
    })
print(data)


In [None]:
# Calculate the Covariance Matrix
print(data.cov())

<a id="section-five"></a>
# 5. REFERENCES

[Yahoo Finance](https://finance.yahoo.com/)  
[Kadokawa Coporation](https://ir.kadokawa.co.jp/global/businessoverview02.php)  
[COVID-19 by country - Daily update](https://www.kaggle.com/jcsantiago/covid19-by-country-with-government-response)  
[Bloomberg](https://www.bloomberg.com/asia)