![](https://www.syl.ru/misc/i/ai/425996/2860180.jpg)

## 1. Motivation

Stock Exchange market indicates how confident markets think of current economics and productivities.
There are tens of major exchanges all over the world. 

Do they have any relationship against each other?

Stock exchanges are secondary markets, where existing owners of shares can transact with potential buyers. 

##### references:

* https://www.investopedia.com/articles/investing/082614/how-stock-market-works.asp

* https://en.wikipedia.org/wiki/Stock_exchange

`TODO`: current competitions and corporation among those exchanges

## 2. Dataset

##### stock markets

* https://www.kaggle.com/mattiuzc/stock-exchange-data

Daily price data for indexes tracking stock exchanges from all over the world (United States, China, Canada, Germany, Japan, and more). 

The data was all collected from Yahoo Finance, which had several decades of data available for most exchanges.

Prices are quoted in terms of the national currency of where each exchange is located.

`TODO`

##### foreign exchange rate

* https://www.kaggle.com/dhruvildave/currency-exchange-rates

* https://www.kaggle.com/brunotly/foreign-exchange-rates-per-dollar-20002019

* https://fred.stlouisfed.org/series/DEXCHUS

* https://fred.stlouisfed.org/series/DEXUSEU

* https://fred.stlouisfed.org/series/DEXJPUS



In [None]:
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

import seaborn as sns
import plotly as py
import plotly.graph_objs as go
import plotly.express as px
from plotly.offline import init_notebook_mode
init_notebook_mode(connected = True)


In [None]:
stock = pd.read_csv('../input/stockexchangesrelationshipsresearch/stock_exchange_data/indexData.csv')

In [None]:
stock.sample(10)

In [None]:
stock.info()

stock.describe()

grouped = stock.groupby('Index')
grouped.describe()

## 3. preprocessing

In [None]:
# automatically infer the most suitable dtype
d = stock.convert_dtypes()
d.info()

In [None]:
# manually convert the datestring to datetime
d.Date = pd.to_datetime(d.Date)  # d.Date = d.Date.astype('datetime64[ns]')
d.info()

In [None]:
d.dropna(inplace=True)

In [None]:
d.info()

## 4. EDA/Visualization

### 4.1. Overview

```python
import plotly.graph_objects as go

fig = go.Figure(data=go.Ohlc(x=d.Date, open=d.Open, high=d.High, low=d.Low, close=d.Close))
fig.show()
```

[![fm3m79.png](https://z3.ax1x.com/2021/08/06/fm3m79.png)](https://imgtu.com/i/fm3m79)

### 4.2. Adjusted Close

Stock values are stated in terms of the closing price and the adjusted closing price. 

The closing price is the raw price, which is just the cash value of the last transacted price before the market closes. 

The adjusted closing price factors in anything that might affect the stock price after the market closes. 

```python
# sns.relplot(data=d, x='Date', y='Adj Close', hue='Index', height=18, aspect=16/9)
```

[![fm8iEd.png](https://z3.ax1x.com/2021/08/06/fm8iEd.png)](https://imgtu.com/i/fm8iEd)

Are you able to spot out the Great Recession?

### 4.3. Unify value with Foreigh Exchange Rate

In [None]:
d.Index.unique()

| Index     | Fullname                                                     | Region             | Currency | link                                                         |
| --------- | ------------------------------------------------------------ | ------------------ | -------- | ------------------------------------------------------------ |
| NYA       | NYSE Composite Index, New York Stock Exchange                | US                 | USD      | https://en.wikipedia.org/wiki/New_York_Stock_Exchange        |
| IXIC      | NASDAQ Composite Index, National Association of Securities Dealers Automated Quotations | US                 | USD      | https://en.wikipedia.org/wiki/Nasdaq_Composite               |
| 000001.SS | SSE Composite Index                                          | shanghai, China    | CNY      | https://en.wikipedia.org/wiki/SSE_Composite_Index            |
| 399001.SZ | Shenzhen Composite Index                                     | shenzhen, China    | CNY      | https://en.wikipedia.org/wiki/SZSE_Composite_Index           |
| N100      | Euronext 100, pan-European exchange                          | Europe             | EUR      | https://en.wikipedia.org/wiki/Euronext_100                   |
| N225      | Nikkei 225, Nikkei Stock Exchange                            | Japan              | JPY      | https://en.wikipedia.org/wiki/Nikkei_225                     |
| HSI       | Hang Seng Index                                              | Hong Kong, China   | HKD      | https://en.wikipedia.org/wiki/Hang_Seng_Index                |
| TWII      | Taiwan Capitalization Weighted Stock Index, TWSE             | Taiwan, China      | TWD      | https://en.wikipedia.org/wiki/KOSPI                          |
| J203.JO   | FTSE All-Share Index, London Stock Exchange                  | London, UK         |          | https://en.wikipedia.org/wiki/FTSE_All-Share_Index           |
| GSPTSE    | S&P/TSX Composite index, Toronto Stock Exchange 300 Composite Index | Toronto, Canada    |          | https://en.wikipedia.org/wiki/S%26P/TSX_Composite_Index      |
| NSEI      | National Stock Exchange of India                             | Mumbai, India      |          | https://en.wikipedia.org/wiki/National_Stock_Exchange_of_India |
| GDAXI     | Deutscher Aktienindex (German stock index)                   | Frankfurt, Germany |          | https://en.wikipedia.org/wiki/DAX                            |
| KS11      | Korea SE Kospi Index                                         | Korea              |          | https://en.wikipedia.org/wiki/KOSPI                          |



In [None]:
dex_chus = pd.read_csv('../input/stockexchangesrelationshipsresearch/foreign_exchange_rate/DEX_CHUS.csv')
dex_euus = pd.read_csv(../input/stockexchangesrelationshipsresearch./foreign_exchange_rate/DEX_USEU.csv' # USEU)
dex_jpus = pd.read_csv../input/stockexchangesrelationshipsresearch'./foreign_exchange_rate/DEX_JPUS.csv')

#### 4.3.1. China mainland (399001SZ & 000001SH )

In [None]:
dex_chus.head()

In [None]:
dex_chus.shape

In [None]:
dex_chus.info()

In [None]:
dex_chus.DATE = pd.to_datetime(dex_chus.DATE)
dex_chus.head()

In [None]:
# plt.scatter(x=dex_chus.DATE, y=dex_chus.DEXCHUS)

In [None]:
sz = d[d['Index'] == '399001.SZ']
sh = d[d['Index'] == '000001.SS']
print(sz.head())
print(sh.head())

In [None]:
dex_chus.DATE = pd.to_datetime(dex_chus.DATE)
sz_us = sz.merge(right=dex_chus, left_on='Date', right_on='DATE', how='inner')
sz_us.info()

In [None]:
sz_us.head()

In [None]:
# 舍弃与Date重复的DATE列
sz_us.drop(axis=1, labels='DATE', inplace=True)

An exception is raised becuase of invalid values. Next we are gonna apply a customized function/regular express to filter those invalid row.

In [None]:
import re

def rg_convert(s):
    return float(s) if re.match(r'[0-9]+\.[0-9]{,4}', s) else None

In [None]:
sz_us.DEXCHUS = sz_us.DEXCHUS.apply(rg_convert)
sz_us.info()

In [None]:
sz_us.dropna(inplace=True)
sz_us.info()

In [None]:
sz_us['AdjustedCloseInUSD'] = sz_us['Adj Close'] / sz_us.DEXCHUS

In [None]:
sz_us.head()

In [None]:
sh_us = sh.merge(right=dex_chus, left_on='Date', right_on='DATE', how='inner') # inner outer join
sh_us.drop(axis=1, labels='DATE', inplace=True)
sh_us.DEXCHUS = sh_us.DEXCHUS.apply(rg_convert)

sh_us['AdjustedCloseInUSD'] = sh_us['Adj Close'] / sh_us.DEXCHUS
print(sh_us.info())
print(sh_us.head())

#### 4.3.2. Euro (N100)

In [None]:
dex_euus.info()

In [None]:
dex_euus.DATE = pd.to_datetime(dex_euus.DATE)

In [None]:
dex_euus['DEXUSEU'] = dex_euus['DEXUSEU'].apply(rg_convert)
dex_euus.dropna(inplace=True)
dex_euus.info()

In [None]:
dex_euus.head()

In [None]:
eu = d[d['Index'] == 'N100']
eu_us = eu.merge(right=dex_euus, left_on='Date', right_on='DATE', how='inner') # inner outer join
eu_us.drop(axis=1, labels='DATE', inplace=True)
eu_us['AdjustedCloseInUSD'] = eu_us['Adj Close'] * eu_us.DEXUSEU # !! the column is DEXUSEU (EUR/USD) in this dataset ! 

In [None]:
print(eu_us.info())
print(eu_us.head())

#### 4.3.3. Japan (N225)

In [None]:
dex_jpus.info()

In [None]:
dex_jpus.DATE = pd.to_datetime(dex_jpus.DATE)

In [None]:
dex_jpus['DEXJPUS'] = dex_jpus['DEXJPUS'].apply(rg_convert)
dex_jpus.dropna(inplace=True)
dex_jpus.info()

In [None]:
jp = d[d['Index'] == 'N225']
jp_us = jp.merge(right=dex_jpus, left_on='Date', right_on='DATE', how='inner') # inner outer join
jp_us.drop(axis=1, labels='DATE', inplace=True)
jp_us['AdjustedCloseInUSD'] = jp_us['Adj Close'] / jp_us.DEXJPUS

In [None]:
print(eu_us.info())
print(eu_us.head())

#### 4.3.4. US (NYA, IXIC)

In [None]:
nya = d[d['Index'] == 'NYA']
ixic = d[d['Index'] == 'IXIC']

nya['AdjustedCloseInUSD'] = nya['Adj Close'].copy()
ixic['AdjustedCloseInUSD'] = ixic['Adj Close'].copy()

### 4.3.4. put all unified data together

In [None]:
sz_us.drop(axis=1, labels='DEXCHUS', inplace=True)
sh_us.drop(axis=1, labels='DEXCHUS', inplace=True)
eu_us.drop(axis=1, labels='DEXUSEU', inplace=True)
jp_us.drop(axis=1, labels='DEXJPUS', inplace=True)


In [None]:
# todo merge all those datasets 
ny = nya.loc[:, ['Date', 'AdjustedCloseInUSD']].rename(columns={'AdjustedCloseInUSD': 'ny'})
na = ixic.loc[:, ['Date', 'AdjustedCloseInUSD']].rename(columns={'AdjustedCloseInUSD': 'na'})
sz = sz_us.loc[:, ['Date', 'AdjustedCloseInUSD']].rename(columns={'AdjustedCloseInUSD': 'sz'})
sh = sh_us.loc[:, ['Date', 'AdjustedCloseInUSD']].rename(columns={'AdjustedCloseInUSD': 'sh'})
eu = eu_us.loc[:, ['Date', 'AdjustedCloseInUSD']].rename(columns={'AdjustedCloseInUSD': 'eu'})
jp = jp_us.loc[:, ['Date', 'AdjustedCloseInUSD']].rename(columns={'AdjustedCloseInUSD': 'jp'})

In [None]:
print(ny.info())
print(na.info())
print(sz.info())
print(sh.info())
print(eu.info())
print(jp.info())

In [None]:
data = ny.merge(na, on='Date').merge(sz, on='Date').merge(sh, on='Date').merge(eu, on='Date').merge(jp, on='Date') # default is inner join
data.info()

In [None]:
data.head()

In [None]:
data.set_index(keys='Date', inplace=True)

In [None]:
data.plot()

In [None]:
sns.pairplot(data)

`TODO` explain a little it what is correlation and what is used for

In [None]:
# correlation
corr = data.corr()
print(corr)

In [None]:
sns.heatmap(corr)

implication:

1. Exchanges in the same region have similar paces. e.g. ny and na, sz and sh.
2. Japanese Stock Market is heavily decided by American Stock Market. corr between ny and jp is 0.926!
3. American Stock Market has less influence on European Stock Market.
4. Chinese Stock Market has its own pace.

## 5. Evaluation/Limitation

#### market
1. administrative differences / regulation policies 
2. political policies
    such as trade war, tariff, 
3. exchanges focus
    exchanges have their own focuses, might have different appetites

#### dataset
1. market volumes
2. missing value / period
3. year/month/day change

