# Table of contents
1. [Introduction](#introduction)
2. [Data Analysis](#data_analysis)
    1. [Libraries](#libraries)
    2. [Data Paths](#data_paths)
    3. [Data Cleaning and Engineering](#data_cleaning_and_engineering)
    4. [Statistics](#statistics)
3. [Data Exploration](#data_exploration)
    1. [Currency Strength in relation to Gold](#curr_strength)
    2. [Intermarket Relations [Commodities]](#intermarket_relations)
    3. [Gold / Oil Ratio](#gold_oil_ratio)
    4. [Risk Appetite](#risk_appetite)
        1. [Volatility Index, Equity Index and USD/JPY](#VIX_S&P_USDJPY) 
        2. [Anomaly Detection in Market Risk Appetite](#anomaly_detection)
    

## **Introduction** <a name="introduction"></a>
Forex is a large global market that allows one to trade currencies against each other. As the largest market in the world, it boasts a trading volume of almost $7 trillion in a single day. With the increase in popularity of AI and Machine Learning, many have attempted to predict future currency prices, however, many had little success.

Predicting the Financial Markets is akin to predicting the future. With so many unknown and unpredictable factors, building a machine learning model to predict occurrence of future events is just too unlikely (for now). Therefore, instead of trying to predict future prices, this notebook provides a simple analysis of the currency market (only some currency pairs are analysed), its correlations with the wider market and perhaps some recent trends that we have been observing

<img src= "https://www.hedgethink.com/wp-content/uploads/2020/09/e4367cfd7833fa17a680a30b7c32cd9f-1024x576.jpg" alt ="Forex" style='width: 200px;'>
<sub>Source : https://www.hedgethink.com/wp-content/uploads/2020/09/e4367cfd7833fa17a680a30b7c32cd9f-1024x576.jpg</sub>

## **Data Analysis** <a name="data_analysis"></a>
In this section we will do a simple analysis of our data that may help us later on in our **Data Exploration**

### Libraries <a name="libraries"></a>

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
import numpy as np
import pandas as pd
import os
from datetime import datetime

import plotly.graph_objects as go
import matplotlib.pyplot as plt
import plotly.express as px

from plotly.subplots import make_subplots
from plotly.graph_objs import Line

from scipy import stats
import seaborn as sns

### Data Paths <a name="data_paths"></a> 


In [None]:
# Data Paths
daily_eurusd_df = pd.read_csv("../input/xauusdxaueureurusd-daily/data/EUR_USD Historical Data.csv")
xau_eur_df = pd.read_csv("../input/xauusdxaueureurusd-daily/data/XAU_EUR Historical Data (2).csv")
xau_usd_df = pd.read_csv("../input/xauusdxaueureurusd-daily/data/XAU_USD Historical Data (1).csv")
oil_df = pd.read_csv("../input/crude-oil-prices/Oil_Prices.csv")
usd_index_df = pd.read_csv("../input/us-dollar-index/US Dollar Index Futures Historical Data.csv")
us_interest_rates = pd.read_csv("../input/historical-fed-funds/fed-funds-rate-historical-chart_Mar2021.csv")
gold_prices_df = pd.read_csv("../input/gold-and-silver-prices-dataset/gold_price.csv")
daily_usdjpy_df = pd.read_csv("../input/usdjpy-historical-data-2014-2021/USD_JPY Historical Data.csv")
vix_df = pd.read_csv("../input/cboe-vix-historical-data-2014-2021/CBOE Volatility Index Historical Data.csv")
snp_500_df = pd.read_csv("../input/sp-500-historical-data-2014-2021/SP 500 Historical Data.csv")

### Data Cleaning and Engineering <a name="data_cleaning_and_engineering"></a>

In [None]:
# Renaming Columns For Merging Later on

daily_eurusd_df.rename(columns = {'Price' : 'EURUSD_Price', 'Open' : 'EURUSD_Open', "High":"EURUSD_High", "Low":"EURUSD_Low", "Change %":"EURUSD_Change%"}, inplace = True)
xau_usd_df.rename(columns = {'Price' : 'XAUUSD_Price', 'Open' : 'XAUUSD_Open', "High":"XAUUSD_High", "Low":"XAUUSD_Low", "Change %":"XAUUSD_Change%"}, inplace = True)
xau_eur_df.rename(columns = {'Price' : 'XAUEUR_Price', 'Open' : 'XAUEUR_Open', "High":"XAUEUR_High", "Low":"XAUEUR_Low", "Change %":"XAUEUR_Change%"}, inplace = True)

In [None]:
def modify_datetime(df_column):
    """
    Changes Date Format from Feb 08, 2020 --> 08/02/2020 [dd/mm/YYYY]
    """
    df_column["Date"] = df_column["Date"].apply(lambda x:datetime.strptime(x.lower().replace(",", ""), "%b %d %Y").strftime("%d/%m/%Y"))
    return df_column["Date"]


def remove_comma(df_column, column_name):
    """
    Removes Comma from Prices E.g [1,234,234 --> 1234234]
    """
    try:
        df_column[column_name] = df_column[column_name].apply(lambda x: x.replace(",", ""))
        return df_column[column_name]
    except:
        return df_column[column_name]


daily_eurusd_df["Date"] = modify_datetime(daily_eurusd_df)
print("No. of Data Points (EURUSD) :", len(daily_eurusd_df))

xau_usd_df["Date"] = modify_datetime(xau_usd_df)
print("No. of Data Points (XAUUSD) :", len(xau_usd_df))

xau_eur_df["Date"] = modify_datetime(xau_eur_df)
print("No. of Data Points (XAUEUR) :", len(xau_eur_df))

In [None]:
# Merging all the Dataframes together 
merge_df = pd.merge(daily_eurusd_df, xau_usd_df, how="outer", on="Date")
merge_df = pd.merge(merge_df, xau_eur_df, how="outer", on="Date")
# Re-Fromatting Dataframe
merge_df.dropna(inplace=True)
merge_df = merge_df[::-1].reset_index()
del merge_df["index"]
# Removes Commas from Columns we need
merge_df["XAUUSD_Price"] = remove_comma(merge_df, "XAUUSD_Price")
merge_df["XAUEUR_Price"] = remove_comma(merge_df, "XAUEUR_Price")
# Make an archive/copyy of the original dataframe
_merge_df = merge_df.copy()

### Statistics <a name="statistics"></a>

In [None]:
"""Mean Price"""

mean_eurusd = merge_df["EURUSD_Price"].mean()
mean_xauusd = merge_df["XAUUSD_Price"].astype(np.float).mean()
mean_xaueur = merge_df["XAUEUR_Price"].astype(np.float).mean()


"""Mode Price"""

mode_eurusd = merge_df["EURUSD_Price"].mode().tolist()
mode_xauusd = merge_df["XAUUSD_Price"].mode().astype(float).tolist()
mode_xaueur = merge_df["XAUEUR_Price"].mode().astype(float).tolist()


"""Plotting Candlestick Graphs with Mean and Mode Values"""

fig = go.Figure(data=[go.Candlestick(x=merge_df['Date'],
                open=merge_df['EURUSD_Open'],
                high=merge_df['EURUSD_High'],
                low=merge_df['EURUSD_Low'],
                close=merge_df['EURUSD_Price'],
                name="Candlestick Graph")])

for i in mode_eurusd:
    x = np.array(["02/01/2014", "08/02/2021"])
    y = np.array([i, i])
    fig.add_trace(go.Scatter(x=x, y=y, name="Mode Value(s)",mode='lines'))

x = np.array(["02/01/2014", "08/02/2021"])
y = np.array([mean_eurusd, mean_eurusd])
fig.add_trace(go.Scatter(x=x, y=y, name="Mean Value",line=dict(color='red', width=1.5, dash='dot')))

fig.update_layout(showlegend=True)
fig.update_layout(xaxis_rangeslider_visible=False)
fig.update_layout(height=600, width=1000, title_text="EURUSD Chart")

fig.show()
print("Mean EURUSD Price (Jan 2014 - Feb 2021) :", round(mean_eurusd, 4))
print("Mode EURUSD Price(s) (Jan 2014 - Feb 2021) :", mode_eurusd)

In [None]:
# Overview of Data
sns.displot(merge_df['EURUSD_Price'])

"""
Skewness is a measure of the symmetrical nature of data. 

Kurtosis is a measure of how heavy-tailed or light-tailed the data is relative to a normal distribution.
"""
print("Skewness: %f" % merge_df['EURUSD_Price'].skew())
print("Kurtosis: %f" % merge_df['EURUSD_Price'].kurt())

## **Data Exploration** <a name="data_exploration"></a>

### Currency Strength in relation to Gold <a name="curr_strength"></a>

Currency Strength is often used as an indicator to aid in trading. However, there are many ways to define a currency's strength. Some open source currency strength meters available measure the strength of each currency in relation to USD then ranks the major currency pairs accordingly. This presents a more biased view of currency strength due to the heavy influence of USD involved and thus a greater influence of events occurring in USA. Take for example a rise in AUDUSD does not necessarily suggest an improving AUD but could instead be a negtive fundamental change that is occurring in USA. Thus, it is essential that we avoid using other currencies as a standard to measure currency strength.

Perhaps looking at the currencies in terms of Gold provides a less biased outlook of currency strength. In this section, we look at how closely correlated EUR/USD is against their currency strengths in relation to Gold


In [None]:
fig = go.Figure(data=[go.Candlestick(x=merge_df['Date'],
                open=merge_df['XAUUSD_Open'],
                high=merge_df['XAUUSD_High'],
                low=merge_df['XAUUSD_Low'],
                close=merge_df['XAUUSD_Price'],
                name="Candlestick Graph")])

for i in mode_xauusd:
    x = np.array(["02/01/2014", "08/02/2021"])
    y = np.array([i, i])
    fig.add_trace(go.Scatter(x=x, y=y, name="Mode Value(s)",mode='lines'))

x = np.array(["02/01/2014", "08/02/2021"])
y = np.array([mean_xauusd, mean_xauusd])
fig.add_trace(go.Scatter(x=x, y=y, name="Mean Value",line=dict(color='red', width=1.5, dash='dot')))

fig.update_layout(showlegend=True)  
fig.update_layout(xaxis_rangeslider_visible=False)
fig.update_layout(height=600, width=1000, title_text="XAUUSD Chart")

fig.show()
print("Mean XAUUSD Price (Jan 2014 - Feb 2021) :", round(mean_xauusd, 2))
print("Mode XAUUSD Price(s) (Jan 2014 - Feb 2021) :", mode_xauusd)



fig = go.Figure(data=[go.Candlestick(x=merge_df['Date'],
                open=merge_df['XAUEUR_Open'],
                high=merge_df['XAUEUR_High'],
                low=merge_df['XAUEUR_Low'],
                close=merge_df['XAUEUR_Price'],
                name="Candlestick Graph")])

for i in mode_xaueur:
    x = np.array(["02/01/2014", "08/02/2021"])
    y = np.array([i, i])
    fig.add_trace(go.Scatter(x=x, y=y, name="Mode Value(s)",mode='lines'))

x = np.array(["02/01/2014", "08/02/2021"])
y = np.array([mean_xaueur, mean_xaueur])
fig.add_trace(go.Scatter(x=x, y=y, name="Mean Value",line=dict(color='red', width=1.5, dash='dot')))

fig.update_layout(showlegend=True)  
fig.update_layout(xaxis_rangeslider_visible=False)
fig.update_layout(height=600, width=1000, title_text="XAUEUR Chart")

fig.show()
print("Mean XAUEUR Price (Jan 2014 - Feb 2021) :", round(mean_xaueur, 2))
print("Mode XAUEUR Price(s) (Jan 2014 - Feb 2021) :", mode_xaueur)

In [None]:
# Finding the Difference Between XAUUSD and XAUEUR
merge_df["XAUUSD_XAUEUR_Diff_Price"] = (merge_df["XAUUSD_Price"].astype(float) - merge_df["XAUEUR_Price"].astype(float))
merge_df["XAUEUR / XAUUSD Price"] = (merge_df["XAUUSD_Price"].astype(float) / merge_df["XAUEUR_Price"].astype(float))

EUR and USD Currency Strength is taken as (XAUEUR - XAUUSD) in this case

In [None]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.XAUUSD_XAUEUR_Diff_Price, name="EUR and USD Currency Strength"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.EURUSD_Price, name="EUR/USD"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="EUR/USD Versus EUR and USD Currency Strength"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>EUR and USD Currency Strength</b>", secondary_y=False)
fig.update_yaxes(title_text="<b>EUR/USD</b> Prices", secondary_y=True)

fig.show()

print("Correlation Between Currency Strength of EUR and USD (XAUEUR - XAUUSD) and EUR/USD :", round(stats.pearsonr(merge_df.XAUUSD_XAUEUR_Diff_Price, merge_df.EURUSD_Price)[0],4))

We can see currency strength of EUR and USD is closely correlated to EUR/USD Prices. From 2015, it even seems that they move in sync with each other.  

### Intermarket Relations [Commodities] <a name="intermarket_relations"></a>

Commodities such as Gold and Oil have been a measure for currencies for a long while, whether through direct means of affecting currency prices or through its correlations with interest rates, it seems that commodities have played an essential role in gauging future movements in the Forex market. Let us look at Gold and Oil Correlations against some of our the Data we have at hand

In [None]:
# Oil Dataframe
oil_df = pd.read_csv("../input/crude-oil-prices/Oil_Prices.csv")
oil_df.rename(columns = {'Close/Last' : 'Oil_Price', 'Volume' : 'Oil_Volume', "Open": "Oil_Open", "High":"Oil_High", "Low":"Oil_Low"}, inplace=True)
oil_df = oil_df[::-1].reset_index()
del oil_df["index"]
oil_df = oil_df[712:]
oil_df["Date"] = oil_df["Date"].apply(lambda x:datetime.strptime(x, "%m/%d/%Y").strftime("%d/%m/%Y"))
oil_df.reset_index()

# Gold Dataframe
gold_prices_df = pd.read_csv("../input/gold-and-silver-prices-dataset/gold_price.csv")
gold_prices_df.rename(columns={"date":"Date", "price": "Gold_Price"}, inplace=True)
gold_prices_df["Date"] = gold_prices_df["Date"].apply(lambda x:datetime.strptime(x, "%Y-%m-%d").strftime("%d/%m/%Y"))
gold_prices_df = gold_prices_df.dropna()
# us_interest_rates = us_interest_rates[us_interest_rates['Date'].between("02/01/2014", "08/02/2021")]
start_date = "02/01/2014"
end_date = "08/02/2021"
gold_prices_df = gold_prices_df[gold_prices_df[gold_prices_df.Date==(start_date)].index[0] : gold_prices_df[gold_prices_df.Date==(end_date)].index[0]+1].reset_index().drop("index", axis=1)

# USDX Dataframe
usd_index_df = pd.read_csv("../input/us-dollar-index/US Dollar Index Futures Historical Data.csv")
modify_datetime(usd_index_df)
usd_index_df.rename(columns = {'Price' : 'USDX_Price', "Open": "USDX_Open", "High":"USDX_High", "Low":"USDX_Low", "Vol.":"USDX_Vol", "Change %":"USDX_Change%"}, inplace=True)

# US Interest Rates
us_interest_rates = pd.read_csv("../input/historical-fed-funds/fed-funds-rate-historical-chart_Mar2021.csv")
us_interest_rates.rename(columns={"date":"Date", " value": "US_Interest_Rates_Value"}, inplace=True)
us_interest_rates["Date"] = us_interest_rates["Date"].apply(lambda x:datetime.strptime(x, "%m/%d/%Y").strftime("%d/%m/%Y"))
us_interest_rates = us_interest_rates.dropna()
# us_interest_rates = us_interest_rates[us_interest_rates['Date'].between("02/01/2014", "08/02/2021")]
start_date = "02/01/2014"
end_date = "08/02/2021"
us_interest_rates = us_interest_rates[us_interest_rates[us_interest_rates.Date==(start_date)].index[0] : us_interest_rates[us_interest_rates.Date==(end_date)].index[0]+1].reset_index().drop("index", axis=1)

# Merging Dataframe
merge_df = pd.merge(merge_df, oil_df, how="left", on="Date")
merge_df = pd.merge(merge_df, gold_prices_df, how="left", on="Date")
merge_df = pd.merge(merge_df, usd_index_df, how="left", on="Date")
merge_df = pd.merge(merge_df, us_interest_rates, how="left", on="Date")
merge_df["Gold/Oil"] = merge_df["Gold_Price"] / merge_df["Oil_Price"]

Let us first look at **Gold's Correlation with out Data**

In [None]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Gold_Price, name="Gold Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.EURUSD_Price, name="EURUSD Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Gold Prices Versus EURUSD Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>EURUSD</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Gold_Price, name="Gold Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.XAUUSD_XAUEUR_Diff_Price, name="XAUEUR - XAUUSD Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Gold Prices Versus XAUEUR - XAUUSD Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>XAUUSD_XAUEUR_Diff</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Gold_Price, name="Gold Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["XAUEUR / XAUUSD Price"], name="XAUEUR / XAUUSD Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Gold Prices Versus XAUEUR/XAUUSD Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>XAUEUR/XAUUSD</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Gold_Price, name="Gold Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.USDX_Price, name="USDX Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Gold Prices Versus USDX Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>USDX</b> Prices", secondary_y=True)

fig.show()

In [None]:
merge_df.corr(method='pearson')
corr_df = merge_df[["Gold_Price", "USDX_Price", "EURUSD_Price", "XAUEUR / XAUUSD Price", "XAUUSD_XAUEUR_Diff_Price"]]

# Correlation Heatmap
corrmat = corr_df.corr()
f, ax = plt.subplots(figsize=(12, 9))
ax.set_title("Correlation Heatmap")
sns.heatmap(corrmat, square=True, annot=True)

<ins>**Anomalies**</ins>

It's interesting here how Gold has only around a 0.15 correlation with EURUSD and XAUUSD / XAUEUR but a 0.58 correlation with XAUEUR - XAUUSD. From the graphs we can clearly see differences between XAUEUR - XAUUSD and (EURUSD and XAUUSD / XAUEUR) during the following periods :

- 2016 Jun - 2016 Dec
- 2020 Mar - 2021 Feb

Its generally understood that an effect on XAUEUR - XAUUSD is amplified as compared to XAUEUR / XAUUSD or EUR/USD as the impact of a subtraction of 2 values is generally greater than performing division on the 2 values (Gold and EUR/USD for example). Considering some of the global events that occurred during these 2 periods of time, it is easy to understand why our correlation sees such great differences between 2 very similar values.

**Global Events During these Periods**

- 2016 Jun - 2016 Dec
    - Brexit did not create as much of an issue as expected and as such the financial markets did not take a big hit. The stock market rally during this period may have contributed greatly to the fall in prices of Gold
    
- 2020 Mar - 2021 Feb
    - COVID 19's initial toll on the stock market and the ensuing stock market rally



<ins>**Insights and Findings**</ins>

Gold's low correlation to the Euro and Dollar is largely expected as Gold's price is mostly determined by its underlying demand and supply.

Although Gold historically acts as a hedge to inflation and the dollar (Gold usually rises dramatically as the dollar falls), its low correlation to the dollar may suggest a general positive outlook in the US economy over this time period. This is especially so from late 2018 to the early 2020's where Gold moved almost in tandem with USDX prices.



It seems like Gold correlation with our Data is rather weak, now us let us look at **Oil's Correlation with out Data**

In [None]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Oil_Price, name="Oil Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.EURUSD_Price, name="EURUSD Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Oil Prices Versus EURUSD Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>EURUSD</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Oil_Price, name="Oil Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.XAUUSD_XAUEUR_Diff_Price, name="XAUEUR - XAUUSD Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Oil Prices Versus XAUEUR - XAUUSD Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>XAUUSD_XAUEUR_Diff</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Oil_Price, name="Oil Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["XAUEUR / XAUUSD Price"], name="XAUEUR / XAUUSD Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Oil Prices Versus XAUEUR/XAUUSD Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>XAUEUR/XAUUSD</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Oil_Price, name="Oil Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.USDX_Price, name="USDX Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Oil Prices Versus USDX Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>USDX</b> Prices", secondary_y=True)

fig.show()

In [None]:
merge_df.corr(method='pearson')
corr_df = merge_df[["Oil_Price", "USDX_Price", "EURUSD_Price", "XAUEUR / XAUUSD Price", "XAUUSD_XAUEUR_Diff_Price"]]

# Correlation Heatmap
corrmat = corr_df.corr()
f, ax = plt.subplots(figsize=(12, 9))
ax.set_title("Correlation Heatmap")

sns.heatmap(corrmat, square=True, annot=True)

<ins>**Insights and Findings**</ins>

Compared to Gold, there aren't many anomalies in the relationship between Oil and the Euro or Dollar at first glance that have yet to be discussed.

Oil is generally negatively correlated with the Dollar and moves alongside EUR/USD. With Oil priced in USD and USA being a net importer of Oil, it is easy see how this negative correlation between the 2 is established.

Of Course Oil has interesting correlations between other currencies as well. However this will be discussed in further EDA's which goes in depth into specific commodities/markets/assets that affect the Forex markets.

### Gold / Oil Ratio <a name="gold_oil_ratio"></a>



In [None]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Gold_Price, name="Gold Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Oil_Price, name="Oil Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Gold Prices Versus Oil Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>Oil</b> Prices", secondary_y=True)

fig.show()

In [None]:
merge_df.corr(method='pearson')


# Adjust Correlation Dataframe
corr_df = merge_df[["Gold_Price", "Oil_Price"]]

# Correlation Heatmap
corrmat = corr_df.corr()
f, ax = plt.subplots(figsize=(12, 9))
ax.set_title("Gold and Oil Prices Correlation Heatmap")

sns.heatmap(corrmat, square=True, annot=True)

In [None]:
# Overview of Data
sns.displot(merge_df['Gold/Oil'])

print("Skewness: %f" % merge_df['Gold/Oil'].skew())
print("Kurtosis: %f" % merge_df['Gold/Oil'].kurt())
print(merge_df['Gold/Oil'].describe())

In [None]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["Gold/Oil"], name="Gold/Oil"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.US_Interest_Rates_Value, name="US Interest Rate"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Gold/Oil Prices Versus US Interest Rates"
)


# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold/Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>US Interest Rates</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["Gold/Oil"], name="Gold/Oil"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.EURUSD_Price, name="EURUSD Price"),
    secondary_y=True,
)

# Add figure title
fig.update_layout(
    title_text="Gold/Oil Prices Versus EURUSD Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold/Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>EURUSD</b> Prices", secondary_y=True)

fig.show()





# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["Gold/Oil"], name="Gold/Oil"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Gold_Price, name="Gold Price"),
    secondary_y=True,
)


# Add figure title
fig.update_layout(
    title_text="Gold/Oil Prices Versus Gold Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold/Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>Gold</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["Gold/Oil"], name="Gold/Oil"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.Oil_Price, name="Oil Price"),
    secondary_y=True,
)


# Add figure title
fig.update_layout(
    title_text="Gold/Oil Prices Versus Oil Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold/Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>Gold</b> Prices", secondary_y=True)

fig.show()




# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["Gold/Oil"], name="Gold/Oil"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.USDX_Price, name="USDX Price"),
    secondary_y=True,
)


# Add figure title
fig.update_layout(
    title_text="Gold/Oil Prices Versus USDX Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>Gold/Oil</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>USDX</b> Prices", secondary_y=True)

fig.show()

In [None]:
merge_df.corr(method='pearson')


# Adjust Correlation Dataframe
corr_df = merge_df[["Gold/Oil", "US_Interest_Rates_Value", "EURUSD_Price", "USDX_Price", "Gold_Price", "Oil_Price"]]


# Correlation Heatmap
corrmat = corr_df.corr()
f, ax = plt.subplots(figsize=(12, 9))
sns.heatmap(corrmat, square=True, annot=True)

Although there doesn't seem to be many close correlations with Gold/Oil Prices with the rest of our data, the Gold/Oil Price is nevertheless an important factor to look at. 

Gold and Oil are generally thought to be inverse to the Dollar. Gold acts as a "safe haven" investment in crises while Oil's inverse relationship with the dollar stems from it being priced in USD, where when the dollar rises, fewer US Dollars are required to purchase a barrel of oil.

Noting the obvious difference for their inverse relationship with the Dollar, the Gold/Oil Ratio allows us to pinpoint a specific reason/event for the price movement of the Dollar. 

### Risk Appetite <a name="risk_appetite"></a>

Risk Management is essential when participating in the markets. Many participants seek to increase earnings while simultaneously attempting to reduce / limit the increase in downside that usually brings about. This usually comes in the form of diversification and selecting securities which are "safe" and less volatile.

While risk management is essential in protecting one's assets, the global risk appetite does have significant impacts on the Forex Markets, whether directly, or indirectly. In this section, we will look at ways to gauge investor's risk appetite and analyse its impacts on the Forex Markets****

#### *Volatility Index, Equity Index and USD/JPY* <a name="VIX_S&P_USDJPY"></a>

In [None]:
daily_usdjpy_df["Date"] = modify_datetime(daily_usdjpy_df)
vix_df["Date"] = modify_datetime(vix_df)
snp_500_df["Date"] = modify_datetime(snp_500_df)

daily_usdjpy_df.rename(columns = {'Price' : 'USDJPY_Price', "Open": "USDJPY_Open", "High":"USDJPY_High", "Low":"USDJPY_Low", "Change %":"USDJPY_Change %"}, inplace=True)
vix_df.rename(columns = {'Price' : 'VIX_Price', "Open": "VIX_Open", "High":"VIX_High", "Low":"VIX_Low", "Change %":"VIX_Change %"}, inplace=True)
snp_500_df.rename(columns = {'Price' : 'S&P500_Price', "Open": "S&P500_Open", "High":"S&P500_High", "Low":"S&P500_Low", "Change %":"S&P500_Change %"}, inplace=True)

merge_df = pd.merge(merge_df, daily_usdjpy_df, how="left", on="Date")
merge_df = pd.merge(merge_df, vix_df, how="left", on="Date")
merge_df = pd.merge(merge_df, snp_500_df, how="left", on="Date")

In [None]:
merge_df["S&P500_Price"] = merge_df["S&P500_Price"].astype(str)
merge_df["S&P500_Price"] = remove_comma(merge_df, "S&P500_Price")
merge_df["S&P500_Price"] = merge_df["S&P500_Price"].astype(float)

In [None]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["S&P500_Price"], name="S&P500 Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.USDJPY_Price, name="USDJPY Price"),
    secondary_y=True,
)


# Add figure title
fig.update_layout(
    title_text="S&P500 Prices Versus USDJPY Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>S&P500</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>USDJPY</b> Prices", secondary_y=True)

fig.show()





# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df["S&P500_Price"], name="S&P500 Price"),
    secondary_y=False,
)
fig.add_trace(
    go.Scatter(x=merge_df.Date, y=merge_df.VIX_Price, name="VIX Price"),
    secondary_y=True,
)


# Add figure title
fig.update_layout(
    title_text="S&P500 Prices Versus VIX Prices"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>S&P500</b> Prices", secondary_y=False)
fig.update_yaxes(title_text="<b>VIX</b> Prices", secondary_y=True)

fig.show()

In [None]:
merge_df.corr(method='pearson')


# Adjust Correlation Dataframe
corr_df = merge_df[["S&P500_Price", "VIX_Price", "USDJPY_Price"]]

# Correlation Heatmap
corrmat = corr_df.corr()
f, ax = plt.subplots(figsize=(12, 9))
ax.set_title("S&P500 / VIX / USDJPY Prices Correlation Heatmap")

sns.heatmap(corrmat, square=True, annot=True)

I think its natural that we expect a positive correlation between the S&P500 Price and USD/JPY Price as JPY often act as a safe haven asset during times of increasing risk. This is however not the case in the period of 2015 - 2021 that we are analysing. On the contrary, our analysis suggests a negative correlation between the 2 which does come as a surprise. Upon further inspection, there are 2 major areas which contribute to this clear deviation.

1. 2014 Oct - 2016 July 
    - During this period of time, it seems like USD/JPY took off by itself, leaving the S&P 500 price behind.
    - An explanantion for this activity could be the Bank of Japan announcing Quantitative Easing during this period of time. This causes the money supply of Japanese Yen to increase and thus a fall in purchasing power compared to USD and also a fall in interest rates.
2. 2020 Mar - 2021 Jan
    - During this period of time, USD/JPY and S&P 500 Prices exhibited a negative correlation
    - The Fed's Quantitative Easing during the COVID recovery could have contributed to the negative correlation. However, this time the Fed cut interest rates to close to 0 which may have caused a more serious deviation in the correlation between S&P500 Prices and USD/JPY Prices


As for S&P's correlation with VIX, we can expect a clear negative correlation between the 2 when there is a huge change in the S&P's price as VIX acts as a gauge for fear in the markets *[During Periods of Increased Fear, a fall in the S&P500 Prices will often result in a huge rise in VIX]*.

Perhaps a good way of making use of this information would be an attempt to do Anomaly Detection in VIX Prices to catch favourable points in the market.

#### *Anomaly Detection in Market Risk Appetite* <a name="anomaly_detection"></a>

**Isolation Forest [VIX, USD/JPY] (0.05 Contamination)**

In [None]:
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from mpl_toolkits.mplot3d import Axes3D
import datetime as dt
import matplotlib.dates as mdates


clf=IsolationForest(n_estimators=100, max_samples='auto', contamination=float(.05),
                        max_features=1.0, n_jobs=-1, random_state=42)
anomaly_df = merge_df[["S&P500_Price", "VIX_Price", "USDJPY_Price", "Date"]].dropna()


clf.fit(anomaly_df[["USDJPY_Price", "VIX_Price"]])
pred = clf.predict(anomaly_df[["USDJPY_Price", "VIX_Price"]])

anomaly_df['anomaly'] = pred
anomaly_df = anomaly_df.reset_index()
outliers = anomaly_df.loc[anomaly_df['anomaly']==-1]
out_test_index = list(outliers.index)
outlier_index = list(outliers["index"].index)
se = anomaly_df['anomaly'].value_counts()
_se_list = ["Normal Points", "Anomalies"]
se.index = _se_list
se = se.rename("Number of Anomalies")
print(se)


X = anomaly_df[["USDJPY_Price", "VIX_Price", "Date"]]
b1 = plt.scatter(X["Date"], X["USDJPY_Price"], c='green',
                 s=3, label="Normal Points")

b2 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"USDJPY_Price"], c='green',s=6, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("USDJPY Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

b3 = plt.scatter(X["Date"], X["VIX_Price"], c='green',
                 s=5,label="Normal Points")

b4 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"VIX_Price"], c='green',s=10, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("VIX Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

**Isolation Forest [VIX, USD/JPY] (0.1 Contamination)**

In [None]:
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from mpl_toolkits.mplot3d import Axes3D


clf=IsolationForest(n_estimators=100, max_samples='auto', contamination=float(.1),
                        max_features=1.0, n_jobs=-1, random_state=42)
anomaly_df = merge_df[["S&P500_Price", "VIX_Price", "USDJPY_Price", "Date"]].dropna()


clf.fit(anomaly_df[["USDJPY_Price", "VIX_Price"]])
pred = clf.predict(anomaly_df[["USDJPY_Price", "VIX_Price"]])

anomaly_df['anomaly'] = pred
anomaly_df = anomaly_df.reset_index()
outliers = anomaly_df.loc[anomaly_df['anomaly']==-1]
out_test_index = list(outliers.index)
outlier_index = list(outliers["index"].index)
se = anomaly_df['anomaly'].value_counts()
_se_list = ["Normal Points", "Anomalies"]
se.index = _se_list
se = se.rename("Number of Anomalies")
print(se)


X = anomaly_df[["USDJPY_Price", "VIX_Price", "Date"]]
b1 = plt.scatter(X["Date"], X["USDJPY_Price"], c='green',
                 s=3,label="Normal Points")

b2 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"USDJPY_Price"], c='green',s=6, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("USDJPY Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

b3 = plt.scatter(X["Date"], X["VIX_Price"], c='green',
                 s=5,label="Normal Points")

b4 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"VIX_Price"], c='green',s=10, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("VIX Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

**Isolation Forest [VIX, S&P500] (0.05 Contamination)**

In [None]:
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from mpl_toolkits.mplot3d import Axes3D


clf=IsolationForest(n_estimators=100, max_samples='auto', contamination=float(.05),
                        max_features=1.0, n_jobs=-1, random_state=42)
anomaly_df = merge_df[["S&P500_Price", "VIX_Price", "USDJPY_Price", "Date"]].dropna()


clf.fit(anomaly_df[["S&P500_Price", "VIX_Price"]])
pred = clf.predict(anomaly_df[["S&P500_Price", "VIX_Price"]])

anomaly_df['anomaly'] = pred
anomaly_df = anomaly_df.reset_index()
outliers = anomaly_df.loc[anomaly_df['anomaly']==-1]
out_test_index = list(outliers.index)
outlier_index = list(outliers["index"].index)
se = anomaly_df['anomaly'].value_counts()
_se_list = ["Normal Points", "Anomalies"]
se.index = _se_list
se = se.rename("Number of Anomalies")
print(se)

X = anomaly_df[["S&P500_Price", "VIX_Price", "Date"]]
b1 = plt.scatter(X["Date"], X["S&P500_Price"], c='green',
                 s=3,label="Normal Points")

b2 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"S&P500_Price"], c='green',s=6, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("S&P 500 Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

b3 = plt.scatter(X["Date"], X["VIX_Price"], c='green',
                 s=5,label="Normal Points")

b4 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"VIX_Price"], c='green',s=10, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("VIX Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

**Isolation Forest [VIX, S&P500] (0.1 Contamination)**

In [None]:
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from mpl_toolkits.mplot3d import Axes3D


clf=IsolationForest(n_estimators=100, max_samples='auto', contamination=float(.1),
                        max_features=1.0, n_jobs=-1, random_state=42)
anomaly_df = merge_df[["S&P500_Price", "VIX_Price", "USDJPY_Price", "Date"]].dropna()


clf.fit(anomaly_df[["S&P500_Price", "VIX_Price"]])
pred = clf.predict(anomaly_df[["S&P500_Price", "VIX_Price"]])

anomaly_df['anomaly'] = pred
anomaly_df = anomaly_df.reset_index()
outliers = anomaly_df.loc[anomaly_df['anomaly']==-1]
out_test_index = list(outliers.index)
outlier_index = list(outliers["index"].index)
se = anomaly_df['anomaly'].value_counts()
_se_list = ["Normal Points", "Anomalies"]
se.index = _se_list
se = se.rename("Number of Anomalies")
print(se)


X = anomaly_df[["S&P500_Price", "VIX_Price", "Date"]]
b1 = plt.scatter(X["Date"], X["S&P500_Price"], c='green',
                 s=3,label="Normal Points")

b2 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"S&P500_Price"], c='green',s=6, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("S&P500 Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

b3 = plt.scatter(X["Date"], X["VIX_Price"], c='green',
                 s=5,label="Normal Points")

b4 = plt.scatter(X.loc[outlier_index,"Date"], X.loc[outlier_index,"VIX_Price"], c='green',s=10, edgecolor="red", label="Predicted Outlieres")

plt.legend(loc="upper right")
plt.xlabel("Date")
plt.ylabel("VIX Price")
plt.xticks(rotation=90)
plt.gca().xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
plt.gca().xaxis.set_minor_locator(mdates.DayLocator(30))

plt.show()

**Isolation Forest [VIX, S&P500, USD/JPY] (0.05 Contamination)**

In [None]:
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from mpl_toolkits.mplot3d import Axes3D


clf=IsolationForest(n_estimators=100, max_samples='auto', contamination=float(.05), \
                        max_features=1.0, n_jobs=-1, random_state=42)
anomaly_df = merge_df[["S&P500_Price", "VIX_Price", "USDJPY_Price", "Date"]].dropna()

clf.fit(anomaly_df[["S&P500_Price", "VIX_Price", "USDJPY_Price"]])
pred = clf.predict(anomaly_df[["S&P500_Price", "VIX_Price", "USDJPY_Price"]])

anomaly_df['anomaly'] = pred
anomaly_df = anomaly_df.reset_index()
outliers = anomaly_df.loc[anomaly_df['anomaly']==-1]
outlier_index = list(outliers["index"].index)
se = anomaly_df['anomaly'].value_counts()
_se_list = ["Normal Points", "Anomalies"]
se.index = _se_list
se = se.rename("Number of Anomalies")
print(se)


X = anomaly_df.to_numpy()
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlabel("S&P500_Price")
ax.set_ylabel("VIX_Price")
ax.set_zlabel("USDJPY_Price")

ax.scatter(X[:, 0], X[:, 1], X[:, 2], s=4, lw=1, label="Inliers",c="green")
ax.scatter(X[outlier_index,0],X[outlier_index,1], X[outlier_index,2],
           lw=2, s=60, marker="x", c="red", label="outliers")
ax.legend()
plt.show()

Above are just some of the ways that one can make sense of given market data to draw conclusions and make better trading decision. There will be more in depth to discuss each section in future notebooks