## How to get market data

For the moment, most of the data being anonymised, it is quite hard to figure out how external data might be usefull and how we would incorporate them in our model. But in case this might be usefull, here is a short notebook on how to get external data from yfinance. 

First you have to install it.

In [None]:
!pip install yfinance

Getting data is really easy. Start by selecting a ticker and look a the ticker info. Here we start with a major us index "SPY", replicating the performance of the S&P 500. We use major indices for two reason : with goog volume they reflect the underlying and most importantfly for us they are available as ticker trough yahoo finance.

In [None]:
import yfinance as yf

In [None]:
SPY = yf.Ticker("SPY")

Various informations about the ticker :

In [None]:
SPY.info

Get histroical data and plot close price :

In [None]:
# get historical market data
SPY_histo = SPY.history(start="2015-01-01", end="2020-12-31")

In [None]:
import matplotlib.pyplot as plt
import numpy as np

In [None]:
SPY_histo.index

I add a green band to plot the suspected train period (see this discussion about [matching pattern of bank holidays with outliers dates](https://www.kaggle.com/c/jane-street-market-prediction/discussion/205107#1117838))

In [None]:
import datetime

plt.figure(figsize=(10,10))
plt.plot(SPY_histo.index, SPY_histo['Close'])
plt.xlabel("date")
plt.ylabel("$ price")
plt.title("Stock Price")

start_date = datetime.datetime(2017,11,1)
end_date = datetime.datetime(2019,10,31)
plt.axvspan(start_date, end_date, color='g', alpha=0.5, lw=0)

* Volume is also an important feature (monthly smoothing):

In [None]:
plt.figure(figsize=(10,10))
plt.plot(SPY_histo.index, np.log(SPY_histo['Volume']).rolling(window=20).mean())
plt.xlabel("date")
plt.ylabel("volume")
plt.title("Volume")

start_date = datetime.datetime(2017,11,1)
end_date = datetime.datetime(2019,10,31)
plt.axvspan(start_date, end_date, color='g', alpha=0.5, lw=0)

Percent change / volume relation (as asked by @carlmcbrideellis [here](https://www.kaggle.com/c/jane-street-market-prediction/discussion/208013))

In [None]:
plt.scatter(SPY_histo['Volume'],SPY_histo['Close'].pct_change(1))
plt.ylabel("daily return")
plt.xlabel("volume")

## Other important market indices

Vix is a volatility index reflecting volatility of the market, generally reflecting market turmoil :

In [None]:
#select ticker
VIX = yf.Ticker("^VIX")
VIX_histo = VIX.history(start="2015-01-01", end="2020-12-31")


plt.figure(figsize=(10,10))
plt.plot(VIX_histo.index, VIX_histo['Close'])
plt.xlabel("date")
plt.ylabel("$ price")
plt.title("VIX Price")

start_date = datetime.datetime(2017,11,1)
end_date = datetime.datetime(2019,10,31)
plt.axvspan(start_date, end_date, color='g', alpha=0.5, lw=0)

Here is a skew index that reflect the tendency to provide skewed return, also an indicator of market turmoil.

In [None]:
#select ticker
SPX = yf.Ticker("^skew")
SPX_histo = SPX.history(start="2015-01-01", end="2020-12-31")

plt.figure(figsize=(10,10))
plt.plot(SPX_histo.index, SPX_histo['Close'])
plt.xlabel("date")
plt.ylabel("$ price")
plt.title("skew index Price")

start_date = datetime.datetime(2017,11,1)
end_date = datetime.datetime(2019,10,31)
plt.axvspan(start_date, end_date, color='g', alpha=0.5, lw=0)

TNX is an index that reflect 10Y treasury rate, key benchmark for performance :

In [None]:
#select ticker
SPX = yf.Ticker("^TNX")
SPX_histo = SPX.history(start="2015-01-01", end="2020-12-31")

plt.figure(figsize=(10,10))
plt.plot(SPX_histo.index, SPX_histo['Close'])
plt.xlabel("date")
plt.ylabel("$ price")
plt.title("TNX Price")

start_date = datetime.datetime(2017,11,1)
end_date = datetime.datetime(2019,10,31)
plt.axvspan(start_date, end_date, color='g', alpha=0.5, lw=0)

## Options chains

Standardized derivates are one of the most used tool on the market. You can get data for those trough yfinance. A call to options will give you the available horizons.

In [None]:
SPY = yf.Ticker("SPY")
SPY.options

Then you can load the (approximately current) data regarding this expiration date.

In [None]:
opt = SPY.option_chain('2023-12-15')

Looking more specifically at calls, you can get individual data for given strikes :

In [None]:
opt.calls

## GME update

In their latest shenaningans, a now famous subreddit as managed to push the price of GME (video game retail chain that was almost doomed before the pandemic) to insane levels. We can look at what is happening right now :

In [None]:
GME = yf.Ticker("GME")
GME.info

shortRatio : 2.81  😂

In [None]:
#select ticker
GME_histo = GME.history(start="2020-01-01", end="2021-01-30")


plt.figure(figsize=(10,10))
plt.plot(GME_histo.index, GME_histo['Close'])
plt.xlabel("date")
plt.ylabel("$ price")
plt.title("GME Price")

😂😂😂😂

In [None]:
GME.options

In [None]:
opt = SPY.option_chain('2021-04-16')

In [None]:
opt.calls

In [None]:
opt.puts