## Time Series Analysis

For required packages:

$$ pip install yfiance

$$ pip install finplot

$$ pip install hmmlearn

### Setup

First, let's import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures. We also check that Python 3.5 or later is installed (although Python 2.x may work, it is deprecated so we strongly recommend you use Python 3 instead), as well as Scikit-Learn ≥0.20.

In [None]:
# Python ≥3.5 is required
import sys
assert sys.version_info >= (3, 5)

# Scikit-Learn ≥0.20 is required
import sklearn
assert sklearn.__version__ >= "0.20"

# Common imports
import numpy as np
import pandas as pd
import os

# to make this notebook's output stable across runs
np.random.seed(42)

# To plot pretty figures
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn

mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)

# Where to save the figures
PROJECT_ROOT_DIR = "."
CHAPTER_ID = "time_series"
IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID)
os.makedirs(IMAGES_PATH, exist_ok=True)

def save_fig(fig_id, tight_layout=True, fig_extension="png", resolution=300):
    path = os.path.join(IMAGES_PATH, fig_id + "." + fig_extension)
    print("Saving figure", fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(path, format=fig_extension, dpi=resolution)

# Ignore useless warnings (see SciPy issue #5998)
import warnings
warnings.filterwarnings(action="ignore", message="^internal gelsd")

### Get stock data

In [None]:





# get stock info


In [None]:
# get historical market data






In [None]:
# Plot everything by leveraging matplotlib package
hist['Close'].plot(figsize=(6, 4))

plt.figure()
hist['Close'].rolling(7).mean().plot()

plt.figure()

hist['Close'].rolling(24).corr().plot()

### Building Hidden markov Models for Sequential Data

The Hidden Markov Models (HMMs) are powerful when it comes to sequential data analysis. They are used extensively in finance, speech analysis, weather forecasting, sequencing of words, and so on. 

Any source of data that produces a sequence of outputs could produce patterns. Note that HMMs are generative models, which means that they can generate the data once theyt learn the underlying structure. HMMs cannot discriminate between classes in their base forms. This is in contrast to discriminative models that can learn to discriminate between classes but cannot generate data. 

### Analyzing stock market data using HHMs

Stock market data is a good example of time series data where the data is organized in the form of dates. We can see how the stock values of various companies fluctuate over time. 

### Get the data

In [None]:
from hmmlearn.hmm import GaussianHMM
import yfinance as yf

# get historical market data
msft = yf.Ticker("MSFT")

hist = msft.history(period="365d")
print("\n stock history\n", hist[:5])

### Build an HMM model

#### the number of components = 4

Create and train the HMM using four components. The number of components is a hyperparameter that we have to choose. Here, by selecting 4 as the number of components, we say that the data is being generated using four underlying states. 

In [None]:
from hmmlearn.hmm import GaussianHMM
import yfinance as yf








### HMMs are generative models. Let's generate 1000 samples and plot this

#### the number of components = 16

In [None]:
from hmmlearn.hmm import GaussianHMM

num_components = 8

model = GaussianHMM(n_components=num_components, covariance_type='diag', n_iter=1000)
model.fit(hist)

print(model)

hidden_states = model.predict(hist)

num_samples = 500
samples, _ = model.sample(num_samples)

plt.figure()
plt.plot(np.arange(num_samples), samples[:, 0], c='green')
plt.title('Number of components = ' + str(num_components))

In [None]:
import yfinance as yf

msft = yf.Ticker("MSFT")

# get historical market data
hist = msft.history(period="365d")

closing_values = hist['Close']
volume_of_shares = hist['Volume']

diff_percentage = 100 * np.diff(closing_values)/closing_values[:-1]

num_components = 5

model = GaussianHMM(n_components=num_components, covariance_type="diag", n_iter=1000)
model.fit(hist)

# Generate data using model
num_samples = 100 
samples, _ = model.sample(num_samples) 
plt.plot(np.arange(num_samples), samples[:,0], c='black')

plt.show()

In [None]:
import yfinance as yf
import finplot as fplt

df = yf.download('SPY AAPL', start='2018-01-01', end = '2020-04-29')

fplt.candlestick_ochl(df[['Open','Close','High','Low']])
fplt.plot(df.Close.rolling(50).mean())
fplt.plot(df.Close.rolling(200).mean())

fplt.show()