# yo yo yo

## 1. Import packages


In [3]:
import yfinance as yf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import talib

## 2. Obtain data from yfinance

We will start with collecting data from a single stock

In [4]:
# Define the start and end dates
start_date = "2021-01-01"
end_date = "2022-02-01"

# Choose your favorite stock
ticker  = "NVDA"
data = yf.download(ticker, start=start_date, end=end_date, interval="1d")

[*********************100%%**********************]  1 of 1 completed


We can check to makesure that the type of the data we just downloaded is a pandas dataframe (which is very convenient)

In [5]:
type(data)

pandas.core.frame.DataFrame

We can look at the first five entries of the data

In [6]:
data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-01-03,298.149994,307.109985,297.850006,301.209991,300.821594,39154700
2022-01-04,302.769989,304.679993,283.48999,292.899994,292.522278,52715400
2022-01-05,289.48999,294.160004,275.329987,276.040009,275.684052,49806400
2022-01-06,276.399994,284.380005,270.649994,281.779999,281.416626,45418600
2022-01-07,281.410004,284.220001,270.570007,272.470001,272.118622,40993900


## 3. Adding indicator features from TA-lib

Let's first checkout what indicators we can add (there's a lot!)

In [7]:
from talib.abstract import *
for group, funcs in talib.get_function_groups().items():
    print(group)
    print('-----------------------------------------')
    for func in funcs:
        f = Function(func)
        print('{} - {}'.format(func, f.info['display_name']))
        print(f"Inputs: {f.input_names}\n")
        
    print()

Cycle Indicators
-----------------------------------------
HT_DCPERIOD - Hilbert Transform - Dominant Cycle Period
Inputs: OrderedDict([('price', 'close')])

HT_DCPHASE - Hilbert Transform - Dominant Cycle Phase
Inputs: OrderedDict([('price', 'close')])

HT_PHASOR - Hilbert Transform - Phasor Components
Inputs: OrderedDict([('price', 'close')])

HT_SINE - Hilbert Transform - SineWave
Inputs: OrderedDict([('price', 'close')])

HT_TRENDMODE - Hilbert Transform - Trend vs Cycle Mode
Inputs: OrderedDict([('price', 'close')])


Math Operators
-----------------------------------------
ADD - Vector Arithmetic Add
Inputs: OrderedDict([('price0', 'high'), ('price1', 'low')])

DIV - Vector Arithmetic Div
Inputs: OrderedDict([('price0', 'high'), ('price1', 'low')])

MAX - Highest value over a specified period
Inputs: OrderedDict([('price', 'close')])

MAXINDEX - Index of highest value over a specified period
Inputs: OrderedDict([('price', 'close')])

MIN - Lowest value over a specified period
Inp

If the output is trunkcated, probably click on "view as a scrollable" or something like that

Or just checkout: https://github.com/TA-Lib/ta-lib-python#overlap-studies

We can start by adding some simple indicators, for example, the simple moving average (SMA), which just takes in the closing prices, and a custom period (SMA over 30, 60, or 120 days). 

In [8]:
# Moving Averages
data['SMA30'] = talib.SMA(data['Adj Close'],timeperiod=30)
data['SMA60'] = talib.SMA(data['Adj Close'],timeperiod=60)
data['SMA120'] = talib.SMA(data['Adj Close'],timeperiod=120)

Now we can check out the head of the dataframe again, and we'll see three extra columns

In [9]:
data.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,SMA30,SMA60,SMA120
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2022-01-03,298.149994,307.109985,297.850006,301.209991,300.821594,39154700,,,
2022-01-04,302.769989,304.679993,283.48999,292.899994,292.522278,52715400,,,
2022-01-05,289.48999,294.160004,275.329987,276.040009,275.684052,49806400,,,
2022-01-06,276.399994,284.380005,270.649994,281.779999,281.416626,45418600,,,
2022-01-07,281.410004,284.220001,270.570007,272.470001,272.118622,40993900,,,


Oh no, there are some NaN values. What happened?
This is because, 