<a href="https://colab.research.google.com/github/jharilal/candlestick_analysis/blob/main/Candlestick_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Candlestick Indicator Analysis



## Outline



1. Problem Questions
2. Project Installs
3. Data Collection
4. Data Preparation
5. Data Analysis
6. Report

## Problem Questions

 - How accurate are technical indicators in predicting stock price movement for the top 100 / 500 companies over the past 5 years?
 - Does the industry / sub industry demonstrate a different amount of success with indicators?
 - Analyze the popular, single candle-stick indicators - Hammer /  Long Legged Doji / Shooting Star / Hangman / Gravestone / Dragonfly
 - Build this in a way such that it's possible to expand on the technical indicator of analysis
 - How often do these indicators occur?

 These questions help answer the measurable reliability of using these indicators to influence trading decisions.


## Project Installs

In [1]:
!pip install yahoofinance
!pip install yfinance

Collecting yahoofinance
  Downloading yahoofinance-0.0.2-py3-none-any.whl (18 kB)
Installing collected packages: yahoofinance
Successfully installed yahoofinance-0.0.2
Collecting yfinance
  Downloading yfinance-0.1.70-py2.py3-none-any.whl (26 kB)
Collecting lxml>=4.5.1
  Downloading lxml-4.7.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (6.4 MB)
[K     |████████████████████████████████| 6.4 MB 15.7 MB/s 
[?25hCollecting requests>=2.26
  Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB)
[K     |████████████████████████████████| 63 kB 1.4 MB/s 
Installing collected packages: requests, lxml, yfinance
  Attempting uninstall: requests
    Found existing installation: requests 2.23.0
    Uninstalling requests-2.23.0:
      Successfully uninstalled requests-2.23.0
  Attempting uninstall: lxml
    Found existing installation: lxml 4.2.6
    Uninstalling lxml-4.2.6:
      Successfully uninstalled lxml-4.2.6
[31mERROR: pip's dependency resolver does

In [2]:
import yfinance as yf
import pandas as pd
msft_tick = yf.download('MSFT')

[*********************100%***********************]  1 of 1 completed


## Data Collection

In [3]:
nasdaq_df = pd.read_html('https://en.wikipedia.org/wiki/Nasdaq-100')

In [4]:
nasdaq_df[3]

Unnamed: 0,Company,Ticker,GICS Sector,GICS Sub-Industry
0,Activision Blizzard,ATVI,Communication Services,Interactive Home Entertainment
1,Adobe,ADBE,Information Technology,Application Software
2,ADP,ADP,Information Technology,Data Processing & Outsourced Services
3,Airbnb,ABNB,Consumer Discretionary,Internet & Direct Marketing Retail
4,Align,ALGN,Health Care,Health Care Supplies
...,...,...,...,...
97,Workday,WDAY,Information Technology,Application Software
98,Xcel Energy,XEL,Utilities,Multi-Utilities
99,Xilinx,XLNX,Information Technology,Semiconductors
100,Zoom,ZM,Information Technology,Application Software


In [5]:
n100 = nasdaq_df[3]

In [6]:
n100

Unnamed: 0,Company,Ticker,GICS Sector,GICS Sub-Industry
0,Activision Blizzard,ATVI,Communication Services,Interactive Home Entertainment
1,Adobe,ADBE,Information Technology,Application Software
2,ADP,ADP,Information Technology,Data Processing & Outsourced Services
3,Airbnb,ABNB,Consumer Discretionary,Internet & Direct Marketing Retail
4,Align,ALGN,Health Care,Health Care Supplies
...,...,...,...,...
97,Workday,WDAY,Information Technology,Application Software
98,Xcel Energy,XEL,Utilities,Multi-Utilities
99,Xilinx,XLNX,Information Technology,Semiconductors
100,Zoom,ZM,Information Technology,Application Software


In [7]:
ticker_sec_gics = pd.DataFrame()
ticker_sec_gics = n100[['Ticker','Company','GICS Sector', 'GICS Sub-Industry']]

In [8]:
ticker_sec_gics.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 102 entries, 0 to 101
Data columns (total 4 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   Ticker             102 non-null    object
 1   Company            102 non-null    object
 2   GICS Sector        102 non-null    object
 3   GICS Sub-Industry  102 non-null    object
dtypes: object(4)
memory usage: 3.3+ KB


In [9]:
ticker = ticker_sec_gics['Ticker'].to_list()

In [10]:
start_date = '2019-03-31'
end_date = '2021-04-01'

In [11]:
main_df = pd.DataFrame()

In [12]:
ticker = ['FB', 'AAPL', 'NFLX', 'GOOGL', 'AMZN']

In [13]:
for sym in ticker:

  temp_df = yf.download('{}'.format(sym), start=start_date, end=end_date)
  temp_df = temp_df.drop(columns=['Adj Close'], axis=1)

  temp_industry = ticker_sec_gics[ticker_sec_gics['Ticker'] == '{}'.format(sym)]['GICS Sector'].to_list()[0]
  temp_subindustry = ticker_sec_gics[ticker_sec_gics['Ticker'] == '{}'.format(sym)]['GICS Sub-Industry'].to_list()[0]

  temp_df.insert(0, 'Ticker', '{}'.format(sym))
  temp_df.insert(1, 'GICS Sector', '{}'.format(temp_industry))
  temp_df.insert(2, 'GICS Sub-Industry', '{}'.format(temp_subindustry))

  main_df = pd.concat([main_df, temp_df])

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


In [14]:
main_df

Unnamed: 0_level_0,Ticker,GICS Sector,GICS Sub-Industry,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2019-04-01,FB,Communication Services,Interactive Media & Services,167.830002,168.899994,167.279999,168.699997,10381500
2019-04-02,FB,Communication Services,Interactive Media & Services,170.139999,174.899994,169.550003,174.199997,23946500
2019-04-03,FB,Communication Services,Interactive Media & Services,174.500000,177.960007,172.949997,173.539993,27391100
2019-04-04,FB,Communication Services,Interactive Media & Services,176.020004,178.000000,175.529999,176.020004,17847700
2019-04-05,FB,Communication Services,Interactive Media & Services,176.880005,177.000000,175.100006,175.720001,9594100
...,...,...,...,...,...,...,...,...
2021-03-25,AMZN,Consumer Discretionary,Internet & Direct Marketing Retail,3072.989990,3109.780029,3037.139893,3046.260010,3563500
2021-03-26,AMZN,Consumer Discretionary,Internet & Direct Marketing Retail,3044.060059,3056.659912,2996.000000,3052.030029,3312900
2021-03-29,AMZN,Consumer Discretionary,Internet & Direct Marketing Retail,3055.439941,3091.250000,3028.449951,3075.729980,2746000
2021-03-30,AMZN,Consumer Discretionary,Internet & Direct Marketing Retail,3070.010010,3073.000000,3034.000000,3055.290039,2337600


## Data Preparation

Code needs to::

iterate rows:

  -do basic calculation to determine presence of indicator

  -if indicator found, move on
  
  -save indicator in another dataframe with:

    -ticker, gics sec, gics sub-industry, indicator type, close, volume, uptrend/downtrend/neither, close value in 1d, 3d, 7d, 10d.

- check that close < open
- symbols are the same
- 2 days prior to target
- only want to check if there is indicator present

In [76]:
class Indicator:

  indicator_data = []
  
  @staticmethod
  def indicator_detect(df: pd.DataFrame):
    
    for index in range(df.shape[0]):
      cur_row = df.iloc[index]
      cur_ticker, cur_open, cur_high, cur_low, cur_close = Indicator.assign_values(cur_row)
      # Hammer.hammer_detect()
    
  def assign_values(row):
    """Extracts the ticker, open, high, low, and close values from a row"""
    return row[0], row[3], row[4], row[5], row[6]
  
  def validate_tickers(target_ticker: str, list_of_assigned_values: list):
    """Ensures that all"""
    for item in list_of_assigned_values:
      if item[0] != target_ticker:
        return False
    return True




In [83]:
Indicator.assign_values(main_df.iloc[2])

('FB', 174.5, 177.9600067138672, 172.9499969482422, 173.5399932861328)

In [56]:
main_df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2525 entries, 2019-04-01 to 2021-03-31
Data columns (total 8 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Ticker             2525 non-null   object 
 1   GICS Sector        2525 non-null   object 
 2   GICS Sub-Industry  2525 non-null   object 
 3   Open               2525 non-null   float64
 4   High               2525 non-null   float64
 5   Low                2525 non-null   float64
 6   Close              2525 non-null   float64
 7   Volume             2525 non-null   int64  
dtypes: float64(4), int64(1), object(3)
memory usage: 177.5+ KB


In [None]:
class Trend:
  
  def downtrend(df, index):
    """Checks for the presence of a downtrend within 2 days of the indicator"""

    two_day_ago = (Indicator.assign_values(df[index - 2]))
    one_day_ago = (Indicator.assign_values(df[index - 1]))
    curr_day    = (Indicator.assign_values(df[index]))

    three_day = [two_day_ago, one_day_ago, curr_day]

    if Indicator.validate_tickers(curr_day[0], three_day) == False:
      return False

    for i in range(2):
      if three_day[i][2] > 0: 
        return False
    
    for i in range(1):
      if three_day[i][1] < three_day[i + 1][1]:
        return False

    return True



  def uptrend():

    pass

In [86]:
a = [4, 5, 6, 2]

b, c, d = a[2], a[3], a[1]

print(b)

6


current ticker against current predicted value

In [None]:
class Predict:

  def predictions(df, index):
    """Provides the close value with respect to
       the indicator 1, 3, 5,and 10 days ahead"""
    days = [1, 3, 5, 10]
    temp_list = []   
    for day in days:
      temp_list.append(df.iloc[index + day])
    pred_1, pred_3, pred_5, pred_10 =
       



NameError: ignored

hamer criteria: 


*   high ~ open / close value
*   low > 2x the lower bound



In [68]:
class Hammer:
  """Identifies and creates a ammer entries in indicator_df"""

  def hammer_entry():
    """Creates the value of the hammer entry into indicator_df"""
    pass


  def hammer_criteria(open, high, low, close):
    """Tests for the criteria of te hammer indicator on a day"""
    upper_bound, lower_bound = max(open, close), min(open, close)
    check_one = Hammer.upper_wick_check()
    check_two = Hammer.lower_wick_check()
    if check_one == True and check_two == True:
      return True    


  def hammer_detect(df, index, open, high, low, close):
    """Detects the presence of a hammer and creates an entry if present"""
    crit_check = Hammer.hammer_criteria(open, high, low, close)
    if crit_check == True:
      curr_trend = Trend.downtrend(df, index)

  

  def upper_wick_check(upper_bound, high):
    """Tests for the upper wick of the hammer. It should be approxmiately the 
       same as the daily high"""
    upper_range = [upper_bound * 0.999, upper_bound * 1.001]
    if high >= upper_range[0] and high <= upper_range[1]:
      return True


  def lower_wick_check(upper_bound, lower_bound, low):
    """Tests the lower wick of the hammer. It should be at least twice the size
       of the candle body"""
    day_diff = upper_bound - lower_bound
    if (lower_bound - low) >= (2 * day_diff):
      return True

In [None]:
class ShootingStar:
  """Identifies and creates a ShootingStar Object"""

In [None]:
# class Longlegdoji:
#   pass

In [None]:
# class Gravestone:
#   # pass

In [None]:
# class Dragonfly:
#   pass