# Long / Short Signals based on Technical Indicators
Author: **Peeyush Sharma**; Feedback: **PSharma3@gmail.com**

This notebook captures some basic long / short signals based on technical indicators. The core daily market data was retrieved from local catalog and was subsequently used to generate various technical indicators in CSV form for a group of equities. This analysis retrieves that data and generates a list of dates with opportunities for long / short decisions. Decisions are based on relative value of closing price of an equity over any given duration (say 90 days). We start with a  broad selection of dates and gradually add limiting factors to come with optimal days to purchase equities in a bear market scenario. Similar methods can be used for normal and bull market as well with some adjustments.

This is still a very high-level decision making. A stock may be on a long bull run and the 90 day averages may still not capture full potential in the long run. That kind of analysis is more detailed and is not captured in this publicly shared notebook. 

In [33]:
import os
import os.path
from datetime import datetime, timedelta

import pandas as pd

pd.options.mode.chained_assignment = None 
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

In [34]:
BASE_DIR = '../../../../workspace/HelloPython/HistoricalMarketData/TechnicalIndicators'
DURATIONS = (14, 30, 90, 200) # Roughly for bi-weekly, monthly, quarterly, and 200 days moving averages and other tech indicators

In [35]:
# Retrieve data for Technical Indicators from pre-calculated CSVs
def generate_file_path(symbol, date=None):
    if date is not None:
        str_date = datetime.strftime(date, '%Y%m%d')
        file_name = symbol.lower()+'_'+str_date+'.csv'
        file_path = os.path.join(BASE_DIR, file_name)
    else: 
        file_name = symbol.lower()+'.csv'
        file_path = os.path.join(BASE_DIR, file_name)
    if file_path is None:
            print('Could not find file for symbol:{}'.format(symbol))
    # print(file_path)
    return file_path, file_name

In [36]:
# Retrieve sample data for a stock for predictions. 
str_date_from = '2012-01-03'
str_date_to = '2020-04-30'

dt_from = datetime.strptime(str_date_from, '%Y-%m-%d')
dt_to = datetime.strptime(str_date_to, '%Y-%m-%d')

# symbols = ['FB', 'MSFT', 'GOOGL', 'NFLX', 'AAPL', 'AMZN', 'WFC', 'TSLA', 'BAC', 'C', 'GS', 'JPM', 'MS', 'MRK', 'NKE']
symbol = 'AAPL'
# for symbol in symbols:
file_path, _ = generate_file_path(symbol)
if file_path is not None:
    try: 
        dfrm = pd.read_csv(file_path)
        dfrm['date'] = pd.to_datetime(dfrm['date'])
        dfrm.set_index('date', inplace=True)
        dfrm = dfrm.loc[dt_from: dt_to, :]


    except FileNotFoundError as e:
        print('Exception reading input data for symbol {}.'.format(symbol))
        print(e)

dfrm.tail(5)

Unnamed: 0_level_0,symbol,close,volume,mean_200,stddev_200,pcntleStdDevs_200,pcntleVolume_200,pcntleClosing_200,oscillator_200,accu_dist_200,...,stddev_30,accu_dist_90,bollingerLower_90,bollingerUpper_90,mean_90,oscillator_90,pcntleClosing_90,pcntleStdDevs_90,pcntleVolume_90,stddev_90
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2020-04-24,AAPL,282.97,31627183,276.206043,28.425625,42.446043,50.359712,61.870504,56.987261,5301392.0,...,15.342032,7145867.0,224.419721,342.218374,283.319048,56.987261,50.793651,74.603175,20.634921,29.449663
2020-04-27,AAPL,283.17,29271893,277.352701,27.634313,39.416058,42.335766,61.313869,57.181756,5476160.0,...,15.072771,7319212.0,224.377218,341.442464,282.909841,57.181756,52.380952,73.015873,15.873016,29.266311
2020-04-28,AAPL,278.58,28001187,277.706642,27.333412,37.226277,38.686131,55.474453,52.718078,2072180.0,...,14.856335,2759266.0,224.430008,340.148088,282.289048,52.718078,49.206349,69.84127,12.698413,28.92952
2020-04-29,AAPL,287.73,34320204,278.082701,27.111808,35.036496,59.124088,65.693431,61.61626,8842875.0,...,15.041739,11719160.0,224.838585,338.577288,281.707937,61.61626,61.904762,66.666667,36.507937,28.434676
2020-04-30,AAPL,293.8,45765968,278.196594,27.045793,34.782609,76.086957,72.463768,67.519206,13356050.0,...,15.483024,17752700.0,225.302608,337.158662,281.230635,67.519206,71.428571,63.492063,50.793651,27.964013


A sneak peak at the data and it's statistical distribution

In [37]:
"""
Plot daily closing values and couple other technical indicators using plotly.express
# Note that this plot may not show up in some platforms
"""
# Plot works, but increases the size of GitHub upload by order of MBs. Commenting out before upload

# fig = px.line(dfrm, x=dfrm.index, y=['close', 'oscillator_30',  'mean_30'],  title='Time Series with Range Slider and Selectors')
# fig.update_xaxes(
#     rangeslider_visible=True,
#     rangeselector=dict(
#         buttons=list([
#             dict(count=1, label="1m", step="month", stepmode="backward"),
#             dict(count=6, label="6m", step="month", stepmode="backward"),
#             dict(count=1, label="YTD", step="year", stepmode="todate"),
#             dict(count=1, label="1y", step="year", stepmode="backward"),
#             dict(step="all")
#         ])
#     )
# )

# fig.show()

'\nPlot daily closing values and couple other technical indicators using plotly.express\n# Note that this plot may not show up in some platforms\n'

## Strategies

### Buy Side Decisions based on a Waterfall Approach w/ Technical Indicators. 
We will start with picking dates where closing prices for AAPL were significantly lower relative to the 90 days range. For that, we identify days when AAPL closing prices was lower than the lower level of the moving 90 days Bollinger band. 

In [38]:
# Set sample duration for technical indicators
duration = 90

In [40]:
# Start simple. Identify days when closing price was lower than the lower band in Bollinger range
dates_lows_for_buy_ops_1 = [ date for date in dfrm.index if dfrm.loc[date, 'close'] < dfrm.loc[date, 'bollingerLower_'+str(duration)] ]
dates_lows_for_buy_ops_1[-20:]

[Timestamp('2018-11-13 00:00:00'),
 Timestamp('2018-11-14 00:00:00'),
 Timestamp('2018-11-15 00:00:00'),
 Timestamp('2018-11-16 00:00:00'),
 Timestamp('2018-11-19 00:00:00'),
 Timestamp('2018-11-20 00:00:00'),
 Timestamp('2018-11-21 00:00:00'),
 Timestamp('2018-11-23 00:00:00'),
 Timestamp('2018-11-26 00:00:00'),
 Timestamp('2018-11-27 00:00:00'),
 Timestamp('2018-12-07 00:00:00'),
 Timestamp('2018-12-21 00:00:00'),
 Timestamp('2018-12-24 00:00:00'),
 Timestamp('2020-03-12 00:00:00'),
 Timestamp('2020-03-16 00:00:00'),
 Timestamp('2020-03-17 00:00:00'),
 Timestamp('2020-03-18 00:00:00'),
 Timestamp('2020-03-19 00:00:00'),
 Timestamp('2020-03-20 00:00:00'),
 Timestamp('2020-03-23 00:00:00')]

Let us dive deeper into return on a sample day. Pick a sample date from the response above and check whether closing price that day was indeed lower than preceding days. Take example of March 12th, 2020 just as NYSE and broader market were getting close to their lows COVID-19 pandemic. The next table captures daily closing prices of AAPL around 3/12/2020. 

In [41]:
str_date_from = '2020-03-12'
# Check the closing values preceding it
dt_from = datetime.strptime(str_date_from, '%Y-%m-%d')
dfrm.loc[dt_from - timedelta(days=15):dt_from + timedelta(days=15),['close', 'volume']]

Unnamed: 0_level_0,close,volume
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-02-26,292.65,49678431
2020-02-27,273.52,80151381
2020-02-28,273.36,106721230
2020-03-02,298.81,85349339
2020-03-03,289.32,79868852
2020-03-04,302.74,54794568
2020-03-05,292.92,46893219
2020-03-06,289.03,56544246
2020-03-09,266.17,71686208
2020-03-10,285.34,71322520


So 2020-03-12 was indeed a good opportunity to buy AAPL. That said, as it turned out, it was a local minima. The AAPL price continued to go down over next few days after 2020-03-12. The question is can we avoid purchasing in the local minima and instead wait out to get closer to a minima over a broader period. We will try to make the decision more efficient.

Let us include some other technical indicators beyond Bollinger bands. Let us look for days when percentile closing value is in lowest 10% but sale volume is in highest 90%. Go long on first such day itself. We must note however that this strategy can only be applied for companies with robust cash flows and balance sheet. For companies with volatile earnings and weaker balance sheets, a high interest at very low prices may simply be the start of a long sale cycle. 

In [42]:
dates_lows_for_buy_ops_2 = [ date for date in dfrm.index if dfrm.loc[date, 'pcntleClosing_'+str(duration)] < 10 and dfrm.loc[date, 'pcntleVolume_'+str(duration)] > 90]
dates_lows_for_buy_ops_2[-20:]

[Timestamp('2018-02-07 00:00:00'),
 Timestamp('2018-02-08 00:00:00'),
 Timestamp('2018-02-09 00:00:00'),
 Timestamp('2018-02-12 00:00:00'),
 Timestamp('2018-11-02 00:00:00'),
 Timestamp('2018-11-05 00:00:00'),
 Timestamp('2018-11-12 00:00:00'),
 Timestamp('2018-11-14 00:00:00'),
 Timestamp('2018-11-20 00:00:00'),
 Timestamp('2018-12-10 00:00:00'),
 Timestamp('2018-12-20 00:00:00'),
 Timestamp('2018-12-21 00:00:00'),
 Timestamp('2019-01-03 00:00:00'),
 Timestamp('2020-03-09 00:00:00'),
 Timestamp('2020-03-12 00:00:00'),
 Timestamp('2020-03-13 00:00:00'),
 Timestamp('2020-03-16 00:00:00'),
 Timestamp('2020-03-17 00:00:00'),
 Timestamp('2020-03-20 00:00:00'),
 Timestamp('2020-03-23 00:00:00')]

We were not able to reduce count with the combination of percentile closing and percentile volume. Let us try with stochastic oscillators.

In [43]:
dates_lows_for_buy_ops_3 = [ date for date in dfrm.index if dfrm.loc[date, 'oscillator_'+str(duration)] < 25 and dfrm.loc[date, 'pcntleVolume_'+str(duration)] > 90 ]
dates_lows_for_buy_ops_3[-20:]

[Timestamp('2018-02-07 00:00:00'),
 Timestamp('2018-02-08 00:00:00'),
 Timestamp('2018-02-09 00:00:00'),
 Timestamp('2018-11-02 00:00:00'),
 Timestamp('2018-11-05 00:00:00'),
 Timestamp('2018-11-12 00:00:00'),
 Timestamp('2018-11-14 00:00:00'),
 Timestamp('2018-11-20 00:00:00'),
 Timestamp('2018-12-10 00:00:00'),
 Timestamp('2018-12-20 00:00:00'),
 Timestamp('2018-12-21 00:00:00'),
 Timestamp('2019-01-03 00:00:00'),
 Timestamp('2020-02-27 00:00:00'),
 Timestamp('2020-02-28 00:00:00'),
 Timestamp('2020-03-09 00:00:00'),
 Timestamp('2020-03-12 00:00:00'),
 Timestamp('2020-03-16 00:00:00'),
 Timestamp('2020-03-17 00:00:00'),
 Timestamp('2020-03-20 00:00:00'),
 Timestamp('2020-03-23 00:00:00')]

Not significant difference either. March 2020 still shows a lot of days with buy options. Let us introduce average of standard deviation and include that in our filter as well. 

In [44]:
dates_lows_for_buy_ops_4 = [ date for date in dfrm.index if dfrm.loc[date, 'oscillator_'+str(duration)] < 25 and dfrm.loc[date, 'pcntleVolume_'+str(duration)] > 90 and dfrm.loc[date, 'pcntleStdDevs_'+str(duration)] > 90]
dates_lows_for_buy_ops_4[-20:]

[Timestamp('2015-08-05 00:00:00'),
 Timestamp('2015-08-11 00:00:00'),
 Timestamp('2015-08-12 00:00:00'),
 Timestamp('2015-08-21 00:00:00'),
 Timestamp('2015-08-24 00:00:00'),
 Timestamp('2015-08-25 00:00:00'),
 Timestamp('2016-01-07 00:00:00'),
 Timestamp('2016-01-08 00:00:00'),
 Timestamp('2016-01-15 00:00:00'),
 Timestamp('2016-01-20 00:00:00'),
 Timestamp('2016-01-26 00:00:00'),
 Timestamp('2016-01-27 00:00:00'),
 Timestamp('2016-06-17 00:00:00'),
 Timestamp('2016-06-24 00:00:00'),
 Timestamp('2018-12-10 00:00:00'),
 Timestamp('2018-12-20 00:00:00'),
 Timestamp('2018-12-21 00:00:00'),
 Timestamp('2019-01-03 00:00:00'),
 Timestamp('2020-03-20 00:00:00'),
 Timestamp('2020-03-23 00:00:00')]

Now, we only get March 20th and 23rd as potential buying opportunities for AAPL. So, in all, this was a gradual filtering of optimal purchase days. This example focused at the peak of COVID-19 impact on financial markets. Generally speaking filtering upto the level of 'dates_lows_for_buy_ops_4' will not yield any buying opportunity in normal market conditions. An asset manager needs to start with earlier levels and then apply her assessment of economic conditions to purchase at the optimal time of her choice. 

### Sale Side Decisions based on a Waterfall Approach w/ Technical Indicators

In [45]:
dates_highs_for_sale_ops_1 = [ date for date in dfrm.index if dfrm.loc[date, 'close'] > dfrm.loc[date, 'bollingerUpper_'+str(duration)] ]
dates_highs_for_sale_ops_1[-20:]

[Timestamp('2019-10-28 00:00:00'),
 Timestamp('2019-10-31 00:00:00'),
 Timestamp('2019-11-01 00:00:00'),
 Timestamp('2019-11-04 00:00:00'),
 Timestamp('2019-11-05 00:00:00'),
 Timestamp('2019-11-06 00:00:00'),
 Timestamp('2019-11-07 00:00:00'),
 Timestamp('2019-11-08 00:00:00'),
 Timestamp('2019-11-11 00:00:00'),
 Timestamp('2019-11-12 00:00:00'),
 Timestamp('2019-11-13 00:00:00'),
 Timestamp('2020-01-02 00:00:00'),
 Timestamp('2020-01-06 00:00:00'),
 Timestamp('2020-01-08 00:00:00'),
 Timestamp('2020-01-09 00:00:00'),
 Timestamp('2020-01-10 00:00:00'),
 Timestamp('2020-01-13 00:00:00'),
 Timestamp('2020-01-14 00:00:00'),
 Timestamp('2020-01-16 00:00:00'),
 Timestamp('2020-01-17 00:00:00')]

We right away see that there was no sale opportunity at the peak of COVID-19. The first opportunity after that seems to have come around June 10th. Let us check returns around that date. 

In [46]:
str_date_from = '2020-06-10'
# Check the closing values around the target date 
dt_from = datetime.strptime(str_date_from, '%Y-%m-%d')
dfrm.loc[dt_from - timedelta(days=15):dt_from + timedelta(days=15),['close', 'volume']]

Unnamed: 0_level_0,close,volume
date,Unnamed: 1_level_1,Unnamed: 2_level_1


Seems like June 10th was a resonably local high for sale opportunity. We know that afterward in late summer 2020, the large tech sector had a strong showing. That presented with further several sales opportunities for AAPL. Let us further see if we can identify a more narrow sale opportunity in early to mid-August when AAPL hit several peaks. Similar to opportunities on buy side, we will now add percentile closing and volume factors. 

In [47]:
dates_highs_for_sale_ops_2 = [ date for date in dfrm.index if dfrm.loc[date, 'pcntleClosing_'+str(duration)] > 90 and dfrm.loc[date, 'pcntleVolume_'+str(duration)] > 90]
dates_highs_for_sale_ops_2[-20:]

[Timestamp('2018-08-01 00:00:00'),
 Timestamp('2018-08-02 00:00:00'),
 Timestamp('2018-08-03 00:00:00'),
 Timestamp('2018-09-10 00:00:00'),
 Timestamp('2018-09-11 00:00:00'),
 Timestamp('2018-09-12 00:00:00'),
 Timestamp('2018-09-13 00:00:00'),
 Timestamp('2018-09-17 00:00:00'),
 Timestamp('2019-07-31 00:00:00'),
 Timestamp('2019-08-13 00:00:00'),
 Timestamp('2019-09-11 00:00:00'),
 Timestamp('2019-12-20 00:00:00'),
 Timestamp('2019-12-27 00:00:00'),
 Timestamp('2019-12-30 00:00:00'),
 Timestamp('2020-01-03 00:00:00'),
 Timestamp('2020-01-09 00:00:00'),
 Timestamp('2020-01-14 00:00:00'),
 Timestamp('2020-01-24 00:00:00'),
 Timestamp('2020-01-28 00:00:00'),
 Timestamp('2020-01-29 00:00:00')]

And we did see a somewhat more narrow range indeed for early to mid-August time-frame. Let us narrow it down further. 

In [48]:
dates_lows_for_sale_ops_3 = [ date for date in dfrm.index if dfrm.loc[date, 'oscillator_'+str(duration)] > 75 and dfrm.loc[date, 'pcntleVolume_'+str(duration)] > 90 ]
dates_lows_for_sale_ops_4 = [ date for date in dfrm.index if dfrm.loc[date, 'oscillator_'+str(duration)] > 75 and dfrm.loc[date, 'pcntleVolume_'+str(duration)] > 90 and dfrm.loc[date, 'pcntleStdDevs_'+str(duration)] > 90]
dates_lows_for_sale_ops_3[-20:]
dates_lows_for_sale_ops_4[-20:]

[Timestamp('2017-03-17 00:00:00'),
 Timestamp('2017-03-21 00:00:00'),
 Timestamp('2017-09-12 00:00:00'),
 Timestamp('2017-09-13 00:00:00'),
 Timestamp('2018-05-02 00:00:00'),
 Timestamp('2018-05-04 00:00:00'),
 Timestamp('2018-06-15 00:00:00'),
 Timestamp('2018-09-10 00:00:00'),
 Timestamp('2018-09-11 00:00:00'),
 Timestamp('2018-09-12 00:00:00'),
 Timestamp('2018-09-13 00:00:00'),
 Timestamp('2018-09-17 00:00:00'),
 Timestamp('2018-09-21 00:00:00'),
 Timestamp('2020-01-14 00:00:00'),
 Timestamp('2020-01-24 00:00:00'),
 Timestamp('2020-01-27 00:00:00'),
 Timestamp('2020-01-28 00:00:00'),
 Timestamp('2020-01-29 00:00:00'),
 Timestamp('2020-01-31 00:00:00'),
 Timestamp('2020-02-03 00:00:00')]

The mean of standard deviation did not help much for long decisions. This is likely because AAPL had a great run with not as much volatility to increase std deviation significantly. 

## Limitations and TODO Items: 
There are several limitations for this kind of analysis. Some of the more noteworthy ones are:
- This type of analysis is focusing only on market price during a given time. Market price may not capture the full intrinsic potential of future earnings of a company. This analysis is strictly limited to the range within price movements occurs and needs to be complimented with financial analysis of returns and potential of future earnings for a full picture. 
- In a low volatility market, the filters used here may either list a lot of days or hardly any to go long / short. The criteria may need to be relaxed or hardened going by prevailing market conditions. 
- The same technical indicator and analysis may not equally apply across stocks and sectors. 
