# Import Libraries

In [1]:
import numpy as np
import pandas as pd
import requests

import sys
sys.path.append('/home/adedapo/personal_project/daps05ayoade/DeepTrade')

from secrecy import api_key

# Get Stock Data

In [2]:
# Set the stock symbol (in this case, 'AAPL' for Apple Inc.)
ticker = 'IBM'

# Construct the API URL using the stock symbol and API token
api_url = f'https://www.alphavantage.co/query?function=TIME_SERIES_WEEKLY_ADJUSTED&symbol={ticker}&apikey={api_key}'

# Send a GET request to the API and parse the JSON response
data = requests.get(api_url).json()

In [3]:
df = pd.DataFrame(data['Weekly Adjusted Time Series']).T

In [4]:
df.head()

Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount
2023-10-18,139.28,140.62,136.31,139.97,139.97,10864275,0.0
2023-10-13,142.3,143.415,138.27,138.46,138.46,16386334,0.0
2023-10-06,140.04,142.94,139.86,142.03,142.03,15932918,0.0
2023-09-29,146.57,147.43,139.61,140.3,140.3,23445425,0.0
2023-09-22,145.77,151.9299,144.66,146.91,146.91,23597168,0.0


In [5]:
# Define new column names
column_names = ['open','high','low','close','adj_close','volume','dividend_amount']

# Assign new column names
df.columns = column_names

# Convert all column names to lower case
df.columns = df.columns.str.lower()

In [6]:
df.head()

Unnamed: 0,open,high,low,close,adj_close,volume,dividend_amount
2023-10-18,139.28,140.62,136.31,139.97,139.97,10864275,0.0
2023-10-13,142.3,143.415,138.27,138.46,138.46,16386334,0.0
2023-10-06,140.04,142.94,139.86,142.03,142.03,15932918,0.0
2023-09-29,146.57,147.43,139.61,140.3,140.3,23445425,0.0
2023-09-22,145.77,151.9299,144.66,146.91,146.91,23597168,0.0


# Technical Data

## Exponential Moving Average (EMA)

- **The EMA gives more weight to the most recent prices**, and as such, it reacts more quickly to price changes than the SMA.
- It is calculated using the following formula:
  - **EMA<sub>t</sub> = (P<sub>t</sub> - EMA<sub>t-1</sub>) × (2 / (N + 1)) + EMA<sub>t-1</sub>**
- Where:
  - **P<sub>t</sub>**: is the price at time t
  - **EMA<sub>t-1</sub>**: is the EMA value at time t-1
  - **N**: is the span of the moving average.

In [9]:
ema_url = f'https://www.alphavantage.co/query?function=EMA&symbol={ticker}&interval=daily&time_period=10&series_type=close&apikey={api_key}'

ema = requests.get(ema_url).json()

df['ema'] = pd.DataFrame(ema['Technical Analysis: EMA']).T

In [11]:
df.head(2)

Unnamed: 0,open,high,low,close,adj_close,volume,dividend_amount,ema
2023-10-18,139.28,140.62,136.31,139.97,139.97,10864275,0.0,140.7928
2023-10-13,142.3,143.415,138.27,138.46,138.46,16386334,0.0,141.546


## Moving Average Convergence Divergence (MACD)

The Moving Average Convergence Divergence (MACD) is a popular technical indicator used in financial analysis, particularly for stocks, to identify potential buy or sell signals. It's used to detect changes in the strength, direction, momentum, and duration of a trend in a stock's price.

### Construction:

1. **MACD Line**: This is the difference between two exponential moving averages (EMAs) of a stock’s price. Typically, the 12-day EMA minus the 26-day EMA.
   
   $$
   \text{MACD Line} = \text{12-day EMA} - \text{26-day EMA}
   $$

2. **Signal Line**: This is the 9-day EMA of the MACD line.

3. **Histogram**: This is the difference between the MACD line and the Signal line.

   $$
   \text{Histogram} = \text{MACD Line} - \text{Signal Line}
   $$

### Interpretation:

1. **Crossovers**:
    - **Bullish Crossover**: When the MACD line crosses above the Signal line, it's considered a bullish sign, suggesting it might be a good time to buy.
    - **Bearish Crossover**: When the MACD line crosses below the Signal line, it's considered a bearish sign, suggesting it might be a good time to sell.

2. **Divergence**:
   - If the price of a stock is making a new high, but the MACD is failing to surpass its previous high, this can be an indication of a potential price reversal to the downside (bearish divergence).
   - Conversely, if the stock price is making a new low, but the MACD is not reaching its previous lows, this can indicate a potential upward reversal (bullish divergence).

3. **Histogram**:
   - When the histogram is positive (above the zero line), it indicates the MACD is above the signal line, which can be a bullish sign.
   - When the histogram is negative (below the zero line), the MACD is below the signal line, indicating potential bearishness.

4. **Overextended MACD Values**:
   - If the MACD line moves too far away from the Signal line, it can be an indication that the stock is overbought or oversold and may soon return to normal levels.

In [40]:
macd_url = f'https://www.alphavantage.co/query?function=MACD&symbol={ticker}&interval=daily&time_period=10&series_type=close&apikey={api_key}'

macd = requests.get(macd_url).json()

macd_df = pd.DataFrame(macd['Technical Analysis: MACD']).T

In [43]:
df['macd'] = macd_df['MACD']
df['macd_signal'] = macd_df['MACD_Signal']
df['macd_hist'] = macd_df['MACD_Hist']

In [44]:
df.head(2)

Unnamed: 0,open,high,low,close,adj_close,volume,dividend_amount,ema,rsi,upper_band,middle_band,lower_band,slowk,slowd,macd,macd_signal,macd_hist
2023-10-18,139.28,140.62,136.31,139.97,139.97,10864275,0.0,140.7928,39.9025,143.8718,141.029,138.1862,43.5328,29.5285,-1.2279,-0.9833,-0.2446
2023-10-13,142.3,143.415,138.27,138.46,138.46,16386334,0.0,141.546,29.7462,143.7522,141.305,138.8578,44.0953,61.5547,-1.0952,-0.7553,-0.3399


## Bollinger Bands (BBANDS)

Bollinger Bands is a technical analysis tool developed by John Bollinger in the 1980s. It is designed to provide a relative definition of high and low prices of a market instrument (like a stock) and to identify periods of high or low volatility. Bollinger Bands consist of three bands:

- **Middle Band**: A simple moving average (SMA).
- **Upper Band**: Calculated as the simple moving average plus a specified number of standard deviations (typically two).
- **Lower Band**: Calculated as the simple moving average minus a specified number of standard deviations (typically two).

The formula for the bands is as follows:

**Middle Band (MB)**: 
$$ MB = SMA(N) $$

Where \( SMA(N) \) is the simple moving average over \( (N) \) periods.

**Upper Band (UB)**: 
$$ UB = SMA(N) + (K \times \sigma(N)) $$

Where \( \sigma(N) \) is the standard deviation of the price over \( (N) \) periods, and \( (K) \) is a multiplier which is usually set to 2.

**Lower Band (LB)**: 
$$ LB = SMA(N) - (K \times \sigma(N)) $$

In [18]:
bbands_url = f'https://www.alphavantage.co/query?function=BBANDS&symbol={ticker}&interval=daily&time_period=10&series_type=close&apikey={api_key}'

bbands = requests.get(bbands_url).json()

bbands_df = pd.DataFrame(bbands['Technical Analysis: BBANDS']).T

In [34]:
df['upper_band'] = bbands_df['Real Upper Band']
df['middle_band'] = bbands_df['Real Middle Band']
df['lower_band'] = bbands_df['Real Lower Band']

In [35]:
df.head(2)

Unnamed: 0,open,high,low,close,adj_close,volume,dividend_amount,ema,rsi,upper_band,middle_band,lower_band
2023-10-18,139.28,140.62,136.31,139.97,139.97,10864275,0.0,140.7928,39.9025,143.8718,141.029,138.1862
2023-10-13,142.3,143.415,138.27,138.46,138.46,16386334,0.0,141.546,29.7462,143.7522,141.305,138.8578


## Relative Strength Index (RSI)

The Relative Strength Index (RSI) is a momentum oscillator that measures the speed and change of price movements. It was developed by J. Welles Wilder and introduced in his 1978 book, "New Concepts in Technical Trading Systems." RSI is used to identify overbought or oversold conditions in a traded security.

### Calculation:

RSI is calculated using the following formula:

$$ \text{RSI} = 100 - \frac{100}{1 + RS} $$

Where:

$$ RS = \frac{\text{Average Gain over n periods}}{\text{Average Loss over n periods}} $$

1. Compute the average gain and average loss over a specified period, typically 14 days.
2. Calculate the relative strength (RS), which is the ratio of average gain to average loss.
3. Calculate the RSI using the formula given above.

### Interpretation:

- The RSI oscillates between 0 and 100.
- Traditionally, an RSI above 70 indicates that a security is overbought, suggesting it might be overvalued and is a potential candidate for a price pullback or a reversal.
- Conversely, an RSI below 30 indicates that a security is oversold, suggesting it might be undervalued and is a potential candidate for a price rally or a reversal.
- The standard setting for RSI is a 14-day period, but traders sometimes use different periods based on their strategies and the asset they're trading.

In [None]:
rsi_url = f'https://www.alphavantage.co/query?function=RSI&symbol={ticker}&interval=daily&time_period=10&series_type=close&apikey={api_key}'

rsi = requests.get(rsi_url).json()

df['rsi'] = pd.DataFrame(rsi['Technical Analysis: RSI']).T

In [None]:
df.head(2)

## Stochastic Oscillator (STOCH)

The Stochastic Oscillator is a momentum indicator that measures the position of a stock's latest closing price relative to its high and low range over a specific time period. It provides insights into potential overbought or oversold conditions in the stock, helping traders to identify potential trend reversals.

The formula for the Stochastic Oscillator is:

$$ \%K = \left( \frac{\text{Latest Close} - \text{Lowest Low}}{\text{Highest High} - \text{Lowest Low}} \right) \times 100 $$

Where:
- **Latest Close** is the most recent closing price.
- **Lowest Low** is the lowest price of the stock over the specified period.
- **Highest High** is the highest price of the stock over the specified period.

The Stochastic Oscillator comprises two lines:
1. **%K** – The main line, calculated from the above formula.
2. **%D** – A moving average of the %K value, usually a 3-period simple moving average.

### Interpretation:

- **Overbought & Oversold Levels**: Typically, the Stochastic Oscillator ranges from 0 to 100. A value above 80 is usually considered "overbought," suggesting that a price reversal or correction might be imminent. Conversely, a value below 20 is considered "oversold," indicating potential for a price rise.
  
- **Signal Line Crossovers**: When the %K line crosses above the %D line, it's considered a bullish signal (potential buy), especially if this crossover happens below the 20 level. When the %K line crosses below the %D line, it's seen as a bearish signal (potential sell), especially if this happens above the 80 level.

- **Divergence**: If the stock's price forms a new high or low that isn't confirmed by the Stochastic Oscillator, it may indicate a potential trend reversal. For example, if a stock forms a new high, but the Stochastic Oscillator doesn't, it's a bearish divergence.

In [36]:
stoch_url = f'https://www.alphavantage.co/query?function=STOCH&symbol={ticker}&interval=daily&time_period=10&series_type=close&apikey={api_key}'

stoch = requests.get(stoch_url).json()

stoch_df = pd.DataFrame(stoch['Technical Analysis: STOCH']).T

In [37]:
df['slowk'] = stoch_df['SlowK']
df['slowd'] = stoch_df['SlowD']

In [38]:
df.head(2)

Unnamed: 0,open,high,low,close,adj_close,volume,dividend_amount,ema,rsi,upper_band,middle_band,lower_band,slowk,slowd
2023-10-18,139.28,140.62,136.31,139.97,139.97,10864275,0.0,140.7928,39.9025,143.8718,141.029,138.1862,43.5328,29.5285
2023-10-13,142.3,143.415,138.27,138.46,138.46,16386334,0.0,141.546,29.7462,143.7522,141.305,138.8578,44.0953,61.5547


## Average True Range (ATR)

The Average True Range (ATR) is a technical indicator that measures the volatility of a stock or any other market instrument. It was introduced by J. Welles Wilder in his 1978 book "New Concepts in Technical Trading Systems." ATR does not provide an indication of price direction but instead quantifies the degree of price volatility.

### Calculation:

1. **True Range Calculation**:

   First, you need to calculate the True Range (TR) for each period. The TR for any period is calculated as the greatest of the following:

   - Current high minus the current low.
   - Absolute value of the current high minus the previous close.
   - Absolute value of the current low minus the previous close.

   Mathematically:

   $$
   \text{TR} = \max[(\text{High} - \text{Low}), |\text{High} - \text{Previous Close}|, |\text{Low} - \text{Previous Close}|]
   $$

2. **Average True Range Calculation**:

   The ATR is typically calculated using a 14-day moving average of the TR values, though the period can be adjusted based on the user's preferences. Wilder originally used a smoothed moving average, but many traders now use an exponential moving average.

   For the first period's ATR value:

   $$
   \text{ATR}_1 = \frac{1}{n} \sum_{i=1}^{n} \text{TR}_i
   $$

   Where:
   - \( n \) is the number of periods, typically 14.
   - \( \text{TR}_i \) is the True Range of the ith period.

   For subsequent days:

   $$
   \text{ATR}_{t} = \frac{\text{Previous ATR} \times (n-1) + \text{TR}_{t}}{n}
   $$

   This formula gives the new ATR a weight relative to the previous ATR, with the latest day's TR receiving the most weight.

### Interpretation and Use:

- The ATR value rises when price movements (either up or down) are large, and it diminishes when price changes are minimal, making it a reflection of volatility.
  
- ATR can be used to place stop-loss orders. For instance, a trader might set a stop-loss at a multiple of the ATR below their entry price for a long position.

- It can be used as a filter for trading systems, where trades might only be considered if the ATR is above a certain level, indicating sufficient volatility for potential profitable price movements.

- ATR does not indicate price direction and only provides an estimate of volatility. It's often used in conjunction with other indicators to develop trading strategies.

In [45]:
atr_url = f'https://www.alphavantage.co/query?function=ATR&symbol={ticker}&interval=daily&time_period=10&series_type=close&apikey={api_key}'

atr = requests.get(atr_url).json()

df['atr'] = pd.DataFrame(atr['Technical Analysis: ATR']).T

In [47]:
df.head(2)

Unnamed: 0,open,high,low,close,adj_close,volume,dividend_amount,ema,rsi,upper_band,middle_band,lower_band,slowk,slowd,macd,macd_signal,macd_hist,atr
2023-10-18,139.28,140.62,136.31,139.97,139.97,10864275,0.0,140.7928,39.9025,143.8718,141.029,138.1862,43.5328,29.5285,-1.2279,-0.9833,-0.2446,2.0628
2023-10-13,142.3,143.415,138.27,138.46,138.46,16386334,0.0,141.546,29.7462,143.7522,141.305,138.8578,44.0953,61.5547,-1.0952,-0.7553,-0.3399,2.0343


## Set Target Column

In [59]:
# Set the index to datetime
df.index = pd.to_datetime(df.index)

# Sort Index in ascending order
df = df.sort_index(ascending=True)

# Create target column
df['target'] = df['adj_close'].shift(-1)

# Convert all columns to float
df = df.astype(float)

# Drop NA values
df.dropna(inplace=True)

In [60]:
df.head(2)

Unnamed: 0,open,high,low,close,adj_close,volume,dividend_amount,ema,rsi,upper_band,middle_band,lower_band,slowk,slowd,macd,macd_signal,macd_hist,atr,target
1999-12-17,108.12,112.75,104.5,110.0,58.4579,38810100.0,0.0,58.2416,55.2716,63.3726,59.4363,55.5,45.9012,30.8539,2.2876,3.1375,-0.8499,2.3108,57.7245
1999-12-23,109.06,110.44,107.75,108.62,57.7245,18144100.0,0.0,58.035,50.9433,58.883,57.9307,56.9785,51.4425,64.2224,1.6304,2.3688,-0.7384,1.9492,57.3259


In [61]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1244 entries, 1999-12-17 to 2023-10-13
Data columns (total 19 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   open             1244 non-null   float64
 1   high             1244 non-null   float64
 2   low              1244 non-null   float64
 3   close            1244 non-null   float64
 4   adj_close        1244 non-null   float64
 5   volume           1244 non-null   float64
 6   dividend_amount  1244 non-null   float64
 7   ema              1244 non-null   float64
 8   rsi              1244 non-null   float64
 9   upper_band       1244 non-null   float64
 10  middle_band      1244 non-null   float64
 11  lower_band       1244 non-null   float64
 12  slowk            1244 non-null   float64
 13  slowd            1244 non-null   float64
 14  macd             1244 non-null   float64
 15  macd_signal      1244 non-null   float64
 16  macd_hist        1244 non-null   float64
 

# Modualised

In [62]:
def get_data_from_url(url: str):
    response = requests.get(url)
    response.raise_for_status()  # Raise an exception if the request was unsuccessful
    return response.json()

def get_technical_data(symbol: str, feature: str, api_key: str):
    url = f'https://www.alphavantage.co/query?function={feature}&symbol={symbol}&interval=daily&time_period=10&series_type=close&apikey={api_key}'
    data = get_data_from_url(url)
    key = 'Technical Analysis: ' + feature
    
    # Check if the response contains an error
    if "Error Message" in data:
        raise ValueError(f"Error retrieving {feature} for {symbol}: {data['Error Message']}")
    if key not in data:
        raise ValueError(f"Unexpected API response structure when fetching {feature} for {symbol}")
    
    if feature == 'BBANDS':
        df = pd.DataFrame(data[key]).T
        df = df.rename(columns={
            'Real Upper Band': 'upper_band',
            'Real Middle Band': 'middle_band',
            'Real Lower Band': 'lower_band'
        })
        for col in ['upper_band', 'middle_band', 'lower_band']:
            df[col] = pd.to_numeric(df[col], errors='coerce')
        
    elif feature == 'MACD':
        df = pd.DataFrame(data[key]).T
        df = df.rename(columns={
            'MACD': 'macd',
            'MACD_Signal': 'macd_signal',
            'MACD_Hist': 'macd_hist'
        })
        for col in ['macd', 'macd_signal', 'macd_hist']:
            df[col] = pd.to_numeric(df[col], errors='coerce')

    elif feature == 'STOCH':
        df = pd.DataFrame(data[key]).T
        df = df.rename(columns={
            'SlowK': 'slowk',
            'SlowD': 'slowd'
        })
        for col in ['slowk', 'slowd']:
            df[col] = pd.to_numeric(df[col], errors='coerce')

    else:
        df = pd.DataFrame(data[key]).T

    return df

def get_stock_data(symbol: str, api_key: str):
    api_url = f'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol={symbol}&outputsize=full&apikey={api_key}'

    # Obtain and process stock data
    data = get_data_from_url(api_url)
    df = pd.DataFrame(data['Time Series (Daily)']).T

    # Assign new column names
    column_names = ['open','high','low','close','adj_close','volume', 'dividend', 'split_coeff']
    df.columns = column_names

    # Convert columns to numeric
    for col in column_names:
        df[col] = pd.to_numeric(df[col], errors='coerce')

    # Obtain technical data and merge
    features = ['EMA', 'MACD', 'BBANDS', 'RSI', 'STOCH', 'ATR']
    for feature in features:
        tech_df = get_technical_data(symbol, feature, api_key)
        df = df.join(tech_df, rsuffix=f'_{feature}')
        
    # Convert all column names to lower case
    df.columns = df.columns.str.lower()
    
    # Set the index to datetime
    df.index = pd.to_datetime(df.index)
    
    # Sort Index in ascending order
    df = df.sort_index(ascending=True)

    # Create target column
    df['target'] = df['adj_close'].shift(-1)
    
    # Drop NA values
    df.dropna(inplace=True)

    return df

In [63]:
df_2 = get_stock_data('AAPL', api_key)

In [64]:
df_2.head()

Unnamed: 0,open,high,low,close,adj_close,volume,dividend,split_coeff,ema,macd,macd_signal,macd_hist,upper_band,middle_band,lower_band,rsi,slowk,slowd,atr,target
1999-12-17,100.87,102.0,98.5,100.0,0.757759,4419700,0.0,1.0,0.7652,0.0243,0.0418,-0.0175,0.9044,0.7891,0.6737,50.9703,49.2644,27.6202,0.0438,0.742604
1999-12-20,99.56,99.62,96.62,98.0,0.742604,2535600,0.0,1.0,0.7611,0.0211,0.0377,-0.0165,0.8764,0.7754,0.6744,47.4399,60.4803,44.3777,0.042,0.776703
1999-12-21,98.19,103.06,97.94,102.5,0.776703,2746400,0.0,1.0,0.7639,0.0211,0.0344,-0.0132,0.8283,0.7638,0.6994,55.1978,78.9724,62.9057,0.0416,0.757304
1999-12-22,102.87,104.56,98.75,99.94,0.757304,2920300,0.0,1.0,0.7627,0.0193,0.0313,-0.012,0.8005,0.7561,0.7118,50.4874,71.6734,70.3754,0.0419,0.784281
1999-12-23,101.81,104.25,101.06,103.5,0.784281,2049400,0.0,1.0,0.7666,0.0198,0.029,-0.0092,0.7947,0.7548,0.715,56.2554,79.4111,76.6856,0.041,0.75253


In [65]:
df_2.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 5996 entries, 1999-12-17 to 2023-10-17
Data columns (total 20 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   open         5996 non-null   float64
 1   high         5996 non-null   float64
 2   low          5996 non-null   float64
 3   close        5996 non-null   float64
 4   adj_close    5996 non-null   float64
 5   volume       5996 non-null   int64  
 6   dividend     5996 non-null   float64
 7   split_coeff  5996 non-null   float64
 8   ema          5996 non-null   object 
 9   macd         5996 non-null   float64
 10  macd_signal  5996 non-null   float64
 11  macd_hist    5996 non-null   float64
 12  upper_band   5996 non-null   float64
 13  middle_band  5996 non-null   float64
 14  lower_band   5996 non-null   float64
 15  rsi          5996 non-null   object 
 16  slowk        5996 non-null   float64
 17  slowd        5996 non-null   float64
 18  atr          5996 non-null   o