# Brief Introduction to Pairs Trading
The goal of any trading stratergy is to maximise profits while minimising risks. Pairs trading is the simplest form of statistical arbitrage, summarised in one sentence: 'Buy (Long) asset that is undervalued and sell (Short) asset that is overvalued'. The underlying assumtion is that the pairs of stock traded in the stratergy, have shown 'similarities in their behaviours' which will almost always converge in the long run, even if they diverge in the short-term. 

Identifying 'correlated' stocks thorugh data processing/ cleansing and hypothesis testing is prioritised over anything else in Pairs Trading. Failiure to do so would yield spurcious relationship between stock pairs, arriving at a non-profitable stratergy. 

# Methodology
The dataset used in this investigation is sourced from Yahoo Finance using the yfinance Python package, which interacts with Yahoo's API to retrieve historical and real time stock data. We shall only focus on stocks that are included in the FTSE 100, 250 and 350 index, as it encompasses the 350 largest companies listed on the London Stock Exchange. 
There are several advantages to the choice: 
1. ** Sufficient Liquidity:** These stocks have higher trading volumes than small cap stocks, which means that the bid-ask spread and latency of execution could be minimised.
2. ** Avioding 'Small Cap Trap':** Trading small cap stocks, even at the small volume could potentially result in unpredictable effects on their prices, rendering the whole stratergy useless

# Importing Data
The dataset concerned in this investigation are stocks selected in FTSE 100, FTSE 250, FTSE 350, which represents about 90% of the total market shares of the UK equity market. Data are obtianed via Yahoo Finance through the use of the Yfinance Python Package. 
1. **Training Data Set:** Data from 2010/01/01 to 2015/12/31 is used for clustering and training the trading model
2. **Back-testing Data set:** Data from 2016/01/01 to 2023/12/31 is used to backtest the model

Please note that while it is possible to implement the concurrent.futures package to run multiple downloads in parallel, it is not done so in my code as Yahoo Finance's Public API (without authentication) limits user to 2,000 requests per hour per IP (or up to a total of 48,000 requests a day), which the concurrent package is shown to exceed, leading to a temporary ban from access.

In [None]:
#Import all required packages

import numpy as np
import pandas as pd
import yfinance as yf
from statsmodels.tsa.stattools import coint
import matplotlib.pyplot as plt

In [None]:
#Tickers of 
ftse100_tickers = ["ADM", "AAF", "ALW", "AAL", "ANTO", "AHT", "ABF", "AZN", "AUTO", "AV", 
    "BA", "BARC", "BTRW", "BEZ", "BKG", "BP", "BATS", "BLND", "BT-A", "BNZL",
     "CNA", "CCH", "CPG", "CTEC", "CRDA", "DCC", "DGE", "DPLM", "EDV", "ENT", 
    "EZJ", "EXPN", "FCIT", "FRES", "GAW", "GLEN", "GSK", "HLN", "HLMA", "HL",
    "HIK", "HSX", "HWDN", "HSBA", "IHG", "IMI", "IMB", "INF", "ICG", "IAG", 
    "ITRK", "JD", "KGF", "LAND", "LGEN", "LLOY", "LMP", "LSEG", "MNG", "MKS", 
    "MRO", "MNDI", "NG", "NWG", "NXT", "PSON", "PSH", "PSN", "PHNX", "PRU", 
    "RKT", "REL", "RTO", "RMV", "RIO", "RR", "SGE", "SBRY", "SDR", "SMT", 
    "SGRO", "SVT", "SHEL", "SMDS", "SMIN", "SN", "SPX", "SSE", "STAN", 
    "STJ", "TW", "TSCO", "ULVR", "UU", "UTG", "VOD", "WEIR", "WTB", "WPP"
]  

ftse250_tickers = [
    "3IN", "FOUR", "ABDN", "ALFA", "ATT", "ALPH", "AO", "APAX", "ASHM", "DGN",
    "AGR", "AML", "ATG", "AGT", "BME", "BAB", "BGFD", "USA", "BAKK", "BBY", 
    "BCG", "BNKR", "BGEO", "BAG", "BBGI", "AJB", "BBH", "BWY", "BHMG", "BYG", 
    "BRGE", "BRSC", "THRG", "BRWM", "BMY", "BSIF", "BOY", "BREE", "BPT", "BVIC", 
    "BUT", "BRBY", "BYIT", "CCR", "CLDN", "CGT", "CCL", "CHG", "CHRY", "CTY", 
    "CKN", "CMCX", "COA", "CCC", "CWK", "CRST", "CURY", "ROO", "DLN", "DLG", 
    "DSCV", "DEC", "DOM", "DWL", "DRX", "DOCS", "DNLM", "EDIN", "EWI", "ELM", 
    "ESP", "ENOG", "ESNT", "EOT", "ESCT", "FXPO", "FCSS", "FEML", "FEV", "FSV", 
    "FGT", "FGP", "FSG", "FSFL", "FRAS", "FUTR", "GCP", "GEN", "GNS", "GSCT", 
    "GDWN", "GFTU", "GRI", "GPE", "UKW", "GNC", "GRG", "HMSO", "HBR", "HVPE", 
    "HWG", "HAS", "HTWS", "HET", "HSL", "HRI", "HGT", "HICL", "HILS", "HFG", 
    "HOC", "BOWL", "HTG", "IBST", "ICGT", "IGG", "IEM", "INCH", "IHP", "IDS", 
    "INPP", "INVP", "IPO", "ITH", "ITV", "IWG", "JLEN", "JMAT", "JAM", "JMG", 
    "JEDT", "JGGI", "JII", "JFJ", "JTC", "JUP", "JUST", "KNOS", "KLR", "KIE", 
    "LRE", "LWDB", "EMG", "MSLH", "MEGP", "MRC", "MRCH", "MTRO", "MAB", "MTO", 
    "MCG", "GROW", "MONY", "MNKS", "MOON", "MGAM", "MGNS", "MUT", "MYI", "NBPE", 
    "NCC", "NESF", "N91", "NAS", "OCDO", "OSB", "OXIG", "ONT", "PHI", "PAGE", 
    "PIN", "PAG", "PPET", "PAY", "PNN", "PNL", "PHLL", "PETS", "PTEC", "PLUS",
    "PCFT", "PCT", "PPH", "PFD", "PHP", "PRS", "QQ", "QLT", "RPI", "RAT", 
    "RWI", "RSW", "RHIM", "RCP", "ROR", "RS1", "RICA", "SAFE", "SVS", "SDP", 
    "SOI", "SAIN", "SEIT", "SNR", "SEQI", "SRP", "SHC", "SRE", "SSON", "SCT", 
    "SXS", "SPI", "SPT", "SSPG", "STEM", "SUPR", "SYNC", "THRL", "TATE", "TBCG",
    "TEP", "TMPL", "TEM", "TRIG", "TIFS", "TCAP", "TRN", "TPK", "BBOX", "TRY", 
    "TRST", "TFIF", "SHED", "VSVS", "VCT", "VEIL", "VOF", "VTY", "FAN", "WPS", 
    "WOSG", "JDW", "SMWH", "WIZZ", "WG", "WKP", "WWH", "XPS", "ZIG"
]
ftse350_tickers = [
    'STEM', 'KNOS', 'BWY', 'ALPH', 'HOC', 'JD', 'FSG', 'BPT', 'FXPO', 'VTY', 
    'EWI', 'DWL', 'ONT', 'OCDO', 'HVPE', 'ROO', 'HBR', 'ELM', 'JTC', 'JUP', 'IPO',
    'TPK', 'DRX', 'NCC', 'EDV', 'SYNC', 'QLT', 'SXS', 'PSH', 'STJ', 'SDR', 'AML',
    'VSVS', 'RTO', 'APAX', 'ESNT', 'MRC', 'OXIG', 'MNDI', 'MSLH', 'BSIF', 'ENOG',
    'PLUS', 'MRO', 'SGRO', 'PNN', 'JUST', 'GCP', 'SAIN', 'BRBY', 'FAN', 'OSB', 'ABDN',
    'BME', 'CKN', 'PHNX', 'SSPG', 'PSN', 'ZIG', 'IWG', 'RS1', 'GEN', 'CRST', 'KLR',
    'TMPL', 'CWK', 'NAS', 'SMT', 'AGT', 'ASHM', 'RSW', 'INCH', 'BP', 'BBOX', 'VOD',
    'BOY', 'ITRK', 'BYIT', 'HFG', 'MGNS', 'EOT', 'THRG', 'HMSO', 'VOF', 'SSON', 'AGR',
    'MOON', 'PHP', 'BTRW', 'FEML', 'SHEL', 'WG', 'BAKK', 'GRG', 'RWI', 'SVS', 'BREE',
    'LWDB', 'GRI', 'MCG', 'ESP', 'VEIL', 'EMG', 'GFTU', 'HGT', 'KGF', 'SEIT', 'TEP',
    'BLND', 'COA', 'ITH', 'BARC', 'DSCV', 'ICG', 'USA', 'DNLM', 'TRN', 'FUTR', 'AAL',
    'WIZZ', 'HICL', 'DOCS', 'LGEN', 'BOWL', 'FSFL', 'MTRO', 'GNS', 'PAY', 'JII', 'TW',
    'BRWM', 'DOM', 'FEV', 'RIO', 'GAW', 'BBGI', 'HSL', 'MNKS', 'SPI', 'MUT', 'GROW',
    'UKW', 'PPET', 'KIE', 'PAG', 'EDIN', 'HWDN', 'PPH', 'MNG', 'INF', 'GPE', 'MAB',
    'BUT', 'ITV', 'INVP', 'PHLL', 'RCP', 'INPP', 'MYI', 'BBY', 'FGEN', 'RAT', 'HILS',
    'N91', 'PFD', 'FSV', 'SPX', 'BYG', 'VCT', 'GSK', 'SN', 'DGE', 'CHRY', 'SMWH', 'BKG',
    'CRDA', 'LRE', 'ADM', 'MRCH', 'ENT', 'FCSS', 'WPP', 'BMY', 'JAM', 'FRES', 'BRSC',
    'HLN', 'CCC', 'BBH', 'PCFT', 'SRE', 'CNA', 'TRIG', 'CTEC', 'CTY', 'HAS', 'ALW',
    'TRY', 'IBST', 'DLG', 'SEQI', 'JMAT', 'HSX', 'JMG', 'MGAM', 'BRGE', 'TSCO', 'DGN',
    'AV', 'ROR', 'PTEC', 'PHI', 'REL', 'CURY', 'AHT', 'NG', 'DEC', 'LMP', 'DPLM', 'LLOY',
    'IEM', 'PAGE', 'RPI', 'NESF', 'LAND', 'ATT', 'PRU', 'SCT', 'SHED', 'SMIN', 'AUTO',
    'CCL', 'FRAS', 'AZN', 'CLDN', 'UU', 'AJB', 'ASL', 'RKT', 'WOSG', 'CGT', 'EXPN',
    'NWG', 'SBRY', 'PETS', 'BT.A', 'HSBA', 'HIK', 'PCT', 'HRI', 'PNL', 'STAN', 'UTG',
    'TFIF', 'GSCT', 'BEZ', 'MONY', 'MTO', 'ATG', 'JFJ', 'ESCT', 'BNKR', 'ABF', 'CPG',
    'JEDT', 'IDS', 'TBCG', 'BGEO', 'GLEN', 'PIN', 'RHIM', 'BA', 'IGG', 'XPS', 'ULVR',
    'NBPE', 'QQ', 'SVT', 'ALFA', 'WKP', 'CMCX', 'SRP', 'TCAP', 'SDP', 'BAB', 'BATS',
    'JDW', 'BAG', 'THRL', 'HTWS',

]
# define a class for data retrival
class Stock_Processor:
    def __init__(self):
        self.data = 0
    def download_stock_data(self, tickers, start_date="2010-01-01", end_date="2023-12-31", interval="1d"):
        all_data = {}
        for ticker in tickers:
            print(f"Downloading data for {ticker}...")
            try:
                stock_data = yf.download(ticker, start=start_date, end=end_date, interval=interval)
                all_data[ticker] = stock_data
            except Exception as e:
                print(f"Error downloading data for {ticker}: {e}")
        return all_data
    
    def download_financial_ratio(self, tickers, start_date="2010-01-01", end_date="2023-12-31", interval="1d"):
        all_data = {}
        for ticker in tickers:
            stock = yf.Ticker(ticker)
            info = stock.info
            print(f"Downloading data for {ticker}...")
            try:
                ticker_ratios = {
            "Trailing P/E": info.get('trailingPE'),
            "Forward P/E": info.get('forwardPE'),
            "Price-to-Book": info.get('priceToBook'),
            "Debt-to-Equity": info.get('debtToEquity'),
            "Return on Equity (ROE)": info.get('returnOnEquity'),
            "Beta": info.get('beta'),
            "Average Volume": info.get('averageVolume'),
            "Volume (Current)": info.get('volume')
            }
                all_data[ticker] = ticker_ratios
            except Exception as e:
                print(f"Error downloading data for {ticker}: {e}")
        return all_data
                
processor = Stock_Processor()
    

# Download price data for stocks in FTSE 100, FTSE 250 and FTSE350 on daily and hourly intervals
ftse100_data_daily = processor.download_stock_data(ftse100_tickers,interval="1d")
ftse250_data_daily = processor.download_stock_data(ftse250_tickers,interval="1d")
ftse350_data_daily = processor.download_stock_data(ftse350_tickers,interval="1d")

ftse100_data_hourly = processor.download_stock_data(ftse100_tickers,interval="1h")
ftse250_data_hourly = processor.download_stock_data(ftse250_tickers,interval="1h")
ftse350_data_hourly = processor.download_stock_data(ftse350_tickers,interval="1h")

#Download financial ratio data for FTSE 100, FTSE 250 and FTSE350 on daily and hourly intervals

ftse100_fin_data_daily = processor.download_financial_ratio(ftse100_tickers,interval="1d")
ftse250_fin_data_daily = processor.download_financial_ratio(ftse250_tickers,interval="1d")
ftse350_fin_data_daily = processor.download_financial_ratio(ftse350_tickers,interval="1d")

ftse100_fin_data_hourly = processor.download_financial_ratio(ftse100_tickers,interval="1h")
ftse250_fin_data_hourly = processor.download_financial_ratio(ftse250_tickers,interval="1h")
ftse350_fin_data_hourly = processor.download_financial_ratio(ftse350_tickers,interval="1h")



# Combine price data into a single DataFrame for analysis if needed
ftse100_daily_combined = pd.concat({ticker: data['Close'] for ticker, data in ftse100_data_daily.items()}, axis=1)
ftse250_daily_combined = pd.concat({ticker: data['Close'] for ticker, data in ftse250_data_daily.items()}, axis=1)
ftse350_daily_combined = pd.concat({ticker: data['Close'] for ticker, data in ftse350_data_daily.items()}, axis=1)

ftse100_hourly_combined = pd.concat({ticker: data['Close'] for ticker, data in ftse100_data_hourly.items()}, axis=1)
ftse250_hourly_combined = pd.concat({ticker: data['Close'] for ticker, data in ftse250_data_hourly.items()}, axis=1)
ftse350_hourly_combined = pd.concat({ticker: data['Close'] for ticker, data in ftse350_data_hourly.items()}, axis=1)

# Save price data to CSV files for later use
Output_path = '/Users/perryhui/Desktop/Pairs Trading'

ftse100_daily_combined.to_csv(Output_path + "FTSE100_data_daily.csv")
ftse250_daily_combined.to_csv(Output_path + "FTSE250_data_daily.csv")
ftse350_daily_combined.to_csv(Output_path + "FTSE350_data_daily.csv")

ftse100_hourly_combined.to_csv(Output_path + "FTSE100_data_hourly.csv")
ftse250_hourly_combined.to_csv(Output_path + "FTSE250_data_hourly.csv")
ftse350_hourly_combined.to_csv(Output_path + "FTSE350_data_hourly.csv")

# Save finanical ratio data to CSV files for later use
ftse100_fin_data_daily.to_csv(Output_path + 'FTSE100_data_daily_financial.csv')
ftse250_fin_data_daily.to_csv(Output_path + 'FTSE250_data_daily_financial.csv')
ftse350_fin_data_daily.to_csv(Output_path + 'FTSE0_data_daily_financial.csv')

ftse100_fin_data_hourly.to_csv(Output_path + 'FTSE100_data_hourly_financial.csv')
ftse250_fin_data_hourly.to_csv(Output_path + 'FTSE250_data_hourly_financial.csv')
ftse350_fin_data_hourly.to_csv(Output_path + 'FTSE350_data_hourly_financial.csv')


# Please note that with the Public API (without authentication), you are limited to 2,000 requests per hour per IP (or up to a total of 48,000 requests a day)
# and that the Private API (with OAuth authentication using an API Key), you are limited to 20,000 requests per hour per IP and you are limited to 100,000 requests per day per API Key.
# The use of cocurrent.future will result in a temporary ban 

Downloading data for ADM...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['ADM']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for AAF...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['AAF']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for ALW...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['ALW']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for AAL...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['AAL']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for ANTO...


Failed to get ticker 'ANTO' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['ANTO']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for AHT...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['AHT']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for ABF...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['ABF']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for AZN...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['AZN']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for AUTO...


Failed to get ticker 'AUTO' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['AUTO']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for AV...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['AV']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for BA...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BA']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for BARC...


Failed to get ticker 'BARC' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BARC']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for BTRW...


Failed to get ticker 'BTRW' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BTRW']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for BEZ...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BEZ']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for BKG...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BKG']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for BP...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BP']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for BATS...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BATS']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for BLND...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BLND']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for BT-A...


Failed to get ticker 'BT-A' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BT-A']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for BNZL...


Failed to get ticker 'BNZL' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['BNZL']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for CNA...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['CNA']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for CCH...


Failed to get ticker 'CCH' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['CCH']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')
Failed to get ticker 'CPG' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['CPG']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for CPG...
Downloading data for CTEC...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['CTEC']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for CRDA...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['CRDA']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for DCC...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['DCC']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for DGE...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['DGE']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for DPLM...


Failed to get ticker 'DPLM' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['DPLM']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for EDV...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['EDV']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for ENT...


Failed to get ticker 'ENT' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['ENT']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for EZJ...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['EZJ']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for EXPN...


Failed to get ticker 'EXPN' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['EXPN']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for FCIT...


Failed to get ticker 'FCIT' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['FCIT']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for FRES...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['FRES']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for GAW...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['GAW']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')
Failed to get ticker 'GLEN' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['GLEN']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for GLEN...
Downloading data for GSK...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['GSK']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for HLN...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['HLN']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for HLMA...


Failed to get ticker 'HLMA' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['HLMA']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for HL...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['HL']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for HIK...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['HIK']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for HSX...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['HSX']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for HWDN...


Failed to get ticker 'HWDN' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['HWDN']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for HSBA...


Failed to get ticker 'HSBA' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['HSBA']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for IHG...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['IHG']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for IMI...


Failed to get ticker 'IMI' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['IMI']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for IMB...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['IMB']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for INF...


Failed to get ticker 'INF' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['INF']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for ICG...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['ICG']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for IAG...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['IAG']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for ITRK...


Failed to get ticker 'ITRK' reason: Expecting value: line 1 column 1 (char 0)
[*********************100%***********************]  1 of 1 completed

1 Failed download:
['ITRK']: YFTzMissingError('$%ticker%: possibly delisted; no timezone found')


Downloading data for JD...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['JD']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for KGF...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['KGF']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


Downloading data for LAND...


[*********************100%***********************]  1 of 1 completed

1 Failed download:
['LAND']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')


# Data Cleansing
In this step, we shall first discount stocks that are merged, delisted or exchanged in the sample period; stocks with more than half missing data across the training period are also discounted. In practice, stocks with negative prices as well as low trading volume should also be discounted. But since none of the stocks investigated show such chracteristics, we will ignore it for now.  