# Price Data Extraction for Post-Index Rebalancing Arbitrage Strategy

Data source: Refinitiv Datastream via WRDS

This notebook executes codes to extract relevant data, based on the historical records of FTSE100 and FTSE250 rebalancing, for the past 10 years (2013Q1 - 2023Q3).
In this exercise, the following stocks are excluded:
- Stocks that are suspended from trading within the analysis period (+/- 20 days from rebalancing date)
- Stocks which rebalancing dates fall within the announcement date and the ex date of a corporate action
- Stocks that we are unable to obtain a reliable historical data on
- All Q3 2023 rebalancing; at time of study we are unable to obtain 20 days after the rebalancing date

In [1]:
# Import WRDS library
import wrds
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import os

## Data Extraction

In [2]:
def read_sql_script(fname):
    fd = open(fname, 'r')
    sqlFile = fd.read()
    fd.close()

    return sqlFile


# Get current path
current_dir = os.getcwd()


# Define sql file names
# these will be used as a global variable
query_historical_prices = read_sql_script('/Users/abigail/Desktop/SMU/QF603/get_historical_prices.sql')
query_shares_outstanding = read_sql_script('/Users/abigail/Desktop/SMU/QF603/get_shares_outstanding.sql')

# Establish live connection; requires user login (passwords will be masked)
db = wrds.Connection() # this will be used as a global variable


def get_historical_prices(isin, start_date, end_date):
    
    print(f'Extracting historical prices for {isin}...')

    df =\
    (
        db
        .raw_sql(
            query_historical_prices.format(isin, start_date, end_date), 
            date_cols = ['trade_date']
            )
    )

    if df.empty:
        print('Dataframe is empty. No results was returned!')
    
    print('--------------------------------------------------')

    return df


Enter your WRDS username [abigail]:abigailcjh
Enter your password:········
WRDS recommends setting up a .pgpass file.
Create .pgpass file now [y/n]?: y
Created .pgpass file successfully.
You can create this file yourself at any time with the create_pgpass_file() function.
Loading library list...
Done


In [3]:
ftse_rebal = pd.read_csv('/Users/abigail/Desktop/SMU/QF603/ftse_10y_rebal_records.csv')
# ftse_rebal.head()

# hist_rebal_price = pd.read_csv('')

In [4]:
look_back = 40
look_forward = 40

ftse_rebal["Post Date"] =\
    pd.to_datetime(ftse_rebal["Post Date"], 
                   format = '%d/%m/%Y')

ftse_rebal["start_date"] =\
    (
        ftse_rebal["Post Date"] - timedelta(days = look_back)
    ).dt.strftime('%d/%m/%Y')

ftse_rebal["end_date"] =\
    (
        ftse_rebal["Post Date"] + timedelta(days = look_back)
    ).dt.strftime('%d/%m/%Y')

In [5]:
target_isins = ftse_rebal["ISIN"]
start_dates = ftse_rebal["start_date"]
end_dates = ftse_rebal["end_date"]   


historical_prices =\
    (
        pd.
        concat(
            map(
                get_historical_prices,
                target_isins,
                start_dates,
                end_dates
            )
        )
    )

Extracting historical prices for GI000A0F6407...
--------------------------------------------------
Extracting historical prices for GB00BF8Q6K64...
--------------------------------------------------
Extracting historical prices for GB00BM8NFJ84...
--------------------------------------------------
Extracting historical prices for GB00BMCYKB41...
--------------------------------------------------
Extracting historical prices for GB00B23K0M20...
--------------------------------------------------
Extracting historical prices for GB00BG5KQW09...
--------------------------------------------------
Extracting historical prices for GB00B14SKR37...
--------------------------------------------------
Extracting historical prices for GB0009633180...
--------------------------------------------------
Extracting historical prices for GB0001826634...
--------------------------------------------------
Extracting historical prices for GG00BMD8MJ76...
--------------------------------------------------


--------------------------------------------------
Extracting historical prices for GG00B1ZBD492...
--------------------------------------------------
Extracting historical prices for GB0031544546...
--------------------------------------------------
Extracting historical prices for JE00B6T5S470...
--------------------------------------------------
Extracting historical prices for GB0009039941...
--------------------------------------------------
Extracting historical prices for GB00B018CS46...
--------------------------------------------------
Extracting historical prices for GB0030474687...
--------------------------------------------------
Extracting historical prices for GB00BMV92D64...
--------------------------------------------------
Extracting historical prices for GB0001500809...
--------------------------------------------------
Extracting historical prices for GB00BYV8MN78...
--------------------------------------------------
Extracting historical prices for GB00BJTNFH41...


--------------------------------------------------
Extracting historical prices for GB0002418548...
--------------------------------------------------
Extracting historical prices for GB00BYYW3C20...
--------------------------------------------------
Extracting historical prices for IM00B5VQMV65...
--------------------------------------------------
Extracting historical prices for GB00BYYTFB60...
--------------------------------------------------
Extracting historical prices for GB00BKP36R26...
--------------------------------------------------
Extracting historical prices for GG00BJL5FH87...
--------------------------------------------------
Extracting historical prices for GB0003450359...
--------------------------------------------------
Extracting historical prices for GB0033195214...
--------------------------------------------------
Extracting historical prices for GB0007388407...
--------------------------------------------------
Extracting historical prices for GB00B1JQDM80...


--------------------------------------------------
Extracting historical prices for GB00B03HDJ73...
--------------------------------------------------
Extracting historical prices for GB00BFZNLB60...
--------------------------------------------------
Extracting historical prices for GB0002945029...
--------------------------------------------------
Extracting historical prices for IM00B5VQMV65...
--------------------------------------------------
Extracting historical prices for GB00B012TP20...
--------------------------------------------------
Extracting historical prices for GB00BKX5CN86...
--------------------------------------------------
Extracting historical prices for GB0004915632...
--------------------------------------------------
Extracting historical prices for GB00BJ62K685...
--------------------------------------------------
Extracting historical prices for GB00BGXQNP29...
--------------------------------------------------
Extracting historical prices for GB00B60BD277...


--------------------------------------------------
Extracting historical prices for GB00BLJNXL82...
--------------------------------------------------
Extracting historical prices for GB0007365546...
--------------------------------------------------
Extracting historical prices for GB00B7FC0762...
--------------------------------------------------
Extracting historical prices for GB00B41H7391...
--------------------------------------------------
Extracting historical prices for BMG702782084...
--------------------------------------------------
Extracting historical prices for GB00B1Z4ST84...
--------------------------------------------------
Extracting historical prices for GB00BDVZYZ77...
--------------------------------------------------
Extracting historical prices for GG00BV54HY67...
--------------------------------------------------
Extracting historical prices for GB00BLRLH124...
--------------------------------------------------
Extracting historical prices for GB00BJTNFH41...


--------------------------------------------------
Extracting historical prices for GB00B1QH8P22...
--------------------------------------------------
Extracting historical prices for GB00BVGBWW93...
--------------------------------------------------
Extracting historical prices for IE0002424939...
--------------------------------------------------
Extracting historical prices for GB00BCKFY513...
--------------------------------------------------
Extracting historical prices for GB00B01FLG62...
--------------------------------------------------
Extracting historical prices for GB00BYRJH519...
--------------------------------------------------
Extracting historical prices for GB0004478896...
--------------------------------------------------
Extracting historical prices for GB00BYXJC278...
--------------------------------------------------
Extracting historical prices for GB00B0HZPV38...
--------------------------------------------------
Extracting historical prices for GB0005758098...


--------------------------------------------------
Extracting historical prices for GB00BNGWY422...
--------------------------------------------------
Extracting historical prices for GB00BMQX2Q65...
--------------------------------------------------
Extracting historical prices for GB0001570810...
--------------------------------------------------
Extracting historical prices for GB00BMHTHT14...
--------------------------------------------------
Extracting historical prices for GB00B1YW4409...
--------------------------------------------------
Extracting historical prices for GI000A0F6407...
--------------------------------------------------
Extracting historical prices for GB00BJTNFH41...
--------------------------------------------------
Extracting historical prices for GB00BKRV3L73...
--------------------------------------------------
Extracting historical prices for GB0004228648...
--------------------------------------------------
Extracting historical prices for GB0006834344...


--------------------------------------------------
Extracting historical prices for GB0004866223...
--------------------------------------------------
Extracting historical prices for GB00B0SWJX34...
--------------------------------------------------
Extracting historical prices for GB00B7FC0762...
--------------------------------------------------
Extracting historical prices for GB00B0D5V538...
--------------------------------------------------
Extracting historical prices for GB00B03HDJ73...
--------------------------------------------------
Extracting historical prices for GB00B1VYCH82 ...
Dataframe is empty. No results was returned!
--------------------------------------------------


  pd.


In [8]:
historical_prices.head()

Unnamed: 0,trade_date,security_code,security_name,primary_exchange,refinitiv_code,isin_code,currency,open,high,low,close,volume
0,2023-08-09,18982.0,888 HOLDINGS,LON,26862.0,GI000A0F6407,GBP,1.066,1.136,1.063,1.117,358252.0
1,2023-08-10,18982.0,888 HOLDINGS,LON,26862.0,GI000A0F6407,GBP,1.14,1.14,1.091,1.12,284950.0
2,2023-08-11,18982.0,888 HOLDINGS,LON,26862.0,GI000A0F6407,GBP,1.1,1.331,1.098,1.15,1003896.0
3,2023-08-14,18982.0,888 HOLDINGS,LON,26862.0,GI000A0F6407,GBP,1.147,1.16,1.088108,1.096,1088784.0
4,2023-08-15,18982.0,888 HOLDINGS,LON,26862.0,GI000A0F6407,GBP,1.1,1.126,1.001,1.114,1127118.0


## Data Cleaning

In [6]:

index_close_data = pd.read_csv('/Users/abigail/Desktop/SMU/QF603/FTSE_100_Index_10y.csv', header = 0)

index_close_data["Date"] =\
    (
        index_close_data["Date"]
        .apply(lambda x: datetime.strptime(x,'%d/%m/%y'))
    )


# index_close_data.index[(index_close_data["Date"] == lst_start_date[100])]

In [7]:
index_close_data.head()

Unnamed: 0,Date,Close,Net,%Chg,Open,Low,High,Volume,Turnover - GBP,Flow
0,2012-02-17,5905.07,19.69,0.33%,5885.38,5885.38,5923.62,1101910955,405068.0,-404602.0
1,2012-02-20,5945.25,40.18,0.68%,5905.07,5905.07,5956.33,723669610,270963.0,-133639.0
2,2012-02-21,5928.2,-17.05,-0.29%,5945.25,5916.58,5948.84,846269111,326004.0,-459643.0
3,2012-02-22,5916.55,-11.65,-0.20%,5928.2,5894.6,5937.96,872088943,327016.0,-786659.0
4,2012-02-23,5937.89,21.34,0.36%,5916.55,5900.5,5952.47,1071356849,340406.0,-446253.0


In [None]:
index_close_data.index[(full_ftse_data["trade_date"][x])]

In [10]:
full_ftse_data = historical_prices.copy()

In [None]:
full_ftse_data = historical_prices.copy()

full_ftse_data["index_close"] = \
    (
        full_ftse_data["index_close"]
        .apply(lambda x: datetime.strptime(x,'%d/%m/%y'))
    )



## HEREEEE

In [37]:

close_price = []
trade_date = list(full_ftse_data.trade_date)
count = -1

for date in trade_date:
    count = count  + 1
    if index_close_data.index[(index_close_data["Date"] == date)].empty:
        look_back = 1
        while look_back < 10: 
            if index_close_data.index[(index_close_data["Date"] == date)].empty:
                print(date)
                date = (date - timedelta(days = look_back)).strftime('%d/%m/%Y')
                #(trade_date[4290] - timedelta(days = look_back)).strftime('%d/%m/%Y')
            else:
                idx = index_close_data.index[(index_close_data["Date"] == date)][0]
                close_price.append(index_close_data["Date"][idx])
                print(look_back, count, idx, date)
                break
            looks_back = look_back + 1
    else:
        idx = index_close_data.index[(index_close_data["Date"] == date)][0]
        close_price.append(index_close_data["Date"][idx])
        print(count, idx, date)


0 2897 2023-08-09 00:00:00
1 2898 2023-08-10 00:00:00
2 2899 2023-08-11 00:00:00
3 2900 2023-08-14 00:00:00
4 2901 2023-08-15 00:00:00
5 2902 2023-08-16 00:00:00
6 2903 2023-08-17 00:00:00
7 2904 2023-08-18 00:00:00
8 2905 2023-08-21 00:00:00
9 2906 2023-08-22 00:00:00
10 2907 2023-08-23 00:00:00
11 2908 2023-08-24 00:00:00
12 2909 2023-08-25 00:00:00
13 2910 2023-08-29 00:00:00
14 2911 2023-08-30 00:00:00
15 2912 2023-08-31 00:00:00
16 2913 2023-09-01 00:00:00
17 2914 2023-09-04 00:00:00
18 2915 2023-09-05 00:00:00
19 2916 2023-09-06 00:00:00
20 2917 2023-09-07 00:00:00
21 2918 2023-09-08 00:00:00
22 2919 2023-09-11 00:00:00
23 2920 2023-09-12 00:00:00
24 2921 2023-09-13 00:00:00
25 2922 2023-09-14 00:00:00
26 2923 2023-09-15 00:00:00
27 2924 2023-09-18 00:00:00
28 2925 2023-09-19 00:00:00
29 2926 2023-09-20 00:00:00
30 2927 2023-09-21 00:00:00
31 2928 2023-09-22 00:00:00
32 2929 2023-09-25 00:00:00
33 2930 2023-09-26 00:00:00
34 2931 2023-09-27 00:00:00
35 2932 2023-09-28 00:00:00
36

549 2906 2023-08-22 00:00:00
550 2907 2023-08-23 00:00:00
551 2908 2023-08-24 00:00:00
552 2909 2023-08-25 00:00:00
553 2910 2023-08-29 00:00:00
554 2911 2023-08-30 00:00:00
555 2912 2023-08-31 00:00:00
556 2913 2023-09-01 00:00:00
557 2914 2023-09-04 00:00:00
558 2915 2023-09-05 00:00:00
559 2916 2023-09-06 00:00:00
560 2917 2023-09-07 00:00:00
561 2918 2023-09-08 00:00:00
562 2919 2023-09-11 00:00:00
563 2920 2023-09-12 00:00:00
564 2921 2023-09-13 00:00:00
565 2922 2023-09-14 00:00:00
566 2923 2023-09-15 00:00:00
567 2924 2023-09-18 00:00:00
568 2925 2023-09-19 00:00:00
569 2926 2023-09-20 00:00:00
570 2927 2023-09-21 00:00:00
571 2928 2023-09-22 00:00:00
572 2929 2023-09-25 00:00:00
573 2930 2023-09-26 00:00:00
574 2931 2023-09-27 00:00:00
575 2932 2023-09-28 00:00:00
576 2933 2023-09-29 00:00:00
577 2934 2023-10-02 00:00:00
578 2935 2023-10-03 00:00:00
579 2936 2023-10-04 00:00:00
580 2937 2023-10-05 00:00:00
581 2938 2023-10-06 00:00:00
582 2939 2023-10-09 00:00:00
583 2940 2023-

936 2869 2023-06-30 00:00:00
937 2870 2023-07-03 00:00:00
938 2871 2023-07-04 00:00:00
939 2872 2023-07-05 00:00:00
940 2873 2023-07-06 00:00:00
941 2874 2023-07-07 00:00:00
942 2875 2023-07-10 00:00:00
943 2876 2023-07-11 00:00:00
944 2877 2023-07-12 00:00:00
945 2878 2023-07-13 00:00:00
946 2879 2023-07-14 00:00:00
947 2880 2023-07-17 00:00:00
948 2881 2023-07-18 00:00:00
949 2882 2023-07-19 00:00:00
950 2883 2023-07-20 00:00:00
951 2884 2023-07-21 00:00:00
952 2885 2023-07-24 00:00:00
953 2886 2023-07-25 00:00:00
954 2887 2023-07-26 00:00:00
955 2888 2023-07-27 00:00:00
956 2889 2023-07-28 00:00:00
957 2833 2023-05-10 00:00:00
958 2834 2023-05-11 00:00:00
959 2835 2023-05-12 00:00:00
960 2836 2023-05-15 00:00:00
961 2837 2023-05-16 00:00:00
962 2838 2023-05-17 00:00:00
963 2839 2023-05-18 00:00:00
964 2840 2023-05-19 00:00:00
965 2841 2023-05-22 00:00:00
966 2842 2023-05-23 00:00:00
967 2843 2023-05-24 00:00:00
968 2844 2023-05-25 00:00:00
969 2845 2023-05-26 00:00:00
970 2846 2023-

1330 2864 2023-06-23 00:00:00
1331 2865 2023-06-26 00:00:00
1332 2866 2023-06-27 00:00:00
1333 2867 2023-06-28 00:00:00
1334 2868 2023-06-29 00:00:00
1335 2869 2023-06-30 00:00:00
1336 2870 2023-07-03 00:00:00
1337 2871 2023-07-04 00:00:00
1338 2872 2023-07-05 00:00:00
1339 2873 2023-07-06 00:00:00
1340 2874 2023-07-07 00:00:00
1341 2875 2023-07-10 00:00:00
1342 2876 2023-07-11 00:00:00
1343 2877 2023-07-12 00:00:00
1344 2878 2023-07-13 00:00:00
1345 2879 2023-07-14 00:00:00
1346 2880 2023-07-17 00:00:00
1347 2881 2023-07-18 00:00:00
1348 2882 2023-07-19 00:00:00
1349 2883 2023-07-20 00:00:00
1350 2884 2023-07-21 00:00:00
1351 2885 2023-07-24 00:00:00
1352 2886 2023-07-25 00:00:00
1353 2887 2023-07-26 00:00:00
1354 2888 2023-07-27 00:00:00
1355 2889 2023-07-28 00:00:00
1356 2833 2023-05-10 00:00:00
1357 2834 2023-05-11 00:00:00
1358 2835 2023-05-12 00:00:00
1359 2836 2023-05-15 00:00:00
1360 2837 2023-05-16 00:00:00
1361 2838 2023-05-17 00:00:00
1362 2839 2023-05-18 00:00:00
1363 2840 

1715 2791 2023-03-07 00:00:00
1716 2792 2023-03-08 00:00:00
1717 2793 2023-03-09 00:00:00
1718 2794 2023-03-10 00:00:00
1719 2795 2023-03-13 00:00:00
1720 2796 2023-03-14 00:00:00
1721 2797 2023-03-15 00:00:00
1722 2798 2023-03-16 00:00:00
1723 2799 2023-03-17 00:00:00
1724 2800 2023-03-20 00:00:00
1725 2801 2023-03-21 00:00:00
1726 2802 2023-03-22 00:00:00
1727 2803 2023-03-23 00:00:00
1728 2804 2023-03-24 00:00:00
1729 2805 2023-03-27 00:00:00
1730 2806 2023-03-28 00:00:00
1731 2807 2023-03-29 00:00:00
1732 2808 2023-03-30 00:00:00
1733 2809 2023-03-31 00:00:00
1734 2810 2023-04-03 00:00:00
1735 2811 2023-04-04 00:00:00
1736 2812 2023-04-05 00:00:00
1737 2813 2023-04-06 00:00:00
1738 2814 2023-04-11 00:00:00
1739 2815 2023-04-12 00:00:00
1740 2816 2023-04-13 00:00:00
1741 2817 2023-04-14 00:00:00
1742 2818 2023-04-17 00:00:00
1743 2819 2023-04-18 00:00:00
1744 2820 2023-04-19 00:00:00
1745 2821 2023-04-20 00:00:00
1746 2822 2023-04-21 00:00:00
1747 2823 2023-04-24 00:00:00
1748 2824 

2159 2731 2022-12-08 00:00:00
2160 2732 2022-12-09 00:00:00
2161 2733 2022-12-12 00:00:00
2162 2734 2022-12-13 00:00:00
2163 2735 2022-12-14 00:00:00
2164 2736 2022-12-15 00:00:00
2165 2737 2022-12-16 00:00:00
2166 2738 2022-12-19 00:00:00
2167 2739 2022-12-20 00:00:00
2168 2740 2022-12-21 00:00:00
2169 2741 2022-12-22 00:00:00
2170 2742 2022-12-23 00:00:00
2171 2743 2022-12-28 00:00:00
2172 2744 2022-12-29 00:00:00
2173 2745 2022-12-30 00:00:00
2174 2746 2023-01-03 00:00:00
2175 2710 2022-11-09 00:00:00
2176 2711 2022-11-10 00:00:00
2177 2712 2022-11-11 00:00:00
2178 2713 2022-11-14 00:00:00
2179 2714 2022-11-15 00:00:00
2180 2715 2022-11-16 00:00:00
2181 2716 2022-11-17 00:00:00
2182 2717 2022-11-18 00:00:00
2183 2718 2022-11-21 00:00:00
2184 2719 2022-11-22 00:00:00
2185 2720 2022-11-23 00:00:00
2186 2721 2022-11-24 00:00:00
2187 2722 2022-11-25 00:00:00
2188 2723 2022-11-28 00:00:00
2189 2724 2022-11-29 00:00:00
2190 2725 2022-11-30 00:00:00
2191 2726 2022-12-01 00:00:00
2192 2727 

2598 2681 2022-09-29 00:00:00
2599 2682 2022-09-30 00:00:00
2600 2683 2022-10-03 00:00:00
2601 2684 2022-10-04 00:00:00
2602 2685 2022-10-05 00:00:00
2603 2686 2022-10-06 00:00:00
2604 2687 2022-10-07 00:00:00
2605 2688 2022-10-10 00:00:00
2606 2689 2022-10-11 00:00:00
2607 2690 2022-10-12 00:00:00
2608 2691 2022-10-13 00:00:00
2609 2692 2022-10-14 00:00:00
2610 2693 2022-10-17 00:00:00
2611 2694 2022-10-18 00:00:00
2612 2695 2022-10-19 00:00:00
2613 2696 2022-10-20 00:00:00
2614 2697 2022-10-21 00:00:00
2615 2698 2022-10-24 00:00:00
2616 2699 2022-10-25 00:00:00
2617 2700 2022-10-26 00:00:00
2618 2701 2022-10-27 00:00:00
2619 2702 2022-10-28 00:00:00
2620 2647 2022-08-10 00:00:00
2621 2648 2022-08-11 00:00:00
2622 2649 2022-08-12 00:00:00
2623 2650 2022-08-15 00:00:00
2624 2651 2022-08-16 00:00:00
2625 2652 2022-08-17 00:00:00
2626 2653 2022-08-18 00:00:00
2627 2654 2022-08-19 00:00:00
2628 2655 2022-08-22 00:00:00
2629 2656 2022-08-23 00:00:00
2630 2657 2022-08-24 00:00:00
2631 2658 

3023 2658 2022-08-25 00:00:00
3024 2659 2022-08-26 00:00:00
3025 2660 2022-08-30 00:00:00
3026 2661 2022-08-31 00:00:00
3027 2662 2022-09-01 00:00:00
3028 2663 2022-09-02 00:00:00
3029 2664 2022-09-05 00:00:00
3030 2665 2022-09-06 00:00:00
3031 2666 2022-09-07 00:00:00
3032 2667 2022-09-08 00:00:00
3033 2668 2022-09-09 00:00:00
3034 2669 2022-09-12 00:00:00
3035 2670 2022-09-13 00:00:00
3036 2671 2022-09-14 00:00:00
3037 2672 2022-09-15 00:00:00
3038 2673 2022-09-16 00:00:00
3039 2674 2022-09-20 00:00:00
3040 2675 2022-09-21 00:00:00
3041 2676 2022-09-22 00:00:00
3042 2677 2022-09-23 00:00:00
3043 2678 2022-09-26 00:00:00
3044 2679 2022-09-27 00:00:00
3045 2680 2022-09-28 00:00:00
3046 2681 2022-09-29 00:00:00
3047 2682 2022-09-30 00:00:00
3048 2683 2022-10-03 00:00:00
3049 2684 2022-10-04 00:00:00
3050 2685 2022-10-05 00:00:00
3051 2686 2022-10-06 00:00:00
3052 2687 2022-10-07 00:00:00
3053 2688 2022-10-10 00:00:00
3054 2689 2022-10-11 00:00:00
3055 2690 2022-10-12 00:00:00
3056 2691 

3415 2595 2022-05-26 00:00:00
3416 2596 2022-05-27 00:00:00
3417 2597 2022-05-30 00:00:00
3418 2598 2022-05-31 00:00:00
3419 2599 2022-06-01 00:00:00
3420 2600 2022-06-06 00:00:00
3421 2601 2022-06-07 00:00:00
3422 2602 2022-06-08 00:00:00
3423 2603 2022-06-09 00:00:00
3424 2604 2022-06-10 00:00:00
3425 2605 2022-06-13 00:00:00
3426 2606 2022-06-14 00:00:00
3427 2607 2022-06-15 00:00:00
3428 2608 2022-06-16 00:00:00
3429 2609 2022-06-17 00:00:00
3430 2610 2022-06-20 00:00:00
3431 2611 2022-06-21 00:00:00
3432 2612 2022-06-22 00:00:00
3433 2613 2022-06-23 00:00:00
3434 2614 2022-06-24 00:00:00
3435 2615 2022-06-27 00:00:00
3436 2616 2022-06-28 00:00:00
3437 2617 2022-06-29 00:00:00
3438 2618 2022-06-30 00:00:00
3439 2619 2022-07-01 00:00:00
3440 2620 2022-07-04 00:00:00
3441 2621 2022-07-05 00:00:00
3442 2622 2022-07-06 00:00:00
3443 2623 2022-07-07 00:00:00
3444 2624 2022-07-08 00:00:00
3445 2625 2022-07-11 00:00:00
3446 2626 2022-07-12 00:00:00
3447 2627 2022-07-13 00:00:00
3448 2628 

3819 2607 2022-06-15 00:00:00
3820 2608 2022-06-16 00:00:00
3821 2609 2022-06-17 00:00:00
3822 2610 2022-06-20 00:00:00
3823 2611 2022-06-21 00:00:00
3824 2612 2022-06-22 00:00:00
3825 2613 2022-06-23 00:00:00
3826 2614 2022-06-24 00:00:00
3827 2615 2022-06-27 00:00:00
3828 2616 2022-06-28 00:00:00
3829 2617 2022-06-29 00:00:00
3830 2618 2022-06-30 00:00:00
3831 2619 2022-07-01 00:00:00
3832 2620 2022-07-04 00:00:00
3833 2621 2022-07-05 00:00:00
3834 2622 2022-07-06 00:00:00
3835 2623 2022-07-07 00:00:00
3836 2624 2022-07-08 00:00:00
3837 2625 2022-07-11 00:00:00
3838 2626 2022-07-12 00:00:00
3839 2627 2022-07-13 00:00:00
3840 2628 2022-07-14 00:00:00
3841 2629 2022-07-15 00:00:00
3842 2630 2022-07-18 00:00:00
3843 2631 2022-07-19 00:00:00
3844 2632 2022-07-20 00:00:00
3845 2633 2022-07-21 00:00:00
3846 2634 2022-07-22 00:00:00
3847 2635 2022-07-25 00:00:00
3848 2636 2022-07-26 00:00:00
3849 2637 2022-07-27 00:00:00
3850 2638 2022-07-28 00:00:00
3851 2639 2022-07-29 00:00:00
3852 2584 

4276 2555 2022-03-28 00:00:00
4277 2556 2022-03-29 00:00:00
4278 2557 2022-03-30 00:00:00
4279 2558 2022-03-31 00:00:00
4280 2559 2022-04-01 00:00:00
4281 2560 2022-04-04 00:00:00
4282 2561 2022-04-05 00:00:00
4283 2562 2022-04-06 00:00:00
4284 2563 2022-04-07 00:00:00
4285 2564 2022-04-08 00:00:00
4286 2565 2022-04-11 00:00:00
4287 2566 2022-04-12 00:00:00
4288 2567 2022-04-13 00:00:00
4289 2568 2022-04-14 00:00:00
2022-04-18 00:00:00
17/04/2022


TypeError: unsupported operand type(s) for -: 'str' and 'datetime.timedelta'

In [None]:
# full_ftse_data = pd.read_csv('/Users/abigail/Desktop/SMU/QF603/historical_prices_ftse_full.csv', header = 0)

# full_ftse_data["trade_date"] =\
#     (
#         full_ftse_data["trade_date"]
#         .apply(lambda x: datetime.strptime(x,'%d/%m/%Y'))
#     )



In [None]:
# Here, we only filter for stocks that are listed on LSEG
# There are stocks that somehow the datastream returns the primary stock listed on other exchanges
# Those stocks should not be part of the analysis

lse_historical_prices = full_ftse_data.loc[full_ftse_data.primary_exchange == 'LON', :].copy()
lse_historical_prices.close.isna().sum()

In [None]:
rebal_round = {
    1 : 'Q4',
    2 : 'Q1',
    3 : 'Q1',
    4 : 'Q1',
    5 : 'Q2',
    6 : 'Q2',
    7 : 'Q2',
    8 : 'Q3',
    9 : 'Q3',
    10 : 'Q3',
    11 : 'Q4',
    12 : 'Q4',
}

In [None]:
lse_historical_prices['year'] = lse_historical_prices['trade_date'].dt.year
lse_historical_prices['month'] = lse_historical_prices['trade_date'].dt.month
lse_historical_prices['rebal'] =\
(
    (lse_historical_prices['year'] 
     - 1*(lse_historical_prices['month'] == 1)).astype(str)
    + lse_historical_prices['month'].map(rebal_round)
)

lse_historical_prices.head()

In [None]:
# Remove Q3 2023 Rebal due to incomplete data
lse_historical_prices =\
    lse_historical_prices[lse_historical_prices['rebal'] != '2023Q3']

lse_historical_prices.head()

In [None]:
lse_historical_prices =\
    lse_historical_prices\
    .sort_values(by = ['security_name', 'trade_date'])\
    .reset_index(drop = True)

In [None]:
# full_ftse_data.head()
lst_start_date[100]

## Get FTSE100/Stock to get Beta
Generate FTSE 100 Index Close based on Trade dates of Historical Price DF - get covariance for stock & index and variance for index

## ERRORS M1 (read M2 as well too pls i think taht might be better, was trying to work off what you wrote)
Trying to get data with beta and cov and all but kept getting errors
- was using start date and post date to get price data in a list and run cov/cor/var
- would preferably want to get the price for pre 20/5/3/1 and post 3/5/10/20 close index as well from this loop

In [None]:
lst_isin = list(ftse_rebal["ISIN"])
lst_start_date = list(ftse_rebal["start_date"])
lst_post_date = list(ftse_rebal["Post Date"])
lst_stock_price = list(full_ftse_data["close"])
lst_index_close = list(full_ftse_data["ftse100_close"])

In [None]:
def get_beta(df, isin, start_date, post_date, stock_price, index_close):
    lst_cov = []
    lst_var = []
    lst_beta = []
    lst_corr = []
    
    lst_merge = []
    
    for i in range(len(isin)):
        try:
            index_start = df.index[(df["trade_date"] == start_date[i]) & (df["isin_code"] == isin[i])][0]
            index_end = df.index[(df["trade_date"] == post_date[i]) & (df["isin_code"] == isin[i])][0]

            lst_stock_price = stock_price[index_start:index_end]
            lst_index_close = index_close[index_start:index_end]
            lst_joint = []
            lst_joint.append(lst_stock_price)
            lst_joint.append(lst_index_close)

            corr = np.corrcoef(lst_joint)[0][1]
            cov = np.cov(lst_joint)[0][1]
            var = np.var(lst_index_close)

            beta = cov / var

            lst_cov.append(cov)
            lst_var.append(var)
            lst_beta.append(beta)
            lst_corr.append(corr)
        
        except IndexError:
            
            print(i, isin[i], start_date[i], post_date[i])
            lst_cov.append(0)
            lst_var.append(0)
            lst_beta.append(0)
            lst_corr.append(0)
        next

    lst_merge.append(lst_cov)
    lst_merge.append(lst_var)
    lst_merge.append(lst_beta)
    lst_merge.append(lst_corr)

    return lst_merge
    

In [None]:
# cal_relation = get_beta(full_ftse_data, lst_isin, lst_start_date, lst_post_date, lst_stock_price, lst_index_close)

# ftse_rebal["covariance"] = cal_relation[0]
# ftse_rebal["index_var"] = cal_relation[1]
# ftse_rebal["beta"] = cal_relation[2]
# ftse_rebal["correlation"] = cal_relation[3]


## ERRORS M2
TRIED USING THE CODE TOU WROTE TO GET AROUND IT BUT IT ENDED UP SKIPPING THE ENTIRE LIST :(
- was using post date and (post date index - 20)to get price data in a list and run cov/cor/var
- would preferably want to get the price for pre 20/5/3/1 and post 3/5/10/20 close index as well from this loop

In [None]:
target_isins= ftse_rebal["ISIN"]
rebal_dates = ftse_rebal["Post Date"].dt.strftime('%d/%m/%Y')

stock_price = list(full_ftse_data["close"])
index_close = list(full_ftse_data["ftse100_close"])

target_rebal_prices = []


for isin, rebal_date in zip(target_isins, rebal_dates):
    # Remove stocks that are suspended from trading during the analysis period
    if (isin, rebal_date) in [('GB00BJP5HK17', '19/12/2022'), 
                              ('GB00B1VNST91', '18/06/2018'),
                              ('GB0007892358', '19/06/2017')]:
        continue
    if lse_historical_prices[(lse_historical_prices.isin_code == isin) 
                             & (lse_historical_prices.trade_date == rebal_date)].empty:
        print(f'ISIN {isin} for {rebal_date} is excluded from studies!')
        print(isin, rebal_date)
    
    else:
        sub_df = lse_historical_prices[lse_historical_prices.isin_code == isin]
        rebal_idx =\
        (
            sub_df
            .index[sub_df.trade_date == rebal_date]
            [0]
        )
        
        for delta in [-20, -5, -3, -1, 3, 5, 10, 20]:
            # Ensure that the prices for the days required exist
            assert sub_df['rebal'].loc[rebal_idx] == sub_df['rebal'].loc[rebal_idx + delta],\
            f'ISIN {isin} faced insufficient data pre-rebal on {rebal_date} for delta {delta} days'
        
        pre_20_pd = sub_df.close.loc[rebal_idx - 20]
        pre_5_pd = sub_df.close.loc[rebal_idx - 5]
        pre_3_pd = sub_df.close.loc[rebal_idx - 3]
        pre_1_pd = sub_df.close.loc[rebal_idx - 1]
        post_3_pd = sub_df.close.loc[rebal_idx + 3]
        post_5_pd = sub_df.close.loc[rebal_idx + 5]
        post_10_pd = sub_df.close.loc[rebal_idx + 10]
        post_20_pd = sub_df.close.loc[rebal_idx + 20]
        
        start_idx = rebal_idx - 20
        
        lst_stock_price = stock_price[start_idx:rebal_idx]
        lst_index_close = index_close[start_idx:rebal_idx]
        lst_joint = []
        lst_joint.append(lst_stock_price)
        lst_joint.append(lst_index_close)
        
        cov = np.cov(lst_joint)[0][1]
        var = np.var(lst_joint)
        beta = cov/var
        corr = np.corrcoef(lst_joint)[0][1]
        
        
        target_rebal_prices.append({
            'Name' : sub_df.security_name.values[0],
            'ISIN' : isin,
            'post_date' : rebal_date,
            'pre_twenty_pd' : pre_20_pd,
            'pre_five_pd' : pre_5_pd,
            'pre_three_pd' : pre_3_pd,
            'pre_one_pd' : pre_1_pd,
            'post_three_pd' : post_3_pd,
            'post_five_pd' : post_5_pd,
            'post_ten_pd' : post_10_pd,
            'post_twenty_pd' : post_20_pd,
            'Cov': cov,
            'Var': var,
            'Beta': cov,
            'Corr': corr,
            
        })
    

In [None]:
lse_historical_prices.to_csv(
            '/Users/abigail/Desktop/SMU/QF603/output/error_find_10y.csv',
            index = False)

In [None]:
len(target_isins)

In [None]:
len(rebal_dates)

In [15]:
full_ftse_data.trade_date

0    2023-08-09
1    2023-08-10
2    2023-08-11
3    2023-08-14
4    2023-08-15
        ...    
51   2013-04-22
52   2013-04-23
53   2013-04-24
54   2013-04-25
55   2013-04-26
Name: trade_date, Length: 32690, dtype: datetime64[ns]

In [16]:
index_close_data["Date"][0]

Timestamp('2012-02-17 00:00:00')

In [22]:
index_close_data

Unnamed: 0,Date,Close,Net,%Chg,Open,Low,High,Volume,Turnover - GBP,Flow
0,2012-02-17,5905.07,19.69,0.33%,5885.38,5885.38,5923.62,1101910955,405068.00,-404602.00
1,2012-02-20,5945.25,40.18,0.68%,5905.07,5905.07,5956.33,723669610,270963.00,-133639.00
2,2012-02-21,5928.20,-17.05,-0.29%,5945.25,5916.58,5948.84,846269111,326004.00,-459643.00
3,2012-02-22,5916.55,-11.65,-0.20%,5928.20,5894.60,5937.96,872088943,327016.00,-786659.00
4,2012-02-23,5937.89,21.34,0.36%,5916.55,5900.50,5952.47,1071356849,340406.00,-446253.00
...,...,...,...,...,...,...,...,...,...,...
2939,2023-10-09,7492.21,-2.37,-0.03%,7494.58,7473.19,7540.57,737059812,362378.90,135391242.47
2940,2023-10-10,7628.21,136.00,1.82%,7492.21,7492.21,7637.41,702687119,389471.63,135780714.10
2941,2023-10-11,7620.03,-8.18,-0.11%,7628.21,7608.67,7651.98,623846517,365213.16,135415500.94
2942,2023-10-12,7644.78,24.75,0.32%,7620.03,7620.03,7687.91,657764425,331299.89,135746800.83


In [24]:
len(trade_date)

32690

In [26]:
trade_date[4290]

Timestamp('2022-04-18 00:00:00')

In [29]:
index_close_data.index[(index_close_data["Date"] == trade_date[4290])].empty

True

In [36]:
(trade_date[4290] - timedelta(days = look_back)).strftime('%d/%m/%Y')

'17/04/2022'

In [33]:
look_back = 1

In [None]:
date = tra