## FIN523 Group Project
- Project Name: Global Core Asset Price Analysis

- Programme: FIN523 Quantitative Method for Finance

- Name: DING Yangyang

- Date: 2024/12/02

### Multiple Linear Regression
### Timespan 5: 2000-01-01 to 2023-12-31 (all)

### Step 0: Import Libraries

In [1]:
# install required modules
required_modules = [
    "numpy", "pandas", "loguru", "matplotlib", "scipy", "tqdm",
    "statsmodels", "sklearn", "yfinance", "tushare", "fredapi"
]

for module in required_modules:
    try:
        __import__(module)
        print(f"Module {module} is installed.")
    except ImportError:
        print(f"Module {module} is NOT installed. Please install it.")

Module numpy is installed.
Module pandas is installed.
Module loguru is installed.
Module matplotlib is installed.
Module scipy is installed.
Module tqdm is installed.
Module statsmodels is installed.
Module sklearn is installed.
Module yfinance is installed.
Module tushare is installed.
Module fredapi is installed.


In [2]:
import os
from codebase.const import TimeSpansForAnalysis, Tickers
from codebase.analyzer import FinancialDataAnalyzer

### Step 1: Data Preprocessing

In [3]:
# Select the time span for analysis, UNCOMMENT the ONE you want to use
time_span = TimeSpansForAnalysis.SPAN_2000_2023 # Timespan 5: 2000-01-01 to 2023-12-31

os.makedirs("output", exist_ok=True)
start_date = time_span.value.split('|')[0]
end_date = time_span.value.split('|')[1]
analyser = FinancialDataAnalyzer(start_date=start_date, end_date=end_date)

In [4]:
# Load the data
analyser.load_data(enable_cache=True)

[32m2024-12-01 01:32:38.517[0m | [1mINFO    [0m | [36mcodebase.dataloader[0m:[36mload_or_get_data[0m:[36m52[0m - [1mLoading [标普500] data from cache: data/2000-01-01_2023-12-31/标普500.csv[0m
[32m2024-12-01 01:32:38.521[0m | [1mINFO    [0m | [36mcodebase.dataloader[0m:[36mload_or_get_data[0m:[36m52[0m - [1mLoading [纳斯达克] data from cache: data/2000-01-01_2023-12-31/纳斯达克.csv[0m
[32m2024-12-01 01:32:38.531[0m | [1mINFO    [0m | [36mcodebase.dataloader[0m:[36mload_or_get_data[0m:[36m52[0m - [1mLoading [道琼斯] data from cache: data/2000-01-01_2023-12-31/道琼斯.csv[0m
[32m2024-12-01 01:32:38.534[0m | [1mINFO    [0m | [36mcodebase.dataloader[0m:[36mload_or_get_data[0m:[36m52[0m - [1mLoading [万得全A] data from cache: data/2000-01-01_2023-12-31/万得全A.csv[0m
[32m2024-12-01 01:32:38.536[0m | [1mINFO    [0m | [36mcodebase.dataloader[0m:[36mload_or_get_data[0m:[36m52[0m - [1mLoading [沪深300] data from cache: data/2000-01-01_2023-12-31/沪深300.csv[0m
[3

### Step 2: Calculate the Change ratio of each factor

In [5]:
change_ratio_df = analyser.calculate_change_ratio()
change_ratio_df

Unnamed: 0,2000-01-01 - 2023-12-31
S&P500(%),227.773819
NASDAQ(%),263.369763
DOWJONES(%),231.846856
WIND A(%),342.778581
HS300 Index(%),160.631542
ChiNext Index(%),-48.543394
Hang Seng Index(%),-1.855193
MSCI Developed Markets Index(%),122.838017
MSCI Emerging Markets Index(%),106.307686
US Dollar Index(%),1.107564


### Step 3: Calculate the Statistics for each factor

In [6]:
statistic_df = analyser.calculate_statistics()
statistic_df = analyser.calculate_covariance()
correlation_df = analyser.calculate_correlation()

### Step 4: Regression Analysis for High-correlation Factors

In [7]:
high_correlation_df = analyser.select_high_correlation(0.9)
high_correlation_df.shape

(10, 1)

In [8]:
linear_regression_df = analyser.overall_linear_regression_analysis(plot=False) # plot=True to plot the regression
linear_regression_df

Unnamed: 0,Model,R^2,p-value,t-value
S&P500|NASDAQ,Linear Regression,0.982646,0.0,584.563556
S&P500|DOWJONES,Linear Regression,0.98695,0.0,675.594263
S&P500|MSCI Developed Markets Index,Linear Regression,0.97351,0.0,470.903986
NASDAQ|DOWJONES,Linear Regression,0.957385,0.0,368.214608
NASDAQ|MSCI Developed Markets Index,Linear Regression,0.946572,0.0,326.959134
DOWJONES|MSCI Developed Markets Index,Linear Regression,0.965652,0.0,411.873301
US Dollar Index|Euro to US Dollar,Linear Regression,0.961423,0.0,-388.685512
German 10-Year Treasury Yield|Japanese 10-Year Treasury Yield,Linear Regression,0.850873,0.0,40.395975
German 10-Year Treasury Yield|British 10-Year Treasury Yield,Linear Regression,0.953796,0.0,76.83678
Japanese 10-Year Treasury Yield|British 10-Year Treasury Yield,Linear Regression,0.810804,0.0,35.009471


In [9]:
polynomial_regression_df = analyser.overall_polynomial_regression_analysis(plot=False) # plot=True to plot the regression

                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.986
Model:                            OLS   Adj. R-squared:                  0.986
Method:                 Least Squares   F-statistic:                 2.104e+05
Date:                Sun, 01 Dec 2024   Prob (F-statistic):               0.00
Time:                        01:32:39   Log-Likelihood:                -45549.
No. Observations:                6037   AIC:                         9.110e+04
Df Residuals:                    6034   BIC:                         9.112e+04
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const      -1038.9862     31.980    -32.489      0.0

In [10]:
polynomial_regression_df

Unnamed: 0,Model,R^2,f-value,f_statistic
S&P500|NASDAQ,Polynomial Regression,0.985862,0.0,210383.292268
S&P500|DOWJONES,Polynomial Regression,0.991324,0.0,344726.568823
S&P500|MSCI Developed Markets Index,Polynomial Regression,0.977378,0.0,130327.301324
NASDAQ|DOWJONES,Polynomial Regression,0.973169,0.0,109425.937704
NASDAQ|MSCI Developed Markets Index,Polynomial Regression,0.95575,0.0,65153.615454
DOWJONES|MSCI Developed Markets Index,Polynomial Regression,0.965667,0.0,84842.727428
US Dollar Index|Euro to US Dollar,Polynomial Regression,0.967836,0.0,91188.612629
German 10-Year Treasury Yield|Japanese 10-Year Treasury Yield,Polynomial Regression,0.804703,0.0,492.388112
German 10-Year Treasury Yield|British 10-Year Treasury Yield,Polynomial Regression,0.940399,0.0,1948.612018
Japanese 10-Year Treasury Yield|British 10-Year Treasury Yield,Polynomial Regression,0.786894,0.0,481.870197


## Step 5: Multi-factor Regression Analysis

In [11]:
multi_reg_df = analyser.overall_linear_regression_multi_factor_analysis(False) # plot=True to plot the regression

In [12]:
multi_reg_df.sort_values(by='R^2', ascending=False)

Unnamed: 0,Unnamed: 1,Intercept,R^2,MSE,Coefficient (Slope) US_INDPRO,t-value US_INDPRO,p-value US_INDPRO,Coefficient (Slope) US_DGORDER,t-value US_DGORDER,p-value US_DGORDER,Coefficient (Slope) US_RSAFS,...,p-value US_MTSDS133FMS,Coefficient (Slope) US_HOUST,t-value US_HOUST,p-value US_HOUST,Coefficient (Slope) US_CSUSHPISA,t-value US_CSUSHPISA,p-value US_CSUSHPISA,Coefficient (Slope) FED_FUNDS_RATE,t-value FED_FUNDS_RATE,p-value FED_FUNDS_RATE
DOWJONES,0,80788.053014,0.982376,1266701.0,-76.353003,-0.893713,0.372746,0.01467398,1.31311,0.190926,0.07755632,...,0.57855,-1.500731,-1.974428,0.04996247,-26.083198,-1.844716,0.06682918,428.126816,4.427853,1.703249e-05
S&P500,0,20335.331473,0.981082,22392.14,-37.935161,-3.339668,0.001032,0.003198067,2.152435,0.032782,0.01077032,...,0.408087,-0.2350266,-2.325652,0.02122495,-1.859805,-0.989294,0.3239337,80.880826,6.29152,2.602034e-09
Gold Price,0,-6231.245639,0.97246,8049.004,3.374023,0.551895,0.581676,0.0006365462,0.754515,0.451484,0.004418993,...,0.279109,0.05709995,1.021513,0.3083239,-12.304105,-11.689543,4.440864e-24,17.6475,2.450816,0.01516772
NASDAQ,0,84560.798337,0.965814,540213.6,-225.746246,-4.046197,7.9e-05,0.01495563,2.049332,0.041975,0.0521111,...,0.616249,-1.33668,-2.6929,0.007796913,-17.885086,-1.93693,0.05442157,305.118576,4.832187,3.013822e-06
MSCI Developed Markets Index,0,13354.926682,0.962965,14349.31,-22.657771,-2.74316,0.006684,0.003247621,2.871549,0.004561,0.008153477,...,0.320099,-0.3390104,-4.50127,1.192434e-05,1.659886,1.166795,0.2447949,53.321756,5.536568,1.045332e-07
Oil Price,0,-456.006154,0.956846,27.62395,1.114923,2.742947,0.006766,0.0001271543,2.355378,0.019687,7.348171e-05,...,0.30223,-0.006512168,-1.81193,0.07182636,0.021853,0.324296,0.746127,-2.324393,-4.841021,2.96472e-06
German 10-Year Treasury Yield,0,1.621345,0.953848,0.1562547,0.001501,0.066875,0.94673,9.8532e-06,3.123997,0.001978,-5.536842e-06,...,0.618665,-0.0007776993,-3.940253,0.000103627,0.016957,4.409736,1.493867e-05,0.150465,5.700605,3.113594e-08
Japanese 10-Year Treasury Yield,0,-3.671096,0.943449,0.02277807,-0.022591,-2.636379,0.008863,1.945952e-06,1.615933,0.107272,7.625015e-07,...,0.679662,-0.0002618675,-3.474977,0.0005947091,0.00883,6.0141,5.834464e-09,0.061001,6.053123,4.715366e-09
Chinese Yuan to US Dollar,0,0.214265,0.93992,1.085924e-05,-0.000111,-0.435806,0.663554,4.264333e-08,1.300063,0.195415,4.028067e-08,...,0.185743,-9.450641e-06,-4.331384,2.581345e-05,-1.6e-05,-0.36494,0.7156295,-0.002995,-7.056438,4.645761e-11
British 10-Year Treasury Yield,0,-5.815409,0.9165,0.2132727,0.033695,1.285048,0.199873,8.309081e-06,2.254941,0.024935,-1.18916e-05,...,0.68831,-0.0005718074,-2.479768,0.01375489,0.028234,6.284617,1.306678e-09,0.097618,3.165648,0.001723992
