**REGRESSION ANALYSIS**

In [None]:
# Import the libraries
import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.api as sm
from scipy.stats import ttest_ind

In [None]:
# Mount the drive to access dataset
from google.colab import drive
drive.mount('/content/mount')

Mounted at /content/mount


In [None]:
# Load the dataset from the drive
Dataset =  pd.read_excel('/content/mount/MyDrive/Python/data_inflation_stock_return.xlsx')
Dataset

Unnamed: 0,Year,inflation,sp500 Change,avg_sp500_percentage _change
0,1928,-1.7,0.3788,37.88
1,1929,0.0,-0.1191,-11.91
2,1930,-2.3,-0.2848,-28.48
3,1931,-9.0,-0.4707,-47.07
4,1932,-9.9,-0.1515,-15.15
...,...,...,...,...
91,2019,1.8,0.2888,28.88
92,2020,1.2,0.1626,16.26
93,2021,4.7,0.2689,26.89
94,2022,8.0,-0.1944,-19.44


In [None]:
# Describe the dataset
Dataset.describe()

Unnamed: 0,Year,inflation,sp500 Change,avg_sp500_percentage _change
count,96.0,96.0,96.0,96.0
mean,1975.5,3.09375,0.07864,7.863958
std,27.856777,3.759194,0.191506,19.150558
min,1928.0,-9.9,-0.4707,-47.07
25%,1951.75,1.5,-0.06015,-6.015
50%,1975.5,2.8,0.1109,11.09
75%,1999.25,4.325,0.221175,22.1175
max,2023.0,14.4,0.4659,46.59


**Performing Regression Analysis**

In [None]:
# Showing the Independent Variable
Dataset['inflation']

0    -1.7
1     0.0
2    -2.3
3    -9.0
4    -9.9
     ... 
91    1.8
92    1.2
93    4.7
94    8.0
95    4.1
Name: inflation, Length: 96, dtype: float64

In [None]:
# Showing the Dependant Variable
Dataset['sp500 Change']

0     0.3788
1    -0.1191
2    -0.2848
3    -0.4707
4    -0.1515
       ...  
91    0.2888
92    0.1626
93    0.2689
94   -0.1944
95    0.2423
Name: sp500 Change, Length: 96, dtype: float64

In [None]:
# Add a constant term to the intercept/independent variable
X = sm.add_constant(Dataset['inflation'])

In [None]:
# Fit the regression model using Ordinary Least Squares regression
model = sm.OLS(Dataset['sp500 Change'], X).fit()

In [None]:
# Print the regression
print(model.summary())

                            OLS Regression Results                            
Dep. Variable:           sp500 Change   R-squared:                       0.001
Model:                            OLS   Adj. R-squared:                 -0.010
Method:                 Least Squares   F-statistic:                   0.06070
Date:                Tue, 12 Mar 2024   Prob (F-statistic):              0.806
Time:                        21:26:33   Log-Likelihood:                 22.988
No. Observations:                  96   AIC:                            -41.98
Df Residuals:                      94   BIC:                            -36.85
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0826      0.025      3.242      0.0

**Interpretation and Conclusion**

**Coefficient of Inflation**

The coeffiicient of inflation is -0.0013.
This is the estimated change in the sp500(Dependent variable) for a unit change in inflation(Independent variable) holding all other factors constant.
-0.0013 indicates that there is a negative relationship between inflation and changes in the sp500 index


**P-Value(P>|t|)**

The p-value associated with the coefficient of inflation is 0.806.
It indicates the probability of observing the data if the null hypothesis is true.
Since the p-value is greater than 0.05, **we fail to reject the null hypothesis** thus Inflation may not be a significant predictor of sp500 changes.

 **Monte Carlo Simulation approach to test hypothesis**

In [None]:
# Load the dataset
Dataset

Unnamed: 0,Year,inflation,sp500 Change,avg_sp500_percentage _change
0,1928,-1.7,0.3788,37.88
1,1929,0.0,-0.1191,-11.91
2,1930,-2.3,-0.2848,-28.48
3,1931,-9.0,-0.4707,-47.07
4,1932,-9.9,-0.1515,-15.15
...,...,...,...,...
91,2019,1.8,0.2888,28.88
92,2020,1.2,0.1626,16.26
93,2021,4.7,0.2689,26.89
94,2022,8.0,-0.1944,-19.44


In [None]:
# Deciding the number of simulations
num_simulations = 500

In [None]:
# Initializing an empty list to store corr-coefficients
corr_coefficients = []

**Performing Monte Carlo Simulation**

In [None]:
for _ in range(num_simulations):
   # Generating simulated data
   simulated_data = Dataset.sample(frac=1, replace=True)

   #Calculating correlation coefficient between inflation and sp500 change
   corr_coefficient = simulated_data["inflation"].corr(simulated_data["sp500 Change"])

   # Appending correlation coefficient to the list
   corr_coefficients.append(corr_coefficient)

# Convering the list to a numpy array
corr_coefficients = np.array(corr_coefficients)

In [None]:
# Calculating the p-value by counting the number of simulated correlation coefficients
# greater than or equal to the observed correlation coefficient
observed_corr_coeff = Dataset["inflation"].corr(Dataset["sp500 Change"])
p_value = np.mean(np.array(corr_coefficients) >= observed_corr_coeff)

# Output the results
print("Observed Correlation Coefficient:", observed_corr_coeff)
print("P-value:", p_value)

Observed Correlation Coefficient: -0.02540265479851227
P-value: 0.51


**Interpretation and Conclution**

The p-value is 0.49, which is slightly higher than 0.05 and the correlation coefficient  is -0.0254 under the null hypothesis.

We therefore **fail to reject the null hypothesis** suggesting that **there could be** a relationship between high inflation and lower stock prices

**Performing a T-Test**

In [None]:
positive_change = Dataset[Dataset['sp500 Change'] > 0]['inflation']
negative_change = Dataset[Dataset['sp500 Change'] < 0]['inflation']

t_statistic, p_value = ttest_ind(positive_change, negative_change)

print("T-Statistic:", t_statistic)
print("P-Value:", p_value)

T-Statistic: 0.22368566175204802
P-Value: 0.8234978663382407


**Interpretation and conclution**

The p-value indicates the difference in the means of 'Inflation' and 'sp500' change.

In this case the p-value is 0.8235 which is higher than the recommended 0.05, therefore we **fail to reject the null hypothesis**.

This concludes that there is no significance in statistical difference in Inflation and sp500 changes


