## 1. Importing libraries

In [1]:
import pandas as pd
import numpy as np
import statsmodels.api as sm 
from scipy import stats

## 2. Loading the data

In [2]:
df = pd.read_csv('ginsberg.csv')
df.head()

Unnamed: 0,SubjectID,Simplicity,Fatalism,Depression,Adj_Simplicity,Adj_Fatalism,Adj_Depression
0,1,0.92983,0.35589,0.5987,0.75934,0.10673,0.41865
1,2,0.91097,1.18439,0.72787,0.72717,0.99915,0.51688
2,3,0.53366,-0.05837,0.53411,0.62176,0.03811,0.70699
3,4,0.74118,0.35589,0.56641,0.83522,0.42218,0.65639
4,5,0.53366,0.77014,0.50182,0.47697,0.81423,0.53518


## 3. Assigning Independent Variables (features) and Dependent Variable (target)

In [3]:
X = df[['Simplicity', 'Fatalism']]
Y = df['Depression']

## 4. Adding constant to model

In [4]:
X = sm.add_constant(X)

## 5. Performing and displaying the Multivariate Regression on ['Depression']

In [5]:
model = sm.OLS(Y, X).fit()
model.summary()

0,1,2,3
Dep. Variable:,Depression,R-squared:,0.519
Model:,OLS,Adj. R-squared:,0.507
Method:,Least Squares,F-statistic:,42.58
Date:,"Fri, 16 Aug 2024",Prob (F-statistic):,2.84e-13
Time:,23:40:24,Log-Likelihood:,-29.024
No. Observations:,82,AIC:,64.05
Df Residuals:,79,BIC:,71.27
Df Model:,2,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.2027,0.095,2.140,0.035,0.014,0.391
Simplicity,0.3795,0.101,3.771,0.000,0.179,0.580
Fatalism,0.4178,0.101,4.151,0.000,0.217,0.618

0,1,2,3
Omnibus:,11.218,Durbin-Watson:,1.179
Prob(Omnibus):,0.004,Jarque-Bera (JB):,11.425
Skew:,0.857,Prob(JB):,0.0033
Kurtosis:,3.636,Cond. No.,6.0


In [6]:
model.pvalues

const         0.035455
Simplicity    0.000312
Fatalism      0.000083
dtype: float64

## 6. Interpreting Results

### A. Describing the effect and significance level of each Independent Variable

     ['Simplicity']

- Effect: Our model finds a coef (effect) of 0.3795. This tells us we expect to see a 0.3795 unit increase in our ['Depression'] value with a one unit increase in ['Simplicity'], given ['Fatalism'] remains constant.

- Significance level: Our model gives a p-value (significance level) of 0.000312. Since this is less than 0.05, ['Simplicity'] has a statistically significant effect on ['Depression'].

     ['Fatalism']

- Effect: Our model finds a coef (effect) of 0.4178. This tells us we expect to see a 0.4178 unit increase in our ['Depression'] value with a one unit increase in ['Fatalism'], given ['Simplicity'] remains constant.

- Significance level: Our model gives a p-value (significance level) of 0.000083. Since this is less than 0.05, ['Fatalism'] has a statistically significant effect on ['Depression'].

### B. Interpretation of R-squared

- Our model gives an R-squared = 0.519, meaning 51.9% of the variation in ['Depression'] can be explained by our independent vars ['Simplicity', 'Fatalism']. This score also signifies our model is a moderate fit and proves to be useful however, still has significant room for improvement. 