With a simple python code to demonstrate the following using this simple dataset

## Task 1

Perform a series of correlations on the above (fictitious) data.

First I wrote the data from the graph to **Examdataset1.csv**

In [2]:
import pandas as pd
import scipy.stats as stats

# Load the dataset
df = pd.read_csv("Examdataset1.csv")
df

Unnamed: 0,Memory(Negative Memory Bias),Anxiety,Depression,Self-Esteem
0,5,20,0,16
1,5,21,0,15
2,6,24,0,19
3,6,32,1,18
4,7,32,1,17
5,7,21,1,18
6,7,45,3,16
7,8,45,3,10
8,9,31,5,15
9,9,22,8,15


In [5]:
# Calculate the Pearson correlation coefficients between each pair of variables
correlation_matrix = df.corr()
correlation_matrix

Unnamed: 0,Memory(Negative Memory Bias),Anxiety,Depression,Self-Esteem
Memory(Negative Memory Bias),1.0,0.631711,0.925598,-0.786
Anxiety,0.631711,1.0,0.624016,-0.678946
Depression,0.925598,0.624016,1.0,-0.746602
Self-Esteem,-0.786,-0.678946,-0.746602,1.0


- **Memory (Negative Memory Bias) and Anxiety**: 0.6317
- **Memory (Negative Memory Bias) and Depression**: 0.9256
- **Memory (Negative Memory Bias) and Self-Esteem**: -0.7860
- **Anxiety and Depression**: 0.6240
- **Anxiety and Self-Esteem**: -0.6789
- **Depression and Self-Esteem**: -0.7466

These coefficients range from -1 to 1, where 1 indicates a perfect positive correlation, -1 indicates a perfect negative correlation, and 0 indicates no correlation. The values show the strength and direction of the linear relationship between the variables. For example, a high negative memory bias is strongly positively correlated with depression and strongly negatively correlated with self-esteem.

## Task 2

Demonstrate through multiple regression to examine the contribution of each independent variable to the prediction of Memory Bias. Also report how much of the variance is accounted for by the regression equation?

In [7]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

# Preparing the data for multiple regression
# Independent variables: Anxiety, Depression, Self-Esteem
# Dependent variable: Memory (Negative Memory Bias)
X = df[['Anxiety', 'Depression', 'Self-Esteem']]
y = df['Memory(Negative Memory Bias)']

# Create a linear regression model
model = LinearRegression()

# Fit the model
model.fit(X, y)

# Coefficients and intercept
coefficients = model.coef_
intercept = model.intercept_

# Predictions and R-squared value
y_pred = model.predict(X)
r_squared = r2_score(y, y_pred)

coefficients, intercept, r_squared


(array([ 0.0061383 ,  0.4383072 , -0.18240165]),
 8.723821631330324,
 0.8772391478014961)

**Coefficient for Anxiety**: 0.0061  
**Coefficient for Depressio**n: 0.4383  
**Coefficient for Self-Esteem**: −0.1824  
**Intercept**: 8.7238  

The R^2 2
  value of the regression equati .8772
0.8772. This indicates that approximately 87.72% of the variance in Memory Bias is accounted for by the regression equation. ThiR^2 2
R 
2
  value suggests that the model explains a large portion of the variability in Memory Bias.

## Task 3

Based on the above analysis, what would be the predicted value of Memory Bias for a person with an Anxiety score of 44, a Depression score of 13 and a Self-Esteem score of 12?

In [8]:
# Predicting the value of Memory Bias for specified scores
anxiety_score = 44
depression_score = 13
self_esteem_score = 12

# Creating an array of the specified scores
input_features = [[anxiety_score, depression_score, self_esteem_score]]

# Predicting Memory Bias
predicted_memory_bias = model.predict(input_features)
predicted_memory_bias[0]



12.503080579929751

Based on the multiple regression analysis, the predicted value of Memory Bias for a person with an Anxiety score of 44, a Depression score of 13, and a Self-Esteem score of 12 is approximately 12.50.

## Task 4

Using the same data, perform a multiple regression to determine the best predictor of Memory Bias.

In [10]:
import statsmodels.api as sm

# Adding a constant to the independent variables
X_with_constant = sm.add_constant(X)

# Create a model using statsmodels for detailed statistics
model_sm = sm.OLS(y, X_with_constant)

# Fit the model
results = model_sm.fit()

# Get the summary of the regression
model_summary = results.summary()
model_summary


0,1,2,3
Dep. Variable:,Memory(Negative Memory Bias),R-squared:,0.877
Model:,OLS,Adj. R-squared:,0.854
Method:,Least Squares,F-statistic:,38.11
Date:,"Thu, 07 Dec 2023",Prob (F-statistic):,1.63e-07
Time:,02:24:08,Log-Likelihood:,-38.112
No. Observations:,20,AIC:,84.22
Df Residuals:,16,BIC:,88.21
Df Model:,3,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,8.7238,3.027,2.882,0.011,2.307,15.141
Anxiety,0.0061,0.046,0.135,0.894,-0.090,0.103
Depression,0.4383,0.078,5.612,0.000,0.273,0.604
Self-Esteem,-0.1824,0.127,-1.431,0.172,-0.453,0.088

0,1,2,3
Omnibus:,14.465,Durbin-Watson:,2.292
Prob(Omnibus):,0.001,Jarque-Bera (JB):,14.318
Skew:,-1.405,Prob(JB):,0.000778
Kurtosis:,6.048,Cond. No.,324.0


**Depression** has the most significant impact on Memory Bias, with a coefficient of 
0.4383 and a very low p-value (p < 0.001). This suggests that Depression is a strong and statistically significant predictor of Memory Bias**.
S-lf-Est** em shows a coefficient 1824
−0.1824, but its p-val
0.172
0.172, which is above the common significance level of 0.05. This implies that while Self-Esteem contributes to the model, its effect is not statistically significant at the 5%** level.** 
Anxiety has a coeffi 
0.0061
0.0061 with a  of 
0.894
0.894, indicating that Anxiety is not a statistically significant predictor of Memory Bias in this model.

## Task 5

Using the same data, perform a multiple regression to test the idea that Anxiety is the salient predictor of Memory Bias. Enter Anxiety on the first step, and Depression and Self-Esteem on the second.

In [12]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

# First step: Regression with only Anxiety
X_anxiety = df[['Anxiety']]
model_anxiety = LinearRegression()
model_anxiety.fit(X_anxiety, y)
y_pred_anxiety = model_anxiety.predict(X_anxiety)
r_squared_anxiety = r2_score(y, y_pred_anxiety)

# Second step: Regression with Anxiety, Depression, and Self-Esteem
model_full = LinearRegression()
model_full.fit(X, y)
y_pred_full = model_full.predict(X)
r_squared_full = r2_score(y, y_pred_full)

r_squared_anxiety, r_squared_full

(0.39905858271905825, 0.8772391478014961)

**First Step (Only Anxiety)**:
- R^2 value for the model with only Anxiety as the predictor is 0.3991. This indicates that Anxiety alone accounts for approximately 39.91% of the variance in Memory Bias.  
**Second Step (Anxiety, Depression, and Self-Esteem)**:
- When Depression and Self-Esteem are added to the model (along with Anxiety), the R^2 value increases to 0.8772. This means that the combined model accounts for approximately 87.72% of the variance in Memory Bias.

The substantial increase in the R^2 value from the first to the second step suggests that while Anxiety contributes to the model, it is not the most salient predictor. The inclusion of Depression and Self-Esteem greatly enhances the model's explanatory power, indicating that these variables, particularly Depression as seen in the previous task, play a more significant role in predicting Memory Bias.