In [26]:
import numpy as np
import pandas as pd

In [27]:
def correlation(x, y):
    x_mean = np.mean(x)
    y_mean = np.mean(y)
    x_std = np.std(x)
    y_std = np.std(y)
    n = len(x)
    return np.sum((x - x_mean) * (y - y_mean)) / (n * x_std * y_std)

# Load the data

In [28]:
data = {
    "Hours Studied": [2, 3, 4, 5, 6, 7],
    "Hours Watching TV": [4, 3, 2, 1, 0, 0],
    "Outdoor Activity Time": [2, 4, 6, 8, 10, 12],
    "Hours Listening to Music": [2, 3, 4, 1, 5, 0],
    "Water Consumed": [5, 6, 5, 6, 4, 5],
    "Test Score": [65, 70, 75, 80, 85, 90]
}

df = pd.DataFrame(data)
df

Unnamed: 0,Hours Studied,Hours Watching TV,Outdoor Activity Time,Hours Listening to Music,Water Consumed,Test Score
0,2,4,2,2,5,65
1,3,3,4,3,6,70
2,4,2,6,4,5,75
3,5,1,8,1,6,80
4,6,0,10,5,4,85
5,7,0,12,0,5,90


# Correlation coefficient with Test Scores

In [29]:
corr_coeff = [correlation(df[col], df["Test Score"]) for col in df.columns]
corr_coeff_df = pd.DataFrame(corr_coeff, index=df.columns, columns=["TestScore Correlation"]).drop("Test Score")
corr_coeff_df

Unnamed: 0,TestScore Correlation
Hours Studied,1.0
Hours Watching TV,-0.981981
Outdoor Activity Time,1.0
Hours Listening to Music,-0.2
Water Consumed,-0.355036


In [30]:
pos, neg, nocor = [], [], []
for index, row in corr_coeff_df.iterrows():
    if row["TestScore Correlation"] > 0.7:
        pos.append(index)
    elif row["TestScore Correlation"] < -0.7:
        neg.append(index)
    else:
        nocor.append(index)
print('\nPositive correlation:', pos)
print('\nNegative correlation:', neg)
print('\nNo Significant correlation:', nocor)


Positive correlation: ['Hours Studied', 'Outdoor Activity Time']

Negative correlation: ['Hours Watching TV']

No Significant correlation: ['Hours Listening to Music', 'Water Consumed']


## Hours Studied

A correlation coefficient of 1.0 between "Hours Studied" and test scores indicates a perfect positive linear relationship between these two variables.

- This means that as the number of hours studied increases, the test scores also consistently and proportionally increase.
- In terms of academic achievement, this strong correlation suggests that dedicating more time to studying directly leads to higher test scores.
- Additionally, this correlation does not necessarily address the effectiveness of time management; it only indicates the direct association between study hours and test scores.

In [31]:
print("Correlation coefficient:", corr_coeff_df.loc["Hours Studied"][0])

Correlation coefficient: 1.0


## Hours Watching TV

A correlation coefficient of -0.98 between "Hours Watching TV" and test scores indicates a very strong negative linear relationship between these two variables.
- This suggests that as the number of hours spent watching TV increases, test scores consistently and substantially decrease.
- In terms of academic achievement, this correlation implies that excessive TV-watching is strongly associated with lower test scores. This underscores the potential negative impact of excessive screen time on studying and learning.
- From a time management perspective, the correlation suggests that allocating more time to watching TV is linked to poorer academic outcomes, highlighting the importance of balancing leisure activities like TV-watching with dedicated study time for better academic performance.

In [32]:
print("Correlation coefficient:", corr_coeff_df.loc["Hours Watching TV"][0])

Correlation coefficient: -0.9819805060619659


## Hours Listening to Music

A correlation of -0.2 between "Hours Listening to Music" and test scores suggests a minor negative link. While more music listening slightly associates with lower test scores, the effect is weak.
- This correlation hints that music might impact concentration and study habits to some extent, but the significance varies based on individual preferences and task types.
- Students should consider how music affects their focus and adapt their study strategies accordingly.

In [33]:
print("Correlation coefficient:", corr_coeff_df.loc["Hours Listening to Music"][0])

Correlation coefficient: -0.2


## Water Consumed

A correlation of -0.35 between "Water Consumed" and test scores suggests a moderate negative link.
- This might be due to excessive water intake causing study disruptions.
- While hydration is crucial for cognitive function, managing water consumption is important to avoid distractions during studying and tests.

In [34]:
print("Correlation coefficient:", corr_coeff_df.loc["Water Consumed"][0])

Correlation coefficient: -0.35503580124836315


## Outdoor Activity Time

A correlation of 1.0 between "Outdoor Activity Time" and test scores indicates a perfect positive link.
- Outdoor activities can positively impact academic performance by enhancing cognitive function, memory, and reducing stress.
- Regular outdoor engagement can lead to better learning and test-taking outcomes.

In [35]:
print("Correlation coefficient:", corr_coeff_df.loc["Outdoor Activity Time"][0])

Correlation coefficient: 1.0
