In [1]:
### Q1: Pearson Correlation Coefficient for Study Time and Exam Scores
'''
Pearson Correlation Coefficient Formula:
 r = n ∑ X Y − ∑ X ∑ Y ( n ∑ X 2 − ( ∑ X ) 2 ) ⋅ ( n ∑ Y 2 − ( ∑ Y ) 2 )
Example Calculation:
Let's assume we have the following data for study time (in hours) and exam scores:

| Study Time (hours) | Exam Scores |
|--------------------|-------------|
| 2                  | 70          |
| 3                  | 75          |
| 4                  | 80          |
| 5                  | 85          |
| 6                  | 90          |

We can calculate the Pearson correlation coefficient using Python:
'''
import numpy as np
import pandas as pd

# Sample data
data = pd.DataFrame({
    'study_time': [2, 3, 4, 5, 6],
    'exam_scores': [70, 75, 80, 85, 90]
})

# Calculate Pearson correlation coefficient
pearson_corr = data.corr(method='pearson')

print("Pearson Correlation Coefficient:")
print(pearson_corr)



Pearson Correlation Coefficient:
             study_time  exam_scores
study_time          1.0          1.0
exam_scores         1.0          1.0


In [2]:

### Q2: Spearman's Rank Correlation for Sleep and Job Satisfaction
'''
Spearman's Rank Correlation Formula:
𝑟 = 1 − 6 ∑ 𝑑 𝑛 ( 𝑛 − 1 ) 
where \( d_i \) is the difference between the ranks of corresponding variables.

Example Calculation:
Let's assume we have the following data for sleep hours and job satisfaction:

| Sleep Hours | Job Satisfaction |
|-------------|------------------|
| 6           | 5                |
| 7           | 6                |
| 8           | 7                |
| 5           | 4                |
| 9           | 8                |

We can calculate the Spearman's rank correlation using Python:
'''
# Sample data
data = pd.DataFrame({
    'sleep_hours': [6, 7, 8, 5, 9],
    'job_satisfaction': [5, 6, 7, 4, 8]
})

# Calculate Spearman's rank correlation
spearman_corr = data.corr(method='spearman')

print("Spearman's Rank Correlation:")
print(spearman_corr)







Spearman's Rank Correlation:
                  sleep_hours  job_satisfaction
sleep_hours               1.0               1.0
job_satisfaction          1.0               1.0


In [4]:
### Q3: Comparison of Pearson and Spearman Correlation for Exercise and BMI

import numpy as np
import pandas as pd
from scipy.stats import spearmanr

# Generate synthetic data
np.random.seed(0)
exercise_hours = np.random.randint(0, 10, 50)
bmi = 30 - 0.5 * exercise_hours + np.random.normal(0, 2, 50)

# Create a DataFrame
exercise_bmi_data = pd.DataFrame({
    'exercise_hours': exercise_hours,
    'bmi': bmi
})

# Calculate Pearson correlation coefficient
pearson_corr_ex_bmi = exercise_bmi_data.corr(method='pearson')

# Calculate Spearman's rank correlation
spearman_corr_ex_bmi, _ = spearmanr(exercise_bmi_data['exercise_hours'], exercise_bmi_data['bmi'])

print("Pearson Correlation Coefficient for Exercise and BMI:")
print(pearson_corr_ex_bmi)
print("\nSpearman's Rank Correlation for Exercise and BMI:")
print(spearman_corr_ex_bmi)

# Calculate Pearson correlation coefficient
pearson_corr_ex_bmi = exercise_bmi_data.corr(method='pearson')

# Calculate Spearman's rank correlation
spearman_corr_ex_bmi = exercise_bmi_data.corr(method='spearman')

print("Pearson Correlation Coefficient for Exercise and BMI:")
print(pearson_corr_ex_bmi)
print("\nSpearman's Rank Correlation for Exercise and BMI:")
print(spearman_corr_ex_bmi)




Pearson Correlation Coefficient for Exercise and BMI:
                exercise_hours       bmi
exercise_hours        1.000000 -0.646522
bmi                  -0.646522  1.000000

Spearman's Rank Correlation for Exercise and BMI:
-0.6545382946621912
Pearson Correlation Coefficient for Exercise and BMI:
                exercise_hours       bmi
exercise_hours        1.000000 -0.646522
bmi                  -0.646522  1.000000

Spearman's Rank Correlation for Exercise and BMI:
                exercise_hours       bmi
exercise_hours        1.000000 -0.654538
bmi                  -0.654538  1.000000


In [None]:
### Q4: Pearson Correlation for TV Watching and Physical Activity

# Generate synthetic data
np.random.seed(1)
tv_watching_hours = np.random.randint(0, 10, 50)
physical_activity = 10 - tv_watching_hours + np.random.normal(0, 2, 50)

# Create a DataFrame
tv_activity_data = pd.DataFrame({
    'tv_watching_hours': tv_watching_hours,
    'physical_activity': physical_activity
})

# Calculate Pearson correlation coefficient
pearson_corr_tv_act = tv_activity_data.corr(method='pearson')

print("Pearson Correlation Coefficient for TV Watching and Physical Activity:")
print(pearson_corr_tv_act)

# Calculate Pearson correlation coefficient
pearson_corr_tv_act = tv_activity_data.corr(method='pearson')

print("Pearson Correlation Coefficient for TV Watching and Physical Activity:")
print(pearson_corr_tv_act)




In [None]:
### Q5: Relationship Between Age and Soft Drink Preference

This question involves categorical data, so correlation may not be appropriate. Instead, we can analyze the distribution or use chi-square tests for independence.



In [None]:
### Q6: Pearson Correlation for Sales Calls and Sales Made

# Example Calculation:

# Assume we have the following data:

# | Sales Calls | Sales Made |
# |-------------|------------|
# | ...         | ...        |


# Assuming we have the data in `sales_data` DataFrame

# Calculate Pearson correlation coefficient
pearson_corr_sales = sales_data.corr(method='pearson')

print("Pearson Correlation Coefficient for Sales Calls and Sales Made:")
print(pearson_corr_sales)