## Q1. Pearson Correlation Coefficient
To calculate the Pearson correlation coefficient between the amount of time students spend studying for an exam and their final exam scores, you can use the following formula:

\[ r = \frac{\sum (X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum (X - \bar{X})^2 \sum (Y - \bar{Y})^2}} \]

Where:
- \( X \) is the amount of time spent studying
- \( Y \) is the final exam score
- \( \bar{X} \) and \( \bar{Y} \) are the means of \( X \) and \( Y \)

The Pearson correlation coefficient (r) will be a value between -1 and 1 that indicates the strength and direction of the linear relationship between the two variables.

## Q2. Spearman's Rank Correlation
Spearman's rank correlation is calculated using the ranks of the data rather than the raw data itself. The formula for Spearman's rank correlation coefficient (ρ) is:

\[ \rho = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)} \]

Where:
- \( d_i \) is the difference between the ranks of corresponding values of \( X \) and \( Y \)
- \( n \) is the number of observations

## Q3. Relationship between Hours of Exercise and BMI
To examine the relationship between the number of hours of exercise per week and BMI:
1. Calculate the Pearson correlation coefficient.
2. Calculate the Spearman's rank correlation coefficient.
3. Compare the results to see if the relationship is linear or monotonic.

## Q4. Television Watching and Physical Activity
Calculate the Pearson correlation coefficient between the number of hours individuals spend watching television per day and their level of physical activity using the formula from Q1.

## Q5. Relationship between Age and Soft Drink Preference
This question involves a categorical variable (soft drink preference) and a numerical variable (age). To analyze this:
1. Encode the soft drink preferences numerically (e.g., Coke = 1, Pepsi = 2, Mountain Dew = 3).
2. Calculate the Pearson correlation coefficient using the encoded values and ages.

Given data:
| Age (Years) | Soft drink Preference |
|-------------|-----------------------|
| 25          | Coke                  |
| 42          | Pepsi                 |
| 37          | Mountain Dew          |
| 19          | Coke                  |
| 31          | Pepsi                 |
| 28          | Coke                  |

In [5]:
import numpy as np
from scipy.stats import pearsonr

# Given data
ages = [25, 42, 37, 19, 31, 28]
soft_drink_preferences_encoded = [1, 2, 3, 1, 2, 1]

# Calculate Pearson correlation coefficient
pearson_corr, _ = pearsonr(ages, soft_drink_preferences_encoded)
print('Pearson correlation coefficient:', pearson_corr)

Pearson correlation coefficient: 0.7587035441865057


## Q6. Sales Calls and Sales Made
To calculate the Pearson correlation coefficient between the number of sales calls made per day and the number of sales made per week, use the formula from Q1.

Given data (for illustration, you'll need to replace with actual data from the company):
| Sales Calls per Day | Sales per Week |
|---------------------|----------------|
| X1                  | Y1             |
| X2                  | Y2             |
| ...                 | ...            |
| X30                 | Y30            |

In [4]:
import numpy as np
from scipy.stats import pearsonr

# Example data (replace these lists with the actual data)
sales_calls_per_day = [20, 30, 25, 40, 35, 50, 45, 60, 55, 70, 65, 80, 75, 90, 85, 100, 95, 110, 105, 120, 115, 130, 125, 140, 135, 150, 145, 160, 155, 170]
sales_per_week = [200, 300, 250, 400, 350, 500, 450, 600, 550, 700, 650, 800, 750, 900, 850, 1000, 950, 1100, 1050, 1200, 1150, 1300, 1250, 1400, 1350, 1500, 1450, 1600, 1550, 1700]

# Calculate Pearson correlation coefficient
pearson_corr, _ = pearsonr(sales_calls_per_day, sales_per_week)
print('Pearson correlation coefficient:', pearson_corr)

Pearson correlation coefficient: 1.0
