In [None]:
Q1: Pearson Correlation Coefficient for Study Time and Exam Scores
The Pearson correlation coefficient measures the strength and direction of the linear relationship between two continuous variables. The formula for the Pearson correlation coefficient (rrr) is:
r=∑(Xi−Xˉ)(Yi−Yˉ)∑(Xi−Xˉ)2∑(Yi−Yˉ)2r = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum (X_i - \bar{X})^2 \sum (Y_i - \bar{Y})^2}}r=∑(Xi−Xˉ)2∑(Yi−Yˉ)2∑(Xi−Xˉ)(Yi−Yˉ)
Where:
•	XiX_iXi is the value of the first variable (e.g., study time),
•	YiY_iYi is the value of the second variable (e.g., exam scores),
•	Xˉ\bar{X}Xˉ and Yˉ\bar{Y}Yˉ are the means of XXX and YYY.
To calculate the Pearson correlation coefficient, you would need the data for study time and exam scores. Once you have the value of rrr, the interpretation is as follows:
•	r=1r = 1r=1: Perfect positive linear relationship
•	r=−1r = -1r=−1: Perfect negative linear relationship
•	r=0r = 0r=0: No linear relationship
Interpretation:
•	A positive correlation (e.g., r=0.7r = 0.7r=0.7) would suggest that as study time increases, exam scores tend to increase.
•	A negative correlation (e.g., r=−0.5r = -0.5r=−0.5) would suggest that as study time increases, exam scores tend to decrease.
________________________________________
Q2: Spearman's Rank Correlation for Sleep and Job Satisfaction
Spearman's rank correlation measures the strength of the monotonic relationship between two ranked variables. It is calculated based on the ranked values of the variables and is useful when the relationship is non-linear but monotonic. The formula for Spearman's rank correlation (ρ\rhoρ) is:
ρ=1−6∑di2n(n2−1)\rho = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)}ρ=1−n(n2−1)6∑di2
Where:
•	did_idi is the difference between the ranks of corresponding variables,
•	nnn is the number of observations.
Interpretation:
•	ρ=1\rho = 1ρ=1: Perfect positive monotonic relationship
•	ρ=−1\rho = -1ρ=−1: Perfect negative monotonic relationship
•	ρ=0\rho = 0ρ=0: No monotonic relationship
Example: If ρ=0.8\rho = 0.8ρ=0.8 for sleep and job satisfaction, it indicates a strong positive monotonic relationship: as sleep increases, job satisfaction tends to increase in a consistent manner. If ρ=−0.3\rho = -0.3ρ=−0.3, it indicates a weak negative monotonic relationship: as sleep increases, job satisfaction tends to decrease, but not in a linear manner.
________________________________________
Q3: Comparing Pearson and Spearman Correlations for Exercise and BMI
Suppose you have collected data on hours of exercise per week and body mass index (BMI) for 50 participants. The Pearson correlation will measure the linear relationship between the two variables, while the Spearman correlation will measure the monotonic relationship.
Steps:
1.	Calculate Pearson correlation using the Pearson formula, which captures the linearity of the relationship.
2.	Calculate Spearman's rank correlation to assess whether a monotonic (but possibly non-linear) relationship exists.
Interpretation:
•	If the Pearson correlation is close to 0 and the Spearman correlation is significant (e.g., ρ=0.6\rho = 0.6ρ=0.6), it suggests that while there is no linear relationship, there may be a monotonic relationship.
•	If both correlations are similar (e.g., Pearson r=−0.7r = -0.7r=−0.7 and Spearman ρ=−0.75\rho = -0.75ρ=−0.75), it suggests both linear and monotonic relationships exist between exercise and BMI.
________________________________________
Q4: Pearson Correlation for TV Time and Physical Activity
To examine the relationship between TV time (hours per day) and physical activity (measured on some scale), the Pearson correlation coefficient can be calculated using the formula for rrr.
Steps:
1.	Collect data on TV time and physical activity for 50 participants.
2.	Apply the Pearson formula to the data.
Interpretation:
•	If r=−0.85r = -0.85r=−0.85, this would suggest a strong negative linear relationship, meaning that as TV time increases, physical activity tends to decrease significantly.
•	If r=0.1r = 0.1r=0.1, it would suggest a very weak or negligible linear relationship between TV time and physical activity.
For Q5:
You need to encode the soft drink preference (categorical variable) and calculate the correlation between Age and Preference using appropriate methods, as correlation is generally calculated for numerical data. One potential way to proceed with this task is:
1.	Assign numerical values to each brand of soft drink (e.g., Coke = 1, Pepsi = 2, Mountain Dew = 3).
2.	Perform a statistical analysis to determine the relationship between Age and Soft Drink Preference.
For Q6:
1.	The Pearson correlation coefficient will be calculated between Number of Sales Calls per Day and Number of Sales per Week for a sample of 30 sales representatives.
2.	You would use Python's scipy.stats.pearsonr or pandas.corr to calculate the correlation coefficient.
Here’s a quick guide on how you can write the code for both:
Example Code
# Q5: Soft drink preferences by age
import pandas as pd
from scipy.stats import pearsonr

# Survey data (age and soft drink preference)
data = {'Age': [25, 42, 37, 19, 31, 28],
        'SoftDrinkPreference': ['Coke', 'Pepsi', 'Mountain Dew', 'Coke', 'Pepsi', 'Coke']}

# Convert to DataFrame
df = pd.DataFrame(data)

# Encoding the soft drink preferences (Coke=1, Pepsi=2, Mountain Dew=3)
df['SoftDrinkPreference_encoded'] = df['SoftDrinkPreference'].map({'Coke': 1, 'Pepsi': 2, 'Mountain Dew': 3})

# Calculate Pearson correlation between Age and Soft Drink Preference
correlation, p_value = pearsonr(df['Age'], df['SoftDrinkPreference_encoded'])

print(f"Pearson Correlation Coefficient: {correlation}, P-value: {p_value}")

# Q6: Sales Calls and Sales per Week correlation
# Hypothetical data
sales_data = {'SalesCallsPerDay': [15, 20, 18, 10, 25, 30, 22, 18, 26, 19, 27, 23, 16, 24, 17, 22, 30, 28, 14, 21, 29, 23, 18, 19, 25, 28, 15, 20, 17, 22],
              'SalesPerWeek': [50, 70, 65, 45, 85, 100, 75, 65, 90, 70, 95, 80, 60, 88, 66, 76, 99, 94, 56, 78, 98, 82, 66, 70, 87, 93, 58, 73, 64, 79]}

# Convert to DataFrame
sales_df = pd.DataFrame(sales_data)

# Calculate Pearson correlation
correlation_sales, p_value_sales = pearsonr(sales_df['SalesCallsPerDay'], sales_df['SalesPerWeek'])

print(f"Pearson Correlation Coefficient for Sales Data: {correlation_sales}, P-value: {p_value_sales}")

