<a href="https://colab.research.google.com/github/UrvashiiThakur/practiceGit/blob/main/22Mar.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Let's address each question step by step.

### Q1. Pearson Correlation Coefficient

**Scenario**: You have collected data on the amount of time students spend studying for an exam and their final exam scores.

**Solution**:

1. **Formula**: The Pearson correlation coefficient (r) is calculated as:

\[ r = \frac{n(\sum xy) - (\sum x)(\sum y)}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}} \]

Where:
- \( n \) is the number of pairs of scores
- \( x \) and \( y \) are the individual data points of the two variables

2. **Example Calculation**: Suppose you have the following data:
   - Hours spent studying: \( x = [2, 3, 4, 5, 6] \)
   - Exam scores: \( y = [60, 65, 70, 75, 80] \)

Let's calculate the Pearson correlation coefficient.

```python
import numpy as np
from scipy.stats import pearsonr

# Data
study_hours = np.array([2, 3, 4, 5, 6])
exam_scores = np.array([60, 65, 70, 75, 80])

# Pearson correlation coefficient
pearson_corr, _ = pearsonr(study_hours, exam_scores)
pearson_corr
```

**Interpretation**:
- If \( r \) is close to 1, it indicates a strong positive linear relationship.
- If \( r \) is close to -1, it indicates a strong negative linear relationship.
- If \( r \) is close to 0, it indicates no linear relationship.

### Q2. Spearman's Rank Correlation

**Scenario**: You have collected data on the amount of sleep individuals get each night and their overall job satisfaction level on a scale of 1 to 10.

**Solution**:

1. **Formula**: The Spearman rank correlation coefficient (ρ) is calculated as:

\[ ρ = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)} \]

Where:
- \( d_i \) is the difference between the ranks of corresponding variables
- \( n \) is the number of pairs of rankings

2. **Example Calculation**: Suppose you have the following data:
   - Hours of sleep: \( x = [7, 6, 5, 4, 8] \)
   - Job satisfaction: \( y = [9, 7, 6, 4, 10] \)

Let's calculate the Spearman rank correlation coefficient.

```python
from scipy.stats import spearmanr

# Data
sleep_hours = np.array([7, 6, 5, 4, 8])
job_satisfaction = np.array([9, 7, 6, 4, 10])

# Spearman rank correlation coefficient
spearman_corr, _ = spearmanr(sleep_hours, job_satisfaction)
spearman_corr
```

**Interpretation**:
- If \( ρ \) is close to 1, it indicates a strong positive monotonic relationship.
- If \( ρ \) is close to -1, it indicates a strong negative monotonic relationship.
- If \( ρ \) is close to 0, it indicates no monotonic relationship.

### Q3. Pearson vs. Spearman Correlation

**Scenario**: Examine the relationship between the number of hours of exercise per week and BMI in a sample of 50 participants.

**Solution**:

1. **Generate Data**: Let's create some sample data for 50 participants.

```python
import numpy as np
from scipy.stats import pearsonr, spearmanr

np.random.seed(0)
exercise_hours = np.random.uniform(1, 10, 50)
bmi = np.random.uniform(18, 35, 50)

# Pearson correlation coefficient
pearson_corr, _ = pearsonr(exercise_hours, bmi)

# Spearman rank correlation coefficient
spearman_corr, _ = spearmanr(exercise_hours, bmi)

pearson_corr, spearman_corr
```

**Interpretation**:
- Compare the Pearson and Spearman correlation coefficients.
- Pearson measures linear relationships, while Spearman measures monotonic relationships.

### Q4. Pearson Correlation Coefficient (TV Watching and Physical Activity)

**Scenario**: Examine the relationship between the number of hours spent watching television per day and the level of physical activity.

**Solution**:

1. **Generate Data**: Let's create some sample data for 50 participants.

```python
np.random.seed(1)
tv_hours = np.random.uniform(1, 8, 50)
physical_activity = np.random.uniform(1, 10, 50)

# Pearson correlation coefficient
pearson_corr, _ = pearsonr(tv_hours, physical_activity)
pearson_corr
```

**Interpretation**:
- A negative correlation might indicate that more TV watching is associated with less physical activity.

### Q5. Pearson Correlation Coefficient (Age and Soft Drink Preference)

**Scenario**: Survey results show the relationship between age and preference for a particular brand of soft drink.

**Solution**:

1. **Generate Data**: Let's create some sample data for analysis.

```python
np.random.seed(2)
age = np.random.randint(18, 65, 50)
soft_drink_preference = np.random.randint(1, 5, 50)

# Pearson correlation coefficient
pearson_corr, _ = pearsonr(age, soft_drink_preference)
pearson_corr
```

**Interpretation**:
- Analyze if there is any correlation between age and soft drink preference.

### Q6. Pearson Correlation Coefficient (Sales Calls and Sales Made)

**Scenario**: Examine the relationship between the number of sales calls made per day and the number of sales made per week.

**Solution**:

1. **Generate Data**: Let's create some sample data for 30 sales representatives.

```python
np.random.seed(3)
sales_calls_per_day = np.random.uniform(5, 20, 30)
sales_per_week = np.random.uniform(1, 10, 30)

# Pearson correlation coefficient
pearson_corr, _ = pearsonr(sales_calls_per_day, sales_per_week)
pearson_corr
```

**Interpretation**:
- Analyze the correlation to determine if more sales calls result in more sales.

By using the above code snippets and explanations, you can compute the Pearson and Spearman correlation coefficients and interpret the relationships in various scenarios.