**Q1. What are the three measures of central tendency?**

The three measures of central tendency are:
1. **Mean:** The average of a set of values.
2. **Median:** The middle value in a sorted dataset.
3. **Mode:** The most frequently occurring value.

**Q2. What is the difference between the mean, median, and mode? How are they used to measure the central tendency of a dataset?**

- **Mean:** Calculated by summing all values and dividing by the number of values. It is sensitive to extreme values.
  
- **Median:** The middle value when the data is sorted. Less affected by extreme values, making it useful for skewed distributions.
  
- **Mode:** The most frequently occurring value. Suitable for categorical data.

These measures provide different perspectives on the central tendency of a dataset, allowing statisticians to choose an appropriate measure based on the characteristics of the data.

**Q3. Measure the three measures of central tendency for the given height data: [178,177,176,177,178.2,178,175,179,180,175,178.9,176.2,177,172.5,178,176.5]**

```python
import numpy as np

height_data = [178, 177, 176, 177, 178.2, 178, 175, 179, 180, 175, 178.9, 176.2, 177, 172.5, 178, 176.5]

mean_height = np.mean(height_data)
median_height = np.median(height_data)
mode_height = float(max(set(height_data), key=height_data.count))

print(f"Mean: {mean_height:.2f}")
print(f"Median: {median_height}")
print(f"Mode: {mode_height}")
```

**Q4. Find the standard deviation for the given data: [178,177,176,177,178.2,178,175,179,180,175,178.9,176.2,177,172.5,178,176.5]**

```python
import numpy as np

height_data = [178, 177, 176, 177, 178.2, 178, 175, 179, 180, 175, 178.9, 176.2, 177, 172.5, 178, 176.5]

std_deviation = np.std(height_data)

print(f"Standard Deviation: {std_deviation:.2f}")
```

**Q5. How are measures of dispersion such as range, variance, and standard deviation used to describe the spread of a dataset? Provide an example.**

- **Range:** The difference between the maximum and minimum values. Provides an overall sense of the spread.

- **Variance:** Measures the average squared deviation from the mean. Higher variance indicates greater spread.

- **Standard Deviation:** Square root of the variance. Provides a more interpretable measure of spread.

Example:
```python
import numpy as np

data = [10, 15, 20, 25, 30]
range_data = np.ptp(data)  # Range
variance_data = np.var(data)  # Variance
std_deviation_data = np.std(data)  # Standard Deviation

print(f"Range: {range_data}")
print(f"Variance: {variance_data}")
print(f"Standard Deviation: {std_deviation_data:.2f}")
```

**Q6. What is a Venn diagram?**

A **Venn diagram** is a visual representation of the relationships between sets. It consists of overlapping circles, each representing a set, and the overlap represents the intersection of the sets. Venn diagrams are commonly used to illustrate logical relationships between different groups.

**Q7. For the two given sets A = (2,3,4,5,6,7) & B = (0,2,6,8,10). Find: (i) A ∩ B (ii) A ⋃ B**

(i) **A ∩ B (Intersection):** The common elements in both sets.
   \[ A ∩ B = \{2, 6\} \]

(ii) **A ⋃ B (Union):** The combined elements of both sets, without duplicates.
   \[ A ⋃ B = \{0, 2, 3, 4, 5, 6, 7, 8, 10\} \]

**Q8. What do you understand about skewness in data?**

**Skewness** measures the asymmetry or departure from symmetry in a dataset. If the distribution of data points leans towards one side, it is considered skewed. Skewness can be positive (right-skewed), negative (left-skewed), or zero (symmetric).

**Q9. If a data is right-skewed then what will be the position of median with respect to mean?**

In a right-skewed distribution (positively skewed), the tail on the right side is longer, and the median is typically less than the mean. The mean is influenced by the long tail on the right, pulling it in that direction.

**Q10. Explain the difference between covariance and correlation. How are these measures used in statistical analysis?**

- **Covariance:** Measures the degree to which two variables change together. Covariance can be positive or negative, indicating the direction of the relationship.

- **Correlation:** Standardized measure of the strength and direction of the linear relationship between two variables. It ranges from -1 to 1, where -1 indicates a perfect negative correlation, 1 indicates a perfect positive correlation, and 0 indicates no correlation.

Both covariance and correlation are used to analyze relationships between variables, but correlation is more interpretable as it is scaled.

**Q11. What is the formula for calculating the sample mean? Provide an example calculation for a dataset.**

The formula for calculating the sample mean (\(\bar{x}\)) is:

\[ \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} \]

Example:
```python
data = [10, 15, 20, 25, 30]
mean_data = sum(data) / len(data)
print(f"Sample Mean: {mean_data}")
```

**Q12. For a normal distribution data what is the relationship between its measure of central tendency?**

For a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution. The normal distribution is symmetric, so the mean is at the center of the distribution, and since it is also the point of highest probability, it is also the mode. The median is the same as the mean in a perfectly symmetrical normal distribution.

**Q13. How is covariance different from correlation?**

- **Covariance:** Measures the degree to which two variables change together. It is not scaled, so the values can range from negative infinity to positive infinity.

- **Correlation:** Standardized measure of the strength and direction of the linear relationship between two variables. It ranges from -1 to 1, where -1 indicates a perfect negative correlation, 1 indicates a perfect positive correlation, and 0 indicates no correlation.

Covariance provides the direction of the relationship, while correlation

 provides both the direction and the strength.

**Q14. How do outliers affect measures of central tendency and dispersion? Provide an example.**

- **Central Tendency:** Outliers can significantly affect the mean, pulling it towards their extreme values. The median is less affected as it represents the middle value.

- **Dispersion:** Outliers can greatly influence measures like range and standard deviation, making them larger than they would be without outliers.

Example:
```python
import numpy as np

data_with_outlier = [10, 15, 20, 25, 100]
mean_data_with_outlier = np.mean(data_with_outlier)
std_deviation_data_with_outlier = np.std(data_with_outlier)

print(f"Mean (with outlier): {mean_data_with_outlier:.2f}")
print(f"Standard Deviation (with outlier): {std_deviation_data_with_outlier:.2f}")
```

In this example, the outlier (100) significantly affects the mean and standard deviation. Removing or handling outliers may provide a more representative summary of the data.