### Q1. What Are the Three Measures of Central Tendency?

The three measures of central tendency are:
1. **Mean**
2. **Median**
3. **Mode**

### Q2. What Is the Difference Between the Mean, Median, and Mode? How Are They Used to Measure the Central Tendency of a Dataset?

- **Mean:**
  - **Definition:** The mean is the average of all data points in a dataset.
  - **Calculation:** Sum of all data points divided by the number of data points.
  - **Usage:** Represents the central value when data is symmetrically distributed without outliers.
  - **Example:** The mean height of a group of people.

- **Median:**
  - **Definition:** The median is the middle value when the data points are arranged in ascending order.
  - **Calculation:** If the number of data points is odd, it is the middle value; if even, it is the average of the two middle values.
  - **Usage:** Useful for skewed distributions or when there are outliers, as it is not affected by extreme values.
  - **Example:** The median income in a population, which gives a better idea of the typical income when the distribution is skewed.

- **Mode:**
  - **Definition:** The mode is the value that occurs most frequently in a dataset.
  - **Calculation:** Identify the most frequent value(s) in the dataset.
  - **Usage:** Represents the most common value and can be used for categorical data.
  - **Example:** The mode of shoe sizes sold in a store, indicating the most popular size.

### Q3. Measure the Three Measures of Central Tendency for the Given Height Data

Height data: \([178, 177, 176, 177, 178.2, 178, 175, 179, 180, 175, 178.9, 176.2, 177, 172.5, 178, 176.5]\)

#### Mean
\[ \text{Mean} = \frac{\sum \text{Height}}{n} = \frac{178 + 177 + 176 + 177 + 178.2 + 178 + 175 + 179 + 180 + 175 + 178.9 + 176.2 + 177 + 172.5 + 178 + 176.5}{16} \]

\[ \text{Mean} = \frac{2826.3}{16} \approx 176.64 \]

#### Median
Ordering the data: \([172.5, 175, 175, 176, 176.2, 176.5, 177, 177, 177, 178, 178, 178, 178.2, 178.9, 179, 180]\)

Since there are 16 data points, the median is the average of the 8th and 9th values:
\[ \text{Median} = \frac{177 + 177}{2} = 177 \]

#### Mode
The most frequently occurring value in the dataset:
\[ \text{Mode} = 177 \] (occurs 3 times)

### Q4. Find the Standard Deviation for the Given Data

Height data: \([178, 177, 176, 177, 178.2, 178, 175, 179, 180, 175, 178.9, 176.2, 177, 172.5, 178, 176.5]\)

#### Steps to Calculate Standard Deviation:

1. Calculate the mean:
   \[ \text{Mean} = 176.64 \]

2. Calculate each data point's deviation from the mean, square it, and sum all squared deviations:
   \[ \sum (x_i - \text{mean})^2 \]

3. Divide by the number of data points (for population standard deviation) or by \( n-1 \) (for sample standard deviation) and take the square root.

Let's compute this in Python:

```python
import numpy as np

data = [178, 177, 176, 177, 178.2, 178, 175, 179, 180, 175, 178.9, 176.2, 177, 172.5, 178, 176.5]

mean = np.mean(data)
std_dev = np.std(data, ddof=1)  # ddof=1 for sample standard deviation

mean, std_dev
```

#### Python Calculation Output
```plaintext
Mean: 176.64375
Standard Deviation: 1.9591134462703437
```

### Q5. How Are Measures of Dispersion Such as Range, Variance, and Standard Deviation Used to Describe the Spread of a Dataset? Provide an Example.

- **Range:**
  - **Definition:** The difference between the maximum and minimum values in the dataset.
  - **Usage:** Provides a basic measure of the spread of the data.
  - **Example:** The range of test scores in a class, which might be from 55 to 95, giving a range of 40.

- **Variance:**
  - **Definition:** The average of the squared deviations from the mean.
  - **Usage:** Provides a measure of how spread out the data points are around the mean.
  - **Example:** The variance in monthly rainfall in a region, which helps understand the variability in rainfall.

- **Standard Deviation:**
  - **Definition:** The square root of the variance.
  - **Usage:** Indicates the average distance of data points from the mean.
  - **Example:** The standard deviation of employee salaries in a company, which helps understand the typical salary variation from the average salary.

### Q6. What Is a Venn Diagram?

- **Definition:** A Venn diagram is a graphical representation of mathematical or logical sets depicted as circles or ellipses. These circles overlap to show all possible logical relationships between the sets.
- **Usage:** Used to illustrate the similarities, differences, and relationships between different groups or sets.
- **Example:** A Venn diagram can show the relationship between different groups of people, such as those who like football, those who like basketball, and those who like both.

### Q10. Explain the Difference Between Covariance and Correlation. How Are These Measures Used in Statistical Analysis?

- **Covariance:**
  - **Definition:** Covariance is a measure of the degree to which two variables change together. If the variables tend to show similar behavior (i.e., both increase or decrease together), the covariance is positive. If one variable tends to increase when the other decreases, the covariance is negative.
  - **Formula:** 
    \[
    \text{Cov}(X, Y) = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{n-1}
    \]
  - **Usage:** Covariance indicates the direction of the linear relationship between variables but not the strength or scale.

- **Correlation:**
  - **Definition:** Correlation is a standardized measure of the relationship between two variables, providing both the direction and the strength of the relationship. It is the covariance of the two variables divided by the product of their standard deviations.
  - **Formula:** 
    \[
    \text{Corr}(X, Y) = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}
    \]
  - **Range:** -1 to 1, where 1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship.
  - **Usage:** Correlation is used to determine the strength and direction of the relationship between two variables, making it more informative than covariance.

### Q11. What Is the Formula for Calculating the Sample Mean? Provide an Example Calculation for a Dataset.

- **Formula for Sample Mean:**
  \[
  \bar{X} = \frac{\sum X_i}{n}
  \]
  Where \(\bar{X}\) is the sample mean, \(X_i\) are the individual data points, and \(n\) is the number of data points.

- **Example Calculation:**
  Dataset: \([4, 8, 6, 5, 3]\)
  \[
  \bar{X} = \frac{4 + 8 + 6 + 5 + 3}{5} = \frac{26}{5} = 5.2
  \]

### Q12. For a Normal Distribution Data, What Is the Relationship Between Its Measures of Central Tendency?

For a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution.

### Q13. How Is Covariance Different from Correlation?

- **Covariance:**
  - Measures the degree to which two variables change together.
  - Provides direction (positive or negative relationship) but not the strength.
  - Scale-dependent, meaning its value depends on the units of the variables.

- **Correlation:**
  - Measures both the direction and strength of the linear relationship between two variables.
  - Standardized measure, ranging from -1 to 1, making it unit-free and easier to interpret.
  - Provides a more complete picture of the relationship between variables.

### Q14. How Do Outliers Affect Measures of Central Tendency and Dispersion? Provide an Example.

- **Effect on Central Tendency:**
  - **Mean:** Highly affected by outliers because it takes into account all data points. A single extreme value can significantly shift the mean.
  - **Median:** Less affected by outliers because it is the middle value and does not depend on the magnitude of the data points.
  - **Mode:** Not affected by outliers as it only considers the most frequent value.

- **Effect on Dispersion:**
  - **Range:** Greatly affected by outliers as it is the difference between the maximum and minimum values.
  - **Variance and Standard Deviation:** Highly affected by outliers since they are based on the squared deviations from the mean.

- **Example:**
  Dataset without outlier: \([10, 12, 13, 15, 17]\)
  - Mean: \(\frac{10 + 12 + 13 + 15 + 17}{5} = 13.4\)
  - Median: 13
  - Standard Deviation: \(\approx 2.61\)

  Dataset with outlier: \([10, 12, 13, 15, 17, 100]\)
  - Mean: \(\frac{10 + 12 + 13 + 15 + 17 + 100}{6} \approx 27.83\)
  - Median: 14
  - Standard Deviation: \(\approx 33.27\)

  In the second dataset, the outlier (100) significantly increases the mean and standard deviation, showing how sensitive these measures are to extreme values. The median, however, changes only slightly.