#### Q1: What are the Probability Mass Function (PMF) and Probability Density Function (PDF)? Explain with an example.

#### Probability Mass Function (PMF) and Probability Density Function (PDF)

Probability Mass Function (PMF) and Probability Density Function (PDF) are concepts used in probability theory and statistics to describe the distribution of random variables. The choice between PMF and PDF depends on whether the random variable is discrete or continuous.

##### 1. Probability Mass Function (PMF)

**Definition:** The PMF is used for discrete random variables. It gives the probability that a discrete random variable is exactly equal to some value.

**Properties:**
- The sum of all probabilities in a PMF is 1.
- The PMF can be represented as a function that maps each value of a discrete random variable to a probability between 0 and 1.

**Example:** Suppose we have a fair six-sided die. Let \( X \) be a random variable representing the result of a single die roll. The PMF of \( X \) can be defined as:

$$
P(X = x) = 
\begin{cases} 
      \frac{1}{6}, & \text{if } x = 1, 2, 3, 4, 5, 6 \\
      0, & \text{otherwise}
\end{cases}
$$

This means that each possible outcome (1 through 6) has an equal probability of 1/6.

##### 2. Probability Density Function (PDF)

**Definition:** The PDF is used for continuous random variables. It describes the relative likelihood for the random variable to take on a given value.

**Properties:**
- The total area under the PDF curve is 1.
- Unlike the PMF, the value of the PDF at a specific point is not a probability; instead, it represents the probability density.
- The probability of the random variable falling within a particular interval \([a, b]\) is given by the area under the PDF curve from \( a \) to \( b \):

\begin{cases} 
      P(a \leq X \leq b) = \int_{a}^{b} f(x) \, dx
\end{cases}

**Example:** Consider a continuous random variable \( X \) that is normally distributed with a mean of 0 and a standard deviation of 1 (standard normal distribution). The PDF of \( X \) is given by:

\begin{cases} 
      f(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2}{2}}
\end{cases}

This function describes the probability density of X across all real numbers. The probability that X falls within an interval (e.g., between -1 and 1) is found by calculating the area under the curve between those two points.

##### Summary

- **PMF** is for discrete variables and provides the probability of exact outcomes.
- **PDF** is for continuous variables and gives the density of probabilities over a range of outcomes.

#### Q2: What is Cumulative Density Function (CDF)? Explain with an example. Why CDF is used?

The **Cumulative Distribution Function (CDF)** of a random variable X  is a function that gives the probability that X  is less than or equal to a particular value X . It is mathematically defined as:

$$
F_X(x) = P(X \leq x)
$$


##### Example:

Consider a fair six-sided die. Let the random variable \( X \) represent the outcome of a roll. The CDF for this discrete random variable is calculated by summing the probabilities of all outcomes less than or equal to a specific value.

For instance, if \( x = 3 \):

$$
F_X(3) = P(X \leq 3) = P(X = 1) + P(X = 2) + P(X = 3)
$$

Since each outcome has a probability of \( \frac{1}{6} \):

$$
F_X(3) = \frac{1}{6} + \frac{1}{6} + \frac{1}{6} = \frac{1}{2}
$$

##### Why CDF is Used:

The CDF is useful because it provides a complete description of the probability distribution of a random variable. It helps in:

1. **Calculating Probabilities**: Makes it easy to compute the probability that a random variable falls within a certain range:

   $$
   P(a \leq X \leq b) = F_X(b) - F_X(a)
   $$

2. **Understanding Distribution**: Offers insights into the shape and spread of the distribution.

3. **Defining Percentiles**: Assists in determining percentiles or quantiles of a dataset.


#### Q3: What are some examples of situations where the normal distribution might be used as a model? Explain how the parameters of the normal distribution relate to the shape of the distribution.
##### Examples of Situations Where the Normal Distribution Might Be Used:

1. **Height of People**: Human heights in a given population tend to follow a normal distribution. Heights vary around a mean value, with most people being close to the average and fewer people being at the extremes.

2. **Test Scores**: Scores on many standardized tests, such as IQ tests or academic exams, often follow a normal distribution. Most scores cluster around the mean, with fewer individuals scoring very high or very low.

3. **Measurement Errors**: In scientific experiments, errors in measurements often follow a normal distribution. These errors tend to cluster around zero, with larger deviations being less common.

4. **Stock Returns**: Daily returns of stock prices are often modeled using a normal distribution, assuming returns are symmetrically distributed around an average value.

5. **Blood Pressure**: Blood pressure readings in a population typically follow a normal distribution, with most readings near the average and fewer readings at the extremes.

##### Parameters of the Normal Distribution and Their Relation to the Shape:

The normal distribution is characterized by two parameters: the mean ($\mu$) and the standard deviation ($\sigma$).

1. **Mean ($\mu$)**:
   - The mean of the normal distribution determines the center of the distribution. It is the value around which the data points are symmetrically distributed.
   - In the probability density function (PDF), the mean shifts the peak of the bell curve along the x-axis.
   
   The PDF is given by:
   $$
   f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}
   $$

2. **Standard Deviation ($\sigma$)**:
   - The standard deviation measures the spread or dispersion of the data around the mean. A larger standard deviation results in a wider and flatter bell curve, while a smaller standard deviation results in a narrower and taller curve.
   - In the PDF, the standard deviation affects the width of the bell curve, indicating how concentrated the data is around the mean.
   
   The PDF is given by:
   $$
   f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}
   $$

##### Summary:

- The **mean** determines the center of the normal distribution.
- The **standard deviation** controls the spread of the distribution.
- Together, these parameters define the shape of the bell curve of the normal distribution.


#### Q4: Explain the importance of Normal Distribution. Give a few real-life examples of Normal Distribution.
##### Importance of Normal Distribution

The normal distribution, also known as the Gaussian distribution, is crucial in statistics and probability for several reasons:

1. **Foundation of Statistical Inference**: Many statistical methods and tests are based on the assumption that data follows a normal distribution. This includes hypothesis testing, confidence intervals, and regression analysis.

2. **Central Limit Theorem**: The normal distribution is important because of the Central Limit Theorem (CLT), which states that the distribution of the sum (or average) of a large number of independent and identically distributed random variables approaches a normal distribution, regardless of the original distribution. This theorem justifies the use of normal distribution in various practical applications.

3. **Simplicity and Mathematical Properties**: The normal distribution has well-defined mathematical properties that simplify analysis and modeling. Its symmetry and the fact that it is fully described by just two parameters (mean and standard deviation) make it easier to work with compared to other distributions.

4. **Natural Phenomena**: Many natural phenomena and measurement errors follow a normal distribution, making it a useful model for analyzing and predicting real-world data.

##### Real-Life Examples of Normal Distribution

1. **Height of People**: Human heights in a given population tend to follow a normal distribution. Most individuals have heights around the average, with fewer people being extremely tall or short.

2. **Test Scores**: Scores on standardized tests, such as IQ tests or academic exams, often approximate a normal distribution. Most students score near the average, with fewer students scoring very high or very low.

3. **Measurement Errors**: In scientific experiments and engineering, errors in measurements often follow a normal distribution. Measurement errors are typically small and cluster around zero, with larger errors being less common.

4. **Blood Pressure**: Blood pressure readings in a population typically follow a normal distribution, with most readings near the average and fewer readings at the extremes.

5. **Stock Returns**: Daily returns on stock prices often follow a normal distribution, assuming that the returns are symmetrically distributed around the average return. This assumption helps in modeling and predicting financial markets.


#### Q5: What is Bernaulli Distribution? Give an Example. What is the difference between Bernoulli Distribution and Binomial Distribution?
##### Bernoulli Distribution

The **Bernoulli Distribution** is a discrete probability distribution of a random variable which takes the value 1 with probability \( p \) and the value 0 with probability \( 1 - p \). It is the simplest form of a discrete distribution and represents a single trial of a binary experiment.

Mathematically, the probability mass function (PMF) of the Bernoulli distribution is:

$$
P(X = x) = p^x (1 - p)^{1 - x}
$$

where \( x \) can be 0 or 1, and \( p \) is the probability of success (i.e., \( X = 1 \)).

##### Example

Consider a single coin toss where the coin is fair. Let \( X \) be a random variable that represents the outcome of the toss:

- \( X = 1 \) if the coin lands heads (success), with probability \( p = 0.5 \).
- \( X = 0 \) if the coin lands tails (failure), with probability \( 1 - p = 0.5 \).

In this case, the outcome of the coin toss follows a Bernoulli distribution with \( p = 0.5 \).

##### Difference Between Bernoulli Distribution and Binomial Distribution

- **Bernoulli Distribution**:
  - Models a single binary trial.
  - Described by a single parameter \( p \), which represents the probability of success.
  - The random variable can only take values 0 or 1.

- **Binomial Distribution**:
  - Models the number of successes in \( n \) independent Bernoulli trials.
  - Described by two parameters: \( n \) (the number of trials) and \( p \) (the probability of success in each trial).
  - The random variable can take integer values from 0 to \( n \), representing the number of successes in \( n \) trials.

Mathematically, the probability mass function of the Binomial distribution is:

$$
P(X = k) = \binom{n}{k} p^k (1 - p)^{n - k}
$$

where \( k \) is the number of successes in \( n \) trials.

In summary, the Bernoulli distribution is a special case of the Binomial distribution with \( n = 1 \).


#### Q6. Consider a dataset with a mean of 50 and a standard deviation of 10. If we assume that the dataset is normally distributed, what is the probability that a randomly selected observation will be greater than 60? Use the appropriate formula and show your calculations.

Given:
- Mean ($\mu$) = 50
- Standard deviation ($\sigma$) = 10

We need to find the probability that a randomly selected observation is greater than 60.

1. **Calculate the Z-score**:
   The Z-score formula is:
   $$
   Z = \frac{X - \mu}{\sigma}
   $$
   where \(X\) is the value of interest (60).

   Substituting the values:
   $$
   Z = \frac{60 - 50}{10} = 1
   $$

2. **Find the Probability**:
   The probability of an observation being greater than 60 is the complement of the cumulative probability up to a Z-score of 1.

   Using the standard normal distribution table or a calculator, the cumulative probability for \( Z = 1 \) is approximately 0.8413. Therefore, the probability of being greater than 60 is:
   $$
   P(X > 60) = 1 - P(Z \leq 1) = 1 - 0.8413 = 0.1587
   $$

   So, the probability that a randomly selected observation will be greater than 60 is approximately 0.1587 or 15.87%.

#### Q7: Explain uniform Distribution with an example.
##### Uniform Distribution

The **Uniform Distribution** is a type of probability distribution in which all outcomes are equally likely. The continuous uniform distribution is defined over an interval \([a, b]\), where \(a\) and \(b\) are the lower and upper bounds, respectively.

**Example**:
Suppose we roll a fair six-sided die. Each face (1 through 6) is equally likely, so the die follows a discrete uniform distribution.

In a continuous example, if a random variable \( X \) is uniformly distributed between 2 and 5, then:
$$
f(x) = \frac{1}{b - a} \text{ for } a \leq x \leq b
$$
where \( a = 2 \) and \( b = 5 \). Thus:
$$
f(x) = \frac{1}{5 - 2} = \frac{1}{3}
$$

#### Q8: What is the z score? State the importance of the z score.
##### Z-Score

The **Z-score** measures how many standard deviations an observation or data point is from the mean of the distribution. It is calculated as:
$$
Z = \frac{X - \mu}{\sigma}
$$
where $X$ is the observation, ($\mu$) is the mean, and ($\sigma$) is the standard deviation.

**Importance of the Z-Score**:
- **Standardization**: Allows for comparison of scores from different distributions.
- **Probability Calculation**: Helps in finding probabilities and percentiles in a normal distribution.
- **Outlier Detection**: Identifies data points that are significantly different from the mean.

#### Q9: What is Central Limit Theorem? State the significance of the Central Limit Theorem.
##### Central Limit Theorem (CLT)

The **Central Limit Theorem (CLT)** states that the distribution of the sample mean of a large number of independent, identically distributed random variables approaches a normal distribution, regardless of the original distribution of the variables. 

**Significance of the CLT**:
- **Applicability**: Enables the use of normal distribution techniques even when the underlying data distribution is not normal.
- **Practical Use**: Facilitates the construction of confidence intervals and hypothesis testing.
- **Data Analysis**: Assures that sample means will be approximately normally distributed, making statistical inference more reliable.

#### Q10: State the assumptions of the Central Limit Theorem.
##### Assumptions of the Central Limit Theorem
The assumptions of the Central Limit Theorem are:

1. **Independence**: The random variables must be independent of each other.
2. **Identical Distribution**: The random variables should be identically distributed.
3. **Sample Size**: The sample size should be sufficiently large. Although there's no strict rule, a common guideline is $ n \geq 30 $.

These assumptions ensure that the sample mean approximates a normal distribution, allowing for accurate statistical analysis.
