Q1: What are the Probability Mass Function (PMF) and Probability Density Function (PDF)? Explain with
an example.

**Probability Mass Function (PMF):**

The Probability Mass Function (PMF) is a function that gives the probability of a discrete random variable taking on a specific value. For a discrete random variable \(X\), the PMF is denoted as \(P(X = x)\), representing the probability that \(X\) is equal to a particular value \(x\). The PMF satisfies the following properties:

1. **Non-Negativity:** \(P(X = x) \geq 0\) for all \(x\).
2. **Summation:** \(\sum P(X = x) = 1\) over all possible values of \(X\).

**Example:**
Consider a fair six-sided die. The PMF for the outcome of rolling the die is as follows:

\[ P(X = 1) = P(X = 2) = P(X = 3) = P(X = 4) = P(X = 5) = P(X = 6) = \frac{1}{6} \]

This PMF indicates that each face of the die has an equal probability of \(\frac{1}{6}\).

**Probability Density Function (PDF):**

The Probability Density Function (PDF) is a function that describes the likelihood of a continuous random variable falling within a particular range of values. For a continuous random variable \(X\), the PDF is denoted as \(f(x)\), representing the density of probabilities at a specific point \(x\). The integral of the PDF over a range gives the probability that \(X\) falls within that range. The PDF satisfies the following properties:

1. **Non-Negativity:** \(f(x) \geq 0\) for all \(x\).
2. **Area under the Curve:** \(\int_{-\infty}^{\infty} f(x) \,dx = 1\).

**Example:**
Consider a standard normal distribution with mean (\(\mu\)) of 0 and standard deviation (\(\sigma\)) of 1. The PDF for this distribution is given by the bell-shaped curve described by the formula:

\[ f(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2}{2}} \]

This PDF describes the likelihood of observing a particular value \(x\) in a standard normal distribution.

In summary, the PMF is associated with discrete random variables, providing probabilities for specific outcomes, while the PDF is associated with continuous random variables, describing the density of probabilities across a range of values.

Q2: What is Cumulative Density Function (CDF)? Explain with an example. Why CDF is used?

**Cumulative Density Function (CDF):**

The Cumulative Density Function (CDF) is a function that gives the probability that a random variable \(X\) takes on a value less than or equal to a specified point \(x\). For both discrete and continuous random variables, the CDF is denoted as \(F(x)\), where:

- For a discrete random variable: \(F(x) = P(X \leq x)\)
- For a continuous random variable: \(F(x) = \int_{-\infty}^{x} f(t) \,dt\), where \(f(t)\) is the Probability Density Function (PDF).

**Properties of CDF:**

1. **Non-Decreasing:** \(F(x_1) \leq F(x_2)\) if \(x_1 \leq x_2\).
2. **Limits:** As \(x\) approaches \(-\infty\), \(F(x)\) approaches 0. As \(x\) approaches \(+\infty\), \(F(x)\) approaches 1.
3. **Right-Continuous:** \(F(x)\) is right-continuous, meaning that the CDF is continuous from the right at each point \(x\).

**Example:**

Consider a fair six-sided die. The CDF for the outcome of rolling the die is as follows:

\[ F(x) = \begin{cases} 
0 & \text{if } x < 1 \\
\frac{1}{6} & \text{if } 1 \leq x < 2 \\
\frac{2}{6} & \text{if } 2 \leq x < 3 \\
\frac{3}{6} & \text{if } 3 \leq x < 4 \\
\frac{4}{6} & \text{if } 4 \leq x < 5 \\
\frac{5}{6} & \text{if } 5 \leq x < 6 \\
1 & \text{if } x \geq 6 \\
\end{cases} \]

In this example, \(F(x)\) gives the probability that the outcome of rolling the die is less than or equal to \(x\).

**Why CDF is Used:**

1. **Probability Calculation:** The CDF provides an easy way to calculate the probability that a random variable falls within a specified range. For example, \(P(a \leq X \leq b) = F(b) - F(a)\).

2. **Quantile Calculation:** The CDF is used to find percentiles or quantiles. The \(p\)-th percentile is the value \(x\) for which \(F(x) = p\).

3. **Comparison of Distributions:** CDFs are useful for comparing different probability distributions and understanding the distribution of a random variable.

4. **Statistical Testing:** CDFs are employed in statistical hypothesis testing, goodness-of-fit tests, and other statistical analyses.

In summary, the Cumulative Density Function (CDF) provides a cumulative measure of the probability distribution, making it a valuable tool for understanding the behavior of random variables and facilitating various statistical calculations.

Q3: What are some examples of situations where the normal distribution might be used as a model?
Explain how the parameters of the normal distribution relate to the shape of the distribution.

The normal distribution, also known as the Gaussian distribution or bell curve, is widely used to model various phenomena in diverse fields due to its mathematical tractability and its occurrence in many natural processes. Here are some examples of situations where the normal distribution might be used as a model:

1. **Height of Individuals:**
   - The distribution of human heights is often modeled using a normal distribution. The mean (\(\mu\)) represents the average height, and the standard deviation (\(\sigma\)) characterizes the variability around the mean.

2. **IQ Scores:**
   - IQ scores are often assumed to follow a normal distribution with a mean of 100 and a standard deviation of 15.

3. **Measurement Errors:**
   - Measurement errors in instruments, such as the errors in length or weight measurements, can be modeled using a normal distribution.

4. **Financial Returns:**
   - Stock returns and financial variables are often assumed to be normally distributed in financial modeling.

5. **Physical Characteristics:**
   - Characteristics such as body temperature, blood pressure, and heart rate in a healthy population are often modeled using a normal distribution.

6. **Test Scores:**
   - The scores on standardized tests, such as SAT or GRE, are often assumed to follow a normal distribution.

### Parameters and Shape of the Normal Distribution:

The normal distribution is characterized by two parameters: the mean (\(\mu\)) and the standard deviation (\(\sigma\)). These parameters influence the shape of the distribution:

1. **Mean (\(\mu\)):**
   - The mean is the central location of the distribution. It determines the location of the peak of the bell curve.
   - Shifting the mean to the right or left moves the entire distribution horizontally along the x-axis.

2. **Standard Deviation (\(\sigma\)):**
   - The standard deviation measures the spread or variability of the distribution.
   - A larger standard deviation leads to a wider and flatter distribution, while a smaller standard deviation results in a narrower and taller distribution.

Together, the mean and standard deviation define the shape and scale of the normal distribution. The 68-95-99.7 rule states that approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

In summary, the normal distribution is a versatile and commonly used model due to its mathematical properties, and its parameters (\(\mu\) and \(\sigma\)) play a crucial role in determining the characteristics of the distribution, such as its central tendency and variability.

Q4: Explain the importance of Normal Distribution. Give a few real-life examples of Normal
Distribution.

**Importance of Normal Distribution:**

The normal distribution holds significant importance in statistics, probability theory, and various scientific disciplines. Here are some reasons why the normal distribution is crucial:

1. **Statistical Inference:**
   - Many statistical methods and tests are based on the assumption of normality. For example, t-tests, ANOVA, and regression analysis often assume that the data is normally distributed.

2. **Central Limit Theorem:**
   - The Central Limit Theorem states that the sum (or average) of a large number of independent and identically distributed random variables, regardless of their original distribution, tends to follow a normal distribution. This theorem is fundamental in statistical theory and practice.

3. **Parameter Estimation:**
   - Maximum Likelihood Estimation (MLE), a commonly used method for estimating parameters of statistical models, is particularly powerful when the data is normally distributed.

4. **Quality Control:**
   - In manufacturing and quality control, deviations from a normal distribution in measurements may indicate issues with the production process. Normal distributions are often used to model the variability in product dimensions.

5. **Risk Management and Finance:**
   - In finance, the assumption of normality is frequently used in modeling stock prices and returns. Concepts like Value at Risk (VaR) and option pricing often rely on the assumption of normality.

6. **Biological and Psychological Traits:**
   - Many biological and psychological traits, such as height, weight, IQ scores, and blood pressure in a healthy population, are distributed approximately normally.

7. **Natural Phenomena:**
   - Various natural phenomena, such as the distribution of measurement errors, environmental noise, and the distribution of extreme events, often exhibit characteristics of a normal distribution.

**Real-Life Examples of Normal Distribution:**

1. **IQ Scores:**
   - IQ scores are designed to follow a normal distribution with a mean of 100 and a standard deviation of 15.

2. **Height of Individuals:**
   - The height of a large population, when measured, often follows a normal distribution.

3. **Body Temperature:**
   - Normal body temperature in healthy individuals is approximately normally distributed with a mean around 98.6°F (37°C).

4. **Scores on Standardized Tests:**
   - Scores on standardized tests, such as SAT or GRE, are often assumed to be normally distributed.

5. **Stock Returns:**
   - Daily or monthly stock returns are often modeled using a normal distribution in financial analysis.

6. **Blood Pressure:**
   - Blood pressure values in a healthy population are often approximately normally distributed.

7. **Errors in Measurement:**
   - Measurement errors in scientific experiments or instrument readings are often modeled as normally distributed.

The normal distribution's ubiquity in various aspects of life makes it a valuable tool for understanding and modeling random phenomena, simplifying statistical analyses, and making predictions in a wide range of fields.

Q5: What is Bernaulli Distribution? Give an Example. What is the difference between Bernoulli
Distribution and Binomial Distribution?

**Bernoulli Distribution:**

The Bernoulli distribution is a discrete probability distribution representing a random variable that can take on one of two possible outcomes, typically labeled as success (coded as 1) and failure (coded as 0). It is named after Jacob Bernoulli, a Swiss mathematician.

The probability mass function (PMF) of the Bernoulli distribution is given by:

\[ P(X = x) = \begin{cases} 
p & \text{if } x = 1 \\
1 - p & \text{if } x = 0 \\
0 & \text{otherwise}
\end{cases} \]

where \(p\) is the probability of success.

**Example:**

Consider a single toss of a fair coin. Let \(X\) be a random variable representing the outcome of the toss. If we define success as getting heads (coded as 1) and failure as getting tails (coded as 0), then \(X\) follows a Bernoulli distribution with \(p = 0.5\) (the probability of getting heads).

**Difference between Bernoulli Distribution and Binomial Distribution:**

1. **Number of Trials:**
   - **Bernoulli Distribution:** Represents a single trial with two possible outcomes (success or failure).
   - **Binomial Distribution:** Represents the number of successes in a fixed number (\(n\)) of independent Bernoulli trials.

2. **Random Variable:**
   - **Bernoulli Distribution:** Involves a single binary random variable (\(X\)).
   - **Binomial Distribution:** Involves the sum of binary random variables (\(X_1, X_2, ..., X_n\)) representing the number of successes in \(n\) trials.

3. **Probability Mass Function (PMF):**
   - **Bernoulli Distribution:** \(P(X = x) = p^x (1 - p)^{1 - x}\) for \(x = 0, 1\).
   - **Binomial Distribution:** \(P(X = k) = \binom{n}{k} p^k (1 - p)^{n - k}\) for \(k = 0, 1, ..., n\).

4. **Parameters:**
   - **Bernoulli Distribution:** Parameterized by \(p\) (probability of success).
   - **Binomial Distribution:** Parameterized by \(n\) (number of trials) and \(p\) (probability of success).

5. **Mean and Variance:**
   - **Bernoulli Distribution:** Mean (\(\mu\)) is \(p\), and Variance (\(\sigma^2\)) is \(p(1 - p)\).
   - **Binomial Distribution:** Mean (\(\mu\)) is \(np\), and Variance (\(\sigma^2\)) is \(np(1 - p)\).

In summary, the Bernoulli distribution is a special case of the binomial distribution where the number of trials (\(n\)) is 1. The binomial distribution generalizes the Bernoulli distribution to multiple independent trials, allowing the modeling of the number of successes in a sequence of binary outcomes.

Q6. Consider a dataset with a mean of 50 and a standard deviation of 10. If we assume that the dataset
is normally distributed, what is the probability that a randomly selected observation will be greater
than 60? Use the appropriate formula and show your calculations.

To find the probability that a randomly selected observation from a normally distributed dataset will be greater than 60, we can use the Z-score formula and then look up the corresponding probability in the standard normal distribution table.

The Z-score is calculated using the formula:

\[ Z = \frac{{X - \mu}}{{\sigma}} \]

where:
- \( X \) is the value for which we want to find the probability (in this case, 60),
- \( \mu \) is the mean of the dataset (given as 50),
- \( \sigma \) is the standard deviation of the dataset (given as 10).

So, for \( X = 60 \):

\[ Z = \frac{{60 - 50}}{{10}} = 1 \]

Now, we look up the probability corresponding to a Z-score of 1 in the standard normal distribution table.

The probability that a randomly selected observation will be greater than 60 is given by:

\[ P(X > 60) = P(Z > 1) \]

Using a standard normal distribution table or calculator, we find that \( P(Z > 1) \approx 0.1587 \).

Therefore, the probability that a randomly selected observation will be greater than 60 is approximately 0.1587, or 15.87%.

Q7: Explain uniform Distribution with an example.

**Uniform Distribution:**

The uniform distribution is a probability distribution where all values within a specified range are equally likely to occur, and the probability density function (PDF) is constant over that range. In other words, each value in the range has the same likelihood of being observed.

**Probability Density Function (PDF) of Uniform Distribution:**

For a continuous uniform distribution over the interval \([a, b]\), the PDF is given by:

\[ f(x) = \frac{1}{b - a} \text{ for } a \leq x \leq b \]

This means that the probability of any subinterval within \([a, b]\) is proportional to the length of that subinterval.

**Example:**

Consider a six-sided fair die. The outcomes (1, 2, 3, 4, 5, 6) are uniformly distributed because each face has an equal probability of \(\frac{1}{6}\) of being rolled. Here, the range is from 1 to 6, and the probability density function is constant over this range.

**Characteristics of Uniform Distribution:**

1. **Constant Probability:**
   - The probability of observing any specific value within the range is constant.

2. **Rectangular Shape:**
   - The PDF forms a rectangle over the specified range, indicating equal likelihood for all values within that range.

3. **Equal Intervals:**
   - The distribution is defined over a continuous interval, and each subinterval of the same length within the range has an equal probability.

4. **Cumulative Distribution Function (CDF):**
   - The cumulative distribution function increases linearly over the range.

**Probability Density Function of a Uniform Distribution:**

For a continuous uniform distribution over the interval \([a, b]\), the PDF is given by:

\[ f(x) = \frac{1}{b - a} \text{ for } a \leq x \leq b \]

where \(a\) and \(b\) are the lower and upper bounds of the distribution.

**Example:**

Let's consider a continuous uniform distribution over the interval \([2, 8]\). The probability density function is:

\[ f(x) = \frac{1}{8 - 2} = \frac{1}{6} \text{ for } 2 \leq x \leq 8 \]

In this case, any value within the interval \([2, 8]\) has an equal probability density of \(\frac{1}{6}\). The shape of the distribution is rectangular, indicating uniformity.

Q8: What is the z score? State the importance of the z score.

**Z-Score:**

The Z-score, also known as the standard score or z-value, is a measure of how many standard deviations a particular data point is from the mean of a distribution. It is calculated using the formula:

\[ Z = \frac{{X - \mu}}{{\sigma}} \]

where:
- \( Z \) is the Z-score,
- \( X \) is the individual data point,
- \( \mu \) is the mean of the distribution,
- \( \sigma \) is the standard deviation of the distribution.

The Z-score indicates whether a data point is below, equal to, or above the mean of the distribution and provides a standardized way to compare different observations across different scales.

**Importance of Z-Score:**

1. **Standardization:**
   - Z-scores standardize data, allowing comparisons between different datasets with different units and scales. This is particularly useful in fields like statistics and data analysis.

2. **Outlier Detection:**
   - Z-scores help identify outliers. Observations with Z-scores significantly different from zero may be considered outliers, indicating unusual behavior.

3. **Probability Calculation:**
   - Z-scores are used in calculating probabilities in a standard normal distribution. The Z-score represents the number of standard deviations a data point is from the mean in a standard normal distribution, and this information is used to find probabilities.

4. **Normal Distribution Analysis:**
   - In a normal distribution, Z-scores are crucial for understanding where a data point lies relative to the mean and how common or unusual it is within the distribution.

5. **Quality Control:**
   - Z-scores are used in quality control processes to identify data points that fall outside an acceptable range, suggesting potential issues.

6. **Data Transformation:**
   - Z-scores are used in data transformation techniques to normalize data, making it suitable for certain statistical analyses.

7. **Comparison of Scores:**
   - Z-scores allow for the comparison of scores from different distributions. By converting scores to Z-scores, analysts can assess relative performance or characteristics.

8. **Grading and Assessment:**
   - Z-scores are often used in educational settings to standardize scores on tests and assessments, providing a common metric for comparison.

In summary, the Z-score is a valuable statistical tool that standardizes data, facilitates comparisons, and provides insights into the relative position of data points within a distribution. It is widely used in various fields for analysis, quality control, and decision-making.

Q9: What is Central Limit Theorem? State the significance of the Central Limit Theorem.

Q10: State the assumptions of the Central Limit Theorem.