Q1: **Probability Mass Function (PMF) and Probability Density Function (PDF):**
   - **Probability Mass Function (PMF):** The PMF is used to describe the probability distribution of a discrete random variable. It assigns probabilities to each possible outcome or value of the random variable. For each value, the PMF gives the probability of that value occurring. Example: Rolling a fair six-sided die, the PMF would assign a probability of 1/6 to each of the six possible outcomes.

   - **Probability Density Function (PDF):** The PDF is used for continuous random variables and describes the probability distribution by specifying how the probability is distributed over a range of values. The area under the PDF curve between two points represents the probability of the random variable falling within that range. Example: The normal distribution, which is continuous, is characterized by a PDF that forms a bell-shaped curve. It provides the probability of observing a value within a given range.

Q2: **Cumulative Density Function (CDF):**
   - The Cumulative Density Function (CDF) is a function that gives the probability that a random variable takes on a value less than or equal to a specific value. In other words, it accumulates the probabilities as you move from the left to the right on the distribution. The CDF is used to provide a complete picture of the distribution's behavior and is particularly useful for calculating probabilities or percentiles.

   - Example: Consider a fair six-sided die. The CDF of this die would start at 0 for values less than 1, then increase to 1/6 for 1, 2/6 for 2, and so on, until it reaches 1 for values greater than or equal to 6. It tells you the probability of rolling a value less than or equal to the value you're interested in.

   - CDF is used because it simplifies many calculations involving probabilities. For instance, you can find the probability of a value falling within a range by subtracting the CDF at the lower bound of the range from the CDF at the upper bound.

Q3: **Examples of situations where the normal distribution might be used as a model:**
   - IQ Scores: IQ scores often follow a normal distribution, with a mean of 100 and a standard deviation of 15.
   - Height of Individuals: In a large population, the height of individuals often approximates a normal distribution.
   - Measurement Errors: In many scientific measurements, errors follow a normal distribution.
   - Stock Prices: Daily stock price changes are often assumed to be normally distributed.

The parameters of the normal distribution relate to the shape of the distribution as follows:
   - The mean (μ) determines the center or peak of the distribution.
   - The standard deviation (σ) controls the spread or width of the distribution. A larger σ results in a wider distribution.

Q4: **Importance of Normal Distribution:**
   - The normal distribution is crucial in statistics because of the Central Limit Theorem, which states that the sampling distribution of the sample mean of a large number of independent, identically distributed random variables will be approximately normally distributed, regardless of the shape of the original population distribution.
   - It is widely used in hypothesis testing, confidence intervals, and regression analysis.
   - Many real-life phenomena naturally follow a normal distribution, making it a convenient model for various applications.

Real-life examples of the normal distribution:
   - Heights of individuals in a population.
   - IQ scores in a large population.
   - Errors in scientific measurements.
   - Many financial data, such as stock returns.

Q5: **Bernoulli Distribution and Binomial Distribution:**
   - **Bernoulli Distribution:** The Bernoulli distribution models a random experiment with two possible outcomes: success (usually denoted as 1) and failure (usually denoted as 0). It is characterized by a single parameter, p, which represents the probability of success. For example, a coin toss, where heads (1) is success with a probability of 0.5 and tails (0) is failure.

   - **Binomial Distribution:** The Binomial distribution models the number of successes (k) in a fixed number of independent Bernoulli trials (n), where each trial has the same probability of success (p). It has two parameters, n and p. For example, the number of heads when flipping a coin 10 times (n=10, p=0.5) follows a binomial distribution.

The key difference is that the Bernoulli distribution models a single trial, while the Binomial distribution models multiple trials with the same probability of success.

Q6. To calculate the probability that a randomly selected observation from a normally distributed dataset with a mean of 50 and a standard deviation of 10 will be greater than 60, you can use the Z-score formula and then find the corresponding probability from a standard normal distribution table.

The Z-score (standard score) for a value x in a normal distribution with mean μ and standard deviation σ is calculated as:

Z = (x - μ) / σ

In this case, x = 60, μ = 50, and σ = 10:

Z = (60 - 50) / 10 = 1

Now, you can find the probability associated with Z = 1. You can use a Z-table or a calculator to find this probability. The probability of a Z-score being greater than 1 (P(Z > 1)) can be found in a standard normal distribution table, which is typically readily available. In most standard tables, you will find that P(Z > 1) is approximately 0.1587.

So, the probability that a randomly selected observation from the dataset will be greater than 60 is approximately 0.1587, or 15.87%.

Q7. **Uniform Distribution:**
The uniform distribution is a probability distribution in which all outcomes are equally likely. In other words, each value within a given range has the same probability of occurring. It is characterized by a constant probability density function (PDF) over a specified interval.

**Example:** Rolling a fair six-sided die is an example of a discrete uniform distribution. Each of the six faces has an equal probability of 1/6 of occurring.

**Example:** A continuous uniform distribution can be illustrated by a random variable that can take any value between a and b with equal probability. For instance, the time it takes for a bus to arrive at a certain stop might follow a continuous uniform distribution if the bus schedule is known.

Q8. **Z-Score:**
A Z-score (standard score) is a measure of how many standard deviations a data point is from the mean of a dataset. It is calculated using the formula:
Z = (X - μ) / σ

Where:
- Z is the Z-score.
- X is the individual data point.
- μ is the mean of the dataset.
- σ is the standard deviation of the dataset.

**Importance of Z-Score:**
1. It standardizes data, allowing you to compare values from different datasets with different units or scales.
2. It helps identify and interpret outliers by quantifying how far a data point is from the mean.
3. Z-scores are used in hypothesis testing and confidence intervals in statistics.
4. They are essential in quality control and process monitoring to detect deviations from expected norms.

Q9. **Central Limit Theorem (CLT):**
The Central Limit Theorem is a fundamental concept in statistics. It states that the sampling distribution of the sample mean (or other sample statistics) approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution. In other words, if you take a sufficiently large number of random samples from a population and calculate the mean of each sample, the distribution of those sample means will be approximately normal.

**Significance of the Central Limit Theorem:**
1. It allows for the use of normal distribution-based statistical methods even when the population is not normally distributed, making it applicable to a wide range of real-world situations.
2. It forms the basis for inferential statistics, hypothesis testing, and the construction of confidence intervals.
3. It provides a way to estimate population parameters from sample statistics.
4. It simplifies complex problems by approximating them with the normal distribution.

Q10. **Assumptions of the Central Limit Theorem:**
1. **Independence:** Samples must be drawn independently from the population or each other.
2. **Sample Size:** As a rule of thumb, a sufficiently large sample size (typically n > 30) is required for the CLT to apply, although the exact size may vary depending on the population distribution.
3. **Random Sampling:** Samples should be selected randomly from the population.
4. **Population Distribution:** While the population does not have to be normally distributed, the CLT is most effective when the population distribution is not extremely skewed or heavily tailed. In cases where the population is not approximately normal, larger sample sizes may be needed for the CLT to work effectively.