Q1. What are the Probability Mass Function (PMF) and Probability Density Function (PDF)? Explain with
an example.

PMF is a function that describes the probability distribution of a discrete random variable. It gives the probability of each possible outcome in a discrete set of values. 

For instance, consider a random variable X representing the outcome of a fair six-sided die roll. The PMF of X can be represented as:

PMF(X=x) = 1/6 for x ∈ {1,2,3,4,5,6}

PDF is a function that describes the probability distribution of a continuous random variable. It gives the probability density at each point in the continuous set of values. 

For instance, consider a random variable Y representing the time it takes for a customer to complete a transaction at a store. The PDF of Y can be represented as:

PDF(Y=y) = 1/20 e^(-y/20) for y ≥ 0

Here, the PDF gives the probability density of each possible value of Y, which is exponentially decreasing with time. The area under the PDF curve represents the total probability, which is always equal to 1.

Q2. What is Cumulative Density Function (CDF)? Explain with an example. Why CDF is used?

CDF gives the probability that a random variable takes on a value less than or equal to a given value.

For a continuous random variable, the CDF is obtained by integrating the PDF from negative infinity to the given value. For a discrete random variable, the CDF is obtained by adding up the PMF values for all values less than or equal to the given value.

For example, we have a random variable X that represents the number of heads obtained when flipping a fair coin three times. The PMF of X is:

PMF(X = 0) = 1/8
PMF(X = 1) = 3/8
PMF(X = 2) = 3/8
PMF(X = 3) = 1/8

To find the CDF at x = 2, we add up the probabilities of X taking on the values 0, 1, or 2:

CDF(X ≤ 2) = PMF(X = 0) + PMF(X = 1) + PMF(X = 2) = 1/8 + 3/8 + 3/8 = 7/8

The CDF at x = 2 gives the probability that the number of heads obtained is less than or equal to 2 when flipping a coin three times.

CDF is used for a variety of statistical applications, such as hypothesis testing, confidence interval estimation, and simulation. The CDF provides a useful tool for calculating probabilities and making statistical inferences based on the probability distribution of a random variable. Additionally, CDF can be used to obtain other useful quantities, such as the mean and variance of a random variable.

Q3. What are some examples of situations where the normal distribution might be used as a model?
Explain how the parameters of the normal distribution relate to the shape of the distribution.

Some examples of situations where the normal distribution might be used as a model:
1. Exam scores: In a large class of students, exam scores may be approximately normally distributed.
2. Medical measurements: Some medical measurements, such as blood pressure or heart rate, may follow a normal distribution.
3. Financial returns: The returns of financial investments may be modeled using the normal distribution, assuming they are independent and identically distributed.

The normal distribution is characterized by two parameters: the mean (μ) and the standard deviation (σ). The mean determines the center of the distribution, while the standard deviation controls the spread of the distribution.

If μ=0 and σ=1, the normal distribution is known as the standard normal distribution. In this case, the distribution has a bell-shaped curve with the highest point at the mean, which is zero. Approximately 68% of the data will fall within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

If μ is greater than zero, the distribution will be shifted to the right, and if it is less than zero, it will be shifted to the left. If σ is small, the distribution will be narrow and peaked, while if σ is large, the distribution will be wide and flattened.

Q4. Explain the importance of Normal Distribution. Give a few real-life examples of Normal
Distribution.

Some reasons why the normal distribution is important:

1. It is a commonly occurring distribution: The normal distribution is found in many natural phenomena, such as the heights and weights of individuals, test scores, and errors in measurements. Thus, it is important to understand and model these phenomena using the normal distribution.

2. It simplifies statistical analysis: The normal distribution is mathematically tractable and has many analytical properties, making it easy to analyze statistically. This simplifies many statistical analyses, such as hypothesis testing and parameter estimation.

3. It forms the basis of many statistical techniques: Many statistical techniques, such as linear regression and ANOVA, assume the normal distribution for the errors or residuals. Therefore, understanding the normal distribution is critical for understanding and applying these techniques.

Few Real World Examples:

Heights of adults: The heights of adults are approximately normally distributed, with a mean of around 5'7" and a standard deviation of around 2.5".

IQ scores: IQ scores are approximately normally distributed, with a mean of 100 and a standard deviation of 15.

Blood pressure: Blood pressure readings in a population can be modeled by a normal distribution, with mean and standard deviation varying based on the population.

Stock prices: Stock prices are often assumed to follow a log-normal distribution, which is a related distribution to the normal distribution.

Test scores: The scores on many standardized tests, such as the SAT or GRE, are approximately normally distributed, with a mean and standard deviation varying by the specific test.

Q5. What is Bernaulli Distribution? Give an Example. What is the difference between Bernoulli
Distribution and Binomial Distribution?

Bernoulli distribution is a probability distribution that models the probability of a binary event, where the outcome can be either a success or a failure.

For example, Suppose a coin is flipped, and we are interested in the probability of getting heads. We can model this using the Bernoulli distribution, where the outcome is heads (success) with probability p = 0.5 and tails (failure) with probability 1-p = 0.5

The difference is Bernoulli distribution is a probability distribution that models the probability of a binary event, while the binomial distribution models the number of successes in a fixed number of independent trials. The Bernoulli distribution is a special case of the binomial distribution, with n=1.


Q6. Consider a dataset with a mean of 50 and a standard deviation of 10. If we assume that the dataset
is normally distributed, what is the probability that a randomly selected observation will be greater
than 60? Use the appropriate formula and show your calculations.

The Z-Score should be calculated in order to find probability that a randomly selected observation will be greater than 60.

Z = x-mean/(std)
x = 60
mean = 50
std = 10

Z = 60-50/10 = 1

And from Z table for Z = 1 the value of probability is 0.84134. But, we need probability greater than 60.

Therefore, the value of probability is 1 - 0.84134 = 0.15865

In [5]:
# The python code to calculate the probability that a randomly selected observation will be greater than 60
import scipy.stats as stats

mean = 50
std = 10
x = 60

z = (x - mean) / std
prob = 1 - stats.norm.cdf(z)

print("The probability of a randomly selected observation being greater than 60 is:", prob)

The probability of a randomly selected observation being greater than 60 is: 0.15865525393145707


Q7. Explain uniform Distribution with an example.

Uniform distribution is a probability distribution where all possible outcomes are equally likely.

For example, the rolling of a fair die. In this case, the distribution of the outcome is uniform, as all six possible outcomes (1, 2, 3, 4, 5, 6) are equally likely. Each outcome has a probability of 1/6, which is the same for all outcomes.

Q8. What is the z score? State the importance of the z score.

The z-score is a statistical measure that indicates how many standard deviations a data point is away from the mean of a dataset. The importance of the z-score lies in its ability to standardize data, making it easier to compare and interpret values from different datasets. Also, z-score can be used to identify outliers in a dataset. Data points that have a z-score that is significantly higher or lower than the mean can be considered outliers, which may indicate an error in the data or a unique data point that requires further investigation.

Q9. What is Central Limit Theorem? State the significance of the Central Limit Theorem.

The CLT is a statistical theory that states that - if you take a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from that population will be roughly equal to the population mean.

In other words, the CLT tells us that if we take multiple samples of the same size from a population and calculate the mean of each sample, the distribution of these sample means will be approximately normal, even if the population distribution is not normal.

The CLT allows us to make inferences about the population from a relatively small sample size, as the sample mean provides a good estimate of the population mean. This is particularly useful when studying large populations that are difficult or expensive to measure in their entirety.

To understand assume, election polling. These polls are used to estimate the number of people who support a specific candidate. You may have seen these results with confidence intervals on news channels. The CLT aids in this calculation.

Q10. State the assumptions of the Central Limit Theorem.

1. The sample size should be sufficiently large: The CLT assumes that the sample size is large enough to ensure that the sample mean is normally distributed, regardless of the underlying distribution of the population. A sample size of at least 30 is generally considered sufficient for the CLT to apply.

2. The data must be independent and identically distributed (i.i.d): The CLT assumes that the data is sampled randomly and each observation is independent of the others, meaning that the outcome of one observation does not influence the outcome of another. Additionally, the CLT assumes that each observation is drawn from the same probability distribution.

3. The population should have a finite variance: The CLT assumes that the population from which the sample is drawn has a finite variance, which is a measure of how spread out the data is around the mean. If the population variance is infinite, the CLT may not apply.

4. Outliers should be minimized: The CLT assumes that the sample does not contain significant outliers or extreme values, which can distort the distribution of the sample mean.