Q1: What are the Probability Mass Function (PMF) and Probability Density Function (PDF)? Explain with
an example.

The Probability Mass Function (PMF) and Probability Density Function (PDF) are two fundamental concepts in probability theory 
and statistics.

The PMF is used to describe the probability distribution of a discrete random variable. It gives the probability that a random
variable X takes on a specific value x, and is denoted by P(X=x). The PMF is defined for each possible value of x, and the sum 
of all probabilities over all possible values of x is equal to 1.

For example, consider a fair six-sided die. The PMF of the random variable X, which represents the value of the die, is given 
by:

P(X=1) = 1/6
P(X=2) = 1/6
P(X=3) = 1/6
P(X=4) = 1/6
P(X=5) = 1/6
P(X=6) = 1/6

The PMF tells us that the probability of rolling a 1 is 1/6, the probability of rolling a 2 is 1/6, and so on. The sum of all 
probabilities is equal to 1.

The PDF is used to describe the probability distribution of a continuous random variable. It gives the probability density at 
each point in the range of possible values of the random variable. Unlike the PMF, the PDF does not give the actual probability
of a specific value, but rather the likelihood of a value falling within a range of values.

For example, consider the normal distribution with mean μ and standard deviation σ. The PDF of the random variable X, which 
represents a measurement from this distribution, is given by:

f(x) = (1 / (σ * sqrt(2 * π))) * exp(-((x - μ)^2) / (2 * σ^2))

The PDF tells us the probability density at each value of x. For example, the PDF tells us that the probability density is
highest around the mean μ, and decreases as we move away from the mean in either direction. The total area under the curve of 
the PDF is equal to 1.

Q2: What is Cumulative Density Function (CDF)? Explain with an example. Why CDF is used?

The Cumulative Density Function (CDF) is a function that gives the probability that a random variable X takes on a value less 
than or equal to x, for any value of x. It is denoted by F(x), and is defined for both discrete and continuous random variables.

For a discrete random variable X with PMF p(x), the CDF is defined as:

F(x) = P(X ≤ x) = ∑ p(i), for all i such that i ≤ x

For a continuous random variable X with PDF f(x), the CDF is defined as:

F(x) = P(X ≤ x) = ∫ f(t) dt, for t from -∞ to x

The CDF is useful because it provides a complete description of the probability distribution of a random variable. It gives the
probability of X taking on any value less than or equal to x, and can be used to calculate probabilities for specific intervals
of values.

For example, consider a coin flip experiment where we are interested in the number of heads that occur in two flips. Let X be
the number of heads. The PMF of X is:

P(X=0) = 1/4
P(X=1) = 1/2
P(X=2) = 1/4

The CDF of X is:

F(x) = P(X ≤ x) = P(X=0) + P(X=1), if x = 0 or 1
F(x) = 1, if x ≥ 2

The CDF tells us that the probability that X is less than or equal to 0 is 1/4, the probability that X is less than or equal to
1 is 3/4, and the probability that X is less than or equal to 2 is 1. We can use the CDF to calculate probabilities for 
specific intervals of values, such as the probability that X is between 1 and 2, which is F(2) - F(1) = 1 - 3/4 = 1/4.

In summary, the CDF is used to describe the probability distribution of a random variable, and provides a way to calculate
probabilities for specific intervals of values. It is a fundamental concept in probability theory and statistics, and is widely
used in a variety of applications.

Q3: What are some examples of situations where the normal distribution might be used as a model?
Explain how the parameters of the normal distribution relate to the shape of the distribution.

The normal distribution, also known as the Gaussian distribution, is a probability distribution that is commonly used as a model
for many natural and social phenomena. Some examples of situations where the normal distribution might be used as a model 
include:

Physical measurements such as height, weight, and blood pressure
IQ scores and other standardized test scores
Stock prices and financial returns
Errors in measurements and experimental data
Natural phenomena such as the distribution of wind speeds or rainfall
The normal distribution is characterized by two parameters: the mean (μ) and the standard deviation (σ). The mean determines 
the center of the distribution, while the standard deviation determines the spread or width of the distribution.

If μ is increased, the entire distribution shifts to the right, and if μ is decreased, the distribution shifts to the left.
The standard deviation, σ, controls the spread of the distribution. A smaller value of σ indicates a narrower and taller 
distribution, while a larger value of σ indicates a wider and flatter distribution.

The normal distribution is symmetric around its mean, meaning that the left and right tails of the distribution are mirror 
images of each other. This is true regardless of the values of μ and σ. The area under the normal distribution curve is equal
to 1, which means that the probability of a random variable taking on any value within the range of the distribution is equal
to 1.

In summary, the normal distribution is a versatile and commonly used probability distribution that can be used to model a wide
variety of natural and social phenomena. The parameters of the normal distribution, namely the mean and standard deviation, 
determine the center and spread of the distribution, respectively.

Q4: Explain the importance of Normal Distribution. Give a few real-life examples of Normal
Distribution.

Normal distribution, also known as the Gaussian distribution, is an important concept in probability theory and statistics. It 
is widely used because many natural and social phenomena follow this distribution. It is important because it allows us to model
and analyze real-world data in a variety of fields, including physics, engineering, social sciences, finance, and more.

One of the key advantages of the normal distribution is that it has a well-defined mean and standard deviation. This allows us
to easily calculate probabilities and make predictions about the data. Additionally, the central limit theorem states that the
sum of a large number of independent random variables will tend to follow a normal distribution. This property makes the normal
distribution a useful tool in many areas of research and analysis.

Here are a few real-life examples of normal distribution:

Height: In human populations, height follows a normal distribution, with the majority of people clustered around the mean
height, and fewer people at the extreme ends of the distribution.

IQ Scores: IQ scores are normalized to have a mean of 100 and a standard deviation of 15, and follow a normal distribution. This
allows us to compare individual scores to the population mean and predict the likelihood of certain scores.

Stock Prices: The daily returns of stock prices are often modeled using a normal distribution. This allows investors to estimate
the probability of a particular return, and manage their investment portfolios accordingly.

Test Scores: Standardized test scores, such as the SAT or GRE, are designed to follow a normal distribution, with a mean of 500
and a standard deviation of 100. This allows educators and employers to compare individual scores to the population and make 
decisions about admissions or hiring.

Weight: In animal populations, weight often follows a normal distribution, with the majority of animals clustered around the
mean weight, and fewer animals at the extreme ends of the distribution.

In summary, the normal distribution is an important tool in probability theory and statistics. It allows us to model and analyze
real-world data in a variety of fields, and make predictions about the data based on well-defined mean and standard deviation. 
Its versatility and flexibility make it an invaluable concept in many areas of research and analysis.

Q5: What is Bernaulli Distribution? Give an Example. What is the difference between Bernoulli
Distribution and Binomial Distribution?

The Bernoulli distribution is a probability distribution that describes the outcomes of a single experiment that can result in 
one of two possible outcomes, often called a success or a failure. It is named after the Swiss mathematician Jacob Bernoulli, 
who first introduced the concept in the late 1600s.

The Bernoulli distribution is characterized by a single parameter, usually denoted as p, which represents the probability of a
success. The probability of a failure is simply (1 - p).

An example of the Bernoulli distribution could be a coin flip, where a "success" could be defined as the coin landing on heads,
and a "failure" as the coin landing on tails. If the probability of the coin landing on heads is 0.5, then the Bernoulli
distribution for this experiment would have p = 0.5.

The Bernoulli distribution is a special case of the binomial distribution, which describes the probability distribution of the 
number of successes in a fixed number of independent Bernoulli trials. In other words, while the Bernoulli distribution
describes the outcome of a single experiment, the binomial distribution describes the outcome of a series of experiments with 
the same probability of success.

The binomial distribution is characterized by two parameters: n, the number of trials, and p, the probability of success. The
probability of getting exactly k successes in n trials is given by the binomial probability formula:

P(k) = (n choose k) * p^k * (1 - p)^(n - k)

where (n choose k) is the binomial coefficient, which represents the number of ways to choose k items out of a set of n.

To summarize, the Bernoulli distribution describes the probability distribution of a single experiment with two possible 
outcomes, while the binomial distribution describes the probability distribution of a series of independent Bernoulli trials 
with a fixed probability of success.

Q6. Consider a dataset with a mean of 50 and a standard deviation of 10. If we assume that the dataset
is normally distributed, what is the probability that a randomly selected observation will be greater
than 60? Use the appropriate formula and show your calculations.

To calculate the probability that a randomly selected observation will be greater than 60, we can use the standard normal distribution formula:

Z = (X - μ) / σ

where Z is the standard normal random variable, X is the observation we are interested in, μ is the mean of the distribution, and σ is the standard deviation of the distribution.

In this case, we can convert the observation of 60 to a standard normal random variable by plugging in the values:

Z = (60 - 50) / 10 = 1

Next, we can use a standard normal distribution table or calculator to find the probability that Z is greater than 1. This probability corresponds to the area under the standard normal distribution curve to the right of Z = 1.

From the standard normal distribution table, we find that the probability of Z being greater than 1 is approximately 0.1587.

Therefore, the probability that a randomly selected observation from this dataset will be greater than 60 is approximately 0.1587, or 15.87%.





Q7: Explain uniform Distribution with an example.
    
Uniform distribution is a continuous probability distribution that describes a situation where all possible outcomes of an experiment have an equal probability of occurring. In other words, a uniform distribution has a constant probability density function over a specified interval.

An example of a uniform distribution could be the height of a basketball player. If we assume that the height of a basketball player is uniformly distributed between 6 feet and 7 feet, then any height within this interval has an equal chance of occurring. This means that a player who is 6'2" has the same probability of occurring as a player who is 6'8".

The probability density function of a uniform distribution is given by:

f(x) = 1 / (b - a)

where a and b are the lower and upper bounds of the distribution, respectively.

For example, if we assume that the height of a basketball player is uniformly distributed between 6 feet and 7 feet, then the probability density function would be:

f(x) = 1 / (7 - 6) = 1

for 6 <= x <= 7, and f(x) = 0 for x < 6 or x > 7.

The uniform distribution has several useful properties, including its simplicity and ease of use in modeling situations where all outcomes are equally likely. However, it may not always be an appropriate model for real-world situations, as many phenomena do not exhibit this type of symmetry or uniformity.

Q8: What is the z score? State the importance of the z score.
    
A z-score, also known as a standard score, is a statistical measure that tells us how many standard deviations a particular observation or data point is from the mean of the dataset. It is calculated by subtracting the mean of the dataset from the observation, and then dividing by the standard deviation.

The formula for calculating the z-score is:

z = (x - μ) / σ

where z is the z-score, x is the observed value, μ is the mean of the dataset, and σ is the standard deviation of the dataset.

The importance of the z-score lies in its ability to standardize data and make it easier to compare observations from different datasets. By converting raw data to z-scores, we can compare observations that may be measured in different units or have different scales of measurement.

Additionally, the z-score is important in hypothesis testing and statistical inference. For example, in hypothesis testing, we can use the z-score to calculate the probability of observing a particular value or more extreme values assuming the null hypothesis is true. This can help us determine whether a particular result is statistically significant or simply due to chance.

Overall, the z-score is a powerful statistical tool that helps us standardize data, compare observations from different datasets, and make statistical inferences.

Q9: What is Central Limit Theorem? State the significance of the Central Limit Theorem.
    
The Central Limit Theorem (CLT) is a fundamental theorem in probability and statistics that states that as the sample size increases, the sampling distribution of the sample means approaches a normal distribution, regardless of the shape of the original population distribution. In other words, if we take multiple random samples of size n from a population and calculate the mean of each sample, the distribution of those sample means will be approximately normal, even if the population distribution is not.

The significance of the Central Limit Theorem is that it allows us to make statistical inferences about a population based on a sample of data. It is a key component of inferential statistics, which involves drawing conclusions about a population based on a sample of data. By using the CLT, we can make accurate estimates of population parameters, such as the population mean or standard deviation, based on a sample of data.

Another important application of the Central Limit Theorem is in hypothesis testing. In hypothesis testing, we often want to know whether a sample mean is significantly different from a hypothesized population mean. By using the CLT, we can approximate the distribution of the sample means and calculate the probability of observing a sample mean as extreme or more extreme than the one we have observed, assuming that the null hypothesis is true. This allows us to make informed decisions about whether to reject or fail to reject the null hypothesis.

Overall, the Central Limit Theorem is an important concept in statistics that allows us to make accurate inferences about a population based on a sample of data. It is widely used in many fields, including business, engineering, and social sciences, to make data-driven decisions and draw meaningful conclusions from data.

Q10: State the assumptions of the Central Limit Theorem.
    
The Central Limit Theorem (CLT) has a few assumptions that must be met in order for it to hold. These assumptions are:

Random Sampling: The sample must be selected randomly from the population of interest.

Sample Size: The sample size should be sufficiently large, typically at least 30 observations, but this can vary depending on the shape of the population distribution.

Independence: The observations in the sample should be independent of each other. This means that the value of one observation should not be influenced by the value of any other observation in the sample.

Finite Variance: The population from which the sample is drawn should have a finite variance. This means that the population standard deviation should not be infinite.

If these assumptions are met, then the Central Limit Theorem holds, and we can use the normal distribution to make statistical inferences about the population mean based on a sample of data. If these assumptions are not met, then we may need to use alternative methods, such as nonparametric tests, to make inferences about the population.