# Q1: What are the Probability Mass Function (PMF) and Probability Density Function (PDF)? Explain with an example.

### Probability Mass Function (PMF):
The Probability Mass Function (PMF) is used for discrete random variables and gives the probability that a specific value takes place. It provides the probability of each possible outcome as a discrete probability distribution.
Example of PMF:
Consider rolling a fair six-sided die. The PMF of this scenario would give us the probability of getting each face (1, 2, 3, 4, 5, or 6) after rolling the die. Since each outcome is equally likely, the PMF for this situation would be: PMF(x) = 1/6 for x ∈ {1, 2, 3, 4, 5, 6} and PMF(x) = 0 for other values of x.

### Probability Density Function (PDF):
The Probability Density Function (PDF) is used for continuous random variables and provides the likelihood of a random variable falling within a specific range. Unlike PMF, the PDF doesn't directly give the probability at specific points due to the continuous nature of the variable.
### Example of PDF:
A common example is the standard normal distribution (mean = 0 and standard deviation = 1). The PDF of the standard normal distribution is the bell-shaped curve, also known as the Gaussian curve. It provides the probability density for different values within the range of the distribution. The PDF can be denoted as f(x), and its specific formula is: f(x) = (1 / √(2π)) * e^((-x^2) / 2), where 'e' is the base of the natural logarithm.

# Q2: What is Cumulative Density Function (CDF)? Explain with an example. Why CDF is used?

- The Cumulative Density Function (CDF) is a function that provides the cumulative probability that a random variable takes on a value less than or equal to a given point. For discrete random variables, the CDF is the sum of the probabilities up to that point, and for continuous random variables, it is the integral of the probability density function up to that point.

### Example of CDF:
Let's consider a fair six-sided die again. The CDF at a specific point 'x' will give us the probability of rolling a value less than or equal to 'x'. For example, the CDF at x = 3 would be 1/2, as there is a 50% chance of rolling a number less than or equal to 3 on the die.

### Why CDF is used?
The CDF is essential because it gives us information about the probabilities of outcomes for a random variable across a range of values. It helps in understanding the cumulative distribution of the variable and is useful for computing percentiles and making probabilistic statements about the data.

# Q3: What are some examples of situations where the normal distribution might be used as a model? Explain how the parameters of the normal distribution relate to the shape of the distribution.


- The normal distribution is commonly used as a model for continuous random variables in various fields due to its widespread occurrence in nature and real-world phenomena. Some examples of situations where the normal distribution might be used as a model include:

- Heights of Individuals: The heights of adult humans tend to follow a roughly normal distribution, with the majority of people clustered around the mean height.

- IQ Scores: Intelligence quotient (IQ) scores are often modeled using a normal distribution, where the mean IQ is typically set to 100 and the standard deviation is 15.

- Measurement Errors: Errors in measurements, such as the length of an object or the time taken to perform a task, often follow a normal distribution.

- Test Scores: Test scores in standardized exams, like SAT or GRE, are often modeled using a normal distribution.

### Parameters of the normal distribution relate to the shape of the distribution as follows:

- Mean (μ): The mean determines the center of the distribution. It is the value around which the data is symmetrically distributed.
- Standard Deviation (σ): The standard deviation controls the spread or dispersion of the data points around the mean. A larger standard deviation leads to a wider distribution, while a smaller standard deviation results in a narrower distribution.
- Variance (σ^2): The variance is the square of the standard deviation and represents the average squared distance of data points from the mean. It also affects the shape of the distribution.

# Q4: Explain the importance of Normal Distribution. Give a few real-life examples of Normal Distribution. 

- The normal distribution holds significant importance in statistics and various scientific fields due to several reasons:

- Central Limit Theorem: The sum or average of a large number of independent and identically distributed random variables tends to follow a normal distribution, regardless of the original distribution. This property is essential for many statistical inference techniques.

- Data Modeling: Many real-world phenomena, such as measurements, errors, IQ scores, and physical attributes, can be effectively modeled by the normal distribution.

- Statistical Analysis: The normal distribution simplifies statistical calculations and makes it easier to derive probabilistic statements and confidence intervals.

### Real-life examples of Normal Distribution:

- The distribution of heights of adult males in a population.
- The distribution of exam scores in a class where the test is well-designed and the scores are relatively evenly distributed around the mean.
- The distribution of weights of products manufactured in a factory, assuming the production process is well-controlled and stable.
- The distribution of IQ scores in a large population.

# Q5: What is Bernaulli Distribution? Give an Example. What is the difference between Bernoulli Distribution and Binomial Distribution?

### Bernoulli Distribution:
- The Bernoulli distribution models a single experiment or trial with two possible outcomes: success (coded as 1) and failure (coded as 0). It is used for a discrete random variable representing a binary event, where the probability of success (p) remains constant for each trial.

### Example of Bernoulli Distribution:
- A coin flip experiment is a classic example of a Bernoulli distribution, where the outcome can be either "heads" (success) or "tails" (failure).

### Difference between Bernoulli Distribution and Binomial Distribution:

### Number of Trials:

- Bernoulli Distribution: Represents a single trial with two outcomes (success or failure).
- Binomial Distribution: Represents the number of successes in a fixed number of independent Bernoulli trials.
### Probability of Success:

- Bernoulli Distribution: Assumes a constant probability of success (p) for each trial.
- Binomial Distribution: Requires both the number of trials (n) and the probability of success (p) for each trial.
### Nature:

- Bernoulli Distribution: Describes the outcome of a single experiment.
- Binomial Distribution: Describes the number of successes in multiple, independent experiments.
### Probability Mass Function (PMF):

- Bernoulli Distribution: PMF is given by P(X = x) = p^x * (1 - p)^(1-x) for x ∈ {0, 1}.
- Binomial Distribution: PMF is given by P(X = k) = C(n, k) * p^k * (1 - p)^(n-k) for k ∈ {0, 1, ..., n}, where C(n, k) is the binomial coefficient.

# Q6. Consider a dataset with a mean of 50 and a standard deviation of 10. If we assume that the dataset is normally distributed, what is the probability that a randomly selected observation will be greater than 60? Use the appropriate formula and show your calculations.

### Probability of randomly selected observation being greater than 60:

To find the probability that a randomly selected observation will be greater than 60, we can use the standard normal distribution (z-distribution) since the dataset is assumed to be normally distributed. First, we need to standardize the value 60 using the z-score formula, and then we can find the corresponding probability from the standard normal distribution table or using a statistical tool like Python.

The z-score formula is:
z = (X - μ) / σ

where:
X = Value (in this case, 60)
μ = Mean of the dataset (given as 50)
σ = Standard deviation of the dataset (given as 10)

Calculations:
z = (60 - 50) / 10
z = 1

Now, we find the probability of z > 1 from the standard normal distribution table or using Python. The probability that a randomly selected observation will be greater than 60 is approximately 0.1587 or 15.87%.




# Q7: Explain uniform Distribution with an example.

### Uniform Distribution:

Uniform Distribution is a probability distribution where all possible outcomes have equal probability of occurrence. In other words, in a uniform distribution, every value within a specified range has the same likelihood of being observed. It is a continuous distribution with a constant probability density over the defined interval.

Example of Uniform Distribution:
Consider rolling a fair six-sided die. Each outcome (1, 2, 3, 4, 5, or 6) has an equal probability of 1/6. This is an example of a discrete uniform distribution. For a continuous uniform distribution, we can consider a random variable X representing the time it takes to complete a task, where X can take any value between 0 and 10 hours with an equal probability density of 1/10.

# Q8: What is the z score? State the importance of the z score.

### Z-score and its Importance:

The Z-score (also known as the standard score) is a statistical measure that quantifies the number of standard deviations a data point is from the mean of a dataset. It is calculated using the formula:

Z = (X - μ) / σ

where:
X = Data point
μ = Mean of the dataset
σ = Standard deviation of the dataset

### Importance of Z-score:
1. Standardization: Z-scores standardize the data, allowing for easier comparison and analysis of different datasets with varying units and scales.

2. Outlier Detection: Z-scores help identify outliers in a dataset. Data points with Z-scores greater than a threshold (e.g., |Z| > 2 or |Z| > 3) are considered potential outliers.

3. Probability Calculation: Z-scores are used to find probabilities in a standard normal distribution table, allowing for easy calculation of probabilities for any given value in a normal distribution.

# Q9: What is Central Limit Theorem? State the significance of the Central Limit Theorem.

### Central Limit Theorem (CLT):

Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size increases, regardless of the shape of the original population distribution. This holds true even if the population distribution is not normal. The theorem is a fundamental concept in statistics and has several significant implications.

### Significance of the Central Limit Theorem:

1. Large Sample Assumption: CLT allows statisticians to use the normal distribution to make inferences about population parameters based on sample statistics, even when the population distribution is unknown or not normally distributed.

2. Sampling Variability: It helps us understand the distribution of sample means and highlights that sample means from different samples may vary but tend to cluster around the population mean.

3. Hypothesis Testing and Confidence Intervals: CLT enables the use of parametric tests (e.g., t-tests, ANOVA) and the construction of confidence intervals for population parameters based on sample statistics.


# Q10: State the assumptions of the Central Limit Theorem.

### Assumptions of the Central Limit Theorem:

1. Random Sampling: The samples must be obtained through a random sampling process, meaning each element in the population has an equal chance of being selected.

2. Independence: The observations within each sample and between samples must be independent of each other.

3. Sample Size: The sample size should be sufficiently large. While there is no strict threshold for what constitutes "sufficiently large," a commonly accepted guideline is that the sample size should be at least 30.

4. Finite Variance: The population from which the samples are drawn should have a finite variance. If the population variance is infinite, the CLT might not hold