## Q1

1. Probability Mass Function(PMF)

    1. The PMF is used for discrete random variables, which can take on a countable set of distinct values.
    2. It assigns a probability to each possible value of the random variable.
    3. The PMF must satisfy two properties: each probability must be non-negative, and the sum of all probabilities must be equal to 1.
    
2. Probability Density Function(PDF)

    1. The PDF is used for continuous random variables, which can take on an uncountably infinite set of values within a range.
    2. The area under the PDF curve over a specific interval represents the probability of the random variable falling within that interval.
    3. Like the PMF, the PDF must also satisfy two properties: it must be non-negative, and the total area under the curve must equal 1.
    

Example

1. Discrete Random Variable(PMF)

    1. Suppose you flip a fair coin three times, and you are interested in the number of heads (H) that occur. H can take values from 0 to 3
    2. The PMF for this scenario is as follows:
        P(H = 0) = 1/8
        P(H = 1) = 3/8
        P(H = 2) = 3/8
        P(H = 3) = 1/8
    3. Each probability represents the likelihood of getting a specific number of heads in three coin flips.    


2. Continuous Random Variable(PDF)

    1. Suppose you measure the height of individuals in a population. Height is a continuous random variable, and it can take any value within a certain range (e.g., 150 cm to 200 cm).
    2. The PDF for height might be represented by a bell-shaped curve, like the normal distribution. This curve describes the probability density of finding individuals with different heights.

## Q2

The Cumulative Density Function (CDF) is a fundamental concept in probability and statistics. It provides a way to describe the probability distribution of a random variable by showing the probability that the random variable takes on a value less than or equal to a specific value. In other words, the CDF accumulates the probabilities as you move along the possible values of the random variable, hence the name "cumulative."

Example: 
    Suppose you have a fair six-sided die (with faces numbered 1 to 6), and you are interested in the CDF for the outcome of a single roll. The CDF would look like this:

    F(1) = P(X ≤ 1) = 1/6 (since there's a 1/6 chance of rolling a 1 or less)
    F(2) = P(X ≤ 2) = 2/6 (a 2/6 chance of rolling a 2 or less)
    F(3) = P(X ≤ 3) = 3/6 (a 3/6 chance of rolling a 3 or less)
    F(4) = P(X ≤ 4) = 4/6 (a 4/6 chance of rolling a 4 or less)
    F(5) = P(X ≤ 5) = 5/6 (a 5/6 chance of rolling a 5 or less)
    F(6) = P(X ≤ 6) = 1 (a 100% chance of rolling a 6 or less)
    This cumulative probability distribution provides a clear picture of the likelihood of obtaining different outcomes when rolling the die. For example, if you want to know the probability of rolling a 3 or less, you can simply look at F(3), which is 3/6 or 50%.
    
The CDF is useful for several reasons:

    1. Visual Representation
    2. Cummulative Information.



## Q3

The normal distribution, also known as the Gaussian distribution or the bell curve, is widely used to model various phenomena in science, engineering, and statistics. It is characterized by a symmetric, bell-shaped curve. Here are some examples of situations where the normal distribution might be used as a model:

    1. Height of Individuals.
    2. Test Scores.
    3. Stock Returns.
    
Parameters of the Normal Distribution and Their Relationship to the Shape:

The normal distribution is characterized by two parameters: the mean (μ) and the standard deviation (σ). These parameters play a crucial role in shaping the distribution:

    1. Mean: The mean represents the central location of the distribution. It is the point around which the data cluster, and it corresponds to the peak of the bell curve. Shifting the mean to the left or right will shift the entire distribution accordingly. A larger mean shifts the distribution to the right, and a smaller mean shifts it to the left.
    
    2. Standard Deviation: The standard deviation measures the spread or variability of the data. A larger standard deviation results in a wider, flatter bell curve, indicating greater variability in the data. Conversely, a smaller standard deviation results in a narrower, taller bell curve, indicating less variability.
    
    
. The empirical rule (68-95-99.7 rule) illustrates the relationship between the standard deviation and the percentage of data within certain ranges in a normal distribution:

    1. Approximately 68% of the data falls within one standard deviation of the mean (μ ± σ).
    2. Approximately 95% of the data falls within two standard deviations of the mean (μ ± 2σ).
    3. Approximately 99.7% of the data falls within three standard deviations of the mean (μ ± 3σ).

    

## Q4

The Normal Distribution, also known as the Gaussian distribution or the bell curve, holds significant importance in various fields of science, engineering, and statistics due to several compelling reasons:

1. Central Limit Theorem: The central limit theorem states that the sampling distribution of the mean will always be normally distributed as long as the size of the sample is large enough. Regardless of whether the population has normal,poisson,uniform or any other distribution.The samplng dstribution of the mean will be normal.


Examples of Real-Life Situations Modeled by the Normal Distribution:

    1. Height of Individuals.
    2. IQ Scores.
    3. Stock Returns.

## Q5

The Bernoulli Distribution is a discrete probability distribution that models a random experiment with two possible outcomes: "success" with probability p and "failure" with probability q, where q = 1 - p. It is named after Swiss mathematician Jacob Bernoulli.

Example:

Consider a single toss of a biased coin, where "heads" (H) is considered a success, and "tails" (T) is considered a failure. If the probability of getting a "heads" is p = 0.7, then the Bernoulli Distribution for this experiment would be:

1. P(X = 1) = p = 0.7 (probability of getting a "heads")
2. P(X = 0) = q = 1 - p = 0.3 (probability of getting a "tails")

In this example, X represents the outcome of a single coin toss, and it follows a Bernoulli Distribution.

Difference between Bernoulli and Binomial Distribution.

1. Number of Trials:

    1. Bernoulli Distribution: Describes a single trial or experiment with two possible outcomes (success or failure).
    2. Binomial Distribution: Describes the number of successes (k) in a fixed number of independent, identical Bernoulli trials (n).
    
2. Random Variable:

    1. Bernoulli Distribution: Models the outcome of a single trial and represents a single random variable (X).
    2. Binomial Distribution: Models the number of successes in multiple trials, and the random variable (X) can take on values from 0 to n.

## Q6

In [1]:
import scipy.stats as stat

mean = 50
std_dev= 10

value = 60

z_score = (value - mean) / std_dev

probability = 1 - stat.norm.cdf(z_score)
print(f"The Probability X > {value} is approximately {probability:.4f}")

The Probability X > 60 is approximately 0.1587


## Q7

The Uniform Distribution, also known as the rectangular distribution, is a probability distribution in which all values within a specific range are equally likely to occur. In other words, it's a constant probability distribution where each value in the range has the same probability of being observed. The uniform distribution is characterized by two parameters: the minimum value (a) and the maximum value (b) within the range.

Example:

Let's consider a simple example of a continuous uniform distribution i.e if the number of candies sold daily at a shop is uniformly distributed with a minimum of 10 and maximum of 40.Then what is the probability of daily sales to fall between 15 and 30 .
b = maximum value(40)
a = minimum value(10)

Pr(15<= x >= 30) = (30-15) * 1/b-a
Pr(15<= x >= 30) = (30-15) * 1/30
                 =  0.5 
                 
that means 50% of the data falls under (15<= x >= 30).                

## Q8

The z-score, also known as the standard score, is a statistical measure that quantifies the number of standard deviations a data point is away from the mean (average).

Formula for z-score 
    z-score = (x-mean) / std_dev
    
Key points about the z-score and its importance:

1. Standardization: The primary purpose of the z-score is to standardize data, transforming it into a common scale with a mean of 0 and a standard deviation of 1. This transformation makes it easier to compare and analyze data points from different datasets or variables.

2. Outlier Detection: Z-scores are commonly used to identify outliers in a dataset. Outliers are data points that are significantly different from the rest of the data. Typically, data points with z-scores greater than a certain threshold (e.g., ±2 or ±3) are considered outliers.

## Q9

The Central Limit Theorem (CLT) is a fundamental concept in statistics that describes the behavior of the sampling distribution of the sample mean (or other sample statistics) from a population, regardless of the shape of the population distribution. 

 The central limit theorem states that the sampling distribution of the mean will always be normally distributed as long as the size of the sample is large enough. Regardless of whether the population has normal,poisson,uniform or any other distribution.The samplng dstribution of the mean will be normal.
 
Significance of the Central Limit Theorem: 

1. Normality Assumption: The CLT allows statisticians to make inferences about population parameters, such as the population mean, even when the population distribution is not known or is not normally distributed. It forms the basis for many statistical techniques, such as hypothesis testing and confidence interval estimation.

2. Sample Size Considerations: The CLT emphasizes the importance of sample size. As the sample size (n) increases, the sampling distribution of the sample mean becomes more normal and its standard error (σ/√n) decreases. This means that larger samples provide more accurate estimates.

## Q10

1. Sample Size should be large enough (n>= 30).
2. The data must be sampled at random.
3. population mean and population and standard deviation should be provided.