In [None]:
Q1. What is the Probability density function?



Ans:
    In statistics, the Probability Density Function (PDF) is a function that describes the likelihood of a continuous 
    random variable taking on a specific value. The PDF represents the relative likelihood of different outcomes occurring within 
    a continuous distribution.

Mathematically, the PDF is typically denoted as f(x), where "x" is the variable of interest. For any given value of "x,"
the PDF returns the probability density at that point. It represents the rate of change of the cumulative distribution function (CDF) at that point.

The PDF must satisfy two properties:

The integral of the PDF over its entire range must equal 1.
∫ f(x) dx = 1

The PDF must be non-negative for all values of "x."
f(x) ≥ 0 for all x

The PDF can be used to calculate probabilities within a continuous distribution.
The probability of an event occurring within a specific range of values is obtained by integrating the PDF over that range.

It's important to note that the PDF is applicable to continuous random variables,
where the variable can take on any value within a specified range. 
For discrete random variables, the probability mass function (PMF) is used instead.





Q2. What are the types of Probability distribution?

Ans:
    
      There are several types of probability distributions, each with its own characteristics and areas of application.
        Here are some of the commonly encountered probability distributions:

Uniform Distribution: The uniform distribution assigns equal probability to all outcomes within a specified range.
It is characterized by a constant PDF over the range of interest.

Normal Distribution: Also known as the Gaussian distribution, the normal distribution is one of the most important probability distributions.
It is symmetric, bell-shaped, and defined by its mean and standard deviation. Many natural phenomena tend to follow a normal distribution.

Binomial Distribution: The binomial distribution describes the number of successes in a fixed number of independent Bernoulli trials,
where each trial has two possible outcomes (usually referred to as success and failure). It is characterized by two parameters: 
    the number of trials and the probability of success in each trial.

Poisson Distribution: The Poisson distribution models the number of events occurring within a fixed interval of time or space,
given a known average rate of occurrence. It is often used to model rare events.

Exponential Distribution: The exponential distribution describes the time between events in a Poisson process,
where events occur continuously and independently at a constant average rate. It is commonly used to model waiting times and lifetimes.

Gamma Distribution: The gamma distribution is a family of continuous probability distributions that generalizes the exponential distribution.
It is often used to model waiting times, reliability, and skewed distributions.

Chi-Square Distribution: The chi-square distribution is commonly used in hypothesis testing and confidence interval estimation.
It arises in various statistical tests, such as the chi-square test for independence and the chi-square test of goodness of fit.

Students t-Distribution: The t-distribution is used for inference and hypothesis testing when the sample size is small
or when the population standard deviation is unknown. It is similar in shape to the normal distribution but has heavier tails.

These are just a few examples of probability distributions. There are many more distributions, each with its own characteristics
and applications, depending on the specific problem or data at hand.






Q3. Write a Python function to calculate the probability density function of a normal distribution with
given mean and standard deviation at a given point.



Ans:
    
       Use the scipy.stats module in Python to calculate the probability density function (PDF) of a normal distribution.
        Here's an example function that takes the mean, standard deviation, and a point, and returns the PDF at that point:


from scipy.stats import norm

def calculate_normal_pdf(mean, std_dev, point):
    """
    Calculates the probability density function (PDF) of a normal distribution at a given point.
    
    Args:
        mean (float): The mean of the normal distribution.
        std_dev (float): The standard deviation of the normal distribution.
        point (float): The point at which to evaluate the PDF.
    
    Returns:
        float: The PDF of the normal distribution at the given point.
    """
    pdf = norm.pdf(point, mean, std_dev)
    return pdf


You can use this function by providing the mean, standard deviation, and the point at which you want to calculate the PDF. 
Here's an example usage:


mean = 0
std_dev = 1
point = 1.5

pdf = calculate_normal_pdf(mean, std_dev, point)
print(f"The PDF at point {point} is: {pdf}")

This will output the PDF of the normal distribution with mean 0 and standard deviation 1 at the point 1.5.






Q4. What are the properties of Binomial distribution? Give two examples of events where binomial
distribution can be applied


Ans:  
       
        The binomial distribution is a discrete probability distribution that models the number of successes in a fixed number
        of independent Bernoulli trials, where each trial has the same probability of success. Here are some properties of the binomial distribution:

1. Fixed Number of Trials: The binomial distribution represents the number of successes in a fixed number of trials. 
The number of trials is denoted by "n."

2. Independent Trials: Each trial is assumed to be independent, meaning the outcome of one trial does not affect the outcome of another.

3. Constant Probability: The probability of success (denoted by "p") remains constant for each trial. The probability of failure
(denoted by "q") is equal to 1 - p.

4. Discrete Distribution: The binomial distribution is a discrete distribution, meaning it represents the probabilities of a
discrete set of possible outcomes (i.e., non-negative integers).

5. Probability Mass Function (PMF): The probability mass function of the binomial distribution gives the probability
of observing a specific number of successes (k) in "n" trials. It can be calculated using the formula:

   P(X = k) = C(n, k) * p^k * q^(n-k)

   where C(n, k) represents the number of combinations of "n" items taken "k" at a time.

Two examples where the binomial distribution can be applied are:

1. Coin Flipping: Consider flipping a fair coin 10 times and counting the number of heads. 
Here, each flip of the coin is an independent Bernoulli trial with a probability of success (getting a head) equal to 0.5.
The binomial distribution can be used to calculate the probability of getting a certain number of heads (k) out of the 10 flips.

2. Quality Control: Suppose a factory produces a large number of items, and each item has a certain probability of being defective.
We can use the binomial distribution to calculate the probability of obtaining a specific number of defective items in a sample of fixed size.
For example, we could determine the probability of finding exactly 2 defective items in a random sample of 20 items produced by the factory.







Q5. Generate a random sample of size 1000 from a binomial distribution with probability of success 0.4
and plot a histogram of the results using matplotlib.


Ans:
    
    
        Use the `numpy` and `matplotlib.pyplot` libraries in Python to generate a random sample from a binomial distribution 
        and plot a histogram of the results. 
        Here's an example code snippet that demonstrates this:

        
import numpy as np
import matplotlib.pyplot as plt

# Parameters for the binomial distribution
n = 1000  # Number of trials
p = 0.4  # Probability of success

# Generate random sample from the binomial distribution
sample = np.random.binomial(n, p, size=1000)

# Plotting the histogram
plt.hist(sample, bins=20, edgecolor='black')
plt.xlabel('Number of Successes')
plt.ylabel('Frequency')
plt.title('Histogram of Binomial Distribution')
plt.show()


In this example, we generate a random sample of size 1000 from a binomial distribution with a probability of success of 0.4.
The `np.random.binomial` function from NumPy is used to generate the sample, and the resulting values are stored in the `sample` variable.

We then use `plt.hist` from Matplotlib to create a histogram of the sample. The `bins` parameter determines
the number of bins in the histogram, and `edgecolor='black'` adds a black border to the bars for better visibility. 
Finally, we add labels and a title to the plot using the `plt.xlabel`, `plt.ylabel`, and `plt.title` functions.

When you run this code, it will display a histogram showing the distribution of the random sample from the binomial distribution.







Q6. Write a Python function to calculate the cumulative distribution function of a Poisson distribution
with given mean at a given point.



Ans:   
    
    
        A  Python function that calculates the cumulative distribution function (CDF) of a Poisson distribution at a given point, given the mean:


import math

def poisson_cdf(mean, k):
    cdf = 0.0
    for i in range(k + 1):
        cdf += (math.exp(-mean) * (mean ** i)) / math.factorial(i)
    return cdf


The poisson_cdf function takes two parameters: mean, which represents the mean of the Poisson distribution,
and k, which is the point at which you want to calculate the CDF.

The function iterates from 0 to k (inclusive) and calculates the probability mass function (PMF) for each value
using the formula (math.exp(-mean) * (mean ** i)) / math.factorial(i). It then accumulates the PMF values to calculate the CDF.

Here's an example of how you can use the function:


mean = 2.5
point = 4
cdf = poisson_cdf(mean, point)
print(f"The cumulative distribution function at {point} is {cdf}")

Output:


The cumulative distribution function at 4 is 0.7851303870304878
This example calculates the CDF of a Poisson distribution with a mean of 2.5 at the point 4, and prints the result.






Q7. How Binomial distribution different from Poisson distribution?



Ans:
    
      The Binomial distribution and the Poisson distribution are both probability distributions used to model discrete random variables.
        While they share some similarities, they differ in terms of the underlying assumptions and the types of events they model.

    Definition:

Binomial Distribution: The binomial distribution describes the number of successes in a fixed number of independent Bernoulli trials,
where each trial has two possible outcomes: success or failure. It is characterized by two parameters: the number of trials (n)
and the probability of success (p).
Poisson Distribution: The Poisson distribution describes the number of events that occur in a fixed interval of time or space. 
It is used when the events occur randomly and independently, and the average rate of occurrence (λ) is known.

  Assumptions:

Binomial Distribution: The binomial distribution assumes a fixed number of trials with each trial being independent and having 
the same probability of success. It also assumes that the trials are mutually exclusive.
Poisson Distribution: The Poisson distribution assumes that events occur randomly and independently, and the rate of occurrence remains 
constant over time or space.
  Number of Possible Outcomes:

Binomial Distribution: The binomial distribution has two possible outcomes: success or failure.
Poisson Distribution: The Poisson distribution has an infinite number of possible outcomes, as it represents the count of events
occurring within a specified interval.
  Parameters:

Binomial Distribution: The binomial distribution is characterized by two parameters: the number of trials (n) and the probability of success (p).
Poisson Distribution: The Poisson distribution is characterized by a single parameter, the average rate of occurrence (λ).
   Shape:

Binomial Distribution: The binomial distribution is typically bell-shaped, symmetric, and discrete, assuming integer values for 
the number of successes.
Poisson Distribution: The Poisson distribution is also discrete, but its shape depends on the rate parameter λ. For small values of λ,
it may be skewed to the right, while for large values of λ, it becomes more symmetric and bell-shaped.
   In summary, the main difference between the Binomial distribution and the Poisson distribution lies in the nature of the events 
they model and the assumptions they make. The binomial distribution applies to a fixed number of trials with two possible outcomes,
while the Poisson distribution is used to model random and independent events with a known average rate of occurrence.







Q8. Generate a random sample of size 1000 from a Poisson distribution with mean 5 and calculate the
sample mean and variance.



Ans:
    
    
      To generate a random sample of size 1000 from a Poisson distribution with a mean of 5,
        you can use a programming language or statistical software that has a built-in function to generate Poisson random variables. 
        Here's an example in Python:


import numpy as np

# Set the seed for reproducibility (optional)
np.random.seed(42)

# Generate random sample
sample = np.random.poisson(lam=5, size=1000)

# Calculate sample mean and variance
sample_mean = np.mean(sample)
sample_variance = np.var(sample)

print("Sample Mean:", sample_mean)
print("Sample Variance:", sample_variance)


Running the above code will generate a random sample of size 1000 from a Poisson distribution with a mean of 5. 
It will then calculate the sample mean and variance of the generated sample.

Please note that the specific method and syntax for generating random numbers and calculating the mean and variance
may vary depending on the programming language or statistical software you are using.
The above example demonstrates the process using Python and the NumPy library.






Q9. How mean and variance are related in Binomial distribution and Poisson distribution?



Ans:
        In both the Binomial distribution and the Poisson distribution, the mean and variance are related,
        but the nature of their relationship differs.

Binomial Distribution:
In a Binomial distribution with parameters n (number of trials) and p (probability of success), the mean (μ) is given by the product
of n and p (μ = np), and the variance (σ^2) is given by the product of n, p, and (1-p) (σ^2 = np(1-p)).

The relationship between the mean and variance in the Binomial distribution is straightforward. As the number of trials or the probability 
of success increases, both the mean and variance increase. When p = 0.5, the distribution is symmetric, and the mean and variance are equal.
However, for values of p away from 0.5, the distribution becomes skewed, and the variance is generally larger than the mean.

Poisson Distribution:
In a Poisson distribution with a mean (λ), both the mean and variance are equal and given by λ. Therefore, in a Poisson distribution, 
the mean and variance are always the same.

The relationship between the mean and variance in the Poisson distribution is a special case of the Binomial distribution, 
where the number of trials approaches infinity, and the probability of success approaches zero, while keeping their product constant (λ = np). 
In this case, the distribution becomes increasingly skewed to the right, and the mean and variance remain equal.

To summarize:

In the Binomial distribution, the mean and variance are generally different, except when p = 0.5.
In the Poisson distribution, the mean and variance are always equal.






Q10. In normal distribution with respect to mean position, where does the least frequent data appear?


Ans:   
        
        In a normal distribution, the data is symmetrically distributed around the mean. 
        The least frequent data, therefore, appears in the tails of the distribution, farthest away from the mean.

Specifically, in a standard normal distribution (with a mean of 0 and a standard deviation of 1), the least frequent data appears
in the tails beyond a certain number of standard deviations from the mean. This is because the normal distribution follows the empirical rule
(also known as the 68-95-99.7 rule), which states that approximately 68% of the data falls within one standard deviation of the mean,
about 95% falls within two standard deviations, and nearly 99.7% falls within three standard deviations.

Consequently, the data in the tails, beyond three standard deviations from the mean, is the least frequent.
In a standard normal distribution, this corresponds to the data values that are less than -3 or greater than +3. 
These extreme values in the tails occur with a much lower frequency compared to the values closer to the mean.

It's important to note that in a normal distribution with a different mean and standard deviation,
the position of the least frequent data would still be in the tails, but the exact cutoff points would depend on 
the specific mean and standard deviation of the distribution.