In [1]:
import scipy
import numpy as np

### Binomial distribution

![](http://www.stat.yale.edu/Courses/1997-98/101/binpdf.gif)

## Binomial Distribution: When to Use?

- **Fixed Trials (n):** Experiment repeats a set number of times.  
- **Two Outcomes:** Success (1) or Failure (0).  
- **Constant Probability (p):** Same for all trials.  
- **Independent Trials:** One trial doesn't affect another.  

### Examples  
- Coin Tossing, Manufacturing Defects, Medical Trials, Customer Surveys.  

## SciPy Implementation  

- **n = 10** → Number of trials  
- **p = 0.5** → Probability of success in each trial  
- **k = 3** → Number of successes  
- **binom.pmf(k, n, p)** → Computes P(X = k)  


In [2]:
from scipy.stats import binom

In [3]:
binom.pmf(k=19,n=25,p=0.65)

0.090777998593228

In [23]:
binom.pmf(19,25,0.65)

0.090777998593228

In [24]:
binom.cdf(2,20,0.06)

0.8850275957378549

In [25]:
binom.pmf(10,20,0.4)

0.11714155053639005

### Poisson Distribution

### Poisson Distribution: When to Use?

- **Counts Events in Fixed Interval:** Used for counting occurrences over time, area, or space.  
- **Independent Events:** One occurrence does not affect another.  
- **Constant Rate (λ):** The average number of occurrences per interval is fixed.  
- **No Upper Limit:** Theoretically, any number of occurrences can happen.  

### Examples  
- Number of customer arrivals per hour at a store.  
- Number of emails received in an hour.  
- Number of defects in a sheet of metal.  

### SciPy Implementation  

- **λ (lambda) = 4** → Average occurrences per interval  
- **k = 2** → Number of actual occurrences  
- **poisson.pmf(k, λ)** → Computes P(X = k)  


In [26]:
from scipy.stats import poisson

In [27]:
poisson.pmf(3,2)

0.18044704431548356

In [28]:
poisson.cdf(7,3.2)

0.9831701582510425

In [31]:
prob_gt_7=1-poisson.cdf(7,3.2)
print(prob_gt_7)

0.01682984174895752


### Uniform Distribution

### Uniform Distribution: When to Use?

- **Equal Probability:** Every value in the range has an equal chance of occurring.  
- **Defined by a Range (a, b):** The distribution is uniform between two values.  
- **Continuous or Discrete:** Can be used for both types of data.  

### Examples  
- Rolling a fair die (discrete uniform).  
- Randomly selecting a number between 0 and 1 (continuous uniform).  
- Generating random timestamps within a day.  

### SciPy Implementation  

- **a = 2, b = 10** → Range of values  
- **uniform.pdf(x, a, b-a)** → Computes P(X = x) for continuous case  


In [33]:
from scipy.stats import uniform

In [34]:
u=np.arange(27,40,1)
u

array([27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39])

In [36]:
uniform.mean(loc=27,scale=12)

33.0

In [38]:
uniform.cdf(np.arange(30,36,1),loc=27,scale=12)

array([0.25      , 0.33333333, 0.41666667, 0.5       , 0.58333333,
       0.66666667])

In [40]:
prob=0.66666667-0.25 
prob

0.41666667

In [41]:
uniform.mean(loc=200,scale=982)

691.0

In [43]:
uniform.std(loc=200,scale=982)

283.4789821721062

### Normal Distribution

### Normal Distribution: When to Use?

- **Bell-Shaped Curve:** Most values cluster around the mean.  
- **Defined by Mean (μ) and Standard Deviation (σ):**  
  - μ (mean): Center of the distribution.  
  - σ (std deviation): Controls spread.  
- **Symmetric:** Equal probability on both sides of the mean.  

### Examples  
- Heights of people in a population.  
- Test scores in a large class.  
- Measurement errors in experiments.  

### SciPy Implementation  

- **μ = 50, σ = 10** → Mean and standard deviation  
- **norm.pdf(x, μ, σ)** → Computes P(X = x)  


In [44]:
from scipy.stats import norm

In [45]:
val,m,s=68,65.5,2.5

In [46]:
norm.cdf(val,m,s)

0.8413447460685429

In [48]:
cdf_gt_val=1-norm.cdf(val,m,s)
cdf_gt_val

0.15865525393145707

In [50]:
val_btw_val=(norm.cdf(val,m,s)-norm.cdf(63,m,s))
val_btw_val

0.6826894921370859

In [51]:
1-norm.cdf(700,494,100)
# for cdf greater than 700
# we first find cdf upto 700 then subtract from total probability 1

0.019699270409376912

In [52]:
norm.ppf(0.95)
# area given and we need to find z value

1.6448536269514722

In [54]:
norm.ppf(1-.6772)

-0.45988328292440145

### Hypergeometric Distribution

### Hypergeometric Distribution: When to Use?

- **Sampling Without Replacement:** Used when selecting items from a finite population.  
- **Defined by Population and Sample Sizes:**  
  - **N** → Total population size.  
  - **K** → Total number of successes in the population.  
  - **n** → Sample size drawn.  
  - **k** → Number of observed successes in the sample.  
- **Dependence Between Trials:** Probability changes after each draw.  

### Examples  
- Drawing cards from a deck without replacement.  
- Selecting defective items from a batch.  
- Choosing students for a competition from different groups.  

### SciPy Implementation  

- **N = 20, K = 7, n = 5** → Population, success count, and sample size.  
- **hypergeom.pmf(k, N, K, n)** → Computes P(X = k).  


In [55]:
from scipy.stats import hypergeom

In [56]:
hypergeom.sf(0,18,3,12)
# sf=1- cdf

0.9754901960784313

In [59]:
hypergeom.cdf(1,18,5,11)

# 1 means cdf upto 1
# 18 means population size 
# 5 are chosen
# 11 means no of success

0.04738562091503268

### Exponential Distribution

### Exponential Distribution: When to Use?

- **Models Time Between Events:** Used for waiting times in a Poisson process.  
- **Defined by Rate Parameter (λ):**  
  - **λ (lambda)** → Average number of events per unit time.  
  - Mean = **1/λ**, Variance = **1/λ²**.  
- **Memoryless Property:** The probability of an event occurring is independent of past occurrences.  

### Examples  
- Time between arrivals at a bus stop.  
- Time until a radioactive particle decays.  
- Duration between customer service calls.  

### SciPy Implementation  

- **λ = 0.5** → Rate parameter (mean waiting time = 1/λ).  
- **expon.pdf(x, scale=1/λ)** → Computes P(X = x).  


In [61]:
from scipy.stats import expon

In [63]:
expon.cdf(0.75,(1/1.38))

0.025043397119053856