## Report 1. 

This is a statistics and probability analysis notebook focusing on common probability distributions (binomial, Poisson, hypergeometric, normal etc.). The notebook contains educational exercises for calculating probabilities and statistical measures.

## Key Patterns and Conventions

### Probability Distributions - cheet-sheet 

Please find the following cheat-sheet for all the most popular probability distributions (here)[https://github.com/reza-bagheri/probability_distributions/blob/main/prob_dist_cheatsheet.pdf] and use it to solve the tasks below.

### Statistical Libraries

- Use `scipy.stats` for probability distributions and calculations
- Import specific distributions: `from scipy.stats import binom, poisson, hypergeom, norm`
- Use `numpy` for numerical operations and array handling

### Exercise Structure

Each exercise follows this pattern:
1. **Problem statement** in markdown cells describing real-world scenarios
2. **Empty code cells** with commented placeholders for each sub-problem
3. **Parameters clearly defined** (n, p, λ, μ, σ) in the problem text

### Problem-Solving Approach
1. **Identify the distribution type** from the problem context
2. **Extract parameters** from the problem statement
3. **Translate probability notation** (P(X ≤ 3), P(X > 4)) to appropriate scipy methods
4. **Verify results** make sense (probabilities between 0 and 1)

---

In [8]:
import scipy.stats as stats
from scipy.stats import binom, poisson, norm, hypergeom
import numpy as np
import random

## Exercise 1.

As a sport analyst, you would like to calculate some probabilities for basketball player who is shooting guard.

n = 10 attemps and
p = 0.7 the probability for scoring three-points 

Calculate the following probabilities: P(X ≤ 3), P(X < 3), P(X > 4) and P(X = 7).

In [7]:
#P(X ≤ 3)
less_equal_3 = binom.cdf(k=3,
                         n=10,
                         p=0.7)
print(less_equal_3)

#P(X < 3)
less_than_3 = binom.cdf(k=2,
                        n=10,
                        p=0.7)
print(less_than_3)

#P(X > 4)
greater_than_4 = 1 - binom.cdf(k=3, n=10, p=0.7)
print(greater_than_4)

#P(X = 7)
equal_7 = binom.pmf(k=7, n=10, p=0.7)
print(equal_7)


0.010592078400000007
0.0015903864000000015
0.9894079216
0.26682793200000005



## Exercise 2.

On a large fully automated production plant items are pushed to a side band at random time points, from which they are automatically fed to a control unit. The production plant is set up in such a way that the number of items sent to the control unit on average is 1.6 item pr. minute. Let the random variable X denote the number of items pushed to the side band in 1 minute.

a) What is the probability that there will be more than 5 items at the control unit in a given minute?

b) What is the probability that less than 8 items arrive to the control unit within a 5-minute period?

In [11]:
# a)
more_than_5 = 1 - poisson.cdf(k=5, mu=1.6)
print(more_than_5)

#b)
less_than_8 = poisson.cdf(k=7, mu=5*1.6)
print(less_than_8)

0.006040291111581331
0.45296080948699446


## Exercise 3.

In the manufacture of car engine cylinders, it's known that there are 5 defective cylinders in every batch of 100 cylinders produced. From a production batch of 100 cylinders, 6 cylinders are selected randomly for analyzing.

What is the probability that the sample contains 2 defective cylinders?

In [None]:
# M total nr of objects
M = 100
# n total nr of type 1 objects
n = 5
# N number of draws, we pick 6 random
N = 6
# k ile chcemy typu 1
k = 2

hypergeom_prob = hypergeom.pmf(k, M, n, N)
print(hypergeom_prob)

0.02670641827490134


## Exercise 4.

A company, which produces tires, is using new technology to provide safer driving experience to drivers. According to their claim, while speed is 70km/h, breaking distance of those tires have normal distribution with mean equal to 26.4 meters and sigma is equal to 2.34

According to standards, breaking distance shouldn't be higher than 29 meters, while speed is 70 km/h. 

a) What is the probability of being comply with standards ?

b) What is the probability of having breaking distance between 26 and 24 ?

In [17]:
mean = 26.4
std = 2.34

#a)  Norma mówi, że droga hamowania nie powinna być wyższa niż 29 metrów.
less_equal_29 = norm.cdf(29, loc=mean, scale=std)
print(less_equal_29)

#b)
less_equal_26 = norm.cdf(26, loc=mean, scale=std)
less_than_24 = norm.cdf(24, loc=mean, scale=std)
between_24_26 = less_equal_26 - less_than_24
print(between_24_26)

0.8667397370974947
0.27960499382023934


---

### Visualization and Result Interpretation

**Visualizing Distributions:**


In [4]:
import matplotlib.pyplot as plt
import numpy as np

---

## Advanced Exercises

### Exercise 5: Network Reliability 

**Problem:** A server farm has 20 independent servers, each with 95% uptime probability. The system needs at least 18 servers operational to function properly. Additionally, during peak hours, server requests follow a Poisson process with λ=4.2 requests/minute.

a) What's the probability the system functions properly?

b) Given the system is functional, what's the probability of receiving exactly 6 requests in the next minute?

c) What's the expected number of failed servers and its standard deviation?

In [28]:
#a)
#P(X >= 18)
prob = 1 - binom.cdf(k=17, n=20, p=0.95)
print(prob)

#b)
k = 6
mu = 4.2
poisson_prob = poisson.pmf(k, mu)
print(poisson_prob)

#c)
#P(X <= 2)
q = 0.05
expected_failures = binom.mean(n=20, p=q)
std = binom.std(n=20, p=q)
print(expected_failures)
print(std)

0.9245163262115035
0.11432110720443431
1.0
0.9746794344808963



### Exercise 6: Inventory Management

**Problem:** A warehouse stocks items with demand following different patterns:
- Daily demand: Normal(μ=150, σ=25)
- Weekly restock arrives with 8% chance of delay
- When delayed, additional wait time is exponentially distributed with λ=0.5 days⁻¹

a) What inventory level ensures 95% probability of meeting daily demand?

b) What's the probability of stockout if starting inventory is 180 units?

c) What's the expected total wait time for restock (including potential delays)?

In [39]:
#a)
inventory_level = norm.ppf(0.95, loc=150, scale=25)
print(inventory_level)

#b) popyt jest większy niż zapas (180). Szukamy więc P(X > 180)
prob_stockout = 1 - norm.cdf(180, loc=150, scale=25)
print(prob_stockout)

#c) 
prob_delay = 0.08
one_over_mu = 1 / 0.5
expected_wait = prob_delay * one_over_mu
print(expected_wait)



191.1213406737868
0.11506967022170822
0.16


## c)

Prawdopodobieństwo opóźnienia: $P(D) = 0.08$

Średni czas trwania opóźnienia (dla rozkładu wykładniczego to $1/\lambda$): $E[Delay] = 1 / 0.5 = 2$ dni.

Oczekiwany czas całkowity to suma (prawdopodobieństwo $\times$ czas):


### Exercise 7: Clinical Trial Analysis 

**Problem:** A clinical trial has 500 participants: 300 receiving treatment, 200 receiving placebo. Treatment success rate is 75%, placebo success rate is 40%. Researchers randomly select 50 participants for detailed analysis.

a) What's the probability that exactly 30 participants in the sample received treatment?

b) Among those 30 treatment participants, what's the probability that at least 20 show positive results?

c) Use normal approximation to estimate the probability that total successes in the full trial exceed 280.

In [37]:
# a)
M = 500
n = 300
N = 50
k = 30
hypergeom_prob = hypergeom.pmf(k, M, n, N)
print(hypergeom_prob)

#b) co najmniej 20
positive_results = binom.sf(k=19, n=30, p=0.75) # > 19
print(positive_results)

#c) exceed 280 - powyżej 280!!!!!
p_mean = 300*0.75 + 200*0.4
print(p_mean)
p_std = np.sqrt(300*0.75*0.25 + 200*0.4*0.6)
print(p_std)
prob_c = 1 - norm.cdf(280, loc=p_mean, scale=p_std)
print(prob_c)

0.12074823636116276
0.8942718773073399
305.0
10.21028892833107
0.9928275882411031


---