In [43]:
import math
import numpy as np
from scipy import stats

A tire company claims that only 2% of their tires have manufacturing defects (theoretical probability).  
A regional quality manager inspects a batch of 500 tires and finds 18 defective tires.  

The manager concludes: "The theoretical defect rate is wrong. It's clearly 3.6% now. We should update our models."

What's the correct interpretation of this situation?

A system processes requests, and each request is either successful (1) with probability 0.97 or unsuccessful (0) with probability 0.03.  
Let X be the random variable representing success (1) or failure (0) for a request.

What is the expected value of $X^2$?

In [1]:
(1 * 0.5) + (0 * 0.5)

0.5

A YouTuber earns revenue based on the number of ads watched. Each ad generates a random amount between ₹0.50 and ₹3, following a Uniform Distribution X∼U(0.5,3). The YouTuber uploads 5 videos, and each video gets an average of 1,000 ad views.

Question 1: What is the expected revenue per ad?  
Question 2: What is the expected total revenue from all 5 videos?  

In [2]:
a = 0.5
b = 3

(a + b) / 2

1.75

A food delivery app records delivery times as follows:

1. 50% of orders are delivered in 25 minutes
2. 30% take 35 minutes
3. 20% take 45 minutes

Question 1: What is the expected delivery time for one order?  
Question 2: If a customer places 3 independent orders, what is the expected total delivery time?

In [3]:
(25 * 0.5 + 35 * 0.3 + 45 * 0.2)

32.0

In [4]:
3 * 32

96

A user plays a Google Pay cashback scratch card where they can win ₹10, ₹20, ₹50, ₹100, or ₹500, each with an equal chance.

Question 1: What is the probability of winning ₹50? (Use PMF)  
Question 2: What is the expected cashback amount a user will receive? (Use Expectation) 

In [7]:
1 / (5 - 1 + 1)

0.2

In [9]:
(1 / 5) * (10 + 20 + 50 + 100 + 500)

136.0

An Uber driver tracks ride durations that follow a normal distribution with:

1. Mean ride time μ=15 minutes
2. Standard deviation σ=3 minutes

Question 1: What is the probability that a randomly selected ride takes more than 20 minutes?  
Question 2: What is the expected ride duration and the variance?  
Question 3: What is the probability that a ride takes between 12 and 18 minutes?  

In [19]:
mu = 15
std = 3
# Find: P(x > 20)
# 1 - P(X < 20)
x = 20

In [20]:
z = (x - mu) / std
z

1.6666666666666667

In [21]:
p_x_lt_20 = 1 - stats.norm.cdf(z)
p_x_lt_20.round(2).item()

0.05

In [22]:
x1, x2 = 12, 18
z1 = (x1 - mu) / std
z2 = (x2 - mu) / std

z1, z2

(-1.0, 1.0)

In [24]:
p_x1_x2 = stats.norm.cdf(z2) - stats.norm.cdf(z1)
p_x1_x2.round(2).item()

0.68

An iPhone's battery charging time (0% to 100%) is uniformly distributed between 1 hour and 2.5 hours.

Question 1: What is the probability that the phone is fully charged in less than 1.5 hours?  
Question 2: What is the probability density function (PDF) value for any given charging time in this range?  

In [31]:
a = 1
b = 2.5
mu = 1
scale = b - a
# Find: P(x < 1.5)
x = 1.5

In [33]:
p_x_lt_1p5 = stats.uniform.cdf(x=x, loc=mu, scale=scale)
p_x_lt_1p5.round(4).item()

0.3333

In [34]:
p_x_1p5 = stats.uniform.pdf(x=x, loc=mu, scale=scale)
p_x_1p5.round(4).item()

0.6667

A WhatsApp user receives an average of 5 messages per minute.  
The number of messages follows a Poisson distribution with λ = 5.

Question 1: What is the probability that the user receives exactly 7 messages in a minute?  
Question 2: What is the probability that the user receives at least 3 messages in a minute?  

In [35]:
# 5m -> 1
#  ? -> 1
mu = 5
# Find: P(x = 7)
x = 7

In [37]:
p_x_7 = stats.poisson.pmf(k=x, mu=mu)
p_x_7.round(4).item()

0.1044

In [38]:
# Find: P(x >= 3)
# 1 - P(x <= 2)
x = 2

In [39]:
p_x_le_2 = 1 - stats.poisson.cdf(k=x, mu=mu)
p_x_le_2.round(4).item()

0.8753

A Netflix user experiences buffering intervals that follow an Exponential distribution with  
rate $\lambda = \frac{1}{15}$ (mean time 15 minutes).

Question 1: What is the probability that the next buffering occurs within 10 minutes?  
Question 2: What is the expected time until the next buffering, and what is the variance?  

In [41]:
rate = 1 / 15
scale = 1 / rate
# Find: P(x < 10)
x = 10

In [42]:
p_x_lt_10 = stats.expon.cdf(x=x, scale=scale)
p_x_lt_10.round(4).item()

0.4866

A social media analytics firm is measuring the average watch time (in minutes) of videos.  
The standard deviation in watch time is 12 minutes.

They collect three random samples:

Group A: 36 users  
Group B: 144 users  
Group C: 36 users  

(but values are scaled up - watch time is measured in seconds, not minutes)
Which of the following statements are TRUE about the Standard Error (SE) of each group?

A. SE of Group B is half that of Group A  
B. SE of Group C is 60 * SE of Group A  
C. SE of Group A = 2 minutes  
D. SE of Group B = 2 minutes  
E. SE of Group C = 120 seconds  
F. Group C has same SE as A, just in different units  
G. SE is independent of units used

In [52]:
std = 12
n1 = 36
n2 = 144
n3 = 36

In [56]:
ga_se = std / math.sqrt(n1)
ga_se

2.0

In [57]:
gb_se = std / math.sqrt(n2)
gb_se

1.0

In [58]:
gc_se = (std * 60) / math.sqrt(n3)
gc_se

120.0

In [59]:
60 * ga_se

120.0

Correct: A, C, E, F  
Wrong: B, D, G

In [82]:
import math
from statistics import mean, stdev
from scipy import stats

def compute_ci(n, weights, confidence):
    """
    input:
    weights -> float values — the weights of the teabags in grams
    confidence_level ->  A float value — the confidence level

    """
    s_mean = mean(weights)
    s_std = stdev(weights) / math.sqrt(n)

    alpha = 1 - confidence
    z = stats.norm.ppf(1 - (alpha / 2))
    moe = (z * s_std).round(2).item()

    x1, x2 = stats.norm.interval(confidence, loc=s_mean, scale=s_std)
    x1, x2 = x1.round(2).item(), x2.round(2).item()

    print(f"Point Estimate (mean): {round(s_mean, 2)}g")
    print(f"Margin of Error: {moe}g")
    print(f"Confidence Interval: ({x1}g, {x2}g)")

In [83]:
round((1 - 0.95) / 2, 4), round((1 - 0.95) / 2, 4)

(0.025, 0.025)

In [84]:
compute_ci(
    10,
    [2.1, 2.2, 2.3, 2.0, 2.2, 2.4, 2.1, 2.3, 2.2, 2.0],
    0.95,
)

Point Estimate (mean): 2.18g
Margin of Error: 0.08g
Confidence Interval: (2.1g, 2.26g)


In [60]:
from statistics import mean, stdev

In [61]:
mean?

[31mSignature:[39m mean(data)
[31mDocstring:[39m
Return the sample arithmetic mean of data.

>>> mean([1, 2, 3, 4, 4])
2.8

>>> from fractions import Fraction as F
>>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
Fraction(13, 21)

>>> from decimal import Decimal as D
>>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
Decimal('0.5625')

If ``data`` is empty, StatisticsError will be raised.
[31mFile:[39m      c:\users\dheem\appdata\local\programs\python\python312\lib\statistics.py
[31mType:[39m      function