**Probability** is a measure of the uncertainty associated with the occurrence of an event in a random experiment.

**Principle of relative frequency**: if an experiment is repeated $n$ times under the same conditions and the event $A$ occurs $k$ times, then its probability can be estimated as:

$$P(A) \approx \frac {k}{n}$$

<p style="text-align:center;"><i>
The probability of event $A$ is approximately equal to $k$ divided by $n$.
</i></p>

This definition is the basis of the frequentist interpretation of probability, according to
which the probability of an event is the limit of the relative frequency for an increasing
number of trials (**law of large numbers**).

We can define the **complementary event A** of an event A as the set of outcomes that do not
belong to A. For example, in the toss of a coin, if A represents heads, its complement
represents tails.

$$P(\bar{A}) = 1 - P(A)$$

<p style="text-align:center;"><i>
The probability of the complement of $A$ is equal to 1 minus the probability of $A$.
</i></p>

The **complement of $A$ ($\bar A$)** represents all outcomes that are not in event A. This property is useful for calculating probabilities without having to directly determine
$P(A)$.

The **conditional probability** is the probability of an event $A$ given that another event $B$ has already occurred. It is defined as:

$$P(A|B) = \frac{P(A \cap B)} {P (B)}$$

<p style="text-align:center;"><i>
The probability of $A$ given $B$ is equal to the probability of the intersection of $A$ and $B$, divided by the probability of $B$.
</i></p>

The quantity $P(A|B)$ represents the probability of the intersection between $A$ and $B$, that is, the probability that both events occur. Conditional probability is fundamental in many fields
of statistics and probability calculation, such as in Bayes' Theorem.

**Exercise 1. Analysis of a Retail Website's Visits** An online store recorded a total of 50,000 visits to its website in January. During the same period, 1,200 of these visits resulted in purchases. Based on these historical data, what is the estimated probability that a visitor randomly chosen from this store's website in the future completes a purchase (*conversion probability*)? Provide your answer in percentage.

In [3]:
# P(A) = k/n

visits = 50_000
purchases = 1_200

conversion = purchases / visits
print(conversion * 100, "%")

2.4 %


In [39]:
import statsmodels.api as sm

visits = 50_000
purchases = 1_200

conversion_rate = purchases / visits
conf_int = sm.stats.proportion_confint(purchases, visits, method='wilson')

print(f"Conversion rate: {conversion_rate * 100:.2f}%")
print(f"95% Confidence interval: [{conf_int[0] * 100:.2f}%, {conf_int[1] * 100:.2f}%]")


Conversion rate: 2.40%
95% Confidence interval: [2.27%, 2.54%]


*Answer* The probability of a randomly chosen visitor making a purchase on this store's website is 2.4%.

**Exercise 2. Fault Analysis in a Production Line** An electronics component factory recorded a total of 240 working days over the last 12 months. During this period, there were 36 days in which at least one failure was found in the production process. Based on this historical data, what is the probability that on a randomly chosen working day in the coming months, at least one failure will be recorded in the production line? Present your answer as a simple fraction.

In [6]:
# P(A) = k/n

failures = 36
days = 240

probability = failures / days

from fractions import Fraction

fraction_value = Fraction(probability).limit_denominator()

print(fraction_value)

3/20


**Exercise 3. Analysis of a Production Process** In a manufacturing company that produces electronic gadgets, it has been observed that out of 10,000 units produced in a month, 300 units are defective. The company decides to improve its production processes to reduce the number of defects and wants to evaluate the probability of achieving better results in the coming month. Assuming the defect rate remains unchanged, what is the probability that at least one gadget in a new batch of 500 units is defective?

In [9]:
# P(A) = k/n

gadgets = 10_000
defects = 300

probability = defects / gadgets
print(probability)

0.03


Aproximadamente el 3% de productos va a tener defectos. Así, aproximadamente 15 de los nuevos productos van a tener defectos. La pregunta es, qué probabilidad hay de que sacar aleatoriamente uno de estos productos defectuosos. 

The proposed problem is a classic example of probability, where we want to calculate the probability of the complementary event (non-defective), and then determine the opposite.The  probability of a defective unit is $$P(D) = \frac {30}{10000} = 0.03$$. Consequently, the probability that a unit is NOT defective is $$P(ND) = 1 - P(D) = 0.97$$. Using the complementary
probability for 500 units, the probability that ALL 500 units are NOT defective is: 
$$P(500 \text{ non-defective}) = 0.97^{500}$$
Now, we calculate the probability of at least one defective gadget, which is the
complement:
$$P(\text{at least one defective}) = 1 - 0.97 ^{500}$$
This calculation applies the concept of probability for mass production examples, helping
the company estimate the success of their process improvements without having to
completely test a vast quantity of products, thereby reducing the risk of defects in gadgets
distributed in the market.

In [12]:
'''
The probability that all units are not defective: (1 - p defect) ** n.
The event that at least one is defective is complementary to that so: 1 - p non defective all.
'''

new_batch = 500
all_non_defective = (1 - probability) ** new_batch

one_defective = 1 - all_non_defective
print(one_defective)

0.9999997568539979


**Exercise 4. Customer Preference Analysis in a Clothing Store** A clothing company wants to analyze the behavior of its customers. In the past six months, they noticed that 600 customers purchased a product immediately after trying on a specific item in the fitting room, while 1800 customers tried it on but did not purchase it. Based on these data, the company wants to know the probability that a customer, after trying on an item in the fitting room, decides to purchase it. Present your answer as a decimal value rounded to two decimal places.

In [13]:
# n = A + ~A
# P(A) = k/n

purchased = 600
not_purchased = 1800

total_customers = purchased + not_purchased
probability = purchased / total_customers
print(probability)

0.25


In [40]:
from scipy.stats import binom

p = 0.80
k = 18
target_prob = 0.95

n = k  # Start at least from k
while True:
    cdf = binom.cdf(k-1, n, p)  # P(X <= 5)
    if 1 - cdf > target_prob:
        break
    n += 1

print(f"Minimum invitations to send: {n}")

Minimum invitations to send: 27


In [26]:
16/18

0.8888888888888888