### Quiz 1
A marketing team is running two parallel campaigns:

- Campaign A promotes a new feature update

- Campaign B promotes a discount offer

From a small test group of $6$ users:

- Users who engaged with Campaign A: `{'Alice', 'Bob', 'Charlie', 'David'}`

- Users who engaged with Campaign B: `{'Charlie', 'Eve', 'Frank', 'Bob'}`



$$
P(A \cup B) = P(A) + P(B) - P(A \cap B)
$$

Where:

- $P(A∪B)$ is the probability of engaging with at least one campaign

- $P(A)$ and $P(B)$ are probabilities of engaging with each campaign individually

- $P(A∩B)$ is the probability of engaging with both campaigns

Write a Python function `probability_union(set1, set2, total_population)` to calculate the probability that a randomly selected user from this test group engaged with at least one of the campaigns.

In [3]:
# Users who engaged with each campaign
likes_a = {'Alice', 'Bob', 'Charlie', 'David'}
likes_b = {'Charlie', 'Eve', 'Frank', 'Bob'}

In [None]:
print(len(likes_a.union(likes_b))/6)
print(len(likes_a)/6)
print(len(likes_b)/6)
print(len(likes_a.intersection(likes_b))/6)



1.0
0.6666666666666666
0.6666666666666666
0.3333333333333333


### Quiz 2
A weather analytics company reports the following forecast for a city:

- There is a $40$% chance of rain tomorrow.

- If it rains, the chance of thunder is $60$%.

- If it doesn't rain, the chance of thunder is just $5$% (e.g., due to dry thunderstorms).

Write a Python function `weather_thunder_probability(p_rain, p_thunder_given_rain, p_thunder_given_no_rain)` that calculates:

1. The probability of both rain and thunder using the multiplication rule:

$$
P(A \cap B) = P(A) \times P(B \mid A)
$$

Where:

- $(P(A \cap B))$: Probability of both A and B occurring  
- $(P(A))$: Probability of A  
- $(P(B \mid A))$: Probability of B given A

2. The probability of thunder overall, whether or not it rains:
Using the law of total probability:
$$
P(B) = P(A) \times P(B \mid A) + (1 - P(A)) \times P(B \mid A')
$$

Where:

- $(P(B))$: Overall probability of B  
- $(P(A)$): Probability of A  
- $(1 - P(A))$: Probability of A not occurring (i.e., \(P(A')\))  
- $(P(B \mid A))$: Probability of B given A  
- $(P(B \mid A'))$: Probability of B given not A

Return both results rounded to two decimal places.

In [9]:

def weather_thunder_probability(p_rain, p_thunder_given_rain, p_thunder_given_no_rain):
    p_rain_and_thunder = p_rain * p_thunder_given_rain
    p_no_rain = 1 - p_rain
    p_no_rain_and_thunder = p_no_rain * p_thunder_given_no_rain
    p_thunder = p_rain_and_thunder + p_no_rain_and_thunder
    return p_thunder

p_rain = 0.4
p_thunder_given_rain = 0.6
p_thunder_given_no_rain = 0.05

print(round(weather_thunder_probability(p_rain,p_thunder_given_rain,p_thunder_given_no_rain),2))


    

0.27


### Quiz 3
You are working with a customer service analytics team that reviews user interactions logged as tuples of the form `(issue_type, channel)`. The company wants to analyse event relationships from this dataset:

```
interactions = [
    ('Billing', 'Email'), ('Technical', 'Phone'),
    ('Billing', 'Phone'), ('General', 'Chat'),
    ('Technical', 'Chat'), ('General', 'Phone'),
    ('Billing', 'Chat'), ('Technical', 'Email'),
    ('Billing', 'Phone'), ('General', 'Email')
]
```
The team is investigating whether there’s a meaningful relationship between:

- Users who report Billing issues

- Users who contact via the Phone channel

Write a function `check_billing_phone_overlap(data)` that:

- Determines whether any record matches both Billing issue and Phone channel

- Compares the actual overlap with the expected overlap (product of individual proportions)

- Returns whether:

  - There is any overlap (i.e., not disjoint)
$$
P(A \cap B) = 0
$$

This means:
- The occurrence of one event **excludes** the occurrence of the other.
- The events have **no overlap** in the sample space.

  - The overlap is what you’d expect if they were unrelated (i.e., independent)
$$
P(A \cap B) = P(A) \times P(B)
$$

This means:
- The probability of both events occurring equals the product of their individual probabilities.
- The events can **still overlap**, but their co-occurrence is purely by chance, not influence.

In [11]:
# Example interaction: (issue_type, channel)
interactions = [
    ('Billing', 'Email'), ('Technical', 'Phone'),
    ('Billing', 'Phone'), ('General', 'Chat'),
    ('Technical', 'Chat'), ('General', 'Phone'),
    ('Billing', 'Chat'), ('Technical', 'Email'),
    ('Billing', 'Phone'), ('General', 'Email')
]

In [14]:
def check_billing_phone_overlap(data):
    total = len(data)
    billing_count  = sum(1 for issue , _ in data if issue == 'Billing')
    phone_count = sum(1 for _, channel in data if channel == 'Phone')

    p_billing_and_phone = sum(
        1 for issue, channel in data
        if issue == 'Billing' and channel == 'Phone'
    )

    # Probabilities
    p_billing = billing_count / total          # P(A)
    p_phone = phone_count / total              # P(B)
    p_actual_overlap = p_billing_and_phone / total   # P(A ∩ B)
    p_expected_overlap = p_billing * p_phone   # P(A)P(B)

    return {
        "actual_overlap_count": p_billing_and_phone,
        "actual_overlap_probability": p_actual_overlap,
        "expected_overlap_probability": p_expected_overlap,
        "is_disjoint": p_billing_and_phone == 0,
        "is_independent_like": abs(p_actual_overlap - p_expected_overlap) < 1e-9
    }

result = check_billing_phone_overlap(interactions)
print(result)

{'actual_overlap_count': 2, 'actual_overlap_probability': 0.2, 'expected_overlap_probability': 0.16000000000000003, 'is_disjoint': False, 'is_independent_like': False}


### Quiz 4
A retail company is analysing its transaction data to understand the distribution of sales across different regions. Each transaction is stored as a tuple in the format:`(region, product_category)`

From a dataset of 10 recent transactions, the company wants to calculate the marginal probability that a randomly selected transaction came from the North region.

$$
P(A) = \frac{\text{Number of favorable outcomes}}{\text{Total number of outcomes}}
$$

Where:

- $P(A)$: Probability of a transaction being from the "North" region

- Favorable outcomes: Transactions from "North"

- Total outcomes: All transactions in the dataset

Write a Python function `marginal_probability(data, condition_type, condition_value)` to calculate the marginal probability based on a given condition (e.g., region = `'North'`).

**Hint:**

Marginal probability is the probability of a single event occurring, irrespective of other variables.

In [15]:
transactions = [
    ('North', 'Electronics'), ('South', 'Clothing'),
    ('North', 'Clothing'), ('East', 'Electronics'),
    ('South', 'Electronics'), ('West', 'Clothing'),
    ('North', 'Clothing'), ('West', 'Electronics'),
    ('South', 'Electronics'), ('East', 'Clothing')
]

In [None]:
def marginal_probability(data, condition_type, condition_value):
    index = 0 if condition_type == 'region' else 1

    count = sum(1 for item in data if item[index] == condition_value)
    
    if len(data) == 0:
        return 0.0
    
    return count/len(data)


condition_type = 'region' 
condition_value = 'North'
print(marginal_probability(transactions,condition_type,condition_value))


0.3


### Quiz 5

A hospital's data team is analysing patient visits to understand how different age groups use various services. Each patient record is stored as a tuple: `(age_group, visit_type)`

They want to calculate the joint probability that a randomly selected patient is both a Senior and visited the Emergency department.

$$
P(A \cap B) = \frac{\text{Number of outcomes where both A and B occur}}{\text{Total number of outcomes}}
$$

Where:

- $P(A∩B)$: Probability that both conditions (Senior and Emergency) are satisfied

- **Numerator**: Number of such records

- **Denominator**: Total number of records

Write a Python function `joint_probability(data, value1, value2)` to compute the joint probability from the dataset of `(age_group, visit_type)`.



In [18]:
# Hospital visit records: (age_group, visit_type)
hospital_data = [
    ('Senior', 'Emergency'), ('Adult', 'Routine'),
    ('Adult', 'Emergency'), ('Child', 'Routine'),
    ('Senior', 'Routine'), ('Adult', 'Emergency'),
    ('Child', 'Emergency'), ('Senior', 'Emergency')
]

In [20]:
def joint_probability(data, value1, value2):
    count = sum (1 for age_group, visit_type in data if age_group==value1 and visit_type== value2)

    if len(data) == 0:
        return 0.0
    
    return count/len(data)

print(round(joint_probability(hospital_data,'Senior', 'Emergency'),2))


0.25


### Quiz 6
An academic analytics team is studying student performance based on gender. The dataset stores each student record as a tuple: `(gender, passed)`
Where `passed = True` means the student passed, and `False` means the student failed.

The team wants to compute the probability that a student passed given that the student is female.

$$
P(B \mid A) = \frac{P(A \cap B)}{P(A)}
$$
Where:

- $P(B∣A)$: Probability of event B occurring given that event A has occurred

- $P(A∩B)$: Joint probability of both A and B occurring

- $P(A)$: Marginal probability of event A

Write a Python function `conditional_probability(data, given_key, given_value, target_key, target_value)` that calculates conditional probabilities for any given condition.

Use it to compute the probability that a student passed given that the student is female.

In [22]:
# Student data: (gender, passed)
student_data = [
    ('Male', True), ('Female', False), ('Female', True),
    ('Male', True), ('Female', True), ('Male', False),
    ('Female', True), ('Male', True)
]

In [23]:
#probability that a student passed given that the student is female
def conditional_probability(data, given_key, given_value, target_key, target_value):
    key_map = { 'gender': 0, 'passed': 1 }

    given_index = key_map[given_key]
    target_index = key_map[target_key]

    count_given = sum(1 for item in data if item[given_index] == given_value)

    count_joint = sum(1 for item in data 
                      if item[given_index] == given_value and item[target_index] == target_value)
    
    if count_given == 0:
        return 0.0
    
    return count_joint / count_given

prob = conditional_probability(student_data,
                               given_key='gender', given_value='Female',
                               target_key='passed', target_value=True)

print(prob)




0.75


### Quiz 7
A quality control inspector at a manufacturing plant selects two items from a batch of 100 products for inspection. The batch contains:

1. 10 defective items

2. 90 non-defective items

- The inspector draws one item at random, does not replace it, and then draws a second item.

- The team wants to compute the probability that the first item is defective and the second is non-defective.

$$P(A∩B)=P(A)×P(B∣A)$$

Where:

- $P(A)$: Probability that the first item is defective

- $P(B∣A)$: Probability that the second item is non-defective, given the first was defective

- $P(A∩B)$: Probability that the first item is defective and the second is non-defective

Write a function `compound_event_probability()` to calculate the compound probability using the given conditions (without replacement).

In [24]:
def compound_event_probability(total_items=100, defective_items=10):
    P_A = defective_items / total_items
    remaining_total = total_items - 1

    remaining_non_defective = (total_items - defective_items)
    P_B_given_A = remaining_non_defective / remaining_total
    return P_A * P_B_given_A

print(compound_event_probability())


0.09090909090909091


### Quiz 8
A healthcare team is analysing responses from a health behavior survey. Each entry includes a person’s gender and whether they are a smoker.

From the data, they want to understand patterns in smoking behavior and answer the following:

- What proportion of the survey participants are smokers?

- What proportion of the participants are female and also smokers?

- Among female participants, what proportion are smokers?

1. Probability of a Specific Condition

$$
P(A) = \frac{\text{Number of individuals with A}}{\text{Total number of individuals}}
$$

2. Probability of Two Conditions Happening Together

$$
P(A \cap B) = \frac{\text{Number of individuals with both A and B}}{\text{Total number of individuals}}
$$

3. Probability of One Condition Among a Group

$$
P(B \mid A) = \frac{P(A \cap B)}{P(A)} = \frac{\text{Number with both A and B}}{\text{Number with A}}
$$

Write three Python functions and round the results to $2$ decimal places:

- `get_marginal(data, key, value)`

- `get_joint(data, key1, val1, key2, val2)`

- `get_conditional(data, given_key, given_val, target_key, target_val)`

In [25]:
# Survey dataset
survey_data = [
    {'gender': 'Male', 'smoker': True},
    {'gender': 'Female', 'smoker': False},
    {'gender': 'Female', 'smoker': True},
    {'gender': 'Male', 'smoker': True},
    {'gender': 'Female', 'smoker': True}
]

In [27]:
def get_marginal(data, key, value):
    count_favorable = sum(1 for rec in data if rec.get(key)==value)
    if len(data) == 0:
        return 0.0
    return count_favorable/len(data) 

def get_joint(data, key1, val1, key2, val2):
    count_joint = sum(1 for rec in data if rec.get(key1)==val1 and rec.get(key2)==val2)

    if len(data) == 0:
        return 0.0
    
    return count_joint/len(data) 

def get_conditional(data, given_key, given_val, target_key, target_val):    
    p_given = get_marginal(data, given_key, given_val)

    p_joint = get_joint(data, given_key, given_val, target_key, target_val)

    if p_given == 0.0:
        return 0.0
    
    return p_joint/p_given


# - What proportion of the survey participants are smokers?
print(round(get_marginal(survey_data,'smoker',True),2))

# - What proportion of the participants are female and also smokers?
print(round(get_joint(survey_data,'gender', 'Female', 'smoker',True),2))

# - Among female participants, what proportion are smokers?
print(round(get_conditional(survey_data,'gender', 'Female', 'smoker',True),2))


    


0.8
0.4
0.67
