## Exercise 1
In Orange County, 51% of the adults are males. (It doesn't take too much advanced
mathematics to deduce that the other 49% are females.) One adult is randomly selected
for a survey involving credit card usage.

- **(a)** Find the probability that the selected person is a male.

- **(b)** It is later learned that the selected survey subject was smoking a cigar. Also, 9.5% of males smoke cigars, whereas 1.7% of females smoke cigars (based on data from the Substance Abuse and Mental Health Services Administration). Use this additional information to find the probability that the cigar−smoking respondent is a male

Use following notation:
M = male <br>
F = female <br>
C = cigar smoker<br>
NC = not a cigar smoker<br>


In [1]:
# Given probabilities
P_M = 0.51   # Probability of being male
P_F = 0.49   # Probability of being female
P_C_given_M = 0.095  # Probability of smoking cigars given male
P_C_given_F = 0.017  # Probability of smoking cigars given female

# Total probability of smoking cigars
P_C = (P_C_given_M * P_M) + (P_C_given_F * P_F)

# Probability of being male given smoking cigars (using Bayes' Theorem)
P_M_given_C = (P_C_given_M * P_M) / P_C

# Print the result
print(f"The probability that the cigar-smoking respondent is male is approximately {P_M_given_C:.3f} or {P_M_given_C * 100:.1f}%")

The probability that the cigar-smoking respondent is male is approximately 0.853 or 85.3%


# Exercise 2

A diagnostic test has a probability 0.95 of giving a positive result when applied to a person suffering
from a certain disease, and a probability 0.10 of giving a (false) positive when applied to a non-sufferer. It is
estimated that 0.5 % of the population are sufferers. Suppose that the test is now administered to a person about
whom we have no relevant information relating to the disease (apart from the fact that he/she comes from this
population). 

Calculate the following probabilities:
- **(a)** that the test result will be positive;
- **(b)** that, given a positive result, the person is a sufferer;
- **(c)** that, given a negative result, the person is a non-sufferer;
- **(d)** that the person will be misclassified.

Use following notation:

T = test positive <br>
NT = test negative<br>
S = sufferer<br>
NS = non-sufferer<br>
M = misclassified<br>

Solve it by two approaches:
1. Arithmetically
2. By simulation

In [2]:
import numpy as np

# Parameters
P_T_given_S = 0.95
P_T_given_NS = 0.10
P_S = 0.005
P_NS = 1 - P_S
num_simulations = 100000

# Simulate the population
population = np.random.choice(['S', 'NS'], size=num_simulations, p=[P_S, P_NS])

# Simulate test results
test_results = []
for person in population:
    if person == 'S':
        test_results.append(np.random.choice(['T', 'NT'], p=[P_T_given_S, 1 - P_T_given_S]))
    else:
        test_results.append(np.random.choice(['T', 'NT'], p=[P_T_given_NS, 1 - P_T_given_NS]))

# Calculate probabilities
P_T_simulated = np.mean(np.array(test_results) == 'T')
P_T_given_S_simulated = np.mean(np.array(test_results)[np.array(population) == 'S'] == 'T')
P_S_given_T_simulated = P_T_given_S_simulated * P_S / P_T_simulated
P_NS_given_NT_simulated = np.mean(np.array(test_results)[np.array(test_results) == 'NT'] == 'NS')
P_misclassified_simulated = np.mean((np.array(population) == 'S') & (np.array(test_results) == 'NT')) + \
                            np.mean((np.array(population) == 'NS') & (np.array(test_results) == 'T'))

# Print the results
print(f"Simulated Probability of Positive Test: {P_T_simulated:.4f}")
print(f"Simulated Probability of Sufferer given Positive Test: {P_S_given_T_simulated:.4f}")
print(f"Simulated Probability of Non-Sufferer given Negative Test: {P_NS_given_NT_simulated:.4f}")
print(f"Simulated Probability of Misclassification: {P_misclassified_simulated:.4f}")

Simulated Probability of Positive Test: 0.1032
Simulated Probability of Sufferer given Positive Test: 0.0463
Simulated Probability of Non-Sufferer given Negative Test: 0.0000
Simulated Probability of Misclassification: 0.0984
