## Probability for Machine Learning

> Notes based on blog by Jason Brownlee: https://machinelearningmastery.com/bayes-theorem-for-machine-learning/

---

### Basic concepts
<img src="https://i.imgur.com/l3GDwZR.jpeg" width="600" height="700"/>
<img src="https://i.imgur.com/eX5ELeX.jpeg" width="600" height="700"/>


### Worked Example for Calculating Bayes Theorem


Scenario: Consider a human population that may or may not have cancer (Cancer is True or False) and a medical test that returns positive or negative for detecting cancer (Test is Positive or Negative), e.g. like a mammogram for detecting breast cancer.

<img src="https://i.imgur.com/MZ7lHIA.jpeg" width="600" height="800"/>
<img src="https://i.imgur.com/QdgcEA7.jpeg" width="700" height="400"/>


The calculation suggests that if the patient is informed they have cancer with this test, then there is only 0.33% chance that they have cancer.

The example also shows that the calculation of the conditional probability requires enough information.

For example, if we have the values used in Bayes Theorem already, we can use them directly.

This is rarely the case, and we typically have to calculate the bits we need and plug them in, as we did in this case. In our scenario we were given 3 pieces of information, the the base rate (P(cancer)), the  sensitivity (or true positive rate - P(+ve | cancer)), and the specificity (or true negative rate - P(-ve|not cancer)).

- Sensitivity: 85% of people with cancer will get a positive test result.
- Base Rate: 0.02% of people have cancer.
- Specificity: 95% of people without cancer will get a negative test result.


### Baye's Theorem in Python

In [1]:
#### A: cancer B: test; need P(A|B)

def bayes_theorem(p_a, p_b_given_a, p_not_b_given_not_a):
    """
    Inputs:
    p_a: base rate
    p_b_given_a: TPR: sensitivity
    p_not_b_given_not_a : TNR: specificity

    Returns P(A|B)
    """

    p_not_a = 1 - p_a
    p_b_given_not_a = 1 - p_not_b_given_not_a

    p_b = p_b_given_a * p_a + p_b_given_not_a*p_not_a

    p_a_given_b = (p_b_given_a * p_a)/p_b

    return p_a_given_b

p_a = 0.0002
p_b_given_a = 0.85
p_not_b_given_not_a = 0.95

result = bayes_theorem(p_a, p_b_given_a, p_not_b_given_not_a)
print('P(A|B) = %.3f%%' % (result * 100))

P(A|B) = 0.339%


<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=e3bd92dc-1f37-46e3-a6da-869792ce3a08' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>