# Probability refresher

Suggested readings before class:<br>
[Math is fun:Probability](https://www.mathsisfun.com/data/probability.html)


Probability is all about the **chances of an event occuring** or how likely an event is to occur, in a set of events.

If you really think about it, you've probably been thinking about probability all of your life, such as if you've ever wonderered about 


> -  The chances of it raining today
> -  The chances of winning the lottery
> -  The chances of getting hired at Google


To really make sense of **the chances** of an event occuring, we need to look at a bit of math through  **probability**.

<br>
In math probability is modeled by the expression:<br><br>
$P(A)= \frac{Count of A }{sample Space}$<br><br>

 <br><br>

$ P $  is the probability of the event $ A $ occuring in a set of observed events $ sample space $
$ count of A $  - is the number of times an certain detail was present in the whole set
$ sampleSpace $ - total number of observed events


To better understand how works, the closer this number is to 0, the less likely it is to occur, with a value 0 meaning it didnt happen at all. The closer to 1 being an indicator that it, the event is more likely to happen, with a value of 1 being that in every single case this happened.


<img src="https://www2.southeastern.edu/Academics/Faculty/dgurney/Math241/StatTopics/PrbScl4.jpg" />


You'll often see this represented in data sets in a number of formats. Here are some examples:<br><br>

| Hired|
|------|
| false|
| true |
| true |
| false|

<br>
 or
<br>

| Won Lottery |
| ----|
| yes |
| no  |
| no  |
| no  |


Lets look at a short example by actually examining some hiring numbers at Google<br><br>
[Click here to read an article about the hiring stats at Google](https://qz.com/285001/heres-why-you-only-have-a-0-2-chance-of-getting-hired-at-google/)<br>
> __Google gets around 3 million applications a year now, according to HR head Laszlo Bock, and hires 7,000 .... making it far more selective than institutions like Harvard, Yale, and Stanford.__<br>


<br><br>



So we have 3 million **observed events**,(in this case the event is submitting an application).<br>
And in 7000 of  those 3 million them a hiring occured


Lets model that with some code!




In [5]:
import random
# total number of applicants to Google
num_of_applicants = 3000000

# here we have the number of people that actually got hired out of that group
total_hires = 7000


def probability(event_count, sample_space):
    p =  event_count/sample_space
    return p

prob_hired = probability(total_hires,num_of_applicants)
display(prob_hired)

0.0023333333333333335

^ this number can be used to tell more human readable calculations such as turning it into a percentage by multiplying it by 100

In [12]:
def percentage(prob):
    percentage = prob * 100
    return '{}% chance of occurence'.format(percentage)
percentage_hired(prob_hired)

'0.23333333333333336% chance of occurence'

To get the fractional form of the probabiility, divide both the divide both $ A $ and $ sample space $ by $ A $, then(if needed), round the denominiator

In [23]:
def fraction_probability(event_count,sample_space):
    denominator = round( sample_space/event_count)
    numerator = int(event_count/event_count)
    return '{} /{} chance of occurence'.format(numerator, denominator)

fraction_probability(total_hires,num_of_applicants)

'1 /429 chance of occurence'

# Conditional Probability

Conditonal probability takes this a bit further in that it gets more descriptive