# Chapter 1 - Probability

In [1]:
import pandas as pd

gss = pd.read_csv('gss_bayes.csv', index_col=0)
gss.head()

Unnamed: 0_level_0,year,age,sex,polviews,partyid,indus10
caseid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,1974,21.0,1,4.0,2.0,4970.0
2,1974,41.0,1,5.0,0.0,9160.0
5,1974,58.0,2,6.0,1.0,2670.0
6,1974,30.0,1,5.0,4.0,6870.0
7,1974,48.0,1,5.0,4.0,7860.0


In [7]:
banker = gss['indus10'] == 6870
banker.head()

caseid
1    False
2    False
5    False
6     True
7    False
Name: indus10, dtype: bool

In [3]:
banker.sum()

728

In [4]:
banker.mean()

0.014769730168391155

In [5]:
def prob(A):
    """ Determine the probability of the proposition A """
    return A.mean()

This lets us compute the probability of someone being a banker using the following

In [8]:
prob(banker)

0.014769730168391155

In [10]:
female = gss['sex'] == 2
prob(female)

0.5378575776019476

We can also look at the `polviews` column:

Value | View
------|------
1 | extremely liberal
2 | Liberal
3 | Slightly liberal
4 | Moderate
5 | slightly conservative
6 | Conservative
6 | Extremely conservative

In [15]:
liberal = gss['polviews'] <= 3
prob(liberal)

0.27374721038750255

We can also look at party affiliation via the `partyid` column:

Value | Coding
----|----
0|Strong Democrat
1 | Not strong democrat
2 | Independent, near democrat
3 | Independent
4 | Independent, near republican
5 | Not strong republican
6 | strong Republican
7 | Other party

In [17]:
democrat = gss['partyid'] <= 1
prob(democrat)

0.3662609048488537

## Conjunction

- $P(A \& B)$
- If we have two boolean Series, use `&` to compute their conjunction:q


In [19]:
prob(banker)
prob(democrat)
prob(banker & democrat)

0.004686548995739501

In [20]:
prob(democrat & banker )

0.004686548995739501

## Conditional Probability

- $P(A | B)
- "Of all the people who are B, what is the fracction that are A?"
- Calculate using a `[]` notation

To calculate $P(Democrat | liberal)$:q


In [22]:
selected = democrat[liberal]
prob(selected)

0.5206403320240125

In [24]:
selected = female[banker]
prob(selected)

0.7706043956043956

In [30]:
def conditional(proposition, given):
    return prob(proposition[given])

In [31]:
conditional(liberal, given=female)

0.27581004111500884

## Laws of Probability

1. $P(A|B) = \frac{P(A \& B)}{P(B)} \iff P(A \& B) = P(A)P(B|A)$

Bayes Theorem:
$$
P(A \& B) = P(B \& A) \\
P(A)P(B|A) = P(B)P(A|B) \\
P(A|B) = \frac{P(A)P(B|A)}{P(B)}
$$

In [32]:
prob(female & banker) / prob(banker)

0.7706043956043956

In [33]:
conditional(female, given=banker)

0.7706043956043956

## Exercises

### 1-1 Use the tools of the chapter to solve the following:
Compute
1. The probability that Linda is a female banker
2. The probability that she is a liberal female banker
3. The probability that she is a liberal female banker and a Democrat

In [34]:
# Solution
print(prob(female & banker))
print(prob(liberal & female & banker))
print(prob(liberal & female & banker & democrat))

0.011381618989653074
0.002556299452221546
0.0012375735443294787


### 1-2 Use `conditional` to compute the following:
- What is the probability that a respondent is liberal, given that they are a Democrat?
- What is the probability that a respondent is a Democrat, given that they are liberal?

In [35]:
# Solution
print(conditional(liberal, given=democrat))
print(conditional(democrat, given=liberal))

0.3891320002215698
0.5206403320240125


### 1-3 Famous quote:
> If you are not a liberal at 25, you have no heart. If you are not a conservative at age 35, you have no brain

Let's test this proposition (kind of).
- young: under 30
- old: over 65
- conservative: conservative, slightly conservative, extremely conservative

Use `prob` and `conditional` to calculate the following:
- P(young and liberal)
- P(liberal | young)
- P(old & conservative)
- P(old | conservative)

In [36]:
young = gss['age'] < 30
old = gss['age'] > 65
conservative = gss['polviews'] >= 5

print("Young liberal:", prob(young & liberal))
print("Young person is liberal:", conditional(liberal, given=young))
print("Old conservative:", prob(old & conservative))
print("Old given conservative", conditional(old, given=conservative))

Young liberal: 0.06579427875836884
Young person is liberal: 0.338517745302714
Old conservative: 0.062264150943396226
Old given conservative 0.1820932716269135
