# Master Probability in Data Science : Part 1: Basic Terms

https://www.youtube.com/watch?v=DUT4WEUngt0&t=10s

## Terminology

### 1. Random Experiment
An experiment is called random experiment if it satisfies the following two conditions:
- It has more than one possible outcome.
- It is not possible to predict the outcome in advance


Rolling a dice is a classic example of a random experiment because the result cannot be precisely determined beforehand and is subject to chance.

### 2. Trial
Trial refers to a single execution of a random experiment. Each trial produces an
outcome.


If you roll the dice once, it's considered one trial. If you roll the dice multiple times, each individual roll is a separate trial.

### 3. Outcome
Outcome refers to a single possible result of a trial.

rolling a dice, an outcome could be any of the numbers from 1 to 6 that appears face up after the dice is rolled.

### 4. Sample Space
Sample Space of a random experiment is the set of all possible outcomes that can occur.
Generally, one random experiment will have one set of sample space.


For rolling a standard six-sided dice, the sample space would be {1, 2, 3, 4, 5, 6}, representing all the possible numbers that could come up.



### 5. Event

Event is a specific set of outcomes from a random experiment or process. Essentially, it's a
subset of the sample space. An event can include a single outcome, or it can include
multiple outcomes. One random experiments can have multiple events.


In the context of rolling a dice, examples of events could be:

- Getting an even number: {2, 4, 6}
- Getting an odd number: {1, 3, 5}
- Getting a number greater than 3: {4, 5, 6}

![Screenshot%202023-09-01%20025138.png](attachment:Screenshot%202023-09-01%20025138.png)



## Types of Events

__1. Simple Event :__ Also known as an elementary event, a simple event is an event that consists
of exactly one outcome.

For example: 

when rolling a fair six-sided die, getting a 3 is a simple event.


but rolling a dice and getting odd number is not simple event as we can get {1, 3 or 5}.
<br></br>

__2. Compound Event :__ A compound event consists of two or more simple events.


For example: 


when rolling a die, the event "rolling an odd number" is a compound event
because it consists of three simple events: rolling a 1, rolling a 3, or rolling a 5.
<br></br>

__3. Independent Events :__ Two events are independent if the occurrence of one event does not
affect the probability of the occurrence of the other event.


For example: 

if you flip a coin and roll a die, the outcome of the coin flip does not affect the outcome of the die roll.
<br></br>

__4. Dependent Events :__ Events are dependent if the occurrence of one event does affect the
probability of the occurrence of the other event.

For example: 

if you draw two cards from a deck without replacement, the outcome of the first draw affects the outcome of the second draw because there are fewer cards left in the
deck.
<br></br>

__5. Mutually Exclusive Events :__ Two events are mutually exclusive (or disjoint) if they cannot
both occur at the same time.

For example: 

when rolling a die, the events "roll a 2" and "roll a 4" are mutually exclusive because a single roll of the die cannot result in both a 2 and a 4.
<br></br>

__6. Exhaustive Events :__ A set of events is exhaustive if at least one of the events must occur
when the experiment is performed.

For example:

when rolling a die, the events "roll an even number" and "roll an odd number" are exhaustive because one or the other must occur on any roll.
<br></br>

__7. Impossible event and Certain Event :__

impossible event : getting "7" by rolling a dice.


certain  event : getting heads or tails on tossing a coin

## What is Probability?

In simplest terms, probability is a measure of the likelihood that a particular event will occur. It
is a fundamental concept in statistics and is used to make predictions and informed decisions
in a wide range of disciplines, including science, engineering, medicine, economics, and social
sciences.


Probability is usually expressed as a number between 0 and 1, inclusive:

- A probability of 0 means that an event will not happen.
- A probability of 1 means that an event will certainly happen.
- A probability of 0.5 means that an event will happen half the time (or that it is as likely to happen as not to happen).

### Empirical Probability Vs Theoretical Probability

#### Empirical Probability:

Empirical probability, also known as experimental probability, is a probability measure that is
based on observed data, rather than theoretical assumptions. It's calculated as the ratio of the
number of times a particular event occurs to the total number of trials.

> __Q-1. Suppose that, in our 100 tosses, we get heads 55 times and tails 45 times. What is the empirical probability of getting a head?__



The empirical probability of getting a head is calculated by dividing the number of times you actually observed heads (in this case, 55 times) by the total number of tosses (100 tosses in this scenario). 

Empirical Probability of Getting a Head = (Number of Heads) / (Total Number of Tosses)


Empirical Probability of Getting a Head = 55 / 100 = 0.55


So, the empirical probability of getting a head based on your 100 tosses is 0.55, which is equivalent to 55%.

> __Q-2. Let's say you have a bag with 50 marbles. Out of these 50 marbles, 20 are red, 15 are blue, and 15 are green. You start to draw marbles one at a time, replacing the marble back into the bag after each draw.__

>__After 200 draws, you find that you've drawn a red marble 80 times, a blue marble 70 times, and a green
marble 50 times. What is the empirical probability of getting a red marble?__


The empirical probability of getting a red marble is calculated by dividing the number of times you observed a red marble (in this case, 80 times) by the total number of draws (200 draws in this scenario).

Empirical Probability of Getting a Red Marble = (Number of Red Marbles Drawn) / (Total Number of Draws)

Empirical Probability of Getting a Red Marble = 80 / 200 =  0.4


So, the empirical probability of getting a red marble based on your 200 draws with replacement is 0.4, which is equivalent to 40%.

#### Theoretical Probability

Theoretical (or classical) probability is used when each outcome in a sample space is equally
likely to occur. If we denote an event of interest as Event A, we calculate the theoretical
probability of that event as:


$$\text{Theoretical Probability of Event A} = \frac{\text{Number of Favourable Outcomes (that is, outcomes in
Event A)}} {\text{Total Number of Outcomes in the Sample Space}}$$


a. Consider a scenario of tossing a fair coin 3 times. Find the probability of getting exactly 2 heads.


b. Consider a scenario of rolling 2 dice. What is the probability of getting a sum = 7

### NOTE : When we do infinite no. of trials then empirical probability becomes very close to theoretical probability

The theoretical probability of getting heads is 0.5 (50%) and theoretical probability of getting tails is also 0.5 (50%).




For example, in the first 10 flips, you might get heads 6 times and tails 4 times, resulting in an empirical probability of 0.6 for heads and 0.4 for tails.

When flipping a fair coin many times (e.g., 1,000 or 10,000 flips), the empirical probability of getting heads or tails will become very close to the theoretical probability of 0.5 for each outcome. This convergence occurs due to the Law of Large Numbers.

## Random Variable

##### random variable is not a variable, it is a function.  

##### def of function : when you pass an input to a function by applying some logic to input, you get some output

In the context of probability theory, a random variable is a function that maps the
outcomes of a random process (known as the sample space) to a set of real numbers.


__Input :__ The input to the function is an outcome from the sample space of a random
process.


__Output :__ The output of the function is a real number that we assign to each possible
outcome.



The transformation from input to output in the function of a random variable is
determined by how we choose to define the random variable.
And the choice of how to define a random variable often depends on the specific aspects
of the random process (or event) that you're interested in studying.

### Difference between Sample Space and Random Variable using example:

**Sample Space:**

When rolling two dice, each die has 6 sides numbered 1 through 6. The sample space represents all possible outcomes when considering both dice together. It consists of all possible pairs of outcomes.

Sample Space for Rolling Two Dice:
```
{ (1,1), (1,2), (1,3), (1,4), (1,5), (1,6),
  (2,1), (2,2), (2,3), (2,4), (2,5), (2,6),
  (3,1), (3,2), (3,3), (3,4), (3,5), (3,6),
  (4,1), (4,2), (4,3), (4,4), (4,5), (4,6),
  (5,1), (5,2), (5,3), (5,4), (5,5), (5,6),
  (6,1), (6,2), (6,3), (6,4), (6,5), (6,6) }
```
This is the complete set of all possible pairs of outcomes when rolling two dice.

**Random Variable:**
A random variable, on the other hand, is a function that assigns a numerical value to each outcome in the sample space of a random experiment.It maps the outcomes of a random experiment to real numbers, and it can take on different values based on the outcome of the experiment.


Now, let's define a random variable, X, as the sum of the numbers rolled on the two dice. X takes on values based on the outcomes in the sample space.

Random Variable X:
```
X(1,1) = 1 + 1 = 2
X(1,2) = 1 + 2 = 3
X(1,3) = 1 + 3 = 4
...
X(6,4) = 6 + 4 = 10
X(6,5) = 6 + 5 = 11
X(6,6) = 6 + 6 = 12
```
So, the random variable X assigns a numerical value (the sum of the two dice) to each possible outcome in the sample space. For example, if we roll the dice and get (3,4), then X would be 3 + 4 = 7.

$$X = \{2,3,4,5,6,7,8,9,10,11,12\}$$


In summary, the sample space contains all possible pairs of outcomes when rolling two dice, and the random variable X associates a value (the sum of the numbers) with each pair in the sample spac

### Types of Random variables

1. **Discrete Random Variable:** A random variable that takes on a countable number of distinct values, often associated with discrete events, such as the number of heads in coin flips.
    
    eg: __Number of Red Marbles in a Bag :__ If you randomly draw marbles from a bag with red and blue marbles, the number of red marbles you get is a discrete random variable. It can only take on specific values like 0, 1, 2, and so on, depending on the outcome of each draw.



2. **Continuous Random Variable:** A random variable that can take any value within a certain range, typically associated with continuous or smoothly varying phenomena, like measuring time or temperature.

    eg : __Temperature Measurement :__ When measuring the temperature outside, the recorded temperature is a continuous random variable. It can take any value within a certain range (e.g., between 20°C and 25°C), including fractions of degrees (e.g., 23.5°C).

### Probability Distribution of a Random Variable (discrete)

A probability distribution is a list of all of the possible outcomes of a random variable along
with their corresponding probability values.

___a. Probability Distribution of tossing a coin:___

![Screenshot%202023-09-02%20015450.png](attachment:Screenshot%202023-09-02%20015450.png)


##### b. Rolling a dice:

![Screenshot%202023-09-02%20015802.png](attachment:Screenshot%202023-09-02%20015802.png)

##### c. Rolling 2 dice:




![main-qimg-01117f27ed34b04c40c2ee4f3cba7fbd-pjlq.jpeg](attachment:main-qimg-01117f27ed34b04c40c2ee4f3cba7fbd-pjlq.jpeg)

![Screenshot%202023-09-02%20020416.png](attachment:Screenshot%202023-09-02%20020416.png)


### Probability Distribution Function

A probability distribution function (PDF) is a mathematical function that describes the likelihood of different outcomes or values occurring for a random variable in a probability experiment. 
$$y=f(x)$$

It assigns probabilities to each possible value or range of values the random variable can take, showing how likely each outcome is. 



The PDF must satisfy two conditions: 


- it must be non-negative for all possible values, and 
- the sum (or integral) of the probabilities over all possible values must equal 1, indicating that one of the possible outcomes will occur.

#### PDF - for continuous 


#### PMF - for discrete

### Mean of a Random Variable

The mean of a random variable, often called the __expected value,__ is essentially the average
outcome of a random process that is repeated many times. More technically, it's a weighted
average of the possible outcomes of the random variable, where each outcome is weighted by
its probability of occurrence.

$$\text{Expected value of random variable} = \sum(\text{the possible outcome values} * \text{probabilty of getting that outcome)}$$

$$E[X]=\sum_{i=1}^nX_i\;P(X_i)$$

where

$P(X_i)$ = probabilty of getting that outcome

$X_i$ is the outcome value

##### probability of each outcome of rolling dice:

![Screenshot%202023-09-02%20023050.png](attachment:Screenshot%202023-09-02%20023050.png)

mean of x :

- perform multiple trials
- find avg value over multiple trials


how to get expected value:
- take all the possible values and multiply them with probability of them occuring and then sum



$$E(X) = \frac{1}{n} \sum_{i=1}^{n} x_i$$

Where:
- $E(X)$ is the mean of the random variable X.
- $n$ is the number of trials (in this case, 10,000 rolls).
- $x_i$ is the value obtained in each roll.

Since each of the six faces has a probability of 1/6, the expected value of a single roll (\(E(X)\) for one roll) is:

$E(X) = \frac{1}{6} \left(1 + 2 + 3 + 4 + 5 + 6\right) = \frac{21}{6} = 3.5$

Now, for 10,000 rolls:

$E(X)_{\text{total}} = \text{number of rolls} \times E(X)_{\text{single roll}} = 10,000 \times 3.5 = 35,000$

So, the mean of rolling a fair six-sided die 10,000 times is 3.5. This means that, on average, you can expect the average outcome of all the rolls to be approximately 3.5.

In [1]:
import random
import numpy as np

outcome = []
for i in range(10000):
    outcome.append(random.randint(1,6))
    
np.array(outcome).mean()

3.4918

##### Consider a simple random variable X representing the number of heads when tossing a fair coin twice. We'll calculate its mean.

**Example: Calculating the Mean of a Discrete Random Variable**

Suppose you're interested in the random variable X, which represents the number of heads when tossing a fair coin twice.

- X can take on values: 0 (no heads), 1 (one head), or 2 (two heads).

To calculate the mean (expected value) of X, you can use the following formula:

$$E(X) = \sum_{i} x_i P(X=x_i)$$

Where:
- $E(X)$ is the mean of X.
- $x_i$ represents each possible value of X.
- $P(X=x_i)$ is the probability that X takes on the value $x_i$.

Now, calculate the mean step by step:

1. Calculate the probability of each possible outcome of X:

   - $P(X=0)$: Probability of getting 0 heads = Probability of getting tails on both tosses = $ (1/2) \times (1/2) = 1/4$
   - $P(X=1)$: Probability of getting 1 head = Probability of getting (head, tail) or (tail, head) = $ (1/2) \times (1/2) + (1/2) \times (1/2) = 1/2$
   - $P(X=2)$: Probability of getting 2 heads = Probability of getting heads on both tosses = $ (1/2) \times (1/2) = 1/4$

2. Calculate the mean using the formula:

   $E(X) = (0 \times 1/4) + (1 \times 1/2) + (2 \times 1/4)$

3. Perform the calculations:

   $E(X) = (0) + (1/2) + (1/2) = 1$

So, the mean (expected value) of the random variable X is 1. This means that, on average, you can expect to get 1 head when tossing a fair coin twice.

### Variance of a Random Variable

The variance of a random variable is a statistical measurement that describes how much
individual observations in a group differ from the mean (expected value).

$$variance=\text{(each outcome - expected mean)}^2$$



$$E = \text{Expected value of}\; (X - E[X])^2\;\;\text{ie. Mean of}\;(X - E[X])^2$$ 

$$\text{Var}(X) = E\big[(X - E[X])^2\big]$$

$$OR$$

$$\text{Var}(X) = E[X^2]- (E[X])^2$$

#### NOTE: Applicable both on continuous and discrete

##### variance for rolling a dice:

The variance of rolling a fair six-sided die 1,000 times can be calculated similarly to the previous calculation. 

First, we know that the mean (expected value) for a single roll of a fair six-sided die is \(E(X) = 3.5\).

To find the variance, you need to calculate \(E(X^2)\), which is the expected value of the square of the outcome for a single roll:

$E(X^2) = \frac{1}{6}(1^2 + 2^2 + 3^2 + 4^2 + 5^2 + 6^2) = \frac{1}{6}(91) = \frac{91}{6}$

Now, you can calculate the variance using the formula:

$$\text{Var}(X) = E\big[\;(X - E[X])^2\big]$$

$\text{Var}(X) = \frac{91}{6} - \left(3.5\right)^2 = \frac{91}{6} - 12.25$

$\text{Var}(X) = \frac{91}{6} - \frac{147}{12} = \frac{91}{6} - \frac{49}{3}$

$\text{Var}(X) = \frac{91}{6} - \frac{98}{6} = -\frac{7}{6}$

So, the variance of rolling a fair six-sided die 1,000 times is approximately $-\frac{7}{6}$. 

However, variance should always be a non-negative value in practice, so this negative value indicates that something may be incorrect in the calculation. The correct variance should be a positive value. 

---
---

# Part 2 :Joint | Marginal | Conditional Probability and Bayes' Theorem

https://www.youtube.com/watch?v=ndHDsvqmbuI&t=47s

## Types of Probability:

- Joint Probability


- Marginal Probability


- Conditional Probability

When working with random samples and probability theory, there are several different types of probabilities and concepts to consider. Here are some of the key types of probabilities with respect to random samples:


1. **Sample Space Probability:** This is the probability associated with individual outcomes in the sample space, representing the likelihood of each possible result occurring.


2. **Marginal Probability:** Marginal probabilities refer to the probabilities of specific events occurring within one variable of a joint probability distribution. For example, the probability of an event happening for one variable in a contingency table.


3. **Joint Probability:** Joint probability represents the probability of two or more events occurring simultaneously. It deals with the likelihood of the intersection of events.


4. **Conditional Probability:** Conditional probability is the probability of one event occurring given that another event has already occurred. It's denoted as P(A | B), meaning the probability of event A happening given that event B has occurred.


5. **Independence:** Events are considered independent if the occurrence (or non-occurrence) of one event does not affect the probability of the other event(s). Independence is a crucial concept when dealing with random samples.


6. **Mutually Exclusive Events:** Mutually exclusive events are events that cannot happen at the same time. If one event occurs, the other(s) cannot. The probability of the union of mutually exclusive events is the sum of their individual probabilities.


7. **Complementary Probability:** Complementary probability refers to the probability of an event not happening (the complement of an event). For example, if P(A) is the probability of event A happening, then P(A') is the probability of event A not happening.


8. **Total Probability:** Total probability is used when you have several disjoint events (events that don't overlap), and it calculates the probability of an event occurring over all possible disjoint scenarios.


9. **Bayes' Theorem:** Bayes' theorem is used to update the probability for an event based on new evidence or information. It's especially useful in conditional probability problems, such as Bayesian inference.


10. **Sampling Distributions:** Sampling distributions describe the distribution of statistics (e.g., sample means or sample proportions) calculated from random samples. These distributions help us make inferences about population parameters.


11. **Central Limit Theorem:** The central limit theorem states that, under certain conditions, the distribution of the sample means of a large enough random sample will be approximately normally distributed, regardless of the population's distribution.

These various types of probabilities and concepts are fundamental in probability theory and statistics, allowing us to analyze and make inferences from random samples and data.

## 1. Joint Probability

![Screenshot%202023-09-03%20015535.png](attachment:Screenshot%202023-09-03%20015535.png)

![Screenshot%202023-09-03%20015632.png](attachment:Screenshot%202023-09-03%20015632.png)

In [1]:
import pandas as pd
import numpy as np

In [3]:
df = pd.read_csv(r"D:\PandasNumpy\datasets\datasets-session-22\titanic.csv")
df.shape

(891, 12)

In [4]:
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [7]:
df['PassengerId'].count()

891

##### contigency table:

In [5]:
pd.crosstab(df['Survived'], df['Pclass'])

Pclass,1,2,3
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,80,97,372
1,136,87,119


#### So probability of x being Pclass=1, y=0(didnot survived):

$$P(X=1, y=0) \Longrightarrow \frac{value}{total\;passengers}$$


$$\frac{80}{891}=0.089787$$

### Total joint probabilities : joint probabilities of x,y occuring together: each element

In [9]:
# (pd.crosstab(df['Survived'], df['Pclass']))/891
pd.crosstab(df['Survived'], df['Pclass'],normalize='all')

Pclass,1,2,3
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,0.089787,0.108866,0.417508
1,0.152637,0.097643,0.133558


### Joint Probability Distribution: The entire table

Probability of all combination

In [10]:
pd.crosstab(df['Survived'], df['Pclass'],normalize='all')

Pclass,1,2,3
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,0.089787,0.108866,0.417508
1,0.152637,0.097643,0.133558


## 2. Marginal Probability/ Unconditional probability / Simple

![image.png](attachment:image.png)

In [11]:
pd.crosstab(df['Survived'], df['Pclass'],margins='all')

Pclass,1,2,3,All
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,80,97,372,549
1,136,87,119,342
All,216,184,491,891


### Marginal Probability:

In [15]:
pd.crosstab(df['Survived'], df['Pclass'],margins=True,normalize=True)

Pclass,1,2,3,All
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,0.089787,0.108866,0.417508,0.616162
1,0.152637,0.097643,0.133558,0.383838
All,0.242424,0.20651,0.551066,1.0


### Marginal Probability Distribution: The entire table
Probability of all combination

### NOTE : Sum of all distribution = 1

## 3. Conditional Probability

![image.png](attachment:image.png)

$$P(A|B)\longrightarrow\; \text{Probability of A given B}$$


$$P(B|A) \longrightarrow\;\text{Probability of B given A}$$

### NOTE : reduce the sample space according to B and find the outcomes which satisfies A in it.

#### eg : 

![image.png](attachment:image.png)

- Sample space : 8 outcomes $\rightarrow${HHH, HHT, HTT, TTT, TTH, THH, THT, HTH}


- Event A -  Atleast 2 Heads


- Event B - Atleast 1 Head


- __Final Sample Space = 7  outcomes $\rightarrow$ {HHH, HHT, HTT, TTH, THH, THT, HTH}__


- Atleast 2 heads P(A|B)= $\frac{4}{7}$

#### eg 2 : 

![Screenshot%202023-09-03%20030545.png](attachment:Screenshot%202023-09-03%20030545.png)

![Screenshot%202023-09-03%20150935.png](attachment:Screenshot%202023-09-03%20150935.png)

- Sample Space = 36

            {(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),
             (2,1),(2,2),(2,3),(2,4),(2,5),(2,6),
             (3,1),(3,2),(3,3),(3,4),(3,5),(3,6),
             (4,1),(4,2),(4,3),(4,4),(4,5),(4,6),
             (5,1),(5,2),(5,3),(5,4),(5,5),(5,6),
             (6,1),(6,2),(6,3),(6,4),(6,5),(6,6)}



- Event A -  sum =7


- Event B - die 1 is odd number = {1,3,5}


- Final sample space = 18

                {(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),
                (3,1),(3,2),(3,3),(3,4),(3,5),(3,6),
                (5,1),(5,2),(5,3),(5,4),(5,5),(5,6)}



- P(A) = {(1,6),{3,4},{5,2}}



- P(A|B) = $\frac{3}{18}=\frac{1}{6}$

#### eg : 3

![image.png](attachment:image.png)

![Screenshot%202023-09-03%20150935.png](attachment:Screenshot%202023-09-03%20150935.png)

- sample space = 36

            {(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),
             (2,1),(2,2),(2,3),(2,4),(2,5),(2,6),
             (3,1),(3,2),(3,3),(3,4),(3,5),(3,6),
             (4,1),(4,2),(4,3),(4,4),(4,5),(4,6),
             (5,1),(5,2),(5,3),(5,4),(5,5),(5,6),
             (6,1),(6,2),(6,3),(6,4),(6,5),(6,6)}

 
- Event A $\rightarrow$  dice 1 = 2


- Event B $\rightarrow$ die 1 + die 2 $\leq$ 5


- Final sample space = 10

                {(1,1),(1,2),(1,3),(1,4),
                (2,1),(2,2),(2,3),
                (3,1),(3,2),
                (4,1)}
                
                
- P(A) = {(2,1),{2,2},{2,3}}


- P(A|B) = $\frac{3}{10}$

### Formula for conditional Probability :

#### conditional probability of A given B :

$$P(A|B) = \frac{P(A\cap B)}{P(B)}$$

where:

- $P(A\cap B) \longrightarrow\;$ __Joint Probability of A and B__


- P(B) $\longrightarrow\;$ __Marginal Probability of B__

### eg : 

![Screenshot%202023-09-03%20160546.png](attachment:Screenshot%202023-09-03%20160546.png)

> __1. P(Y= 0 | X = 3) : <br></br> 
the conditional probability that a passenger did not survive (Y=0) given that their passenger class (X) is equal to 3. In other words, it's the probability of a passenger not surviving if they were in the third class.__

$$P(Y|X)=\frac{P(Y=0 \cap X=3)}{P(X)}=\frac{\frac{372}{372+119}}{0.4175+0.133}= \frac{0.41}{0.54}=0.75$$

> __2. P(Y= 0 | X = 2)  <br></br> 
the conditional probability that a passenger did not survive (Y=0) given that their passenger class (X) is equal to 2. In other words, it's the probability of a passenger not surviving if they were in the second class.__


$$P(Y|X)=\frac{P(Y=0 \cap X=2)}{P(X)}=\frac{0.1088}{0.1088+0.0976}= \frac{0.1088}{0.19}=0.52$$



> __3. P(Y= 0 | X = 1)  <br></br> 
the conditional probability that a passenger did not survive (Y=0) given that their passenger class (X) is equal to 1. In other words, it's the probability of a passenger not surviving if they were in the first class.__


$$P(Y|X)=\frac{P(Y=0 \cap X=1)}{P(X)}=\frac{0.089787}{0.089787+0.152637}= \frac{0.089787}{0.242424}=0.37$$



### Joint + Marginal Probabilities:

In [32]:
pd.crosstab(df['Survived'],df['Pclass'],normalize=True,margins=True)

Pclass,1,2,3,All
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,0.089787,0.108866,0.417508,0.616162
1,0.152637,0.097643,0.133558,0.383838
All,0.242424,0.20651,0.551066,1.0


### Conditional Probabilities:

In [24]:
pd.crosstab(df['Survived'],df['Pclass'],normalize='columns',margins=True)

Pclass,1,2,3,All
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,0.37037,0.527174,0.757637,0.616162
1,0.62963,0.472826,0.242363,0.383838


#### eg : passenger dieing and him being in class 3 : P(X=3 | Y=0)

In [36]:
pd.crosstab(df['Survived'],df['Pclass'],normalize='index',margins=True)

Pclass,1,2,3
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,0.145719,0.176685,0.677596
1,0.397661,0.254386,0.347953
All,0.242424,0.20651,0.551066


### Intiuition behind Conditional Probability Formula:


$$P(A|B) = \frac{P(A\cap B)}{P(B)}$$

![Screenshot%202023-09-03%20173329.png](attachment:Screenshot%202023-09-03%20173329.png)

### DEPENDENT & INDEPENDENT EVENTS
* <b>Independent Events : </b> The theoretical probability remains unaffected by other events. eg: Fliping a coin or rolling a dice

>If two events are **Independent** then the probability of their intersection i.e. Probability(A and B) is the product of their individual probability.  $$\longrightarrow P(A\cap B) = P(A)* P(B)$$ 

![Screenshot%202023-09-03%20174806.png](attachment:Screenshot%202023-09-03%20174806.png)

* <b>Dependent Events : </b> Probabilities of dependent events vary as conditions change. 

eg: 

- Taking out a marble from a bag of 5 different marbles.


- Drawing a card with replacement.
<br></br>

### MUTUALLY EXCLUSIVE SETS 

Mutually exclusive sets are sets which are not allowed to have any overlapping elements. Graphically, their circles never intersect. i.e **They cannot both occur at the same time.**

___eg: tossing a coin - head and tails cannot come at same time___
$$P(A\cap B)=0$$

Mutually exclusive sets have the empty set as their intersection.Therefore, if the intersection of any number of sets is the empty set, then they must be mutually exclusive and vice versa.

About their Union : If some sets are mutually exclusive, their union is simply the sum of all separate individual sets.
$$A \cup B = A+B$$
<br></br>

**Example for mutually exclusive, but not complements:**

A : Winning a Game.
<br>B : Drawing a Game.</br>

Because you can not simultaneously win and draw the same game. However, you can also lose this game, so the two are not complements.


## Baye's Theorem


The conditional probability of getting B given A * the probability of A divided by the probability of B. This equation is known as Bayes Theorem.

$$P(A|B)= \frac{P(B|A)*P(A)}{P(B)}$$

It is crucial because it allows us to find a relationship between the different conditional probabilities of two events.



![image.png](attachment:image.png)

### Mathematical Proof : 

![Screenshot%202023-09-03%20180249.png](attachment:Screenshot%202023-09-03%20180249.png)

#### question:

![Screenshot%202023-09-03%20180640.png](attachment:Screenshot%202023-09-03%20180640.png)

#### if a new person is Male. what is chance of survival?

![Screenshot%202023-09-03%20180931.png](attachment:Screenshot%202023-09-03%20180931.png)

![Screenshot%202023-09-03%20181031.png](attachment:Screenshot%202023-09-03%20181031.png)

---
---