#Probability

## Demo 1: Types of Probabilities

In this demo, you will be shown how to implement types of probability

#### Step 1: Import libraries

In [3]:
import numpy as np
import pandas as pd

#### Step 2: Read marveldata.csv file and print top five rows from the dataset

In [4]:
df = pd.read_csv('marveldata.csv')
df.head()

Unnamed: 0,page_id,name,urlslug,ID,ALIGN,EYE,HAIR,SEX,GSM,ALIVE,APPEARANCES,FIRST APPEARANCE,Year
0,1678,Spider-Man (Peter Parker),\/Spider-Man_(Peter_Parker),Secret Identity,Good Characters,Hazel Eyes,Brown Hair,Male Characters,,Living Characters,4043.0,Aug-62,1962.0
1,7139,Captain America (Steven Rogers),\/Captain_America_(Steven_Rogers),Public Identity,Good Characters,Blue Eyes,White Hair,Male Characters,,Living Characters,3360.0,Mar-41,1941.0
2,64786,"Wolverine (James \""Logan\"" Howlett)",\/Wolverine_(James_%22Logan%22_Howlett),Public Identity,Neutral Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,3061.0,Oct-74,1974.0
3,1868,"Iron Man (Anthony \""Tony\"" Stark)",\/Iron_Man_(Anthony_%22Tony%22_Stark),Public Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,2961.0,Mar-63,1963.0
4,2460,Thor (Thor Odinson),\/Thor_(Thor_Odinson),No Dual Identity,Good Characters,Blue Eyes,Blond Hair,Male Characters,,Living Characters,2258.0,Nov-50,1950.0


#### Step 3: Calculate total number of character counts

In [5]:
x = len(df['name'])
x

16376

#### Step 4: Display columns SEX, EYE, HAIR

In [6]:
marvel = df[['SEX','EYE','HAIR']]
marvel

Unnamed: 0,SEX,EYE,HAIR
0,Male Characters,Hazel Eyes,Brown Hair
1,Male Characters,Blue Eyes,White Hair
2,Male Characters,Blue Eyes,Black Hair
3,Male Characters,Blue Eyes,Black Hair
4,Male Characters,Blue Eyes,Blond Hair
...,...,...,...
16371,Male Characters,Green Eyes,No Hair
16372,Male Characters,Blue Eyes,Bald
16373,Male Characters,Black Eyes,Bald
16374,Male Characters,,


#### Step 5: Calculate total number of characters according to gender

In [7]:
char = df.groupby(['SEX']).count()['name']
print(char)

SEX
Agender Characters           45
Female Characters          3837
Genderfluid Characters        2
Male Characters           11638
Name: name, dtype: int64


# Marginal Probability

### Calculate the probability of character being male

#### Step 6: Calculate the probability of character being male

In [8]:
#Probability of character being male
char['Male Characters']/x

0.7106741573033708

# Joint Probability

### Calculate the probability of character being female and has red hair

Assume that both the events are independent of each other

#### Step 7: Calculate probability of character being female

In [9]:
female=char['Female Characters']/x
female

0.23430630190522717

#### Step 8: Calculate count of different hair color

In [10]:
hair=df.groupby(['HAIR']).count()['name']
print(hair)

HAIR
Auburn Hair                78
Bald                      838
Black Hair               3755
Blond Hair               1582
Blue Hair                  56
Bronze Hair                 1
Brown Hair               2339
Dyed Hair                   1
Gold Hair                   8
Green Hair                117
Grey Hair                 531
Light Brown Hair            6
Magenta Hair                5
No Hair                  1176
Orange Hair                43
Orange-brown Hair           3
Pink Hair                  31
Purple Hair                47
Red Hair                  620
Reddish Blond Hair          6
Silver Hair                16
Strawberry Blond Hair      47
Variable Hair              32
White Hair                754
Yellow Hair                20
Name: name, dtype: int64


#### Step 9: Calculate probability of character having a red hair

In [11]:
red_hair=hair['Red Hair']/x
print(red_hair)

0.03786028334147533


#### Step 10: Calculate probability of character being female and has red hair i.e.joint probability

In [12]:
#Probability of character being female and has red hairs
print(female*red_hair)

0.008870902978825162


#### Step 11: Calculate joint probability in terms of percentage

In [13]:
print(female*red_hair*100)

0.8870902978825161


# Conditional Probability

### Given that character is female, calculate the probability of having green eyes

#### Step 12: Calculate count of green eyes

In [14]:
#Count of green eyes
green_eyes = df[df=='Green Eyes'].EYE.count()
green_eyes

613

#### Step 13: Calculate count of female characters

In [15]:
female_char = df[df.SEX=='Female Characters'].SEX.count()
female_char

3837

#### Step 14: Calculate count of female characters with green eyes

In [16]:
female_green_count = df[(df.SEX=='Female Characters') & (df.EYE=='Green Eyes')].SEX.count()
female_green_count

268

#### Step 15: Calculate probability of female with green eyes

In [17]:
prob_female_green = female_green_count/x
prob_female_green

0.016365412799218368

#### Step 16: Probability of character being female

In [18]:
prob_female = female_char/x
prob_female

0.23430630190522717

#### Step 17: Calculate conditional probability of female with green eyes

In [19]:
prob_female_green/prob_female

0.06984623403700807

#### Conclusion: This code demonstrate how to implement types of probability

# Odds

### Calculate the odds in favour of living characters

#### Step 1: Calculate the count of living character

In [20]:
living_characters=df.groupby(['ALIVE']).count()['ID'] # This statement returns a dataframe consisting of deceased and living characters
#Getting count of living characters
l=living_characters[1] 
print(l)

9490


#### Step 2: Calculate the probability of living characters

In [21]:
prob_liv = l/x
prob_liv

0.5795065950170982

#### Step 3: Calculate number of alive characters except living characters

In [22]:
y = 1-prob_liv
y

0.4204934049829018

#### Step 4: Calculate odds in favour of living characters

In [23]:
odds = prob_liv/y
odds

1.3781585826314262