# Basic Probability

Probabilitas adalah cabang matematika yang mengukur tingkat kemungkinan terjadinya suatu peristiwa atau kejadian. Nilai probabilitas berkisar antara 0 dan 1.
- 0 (nol) menunjukkan bahwa suatu peristiwa tidak mungkin terjadi.
- 1 (satu) menunjukkan bahwa suatu peristiwa pasti terjadi.

In [1]:
import pandas as pd
import numpy as np
import scipy.stats as stats


df = pd.read_csv("matches.csv")
df.head(5)

Unnamed: 0,id,season,city,date,team1,team2,toss_winner,toss_decision,result,dl_applied,winner,win_by_runs,win_by_wickets,player_of_match,venue,umpire1,umpire2,umpire3
0,1,2017,Hyderabad,5/4/2017,Sunrisers Hyderabad,Royal Challengers Bangalore,Royal Challengers Bangalore,field,normal,0,Sunrisers Hyderabad,35,0,Yuvraj Singh,"Rajiv Gandhi International Stadium, Uppal",AY Dandekar,NJ Llong,
1,2,2017,Pune,6/4/2017,Mumbai Indians,Rising Pune Supergiant,Rising Pune Supergiant,field,normal,0,Rising Pune Supergiant,0,7,SPD Smith,Maharashtra Cricket Association Stadium,A Nand Kishore,S Ravi,
2,3,2017,Rajkot,7/4/2017,Gujarat Lions,Kolkata Knight Riders,Kolkata Knight Riders,field,normal,0,Kolkata Knight Riders,0,10,CA Lynn,Saurashtra Cricket Association Stadium,Nitin Menon,CK Nandan,
3,4,2017,Indore,8/4/2017,Rising Pune Supergiant,Kings XI Punjab,Kings XI Punjab,field,normal,0,Kings XI Punjab,0,6,GJ Maxwell,Holkar Cricket Stadium,AK Chaudhary,C Shamshuddin,
4,5,2017,Bangalore,8/4/2017,Royal Challengers Bangalore,Delhi Daredevils,Royal Challengers Bangalore,bat,normal,0,Royal Challengers Bangalore,15,0,KM Jadhav,M Chinnaswamy Stadium,,,


In [2]:
# checking the columns present in the data
df.columns

Index(['id', 'season', 'city', 'date', 'team1', 'team2', 'toss_winner',
       'toss_decision', 'result', 'dl_applied', 'winner', 'win_by_runs',
       'win_by_wickets', 'player_of_match', 'venue', 'umpire1', 'umpire2',
       'umpire3'],
      dtype='object')

### Calculating the probability of a team winning a match

In [3]:
# Total number of matches
total_matches = len(df)
print(f"Total Match: {total_matches}")

# Number of matches won by Mumbai Indians
team_wins = len(df[df['winner'] == 'Mumbai Indians'])
print(f"Total Wins: {total_matches}")

probability = team_wins / total_matches
print('Probability of Mumbai Indians winning a matches:{0: .2f}%'.format(probability*100))

Total Match: 636
Total Wins: 636
Probability of Mumbai Indians winning a matches: 14.47%


### Calculating the probability distribution of toss results

In [4]:
# Count of each team winning the toss
toss_counts = df['toss_winner'].value_counts()

# Total number of matchs
total_matchs = len(df)

toss_probability = (toss_counts / total_matches)*100
toss_probability = round(toss_probability, 2)
print("Probabilitiy distribution of toss results:")
print(toss_probability)

Probabilitiy distribution of toss results:
toss_winner
Mumbai Indians                 13.36
Kolkata Knight Riders          12.26
Delhi Daredevils               11.32
Royal Challengers Bangalore    11.01
Kings XI Punjab                10.69
Chennai Super Kings            10.38
Rajasthan Royals                9.91
Deccan Chargers                 6.76
Sunrisers Hyderabad             5.50
Pune Warriors                   3.14
Gujarat Lions                   2.36
Kochi Tuskers Kerala            1.26
Rising Pune Supergiants         1.10
Rising Pune Supergiant          0.94
Name: count, dtype: float64


### Probability of a specific outcome in the toss (e.g., winning the toss and choosing to bat)

In [5]:
# Total number of tosses
total_tosses = len(df)
batting_choice = len(df[(df['toss_winner'] == 'Chennai Super Kings') & (df['toss_decision'] == 'bat')])

probability = batting_choice / total_tosses
print('Probability of Chennai Super Kings choosing to bat after winning the toss:{0: .2f}%'.format(probability*100))

Probability of Chennai Super Kings choosing to bat after winning the toss: 6.92%


### Probability of a team winning after winning the toss and choosing to field

In [6]:
# Total number of matchs
total_matches = len(df)
toss_field = len(df[(df['toss_decision'] == 'field') & (df['winner'] == df['toss_winner'])])

probability = toss_field / total_matches
print('Probability of a team winning after winning the toss and choosing to field:{0: .2f}%'.format(probability*100))

Probability of a team winning after winning the toss and choosing to field: 31.60%


### Probability pf a specificevent occurring in a match (e.g., a player scoring a century):

In [9]:
total_matches = len(df)
century_matches = len(df[df['player_of_match'] == 'MS Dhoni'])

probability = century_matches / total_matches
print('Probability of MS Dhoni being the palyer pf the match and scoring a century:{0: .2f}%'.format(probability*100))

Probability of MS Dhoni being the palyer pf the match and scoring a century: 2.04%
