### Proability

Probability is the `chance` of something to happen.  
When you flip a `coin`, there is a probability of 0.5 (or 50% chance) to land on heads.  

It's `like` asking, "What are the chances of something to happen?"  
Probability is a number `between` 0 and 1, where 0 means "no way" and 1 means "definitely happening".  

In [18]:
# Coin Flip events
events = ['head', 'tail']

# Output probabilies
print('Head =', 1/2)
print('Tail =', 1/len(events))

Head = 0.5
Tail = 0.5


### Proability Distribution

Now, imagine you're not just flipping a coin but `rolling` a dice.  
There are more `outcomes` (1 through 6), each with its own probability.  

A probability distribution is a `list` with all these probabilities.  
It's like a `map` with all the possible outcomes and how likely they are.

A probability gives you the `chance` for an event.  
A probability distribution tells the `story` for all posible outcomes.  

In [15]:
import pandas as pd
from icecream import ic

# Dataset
A = ['apple']*1 + ['orange']*2 + ['banana']*2
B = ['apple']*5 + ['orange']*2 + ['banana']*0

ic(A)
ic(B)

# Probability (by HAND)
PA_hand = [{'apple': 1/5}, {'orange': 2/5}, {'banana': 2/5}] 
PB_hand = [{'apple': 5/7}, {'orange': 2/7}]

display('\n')

ic(PA_hand)
ic(PB_hand)

# Probability (with PANDAS)
PA_pandas = pd.Series(A).value_counts(normalize=True)
PB_pandas = pd.Series(B).value_counts(normalize=True)



ic| A: ['apple', 'orange', 'orange', 'banana', 'banana']
ic| B: ['apple', 'apple', 'apple', 'apple', 'apple', 'orange', 'orange']


'\n'

ic| PA_hand: [{'apple': 0.2}, {'orange': 0.4}, {'banana': 0.4}]
ic| PB_hand: [{'apple': 0.7142857142857143}, {'orange': 0.2857142857142857}]


### Entropy

Entropy is a measure of how `disordered` a collection is.  
The more `impure` the feature is, the higher the entropy.  

Probability distribution is the `frequency` of the unique values.  
It turns out that a `logarithm` of the number of states is perfect for compute entropy.  

In [None]:
import pandas as pd
import numpy as np
from icecream import ic

# Set the initial traning data
A = ['apple']*1 + ['orange']*2 + ['banana']*2
B = ['apple']*5 + ['orange']*2 + ['banana']*0

# Probability
PA_pandas = pd.Series(A).value_counts(normalize=True)
PB_pandas = pd.Series(B).value_counts(normalize=True)
ic(PA_pandas)
ic(PB_pandas)

# Entropy (Shannon model)
P1 = PA_pandas.values
P2 = PB_pandas.values
H1 = -1 * np.sum(P1 * np.log2(P1))
H2 = -1 * np.sum(P2 * np.log2(P2))


ic(H1)
ic(H2)

print("A entropy > B entropy / There is more disorder in A than B")
print("Assertion passed")