<a href="https://colab.research.google.com/github/luisitobarcito/info-theory-for-ml/blob/master/Refresher.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simulating the orange apples example from the probability refresher
Here, we will simulate the procedure to randomly pick a fruit from the two colored boxes that we used in our review of simple probability.

In [0]:
import numpy as np
from tabulate import tabulate

fruits = np.asarray(['apple', 'orange'])
boxes = np.asarray(['red', 'blue'])

Pb = np.asarray([0.4, 0.6])
n_fruit = {'red': np.asarray([2, 6]), 'blue':np.asarray([3, 1])}

# create table with conditional probabilites P(F|B)
Pfgb = np.zeros((fruits.size, boxes.size), dtype=np.float)
for j, box in enumerate(boxes):
  Pfgb[:, j] = n_fruit[box] / np.sum(n_fruit[box])

print(tabulate(np.concatenate((fruits[:,None], Pfgb), 1), headers=boxes))

          red    blue
------  -----  ------
apple    0.25    0.75
orange   0.75    0.25


# Check condtional probabilities sum to one
We need to confirm that $\sum_FP(F|B) = 1$
This can be accomplished by summing the elements for each column of the array 
``` Pfgb ```



In [0]:
print(Pfgb.sum(axis=0))

[1. 1.]


# Compute Joint probability from product rule
We can compute $P(F,B) = P(F|B)P(B)$

In [0]:
Pfb = Pfgb*Pb[None, :] # broadcasting multiplication
print(tabulate(np.concatenate((fruits[:,None], Pfb), 1), headers=boxes))


          red    blue
------  -----  ------
apple     0.1    0.45
orange    0.3    0.15


# Check joint probabilities all sum to one
We need to confirm that $\sum_{F, B}P(F,B) = 1$
This can be accomplished by summing all the elements  of the array 
``` Pfb ```


In [0]:
print(Pfb.sum())

1.0


# Compute marginal $P(F)$
THis can be accomplished by summing all the elements for each row of the array ```Pfb```



In [0]:
Pf = Pfb.sum(axis=1)
print(tabulate([Pf.tolist()], headers=fruits))

  apple    orange
-------  --------
   0.55      0.45


# Create a function to pick a fruit
Recall, we first pick one of the boxes with probability $P(B)$ and then one of the fruits with probability $P(F|B)$

In [0]:
def pickFruit():
  b = np.random.choice(len(boxes),1, p=Pb[:])
  f = np.random.choice(len(fruits), 1, p=Pfgb[:,b].squeeze())
  return fruits[f].item(),boxes[b].item()

print(pickFruit())

('orange', 'blue')


# Run for $N$ trials and record fruit, box pairs
We can run the ```pickFruit``` function N times and cummulate all outcomes using a dictionary 

In [0]:
N = 1000
Nfb = {}
for t in range(N):
  fb = pickFruit()
  if fb in Nfb.keys():
    Nfb[fb] += 1
  else:
    Nfb[fb] = 1
print(Nfb) 

{('orange', 'blue'): 148, ('apple', 'blue'): 436, ('orange', 'red'): 296, ('apple', 'red'): 120}
