## Jugando con Probabilidades y Python


### La coincidencia de cumpleaños

(https://es.wikipedia.org/wiki/Paradoja_del_cumplea%C3%B1os) 

¿Cuál es la probabilidad de que, en un grupo de personas elegidas al azar, al menos dos de ellas habrán nacido el mismo día del año? ¿Cuántas personas son necesarias para asegurar una probabilidad mayor al 50%?. 

Calcular esa probabilidad es complicado, así que vamos a calcular la probabilidad de que no coincidan, suponiendo que con eventos independientes (es decir las podemos multiplicar), y luego calcularemos la probabilidad de que coincidan como 1 menos esa probabilidad. 

Excluyendo el 29 de febrero de nuestros cálculos y asumiendo que los restantes 365 días de posibles cumpleaños son igualmente probables, vamos a calcular esas dos cuestiones.

In [2]:
# Ejemplo situación 2 La coincidencia de cumpleaños

prob = 1.0
asistentes = 50

# calculamos la probabilidad de coincidencia para 50 asistentes

for i in range(asistentes):
    prob = prob * (365-i)/365

print("Probabilidad de que compartan una misma fecha de cumpleaños es {0:.2f}"
      .format(1 - prob))

# Calculamos el número de asistentes necesarios para asegurar 
# que la probabilidad de coincidencia sea mayor del 50%

asistentes = 0
prob = 1 
while prob > 0.5:
    prob = prob * (365-i)/365
    asistentes += 1

print("Para asegurar que la probabilidad es mayor del 50% necesitamos {0} asistentes".format(asistentes))

Probabilidad de que compartan una misma fecha de cumpleaños es 0.97
Para asegurar que la probabilidad es mayor del 50% necesitamos 5 asistentes


## Variables aleatorias. Vamos a tirar un dado

Vamos a trabajar con variables discretas, y en este caso vamos a reproducir un dado con la librería `random` que forma parte de la librería estandar de Python:

In [None]:
# importa la libreria random. puedes utilizar dir() para entender lo que ofrece

import random


In [None]:
# utiliza help para obtener ayuda sobre el metodo randint

help(random.randint)

Help on method randint in module random:

randint(a, b) method of random.Random instance
    Return random integer in range [a, b], including both end points.



In [None]:
# utiliza randint() para simular un dado y haz una tirada

random.randint(1,6)
print(dado)


4


In [None]:
# ahora haz 20 tiradas, y crea una lista con las tiradas

tiradas = []
for t in range(20):
  dado = random.randint(1,6)
  tiradas.append(dado)

print(tiradas)

[1, 6, 5, 4, 4, 3, 4, 6, 2, 5, 2, 6, 5, 2, 6, 6, 3, 6, 2, 6]


In [None]:
# Vamos a calcular la media de las tiradas

mu = sum(tiradas)/ len(tiradas)
print(mu)


4.2


In [None]:
# Calcula ahora la mediana

TiradasOrdenada = sorted(tiradas)
mediana = len(tiradas)//2
print(mediana)
print(TiradasOrdenada[mediana])

10
5


In [None]:
# Calcula la moda de las tiradas

moda = {1:0, 2:0, 3:0, 4:0, 5:0, 6:0}
for t in tiradas:
  if t in moda:
    moda[t] +=1

print(moda)
print(f'La moda es {max(moda, key=moda.get)}')


{1: 1, 2: 4, 3: 2, 4: 3, 5: 3, 6: 7}
La moda es 6


In [None]:
# TODO moda

def printmode(a):
  max_element = max(a)

  t = max_element +1
  cont= [0] * t

  for i in range(len(a)):
    cont[a[i]] += 1

    mode = 0
    k = cont[0]

    for i in range(1, t):
      if cont[i] > k:
        k = cont[i]
        mode = i
    return mode

moda = printmode(tiradas)
print(tiradas)
print(moda)


[1, 6, 5, 4, 4, 3, 4, 6, 2, 5, 2, 6, 5, 2, 6, 6, 3, 6, 2, 6]
1


In [None]:
#moda

def obtain_mode(nlist, N=6):
    repetitions = []
    for i in range(1, N+1):
        i_rep = 0
        for element in nlist:
            if element == i:
                i_rep += 1
        repetitions.append(i_rep)
    result = []
    for i, element in enumerate(repetitions):
        if element == max(repetitions):
            result.append(i + 1)
    return result

obtain_mode(tiradas)

[6]

In [None]:
#moda

conteo = []

for x in range(1, 7):
  conteo.append(tiradas.count(x))
print(conteo)

[1, 4, 2, 3, 3, 7]


## Viendo como evoluciona el número de 6 cuando sacamos más jugadas

Vamos a ver ahora como evoluciona el número de seises que obtenemos al lanzar el dado 10000 veces. Vamos a crear una lista en la que cada elemento sea el número de ocurrencias del número 6 dividido entre el número de lanzamientos. 

crea una lista llamadada ``frecuencia_seis[]`` que almacene estos valores 


In [None]:
# tu código aquí

import random
numeros_series = 500
frecuencia_seis = []
for lanzamientos in range(1,numeros_series):
  seises =[]
  for i in range(lanzamientos):
    seises.append(random.randint(1,6))
  numeros_seis = seises.count(6)
  frecuencia_seis.append(numeros_seis/lanzamientos)

### Vamos a tratar de hacerlo gráficamente
¿Hacia que valor debería converger los números que forman la lista frecuencia_seis? 
Revisa la ley de los grandes números para la moneda, y aplica un poco de lógica para este caso.


In [None]:
import matplotlib.pyplot as plt
x = range(1,numeros_series)
plt.figure(figsize=(20,10))
plt.plot(x,frecuencia_seis)
plt.axhline(0.16666, color=‘r’)
plt.xlabel(“numero de lanzamientos”)
plt.ylabel(“probabilidad de seis”)
plt.show()

# Resolviendo el problema de Monty Hall

Este problema, más conocido con el nombre de [Monty Hall](https://es.wikipedia.org/wiki/Problema_de_Monty_Hall).
En primer lugar trata de simular el problema de Monty Hall con Python, para ver cuantas veces gana el concursante y cuantas pierde. Realiza por ejemplo 10000 simulaciones del problema, en las que el usuario cambia siempre de puerta. Después puedes comparar con 10000 simulaciones en las que el usuario no cambie de puertas.
Cuales son los resultados?


### Monty Hall sin Bayes - Simulación

In [None]:

# vamos a tratar de solucionar el problema de Monty Hall simulando como jugaría el jugador
# Simula 10000 jugadas
# Puedes escoger a priori la estrategia: haz 10000 simulaciones en las que el jugador siempre cambie la puerta
# y también puedes ejecutar 10000 en las que no cambie


[texto del enlace](https://drive.google.com/open?id=1MNZwRlFMwolKK17hzuDcpXck5X11XEqX)

In [None]:
# Versión 1

import random

no_cambia = 0
cambia = 0
n = 1000000

for j in range(n):
  puertas = [1,2,3]
  coche = random.choice(puertas)
  primera_opcion = random.choice(puertas)

  puertas.remove(random.choice([p for p in puertas if (p != coche) and (p != primera_opcion)]))
  
  segunda_opcion = random.choice(puertas)

  if primera_opcion == segunda_opcion == coche:
    no_cambia +=1
  elif (primera_opcion != segunda_opcion) and (segunda_opcion == coche):
    cambia +=1

cambia_n = cambia/n
no_cambia_n = no_cambia/n
probab_cambia = cambia_n/(cambia_n + no_cambia_n)
probab_no_cambia = no_cambia_n/(cambia_n + no_cambia_n)

print(f'Si el jugador no cambia, la probabilidad de ganar es de {probab_no_cambia}, con {no_cambia} aciertos de {n}')
print(f'Si el jugador cambia, la probabilidad de ganar es de {probab_cambia}, con {cambia} aciertos de {n}')



Si el jugador no cambia, la probabilidad de ganar es de 0.3339348158343269, con 166926 aciertos de 1000000
Si el jugador cambia, la probabilidad de ganar es de 0.6660651841656732, con 332950 aciertos de 1000000


In [None]:
# Versión 2

opciones = []
ganar_sincambiarpuerta = 0
ganar_cambiopuerta = 0
perder = 0
n = 1000000
for i in range(n):
    puertas = [1,2,3]
    puerta_coche = random.choice(puertas) 
    election1 = random.choice(puertas)
    opciones = [i for i in puertas if (i == puerta_coche or i == election1)]
    
    if len(opciones) == 1:
        puertas.remove(opciones[0])
        opciones.append(random.choice(puertas))
        
    apuesta = random.choice(opciones)
    if apuesta == puerta_coche:   
        if (apuesta == election1):
            ganar_sincambiarpuerta += 1
        elif (apuesta != election1):
            ganar_cambiopuerta += 1
    else:
        perder+=1
        
# Contemplando ganar (de las dos maneras) y perder:

p_ganar_sincambiarpuerta = ganar_sincambiarpuerta/n
p_ganar_cambiopuerta = ganar_cambiopuerta/n
p_perder = perder/n

# Contemplando exclusivamente las n veces que se gana:

ganar_p_ganar_sincambiarpuerta = p_ganar_sincambiarpuerta/(p_ganar_sincambiarpuerta+p_ganar_cambiopuerta)
ganar_p_ganar_cambiopuerta = p_ganar_cambiopuerta/(p_ganar_sincambiarpuerta+p_ganar_cambiopuerta)
print('ganar sin cambiar de puerta:\n',p_ganar_sincambiarpuerta, '\n ganar cambiando de puerta:\n', p_ganar_cambiopuerta,' \n p_perder:\n',p_perder )
print('Solo contemplando el ganar, sin cambiar puerta:\n',ganar_p_ganar_sincambiarpuerta)
print('Solo contemplando el ganar, cambiando de puerta:\n',ganar_p_ganar_cambiopuerta)

ganar sin cambiar de puerta:
 0.166783 
 ganar cambiando de puerta:
 0.33379  
 p_perder:
 0.499427
Solo contemplando el ganar, sin cambiar puerta:
 0.33318417094010266
Solo contemplando el ganar, cambiando de puerta:
 0.6668158290598974


In [None]:
#MONTY HALL CON CLASES

def elegir_puerta():
    “”"
    Función para elegir una puerta. Devuelve 1, 2, o 3 en forma aleatoria.
    “”"
    return np.random.randint(1,4)
class MontyHall:
    “”"
    Clase para modelar el problema de Monty Hall.
    “”"
    def __init__(self):
        “”"
        Crea la instancia del problema.
        “”"
        # Elige una puerta en forma aleatoria.
        self.puerta_ganadora = elegir_puerta()
        # variables para la puerta elegida y la puerta descartada
        self.puerta_elegida = None
        self.puerta_descartada = None
    def selecciona_puerta(self):
        “”"
        Selecciona la puerta del concursante en forma aleatoria.
        “”"
        self.puerta_elegida = elegir_puerta()
    def descarta_puerta(self):
        “”"
        Con este método el presentador descarta una de la puertas.
        “”"
        # elegir puerta en forma aleatoria .
        d = elegir_puerta()
        # Si es al puerta ganadora o la del concursante, volver a elegir.
        while d == self.puerta_ganadora or d == self.puerta_elegida:
            d = elegir_puerta()
        # Asignar el valor a puerta_descartada.
        self.puerta_descartada = d
    def cambiar_puerta(self):
        “”"
        Cambia la puerta del concursante una vez que se elimino una puerta.
        “”"
        # 1+2+3=6. Solo existe una puerta para elegir.
        self.puerta_elegida = 6 - self.puerta_elegida - self.puerta_descartada
    def gana_concursante(self):
        “”"
        Determina si el concursante gana.
        Devuelve True si gana, False si pierde.
        “”"
        return self.puerta_elegida == self.puerta_ganadora
    def jugar(self, cambiar=True):
        “”"
        Una vez que la clase se inicio, jugar el concurso.
        ‘cambiar’ determina si el concursante cambia su elección.
        “”"
        # El concursante elige una puerta.
        self.selecciona_puerta()
        # El presentador elimina una puerta.
        self.descarta_puerta()
        # El concursante cambia su elección.
        if cambiar:
            self.cambiar_puerta()
        # Determinar si el concursante ha ganado.
        return self.gana_concursante()


        # Ahora, jugamos el concurso. primero nos vamos a quedar con nuestra elección
# inicial. Vamos a ejecutar el experimiento 10.000 veces.
gana, pierde = 0, 0
for i in range(10000):
    # Crear la instancia del problema.
    s2 = MontyHall()
    # ejecutar el concurso sin cambiar de puerta..
    if s2.jugar(cambiar=False):
        # si devuelve True significa que gana.
        gana += 1
    else:
        # si devuelve False significa que pierde.
        pierde += 1
# veamos la fecuencia de victorias del concursante.
porc_gana = 100.0 * gana / (gana + pierde)
print(“\n10.000 concursos sin cambiar de puerta:“)
print(”  gana: {0:} concursos”.format(gana))
print(”  pierde: {0:} concursos”.format(pierde))
print(”  probabilidad: {0:.2f} procentaje de victorias”.format(porc_gana))


# Ahora, jugamos el concurso siempre cambiando la elección inicial
# Vamos a ejecutar el experimiento 10.000 veces.
gana, pierde = 0, 0
for i in range(10000):
    # Crear la instancia del problema.
    s2 = MontyHall()
    # ejecutar el concurso con cambiar de puerta..
    if s2.jugar(cambiar=True):
        # si devuelve True significa que gana.
        gana += 1
    else:
        # si devuelve False significa que pierde.
        pierde += 1
# veamos la fecuencia de victorias del concursante.
porc_gana = 100.0 * gana / (gana + pierde)
print(“\n10.000 concursos cambiando de puerta:“)
print(“\t gana: {0:} concursos”.format(gana))
print(”  pierde: {0:} concursos”.format(pierde))
print(”  probabilidad: {0:.2f} procentaje de victorias”.format(porc_gana))

# Creando un clasificador basado en el teorema de Bayes

Este es un problema extraido de la página web de Chris Albon, que ha replicado un ejemplo que puedes ver en la wikipedia. Trata de reproducirlo y entenderlo.  

Naive bayes is simple classifier known for doing well when only a small number of observations is available. In this tutorial we will create a gaussian naive bayes classifier from scratch and use it to predict the class of a previously unseen data point. This tutorial is based on an example on Wikipedia's [naive bayes classifier page](https://en.wikipedia.org/wiki/Naive_Bayes_classifier), I have implemented it in Python and tweaked some notation to improve explanation. 

## Preliminaries

In [3]:
import pandas as pd
import numpy as np

## Create Data

Our dataset is contains data on eight individuals. We will use the dataset to construct a classifier that takes in the height, weight, and foot size of an individual and outputs a prediction for their gender.

In [4]:
# Create an empty dataframe
data = pd.DataFrame()

# Create our target variable
data['Gender'] = ['male','male','male','male','female','female','female','female']

# Create our feature variables
data['Height'] = [6,5.92,5.58,5.92,5,5.5,5.42,5.75]
data['Weight'] = [180,190,170,165,100,150,130,150]
data['Foot_Size'] = [12,11,12,10,6,8,7,9]

# View the data
data

Unnamed: 0,Gender,Height,Weight,Foot_Size
0,male,6.0,180,12
1,male,5.92,190,11
2,male,5.58,170,12
3,male,5.92,165,10
4,female,5.0,100,6
5,female,5.5,150,8
6,female,5.42,130,7
7,female,5.75,150,9


The dataset above is used to construct our classifier. Below we will create a new person for whom we know their feature values but not their gender. Our goal is to predict their gender.

In [5]:
# Create an empty dataframe
person = pd.DataFrame()

# Create some feature values for this single row
person['Height'] = [6]
person['Weight'] = [130]
person['Foot_Size'] = [8]

# View the data 
person

Unnamed: 0,Height,Weight,Foot_Size
0,6,130,8


## Bayes Theorem

Bayes theorem is a famous equation that allows us to make predictions based on data. Here is the classic version of the Bayes theorem:

$$\displaystyle P(A\mid B)={\frac {P(B\mid A)\,P(A)}{P(B)}}$$

This might be too abstract, so let us replace some of the variables to make it more concrete. In a bayes classifier, we are interested in finding out the class (e.g. male or female, spam or ham) of an observation _given_ the data:

$$p(\text{class} \mid \mathbf {\text{data}} )={\frac {p(\mathbf {\text{data}} \mid \text{class}) * p(\text{class})}{p(\mathbf {\text{data}} )}}$$

where: 

- $\text{class}$ is a particular class (e.g. male)
- $\mathbf {\text{data}}$ is an observation's data
- $p(\text{class} \mid \mathbf {\text{data}} )$ is called the posterior
- $p(\text{data|class})$ is called the likelihood
- $p(\text{class})$ is called the prior
- $p(\mathbf {\text{data}} )$ is called the marginal probability

In a bayes classifier, we calculate the posterior (technically we only calculate the numerator of the posterior, but ignore that for now) for every class for each observation. Then, classify the observation based on the class with the largest posterior value. In our example, we have one observation to predict and two possible classes (e.g. male and female), therefore we will calculate two posteriors: one for male and one for female.

$$p(\text{person is male} \mid \mathbf {\text{person's data}} )={\frac {p(\mathbf {\text{person's data}} \mid \text{person is male}) * p(\text{person is male})}{p(\mathbf {\text{person's data}} )}}$$

$$p(\text{person is female} \mid \mathbf {\text{person's data}} )={\frac {p(\mathbf {\text{person's data}} \mid \text{person is female}) * p(\text{person is female})}{p(\mathbf {\text{person's data}} )}}$$

[texto del enlace](https://drive.google.com/open?id=1gpXv63so_yZaBpPOuUQ7ihw2h2896dPk)

## Gaussian Naive Bayes Classifier

A gaussian naive bayes is probably the most popular type of bayes classifier. To explain what the name means, let us look at what the bayes equations looks like when we apply our two classes (male and female) and three feature variables (height, weight, and footsize):

$${\displaystyle {\text{posterior (male)}}={\frac {P({\text{male}})\,p({\text{height}}\mid{\text{male}})\,p({\text{weight}}\mid{\text{male}})\,p({\text{foot size}}\mid{\text{male}})}{\text{marginal probability}}}}$$

$${\displaystyle {\text{posterior (female)}}={\frac {P({\text{female}})\,p({\text{height}}\mid{\text{female}})\,p({\text{weight}}\mid{\text{female}})\,p({\text{foot size}}\mid{\text{female}})}{\text{marginal probability}}}}$$

Now let us unpack the top equation a bit:

- $P({\text{male}})$ is the prior probabilities. It is, as you can see, simply the probability an observation is male. This is just the number of males in the dataset divided by the total number of people in the dataset.
- $p({\text{height}}\mid{\text{female}})\,p({\text{weight}}\mid{\text{female}})\,p({\text{foot size}}\mid{\text{female}})$ is the likelihood. Notice that we have unpacked $\mathbf {\text{person's data}}$ so it is now every feature in the dataset. The "gaussian" and "naive" come from two assumptions present in this likelihood:
    1. If you look each term in the likelihood you will notice that we assume each feature is uncorrelated from each other. That is, foot size is independent of weight or height etc.. This is obviously not true, and is a "naive" assumption - hence the name "naive bayes."
    2. Second, we assume have that the value of the features (e.g. the height of women, the weight of women) are normally (gaussian) distributed. This means that $p(\text{height}\mid\text{female})$ is calculated by inputing the required parameters into the probability density function of the normal distribution: 

$$ 
p(\text{height}\mid\text{female})=\frac{1}{\sqrt{2\pi\text{variance of female height in the data}}}\,e^{ -\frac{(\text{observation's height}-\text{average height of females in the data})^2}{2\text{variance of female height in the data}} }
$$

- $\text{marginal probability}$ is probably one of the most confusing parts of bayesian approaches. In toy examples (including ours) it is completely possible to calculate the marginal probability. However, in many real-world cases, it is either extremely difficult or impossible to find the value of the marginal probability (explaining why is beyond the scope of this tutorial). This is not as much of a problem for our classifier as you might think. Why? Because we don't care what the true posterior value is, we only care which class has a the highest posterior value. And because the marginal probability is the same for all classes 1) we can ignore the denominator, 2) calculate only the posterior's numerator for each class, and 3) pick the largest numerator. That is, we can ignore the posterior's denominator and make a prediction solely on the relative values of the posterior's numerator.

Okay! Theory over. Now let us start calculating all the different parts of the bayes equations.

## Calculate Priors

Priors can be either constants or probability distributions. In our example, this is simply the probability of being a gender. Calculating this is simple:

In [6]:
# Number of males
n_male = data['Gender'][data['Gender'] == 'male'].count()

# Number of males
n_female = data['Gender'][data['Gender'] == 'female'].count()

# Total rows
total_ppl = data['Gender'].count()

In [7]:
# Number of males divided by the total rows
P_male = n_male/total_ppl

# Number of females divided by the total rows
P_female = n_female/total_ppl

## Calculate Likelihood

Remember that each term (e.g. $p(\text{height}\mid\text{female})$) in our likelihood is assumed to be a normal pdf. For example:

$$ 
p(\text{height}\mid\text{female})=\frac{1}{\sqrt{2\pi\text{variance of female height in the data}}}\,e^{ -\frac{(\text{observation's height}-\text{average height of females in the data})^2}{2\text{variance of female height in the data}} }
$$

This means that for each class (e.g. female) and feature (e.g. height) combination we need to calculate the variance and mean value from the data. Pandas makes this easy:

In [8]:
# Group the data by gender and calculate the means of each feature
data_means = data.groupby('Gender').mean()

# View the values
data_means

Unnamed: 0_level_0,Height,Weight,Foot_Size
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
female,5.4175,132.5,7.5
male,5.855,176.25,11.25


In [9]:
# Group the data by gender and calculate the variance of each feature
data_variance = data.groupby('Gender').var()

# View the values
data_variance

Unnamed: 0_level_0,Height,Weight,Foot_Size
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
female,0.097225,558.333333,1.666667
male,0.035033,122.916667,0.916667


Now we can create all the variables we need. The code below might look complex but all we are doing is creating a variable out of each cell in both of the tables above.

In [10]:
# Means for male
male_height_mean = data_means['Height'][data_variance.index == 'male'].values[0]
print(male_height_mean)
male_weight_mean = data_means['Weight'][data_variance.index == 'male'].values[0]
male_footsize_mean = data_means['Foot_Size'][data_variance.index == 'male'].values[0]

# Variance for male
male_height_variance = data_variance['Height'][data_variance.index == 'male'].values[0]
male_weight_variance = data_variance['Weight'][data_variance.index == 'male'].values[0]
male_footsize_variance = data_variance['Foot_Size'][data_variance.index == 'male'].values[0]

# Means for female
female_height_mean = data_means['Height'][data_variance.index == 'female'].values[0]
female_weight_mean = data_means['Weight'][data_variance.index == 'female'].values[0]
female_footsize_mean = data_means['Foot_Size'][data_variance.index == 'female'].values[0]

# Variance for female
female_height_variance = data_variance['Height'][data_variance.index == 'female'].values[0]
female_weight_variance = data_variance['Weight'][data_variance.index == 'female'].values[0]
female_footsize_variance = data_variance['Foot_Size'][data_variance.index == 'female'].values[0]

5.855


Finally, we need to create a function to calculate the probability density of each of the terms of the likelihood (e.g. $p(\text{height}\mid\text{female})$).

In [11]:
# Create a function that calculates p(x | y):
def p_x_given_y(x, mean_y, variance_y):

    # Input the arguments into a probability density function
    p = 1/(np.sqrt(2*np.pi*variance_y)) * np.exp((-(x-mean_y)**2)/(2*variance_y))
    
    # return p
    return p

## Apply Bayes Classifier To New Data Point

Alright! Our bayes classifier is ready. Remember that since we can ignore the marginal probability (the demoninator), what we are actually calculating is this:

$${\displaystyle {\text{numerator of the posterior}}={P({\text{female}})\,p({\text{height}}\mid{\text{female}})\,p({\text{weight}}\mid{\text{female}})\,p({\text{foot size}}\mid{\text{female}})}{}}$$

To do this, we just need to plug in the values of the unclassified person (height = 6), the variables of the dataset (e.g. mean of female height), and the function (`p_x_given_y`) we made above:

In [12]:
# Numerator of the posterior if the unclassified observation is a male
P_male * \
p_x_given_y(person['Height'][0], male_height_mean, male_height_variance) * \
p_x_given_y(person['Weight'][0], male_weight_mean, male_weight_variance) * \
p_x_given_y(person['Foot_Size'][0], male_footsize_mean, male_footsize_variance)

6.197071843878078e-09

In [13]:
# Numerator of the posterior if the unclassified observation is a female
P_female * \
p_x_given_y(person['Height'][0], female_height_mean, female_height_variance) * \
p_x_given_y(person['Weight'][0], female_weight_mean, female_weight_variance) * \
p_x_given_y(person['Foot_Size'][0], female_footsize_mean, female_footsize_variance)

0.0005377909183630018

Because the numerator of the posterior for female is greater than male, then we predict that the person is female.