# INTELIGENCIA ARTIFICIAL (INF371, 2018-1)

### Laboratorio 4: Redes Bayesianas

El presente laboratorio aborda la construccion de una red Bayesiana y la realización de inferencia sobre ella. Se usará la libreria pgmpy para facilitar dicha tarea.

Al final del notebook deberas responder a las preguntas planteadas. 


## Red Bayesiana para predecir resultados de juegos de la seleccion peruana de futbol
El objetivo del presente laboratorio es poder crear una red Bayesiana como la mostrada abajo, la cual modela probabilisticamente un dominio para predicción de resultados de juegos de futbol. Las siguientes variables aleatórias están involucradas en ese dominio


<b>EP_Insuficiencia_Cardiaca</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos(50=Presente, 50=Ausente)

<b>MR_metropolol</b>: Medicamento a recetar(4.25=Prescribir Normal, 10.4=Prescribir Bajas Dosis, 85.4= No Prescribir)

<b>SA_angina_de_pecho</b>: Sintomas actuales(51.1=Presente, 48.9 =Ausente)

<b>SA_presion_arterial</b>: Sintomas actuales(15.0=Alto, 45.3=Normal, 39.7=Baja)

<b>EP_Asma_Bronquial</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos. (50=Presente, 50=Ausente)

<b>EP_Diabetes</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos. (50=Presente, 50=Ausente)

<b>EP_Gota</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos. (50=Presente, 50=Ausente)

<b>MP_Ibuprofeno</b>: Background o Antecedentes de los medicamentos que consumen los pacientes geriatricos. (50=Prescrito, 50=No Prescrito)






## Creando la estructura de la RB

In [5]:
from pgmpy.models import BayesianModel
from pgmpy.factors.discrete import TabularCPD
from pgmpy.inference import VariableElimination
import pandas as pd
import numpy as np

In [6]:
bn = BayesianModel([('EP_Insuficiencia_Cardiaca', 'MR_metropolol'),              
                    ('EP_Asma_Bronquial', 'MR_metropolol'), 
                    ('EP_Diabetes', 'MR_metropolol'), 
                    ('EP_Diabetes', 'MR_hidroclorothiazida'), 
                    ('MP_ibuprofeno', 'MR_hidroclorothiazida'), 
                    ('EP_Gota', 'MR_hidroclorothiazida'), 
                    ('MR_metropolol', 'SA_angina_de_pecho'),
                    ('MR_metropolol', 'SA_danho_rinhones'),
                    ('MR_metropolol', 'SA_presion_arterial'),
                    ('MR_hidroclorothiazida', 'SA_danho_rinhones'),
                    ('MR_hidroclorothiazida', 'SA_presion_arterial'),
                    ('MR_amlodipino', 'SA_angina_de_pecho'),
                    ('MR_amlodipino', 'SA_danho_rinhones'),
                    ('MR_amlodipino', 'SA_presion_arterial'),
                    ('MR_losartan', 'SA_danho_rinhones'),
                    ('MR_losartan', 'SA_presion_arterial'),
                    ('MR_enalapril', 'SA_danho_rinhones'),
                    ('MR_enalapril', 'SA_presion_arterial'),
                    ('MR_furosemida', 'SA_presion_arterial'),
                    ('MR_furosemida', 'SA_danho_rinhones'),
                    ('MR_prazosin', 'SA_presion_arterial'),
                    ('MR_prazosin', 'SA_danho_rinhones'),
                    ('MR_verapamilo', 'SA_presion_arterial'),
                    ('MR_verapamilo', 'SA_danho_rinhones'),
                    ('MR_captopril', 'SA_presion_arterial'),
                    ('MR_captopril', 'SA_danho_rinhones'),
                    ('MR_propanolol', 'SA_presion_arterial'),
                    ('MR_propanolol', 'SA_danho_rinhones'),
                    ('MR_carvedilol', 'SA_presion_arterial'),
                    ('MR_carvedilol', 'SA_danho_rinhones')])

## Definiendo los parametros del modelo (CPTs)
En pgmpy las columnas son los valores de los padres (evdencias) y las filas son los estados de las variables 

In [7]:
#Antecendentes de los pacientes geriatricos

cpt_Insuficiencia_Cardiaca = TabularCPD(variable='EP_Insuficiencia_Cardiaca', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Asma_Bronquial = TabularCPD(variable='EP_Insuficiencia_Cardiaca', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Diabetes= TabularCPD(variable='EP_Diabetes', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Ibuprofeno= TabularCPD(variable='MP_ibuprofeno', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Gota= TabularCPD(variable='EP_Gota', variable_card=2,
                      values=[[0.5], [0.5]])

In [8]:
#Medicamentos a recetar

cpt_metoprolol = TabularCPD(variable='MR_metropolol', variable_card=3,
                      values=[[0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.34], 
                              [0.01,0.01,0.01,0.01,0.01,0.01,0.50,0.33], 
                              [0.99,0.99,0.99,0.99,0.99,0.99,0.50,0.33]],
                      evidence=['EP_Asma_Bronquial', 'EP_Insuficiencia_Cardiaca', 'EP_Diabetes'], 
                      evidence_card=[2,2,2])

cpt_hidroclorothiazida = TabularCPD(variable='MR_hidroclorothiazida', variable_card=3,
                      values=[[0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.34], 
                              [0.01,0.01,0.99,0.99,0.01,0.01,0.01,0.33], 
                              [0.99,0.99,0.01,0.01,0.99,0.99,0.99,0.33]],
                      evidence=['EP_Diabetes', 'MP_ibuprofeno', 'EP_Gota'], 
                      evidence_card=[2,2,2])

cpt_amlodipino = TabularCPD(variable='MR_amlodipino', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_losartan = TabularCPD(variable='MR_losartan', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_enalapril= TabularCPD(variable='MR_enalapril', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_furosemida = TabularCPD(variable='MR_furosemida', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_prazosin= TabularCPD(variable='MR_prazosin', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_verapamilo= TabularCPD(variable='MR_verapamilo', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_captopril= TabularCPD(variable='MR_captopril', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_propanolol= TabularCPD(variable='MR_propanolol', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_propanolol= TabularCPD(variable='MR_carvedilol', variable_card=2,
                      values=[[0.5], [0.5]])



In [89]:
#states: la cantidad de estados que tenga cada variables
#index: la posicion que ocupa en el arreglo
#columns: numero de columnas totales
def createColumn(states,index, columns,iBefore):
    valoresPosible = 
    return pd.DataFrame(np.random.randint(low=0,high=(states-1),size=(columns,1)))
    
    

Veamos el caso general, luego los demas se repiten
Es necesario saber
-Cantidad de estados 
-Y saber si tiene caso anterior, si lo tiene hay que saber cuantos fueron sus casos, si no lo tiene no importa


In [52]:
my_df= []
numfilas = 4806
for i in range(0,numfilas):
    d = {
        'val0' : 1,
        'val1' : 2
    }
    my_df.append(d)

my_df = pd.DataFrame(my_df)
my_df.head()


Unnamed: 0,val0,val1
0,1,2
1,1,2
2,1,2
3,1,2
4,1,2


In [119]:
def createDataframe(arrMedicamentos):
    campos = len(arrMedicamentos)
    filas = 1 
    cantEstados = 0 
    col = 'med'
    arrBase     = []    #Para saber la base de cada medicamento
    
    #Obteniendo la base por cada posicion del medicamento
    for j in range(0, campos):
        if(j == 0):
            cantEstados = arrMedicamentos[j] ** (j+1)
        else:
            cantEstados = (arrBase[j-1] * arrMedicamentos[j])
        arrBase.append(cantEstados)
    
    #Obteniendo la cantidad de filas
    for i in arrMedicamentos:
        filas  *=i     
    
    my_final_list=[]
    #Obteniendo la gran tabla de verdad
    for i in range(0, campos):  
        maxEstado = arrMedicamentos[i] -1 # Si es 2, entonces 1 es el maximo (1,0)
        cantVecesPorEstado = arrBase[i]/arrMedicamentos[i] #(4 por cada uno)
        contadorInterno = 1
        my_med_list=[]
        for j in range (0, 20):         
            d = {
                col + str(i): maxEstado
            }
            #Lo hago para verificar si ya se puede reiniciar con el mayor
            if(contadorInterno == arrBase[i]):
                maxEstado = arrMedicamentos[i] -1
                contadorInterno = 1
            else:
                #Lo hago para verificar si ya se puede cambiar de estado a un menor
                if(contadorInterno == cantVecesPorEstado):
                    maxEstado -=1
                contadorInterno +=1
            my_med_list.append(d)
        my_final_list.append(my_med_list)
        #print(my_med_list)
        
    #print(my_final_list)
    df = pd.DataFrame(my_final_list)
        #print('sali '+ str(i)) 
    return df

In [121]:
#Colocar los medicamentos de izquierda a derecha 
df = createDataframe([2,2,2,2,2,2,2,2,2,3,3])
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0},{'med0': 1},{'med0': 0}
1,{'med1': 1},{'med1': 1},{'med1': 0},{'med1': 0},{'med1': 1},{'med1': 1},{'med1': 0},{'med1': 0},{'med1': 1},{'med1': 1},{'med1': 0},{'med1': 0},{'med1': 1},{'med1': 1},{'med1': 0},{'med1': 0},{'med1': 1},{'med1': 1},{'med1': 0},{'med1': 0}
2,{'med2': 1},{'med2': 1},{'med2': 1},{'med2': 1},{'med2': 0},{'med2': 0},{'med2': 0},{'med2': 0},{'med2': 1},{'med2': 1},{'med2': 1},{'med2': 1},{'med2': 0},{'med2': 0},{'med2': 0},{'med2': 0},{'med2': 1},{'med2': 1},{'med2': 1},{'med2': 1}
3,{'med3': 1},{'med3': 1},{'med3': 1},{'med3': 1},{'med3': 1},{'med3': 1},{'med3': 1},{'med3': 1},{'med3': 0},{'med3': 0},{'med3': 0},{'med3': 0},{'med3': 0},{'med3': 0},{'med3': 0},{'med3': 0},{'med3': 1},{'med3': 1},{'med3': 1},{'med3': 1}
4,{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 1},{'med4': 0},{'med4': 0},{'med4': 0},{'med4': 0}
5,{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1},{'med5': 1}
6,{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1},{'med6': 1}
7,{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1},{'med7': 1}
8,{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1},{'med8': 1}
9,{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2},{'med9': 2}


In [7]:
import pandas as pd
import numpy as np
df2 = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
                   columns=['a', 'b', 'c', 'd', 'e'])
df2

Unnamed: 0,a,b,c,d,e
0,8,2,3,9,1
1,7,8,7,9,1
2,2,2,9,1,3
3,9,3,5,5,7
4,8,0,3,5,8


In [None]:
#Se empieza por atras hacia adelante con ambos casos del primer medicamento
#Todos los primeros casos, la presion estara bien baja, por el hecho de tomar 11 medicamentos.
#De hecho que tomar 3 a mas ya es bastantes, entonces le pondre lo mas minimo

cpt_presion_arterial = TabularCPD(variable='SA_presion_arterial', variable_card=2,
                      values=[[0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,0.05,0.05,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.30,0.35,0.30,0.10,0.10,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.00,0.00,0.00,0.45,0.45,0.45,0.25,0.25,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.15,0.15,0.15,0.05,
                               0.05,0.05,0.05,0.15,0.30,0.05,0.05,0.30,0.15,0.15,0.34],
                              [],
                              []],
                      evidence=['MR_metropolol','MR_hidroclorothiazida', 'MR_amlodipino',
                                'MR_losartan', 'MR_enalapril','MR_furosemida',
                                'MR_prazosin','MR_verapamilo','MR_captopril',
                                'MR_propanolol','MR_carvedilol'],
                      evidence_card=[3,2,2,2,3,2,2,2,2,2,2])

In [1]:
#Sintomas resultantes
cpt_angina_pecho = TabularCPD(variable='SA_angina_de_pecho', variable_card=2,
                      values=[[0.20,0.30,0.30,0.30,0.40,0.70], 
                              [0.80,0.70,0.70,0.70,0.60,0.30]],
                      evidence=['MR_metropolol', 'MR_amlodipino'], 
                      evidence_card=[3,2])
cpt_danho_rinhones = TabularCPD(variable='SA_danho_rinhones', variable_card=2,
                      values=[[0.90,0.85,0.85,0.85,0.80,0.80,0.85,0.75,0.75,0.85,0.80,0.80,0.85,
                               0.75,0.75,0.80,0.65,0.65,0.80,0.65,0.65,0.80,0.65,0.65,0.75,0.60,
                               0.60,0.60,0.55,0.10,0.85,0.80,0.70,0.65,0.75,0.65,0.60,0.70,0.60,
                               0.60,0.65,0.55,0.55,0.65,0.60,0.10,0.85,0.75,0.70,0.75,0.65,0.45,
                               0.65,0.25,0.05,0.60,0.15,0.10,0.15,0.10,0.05,0.10,0.05,0.01], 
                              [0.10,0.15,0.15,0.15,0.20,0.20,0.15,0.25,0.25,0.15,0.20,0.20,0.15,
                               0.25,0.15,0.20,0.35,0.35,0.20,0.35,0.35,0.20,0.35,0.35,0.25,0.40,
                               0.40,0.40,0.45,0.90,0.15,0.20,0.30,0.35,0.25,0.35,0.40,0.30,0.40,
                               0.40,0.35,0.45,0.45,0.35,0.40,0.90,0.15,0.25,0.30,0.25,0.35,0.55,
                               0.35,0.75,0.95,0.40,0.85,0.90,0.85,0.90,0.95,0.90,0.95,0.99]],
                      evidence=['MR_enalapril','MR_losartan','MR_amlodipino',
                                'MR_metropolol','MR_hidroclorothiazida'],
                      evidence_card=[2,2,2,3,3])
cpt_presion_arterial = TabularCPD(variable='SA_presion_arterial', variable_card=2,
                      values=[[0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,0.05,0.05,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.30,0.35,0.30,0.10,0.10,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.00,0.00,0.00,0.45,0.45,0.45,0.25,0.25,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.15,0.15,0.15,0.05,
                               0.05,0.05,0.05,0.15,0.30,0.05,0.05,0.30,0.15,0.15,0.34],
                              [],
                              []],
                      evidence=['MR_metropolol', 'MR_amlodipino','MR_losartan',
                               'MR_enalapril','MR_hidroclorothiazida'],
                      evidence_card=[3,2,2,2,3])

NameError: name 'TabularCPD' is not defined

## Asociando los CPTs con la estructura del modelo


In [5]:
bn.add_cpds(cpt_LastResults, cpt_LastResultsOpponent, cpt_PsychologicalState, cpt_PerformancePlayers, cpt_FinalResult)

In [18]:
bn.add_cpds(cpt_hidroclorothiazida, cpt_metoprolol, cpt_angina_pecho)

## Chequeando consistencia de la Red Bayesiana
Chequea que todos los CPTs sean válidos para el modelo

In [19]:
bn.check_model()

ValueError: No CPD associated with MR_enalapril

## Haciendo inferencias en la RB con algoritmo VariableElimination 


Vamos a preguntar a la RB con el algoritmo ** VariableElimination ** Cuál es la distribucion de "B" (Burglary) habiendo obsevado que "J" y "M" son true (John y Mary llaman)  

In [7]:
from pgmpy.inference import VariableElimination

In [8]:
bn_inference = VariableElimination(bn)

In [10]:
q = bn_inference.query(variables=['LastResults'], evidence={'FinalResult':0})
print(q['LastResults'])

╒═══════════════╤════════════════════╕
│ LastResults   │   phi(LastResults) │
╞═══════════════╪════════════════════╡
│ LastResults_0 │             0.3917 │
├───────────────┼────────────────────┤
│ LastResults_1 │             0.6083 │
╘═══════════════╧════════════════════╛


In [11]:
bn_inference.state_names()

TypeError: 'NoneType' object is not callable

## Obteniendo las independencias Condicionales implicadas en la RB  

In [1]:
bn.get_independencies()

NameError: name 'bn' is not defined