## Red Bayesiana para predecir la combinacion de medicamentos mas adecuada en pacientes geriatricos
El objetivo del presente proyecto es poder crear una red Bayesiana, la cual modele probabilisticamente un dominio para predicción de medicamentos para diversas patologias y segun la naturaleza del paciente. 
A continuacion, se muestran algunas de las variables aleatórias involucradas en el dominio

## Variables destinadas a los antecedentes del paciente geriatrico

<b>EP_Insuficiencia_Cardiaca</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos(50=Presente, 50=Ausente)

<b>SA_angina_de_pecho</b>: Sintomas actuales(Presente,Ausente)

<b>SA_presion_arterial</b>: Sintomas actuales(Alto,Normal,Baja)

<b>EP_Asma_Bronquial</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos. (Presente, Ausente)

<b>EP_Diabetes</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos. (Presente,Ausente)

<b>EP_Gota</b>: Background o Antecedentes de enfermedades de los pacientes geriatricos. (Presente,Ausente)

<b>MP_Ibuprofeno</b>: Background o Antecedentes de los medicamentos que consumen los pacientes geriatricos. (Prescrito,No Prescrito)

## Variables destinadas a los posibles medicamentos a recetar

<b>MR_metropolol</b>: Metropolol (Prescribir Normal, Prescribir condicion Diabetes, No Prescribir)

<b>MR_hidroclorothiazida</b>: Hidroclorothiazida (Prescribir Normal, Prescribir condicion Diabetes, No Prescribir)

<b>MR_amlodipino</b>: Amlodipino (Prescribir, No Prescribir)

<b>MR_losartan</b>: Losartan (Prescribir, No Prescribir)

<b>MR_metropolol</b>: Metropolol (Prescribir, No Prescribir)

<b>MR_enalapril</b>: Enalapril (Prescribir, No Prescribir)

<b>MR_furosemida</b>: Furosemida (Prescribir, No Prescribir)

<b>MR_prazosin</b>: Prazosin (Prescribir, No Prescribir)

<b>MR_verapamilo</b>: Verapamilo (Prescribir, No Prescribir)

<b>MR_propanolol</b>: Propanolol (Prescribir, No Prescribir)

<b>MR_carvedilol</b>: Carvedilol (Prescribir, No Prescribir)


## Variables destinadas a los sintomas y efectos secundarios resultantes

<b>SA_angina_de_pecho</b>: Angina de Pecho (Presente, Ausente)

<b>SA_presion_arterial</b>: Presion arterial (Alta, Normal,Baja)

<b>SA_danho_rinhones</b>: Danho en Rinhones (Presente, Ausente)


## Creando la estructura de la RB

In [2]:
from pgmpy.models import BayesianModel
from pgmpy.factors.discrete import TabularCPD
from pgmpy.inference import VariableElimination
import pandas as pd
import numpy as np

In [3]:
bn = BayesianModel([('EP_Insuficiencia_Cardiaca', 'MR_metropolol'),              
                    ('EP_Asma_Bronquial', 'MR_metropolol'), 
                    ('EP_Diabetes', 'MR_metropolol'), 
                    ('EP_Diabetes', 'MR_hidroclorothiazida'), 
                    ('MP_ibuprofeno', 'MR_hidroclorothiazida'), 
                    ('EP_Gota', 'MR_hidroclorothiazida'), 
                    ('MR_metropolol', 'SA_angina_de_pecho'),
                    ('MR_metropolol', 'SA_danho_rinhones'),
                    ('MR_metropolol', 'SA_presion_arterial'),
                    ('MR_hidroclorothiazida', 'SA_danho_rinhones'),
                    ('MR_hidroclorothiazida', 'SA_presion_arterial'),
                    ('MR_amlodipino', 'SA_angina_de_pecho'),
                    ('MR_amlodipino', 'SA_danho_rinhones'),
                    ('MR_amlodipino', 'SA_presion_arterial'),
                    ('MR_losartan', 'SA_danho_rinhones'),
                    ('MR_losartan', 'SA_presion_arterial'),
                    ('MR_enalapril', 'SA_danho_rinhones'),
                    ('MR_enalapril', 'SA_presion_arterial'),
                    ('MR_furosemida', 'SA_presion_arterial'),
                    ('MR_furosemida', 'SA_danho_rinhones'),
                    ('MR_prazosin', 'SA_presion_arterial'),
                    ('MR_prazosin', 'SA_danho_rinhones'),
                    ('MR_verapamilo', 'SA_presion_arterial'),
                    ('MR_verapamilo', 'SA_danho_rinhones'),
                    ('MR_captopril', 'SA_presion_arterial'),
                    ('MR_captopril', 'SA_danho_rinhones'),
                    ('MR_propanolol', 'SA_presion_arterial'),
                    ('MR_propanolol', 'SA_danho_rinhones'),
                    ('MR_carvedilol', 'SA_presion_arterial'),
                    ('MR_carvedilol', 'SA_danho_rinhones')])


# Creacion de la tabla de verdad que brinda evidencia de los medicamentos hacia algun sintoma



Base: Numero que indica la cantidad a repetirse

Por ejemplo:

-[ 1 1 0 0 ]               -> base 4

-[ 1 1 1 1 0 0 0 0]        -> base 8

-[ 2 2 1 1 0 0]            -> base 6

Generar un arreglo con el tamanho de la base

In [6]:
def generateArrayBase(campos, arrMedicamentos):
    arrBase = []
    cantEstados = 0     
    for j in range(0, campos):
        if(j == 0):
            cantEstados = arrMedicamentos[j] ** (j+1)
        else:
            cantEstados = (arrBase[j-1] * arrMedicamentos[j])
        arrBase.append(cantEstados)
    return arrBase

Generar una lista de verdad por variable dada, segun sea su base

In [7]:
def generateArrayState(i,filas, maxEstado, cantVecesPorEstado,arrBase,arrMedicamentos):
    medlist = []
    contEstado = 1
    contadorInterno = 1
    for j in range (0, filas):                 
        medlist.append(maxEstado)
        #Lo hago para verificar si ya se puede reiniciar con el mayor
        if(contadorInterno == arrBase[i]):
            maxEstado = arrMedicamentos[i] -1
            contadorInterno = 1
            contEstado = 1
        else:
            #Lo hago para verificar si ya se puede cambiar de estado a un menor
            if(contEstado == cantVecesPorEstado):
                maxEstado -=1
                contEstado =1
            else:
                contEstado +=1
            contadorInterno +=1
    return medlist

In [36]:
def lenLista(arrMedicamentos):    
    filas = 1 
    for i in arrMedicamentos:
        filas  *=i 
    return filas

In [44]:
def createCombinationDataframe(arrEstadosMedicamentos,arrHeader):
    campos = len(arrEstadosMedicamentos)
    dicc = {}
    i = 0   
    #Obteniendo la base por cada posicion del medicamento
    arrBase = generateArrayBase(campos, arrEstadosMedicamentos)
    
    #Obteniendo la cantidad de filas
    filas   = lenLista(arrEstadosMedicamentos)
         
    #Obteniendo la gran tabla de verdad
    for key in arrHeader:  
        maxEstado = arrEstadosMedicamentos[i] -1 
        cantVecesPorEstado = arrBase[i]/arrEstadosMedicamentos[i]
        medlist = generateArrayState(i,filas, maxEstado,cantVecesPorEstado,arrBase,arrEstadosMedicamentos)
        dicc[key] = medlist
        i+=1   
    df = pd.DataFrame(dicc)
    return df

In [45]:
#Colocar los medicamentos de izquierda a derecha 
df_ANT_presion_arterial = createCombinationDataframe([2,2,2,2,2,3,2,3,2,2,2],
                    ['MR_amlodipino','MR_captopril' ,'MR_carvedilol','MR_enalapril' ,
                     'MR_furosemida','MR_hidroclorothiazida','MR_losartan','MR_metropolol',
                     'MR_prazosin','MR_propanolol','MR_verapamilo']) 

df_ANT_presion_arterial.head(10)

Unnamed: 0,MR_amlodipino,MR_captopril,MR_carvedilol,MR_enalapril,MR_furosemida,MR_hidroclorothiazida,MR_losartan,MR_metropolol,MR_prazosin,MR_propanolol,MR_verapamilo
0,1,1,1,1,1,2,1,2,1,1,1
1,0,1,1,1,1,2,1,2,1,1,1
2,1,0,1,1,1,2,1,2,1,1,1
3,0,0,1,1,1,2,1,2,1,1,1
4,1,1,0,1,1,2,1,2,1,1,1
5,0,1,0,1,1,2,1,2,1,1,1
6,1,0,0,1,1,2,1,2,1,1,1
7,0,0,0,1,1,2,1,2,1,1,1
8,1,1,1,0,1,2,1,2,1,1,1
9,0,1,1,0,1,2,1,2,1,1,1


In [46]:
df_ANT_presion_arterial.tail(10)

Unnamed: 0,MR_amlodipino,MR_captopril,MR_carvedilol,MR_enalapril,MR_furosemida,MR_hidroclorothiazida,MR_losartan,MR_metropolol,MR_prazosin,MR_propanolol,MR_verapamilo
4598,1,0,0,1,0,0,0,0,0,0,0
4599,0,0,0,1,0,0,0,0,0,0,0
4600,1,1,1,0,0,0,0,0,0,0,0
4601,0,1,1,0,0,0,0,0,0,0,0
4602,1,0,1,0,0,0,0,0,0,0,0
4603,0,0,1,0,0,0,0,0,0,0,0
4604,1,1,0,0,0,0,0,0,0,0,0
4605,0,1,0,0,0,0,0,0,0,0,0
4606,1,0,0,0,0,0,0,0,0,0,0
4607,0,0,0,0,0,0,0,0,0,0,0


# Seteo de probabilidades

<b>Combinaciones de 4 a mas medicamentos:</b> La probabilidad de la presencia del sintoma es muy alta

<b>Combinaciones entre 2 a 3:</b> La probabilidad de la ausencia del sintoma variara de acuerdo a la efectividad del medicamento y a la naturaleza del paciente

<b>Solo un medicamento: </b> La probabilidad de la ausencia del sintoma variara de acuerdo a la efectividad del medicamento y a la naturaleza del paciente

In [111]:
def createProbabilisticDataframe(arrEstadosMedicamentos, listEstados, listProbabilidades):
    df= pd.DataFrame(columns=listEstados)
    filas   = lenLista(arrEstadosMedicamentos)
    for i in range(filas):
        df2 = pd.DataFrame([listProbabilidades], columns=listEstados)
        df =df.append(df2)
    df=df.reset_index(drop=True)    
    return df
        

In [114]:
df = createProbabilisticDataframe([2,2,2,2,2,3,2,3,2,2,2],['Alta','Normal','Baja'],[0.33,0.34,0.33])
df.head(10)

Unnamed: 0,Alta,Normal,Baja
0,0.33,0.34,0.33
1,0.33,0.34,0.33
2,0.33,0.34,0.33
3,0.33,0.34,0.33
4,0.33,0.34,0.33
5,0.33,0.34,0.33
6,0.33,0.34,0.33
7,0.33,0.34,0.33
8,0.33,0.34,0.33
9,0.33,0.34,0.33


In [None]:

def setDataframe(df, arrMedicamentos,estadosVariable):
    
    # Iteración por filas del DataFrame:
    for fila in df.iterrows():
        cont=0
        for key in arrMedicamentos:
            if(fila[key] > 0):
                cont+=1
        if(cont>3):#El 3 lo indicara el medico de cada especialidad
                
            #Insertar en el dataframe de probabilidades

In [19]:
df_presion_arterial['Alta']=df_presion_arterial.sum(axis=1)
df_presion_arterial['Media']=df_presion_arterial.sum(axis=1)
df_presion_arterial['Baja']=df_presion_arterial.sum(axis=1)
df_presion_arterial.head(5)

Unnamed: 0,MR_amlodipino,MR_captopril,MR_carvedilol,MR_enalapril,MR_furosemida,MR_hidroclorothiazida,MR_losartan,MR_metropolol,MR_prazosin,MR_propanolol,MR_verapamilo,Alta,Media,Baja
0,1,1,1,1,1,2,1,2,1,1,1,13,26,52
1,0,1,1,1,1,2,1,2,1,1,1,12,24,48
2,1,0,1,1,1,2,1,2,1,1,1,12,24,48
3,0,0,1,1,1,2,1,2,1,1,1,11,22,44
4,1,1,0,1,1,2,1,2,1,1,1,12,24,48


## Definiendo los parametros del modelo (CPTs)
En pgmpy las columnas son los valores de los padres (evidencias) y las filas son los estados de las variables 

In [4]:
#Antecendentes de los pacientes geriatricos

cpt_Insuficiencia_Cardiaca = TabularCPD(variable='EP_Insuficiencia_Cardiaca', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Asma_Bronquial = TabularCPD(variable='EP_Insuficiencia_Cardiaca', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Diabetes= TabularCPD(variable='EP_Diabetes', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Ibuprofeno= TabularCPD(variable='MP_ibuprofeno', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_Gota= TabularCPD(variable='EP_Gota', variable_card=2,
                      values=[[0.5], [0.5]])

In [None]:
#Se empieza por atras hacia adelante con ambos casos del primer medicamento
#Todos los primeros casos, la presion estara bien baja, por el hecho de tomar 11 medicamentos.
#De hecho que tomar 3 a mas ya es bastantes, entonces le pondre lo mas minimo

cpt_presion_arterial = TabularCPD(variable='SA_presion_arterial', variable_card=2,
                      values=[[0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,0.05,0.05,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.30,0.35,0.30,0.10,0.10,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.00,0.00,0.00,0.45,0.45,0.45,0.25,0.25,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.15,0.15,0.15,0.05,
                               0.05,0.05,0.05,0.15,0.30,0.05,0.05,0.30,0.15,0.15,0.34],
                              [],
                              []],
                      evidence=['MR_metropolol','MR_hidroclorothiazida', 'MR_amlodipino',
                                'MR_losartan', 'MR_enalapril','MR_furosemida',
                                'MR_prazosin','MR_verapamilo','MR_captopril',
                                'MR_propanolol','MR_carvedilol'],
                      evidence_card=[3,2,2,2,3,2,2,2,2,2,2])

In [5]:
#Medicamentos a recetar

cpt_metoprolol = TabularCPD(variable='MR_metropolol', variable_card=3,
                      values=[[0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.34], 
                              [0.01,0.01,0.01,0.01,0.01,0.01,0.50,0.33], 
                              [0.99,0.99,0.99,0.99,0.99,0.99,0.50,0.33]],
                      evidence=['EP_Asma_Bronquial', 'EP_Insuficiencia_Cardiaca', 'EP_Diabetes'], 
                      evidence_card=[2,2,2])

cpt_hidroclorothiazida = TabularCPD(variable='MR_hidroclorothiazida', variable_card=3,
                      values=[[0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.34], 
                              [0.01,0.01,0.99,0.99,0.01,0.01,0.01,0.33], 
                              [0.99,0.99,0.01,0.01,0.99,0.99,0.99,0.33]],
                      evidence=['EP_Diabetes', 'MP_ibuprofeno', 'EP_Gota'], 
                      evidence_card=[2,2,2])

cpt_amlodipino = TabularCPD(variable='MR_amlodipino', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_losartan = TabularCPD(variable='MR_losartan', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_enalapril= TabularCPD(variable='MR_enalapril', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_furosemida = TabularCPD(variable='MR_furosemida', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_prazosin= TabularCPD(variable='MR_prazosin', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_verapamilo= TabularCPD(variable='MR_verapamilo', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_captopril= TabularCPD(variable='MR_captopril', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_propanolol= TabularCPD(variable='MR_propanolol', variable_card=2,
                      values=[[0.5], [0.5]])
cpt_propanolol= TabularCPD(variable='MR_carvedilol', variable_card=2,
                      values=[[0.5], [0.5]])



In [1]:
#Sintomas resultantes
cpt_angina_pecho = TabularCPD(variable='SA_angina_de_pecho', variable_card=2,
                      values=[[0.20,0.30,0.30,0.30,0.40,0.70], 
                              [0.80,0.70,0.70,0.70,0.60,0.30]],
                      evidence=['MR_metropolol', 'MR_amlodipino'], 
                      evidence_card=[3,2])
cpt_danho_rinhones = TabularCPD(variable='SA_danho_rinhones', variable_card=2,
                      values=[[0.90,0.85,0.85,0.85,0.80,0.80,0.85,0.75,0.75,0.85,0.80,0.80,0.85,
                               0.75,0.75,0.80,0.65,0.65,0.80,0.65,0.65,0.80,0.65,0.65,0.75,0.60,
                               0.60,0.60,0.55,0.10,0.85,0.80,0.70,0.65,0.75,0.65,0.60,0.70,0.60,
                               0.60,0.65,0.55,0.55,0.65,0.60,0.10,0.85,0.75,0.70,0.75,0.65,0.45,
                               0.65,0.25,0.05,0.60,0.15,0.10,0.15,0.10,0.05,0.10,0.05,0.01], 
                              [0.10,0.15,0.15,0.15,0.20,0.20,0.15,0.25,0.25,0.15,0.20,0.20,0.15,
                               0.25,0.15,0.20,0.35,0.35,0.20,0.35,0.35,0.20,0.35,0.35,0.25,0.40,
                               0.40,0.40,0.45,0.90,0.15,0.20,0.30,0.35,0.25,0.35,0.40,0.30,0.40,
                               0.40,0.35,0.45,0.45,0.35,0.40,0.90,0.15,0.25,0.30,0.25,0.35,0.55,
                               0.35,0.75,0.95,0.40,0.85,0.90,0.85,0.90,0.95,0.90,0.95,0.99]],
                      evidence=['MR_enalapril','MR_losartan','MR_amlodipino',
                                'MR_metropolol','MR_hidroclorothiazida'],
                      evidence_card=[2,2,2,3,3])
cpt_presion_arterial = TabularCPD(variable='SA_presion_arterial', variable_card=2,
                      values=[[0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,0.05,0.05,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.30,0.35,0.30,0.10,0.10,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.05,
                               0.05,0.05,0.05,0.05,0.05,0.00,0.00,0.00,0.45,0.45,0.45,0.25,0.25,
                               0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.10,0.15,0.15,0.15,0.05,
                               0.05,0.05,0.05,0.15,0.30,0.05,0.05,0.30,0.15,0.15,0.34],
                              [],
                              []],
                      evidence=['MR_metropolol', 'MR_amlodipino','MR_losartan',
                               'MR_enalapril','MR_hidroclorothiazida'],
                      evidence_card=[3,2,2,2,3])

NameError: name 'TabularCPD' is not defined

## Asociando los CPTs con la estructura del modelo


In [5]:
bn.add_cpds(cpt_LastResults, cpt_LastResultsOpponent, cpt_PsychologicalState, cpt_PerformancePlayers, cpt_FinalResult)

In [18]:
bn.add_cpds(cpt_hidroclorothiazida, cpt_metoprolol, cpt_angina_pecho)

## Chequeando consistencia de la Red Bayesiana
Chequea que todos los CPTs sean válidos para el modelo

In [19]:
bn.check_model()

ValueError: No CPD associated with MR_enalapril

## Haciendo inferencias en la RB con algoritmo VariableElimination 


Vamos a preguntar a la RB con el algoritmo ** VariableElimination ** Cuál es la distribucion de "B" (Burglary) habiendo obsevado que "J" y "M" son true (John y Mary llaman)  

In [7]:
from pgmpy.inference import VariableElimination

In [8]:
bn_inference = VariableElimination(bn)

In [10]:
q = bn_inference.query(variables=['LastResults'], evidence={'FinalResult':0})
print(q['LastResults'])

╒═══════════════╤════════════════════╕
│ LastResults   │   phi(LastResults) │
╞═══════════════╪════════════════════╡
│ LastResults_0 │             0.3917 │
├───────────────┼────────────────────┤
│ LastResults_1 │             0.6083 │
╘═══════════════╧════════════════════╛


In [11]:
bn_inference.state_names()

TypeError: 'NoneType' object is not callable

## Obteniendo las independencias Condicionales implicadas en la RB  

In [1]:
bn.get_independencies()

NameError: name 'bn' is not defined