# Bayesian Network for the Diagnosis of Psychiatric Diseases

Collecting data for my bayesian network project has been a very challenging task. I first attempted to obtain medical records by calling and visiting several medical institutions in the metro-Atlanta area, but I soon found out the the release of (anonymous) medical records to the general public is illegal, and telling people that you're a student at the Georiga Institute of Technology will only get you so much. 

So, because I refuse to put this project to bed because of insufficient resources, I have chosen to generate my own data in order to represent patients with psychiatric disorders. The data that I am generating will be random and in no way will it be an accurate representation of the real-world population, however the purpose of this project is to build a BBN (Beautiful Bayesian Network) that is capable of handling real-world data when it is blessed enough to recieve some. 

I have 4 main psychiatric diseases that I am focusing on - Manic Depressive Psychosis (Bipolar Disorder), Depressive Disorder, Mixed Dementia, Schizophrenia. Through my research I have identified a number of causes and effects that have to do with these diseases. 

The data that I will be generating will be simple - true or false (1 or 0) values that will represent gender and if the patient is experiencing the listed cause/effect, and numerical values that will represent how many of something a patient has i.e. number of parents with the disease. 

After generating this data, go through it and calculate all of the probabilities for the various diseases, causes, effects, and use these statistics in order to train my bayesian network model. 

In [33]:
import pandas as pd
import random
from numpy.random import randint
import numpy as np

In [34]:
#Create a table that will hold all of the data

data = pd.DataFrame(columns = {"Chronicle Depression, Intercurrenced", "Elevated Stress Level", "Recent Birth",
                              "Unwanted Incident", "Genetic Influence", "Abusive use of HBP's, Sedatives, Contraception Pills",
                              "Toxins in Working Environment", "Scarcity of Phosphate and B12", "Taedium Vitae",
                              "Inquietude or Anxiety", "Social Recession/ Impulse Reduction", "Wariness/Memory Reduction",
                              "Disorientation", "Behavior Disorders", "Mania/Hallucinosis", "Personality/Emotional Life Deterioration",
                              "Social Life Detorioration", "Grimaces, Mannerisms, Puerility"})

In [35]:
data 

Unnamed: 0,Unwanted Incident,Inquietude or Anxiety,Disorientation,Behavior Disorders,Recent Birth,"Chronicle Depression, Intercurrenced",Toxins in Working Environment,Mania/Hallucinosis,Social Recession/ Impulse Reduction,Social Life Detorioration,Personality/Emotional Life Deterioration,"Grimaces, Mannerisms, Puerility","Abusive use of HBP's, Sedatives, Contraception Pills",Genetic Influence,Wariness/Memory Reduction,Elevated Stress Level,Scarcity of Phosphate and B12,Taedium Vitae


In [36]:
print(len(data.columns))

18


In [37]:
#Create a loop that will generate the data for the table and append the data to the table
#I need to create a nested for loop. The outside loop will run 1,000 times, being that I want to generate 1,000 lines of data
#   The inside loop will run 18 times, being that I have 18 categories in a single row that need to have values.

#Create a single empty list that the values will be stored in. This list will be reset to empty every time the inside loop
#   is done iterating, so as to reset itself for the next row

for i in range(1000):
    data.loc[i] = list(randint(2, size=18))
 


In [38]:
#admire the data
data

Unnamed: 0,Unwanted Incident,Inquietude or Anxiety,Disorientation,Behavior Disorders,Recent Birth,"Chronicle Depression, Intercurrenced",Toxins in Working Environment,Mania/Hallucinosis,Social Recession/ Impulse Reduction,Social Life Detorioration,Personality/Emotional Life Deterioration,"Grimaces, Mannerisms, Puerility","Abusive use of HBP's, Sedatives, Contraception Pills",Genetic Influence,Wariness/Memory Reduction,Elevated Stress Level,Scarcity of Phosphate and B12,Taedium Vitae
0,1,0,0,1,1,1,1,0,0,1,1,0,0,1,0,1,0,1
1,0,0,1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,1
2,1,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,1,1
3,1,0,1,0,1,1,1,1,0,1,0,1,1,1,1,1,0,0
4,0,0,1,1,0,1,1,0,1,0,0,0,1,0,0,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,0,0,1,0,1,1,1,0,0,0,1,1,0,0,0,1,1,1
996,0,1,1,1,1,1,0,1,1,1,0,1,0,1,1,1,1,0
997,1,0,0,1,1,0,1,0,1,1,1,0,0,0,1,0,1,0
998,1,0,0,1,0,1,0,1,1,0,0,1,1,1,1,0,0,0


In [39]:
# Now I need to generate the diseases that the patients have
# simple array 
names = np.array(['Manic Depressive Psychosis', 'Depressive Disorder', 'Mixed Dementia', 'Schizophrenia']) 

random.choice(names)

'Depressive Disorder'

In [40]:
disorder_list = []
for i in range(1000):
    disorder_list.append(random.choice(names))

disorder_list

['Schizophrenia',
 'Depressive Disorder',
 'Mixed Dementia',
 'Mixed Dementia',
 'Depressive Disorder',
 'Schizophrenia',
 'Manic Depressive Psychosis',
 'Schizophrenia',
 'Manic Depressive Psychosis',
 'Manic Depressive Psychosis',
 'Manic Depressive Psychosis',
 'Schizophrenia',
 'Depressive Disorder',
 'Depressive Disorder',
 'Schizophrenia',
 'Schizophrenia',
 'Mixed Dementia',
 'Schizophrenia',
 'Depressive Disorder',
 'Schizophrenia',
 'Depressive Disorder',
 'Manic Depressive Psychosis',
 'Schizophrenia',
 'Mixed Dementia',
 'Manic Depressive Psychosis',
 'Mixed Dementia',
 'Schizophrenia',
 'Schizophrenia',
 'Manic Depressive Psychosis',
 'Mixed Dementia',
 'Manic Depressive Psychosis',
 'Depressive Disorder',
 'Schizophrenia',
 'Manic Depressive Psychosis',
 'Schizophrenia',
 'Manic Depressive Psychosis',
 'Mixed Dementia',
 'Schizophrenia',
 'Depressive Disorder',
 'Depressive Disorder',
 'Manic Depressive Psychosis',
 'Depressive Disorder',
 'Depressive Disorder',
 'Schizoph

In [41]:
#Add the disorder column to the dataframe
data.insert(loc=0, column='Patient Diagnosis', value=disorder_list)

In [42]:
data

Unnamed: 0,Patient Diagnosis,Unwanted Incident,Inquietude or Anxiety,Disorientation,Behavior Disorders,Recent Birth,"Chronicle Depression, Intercurrenced",Toxins in Working Environment,Mania/Hallucinosis,Social Recession/ Impulse Reduction,Social Life Detorioration,Personality/Emotional Life Deterioration,"Grimaces, Mannerisms, Puerility","Abusive use of HBP's, Sedatives, Contraception Pills",Genetic Influence,Wariness/Memory Reduction,Elevated Stress Level,Scarcity of Phosphate and B12,Taedium Vitae
0,Schizophrenia,1,0,0,1,1,1,1,0,0,1,1,0,0,1,0,1,0,1
1,Depressive Disorder,0,0,1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,1
2,Mixed Dementia,1,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,1,1
3,Mixed Dementia,1,0,1,0,1,1,1,1,0,1,0,1,1,1,1,1,0,0
4,Depressive Disorder,0,0,1,1,0,1,1,0,1,0,0,0,1,0,0,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Manic Depressive Psychosis,0,0,1,0,1,1,1,0,0,0,1,1,0,0,0,1,1,1
996,Depressive Disorder,0,1,1,1,1,1,0,1,1,1,0,1,0,1,1,1,1,0
997,Schizophrenia,1,0,0,1,1,0,1,0,1,1,1,0,0,0,1,0,1,0
998,Manic Depressive Psychosis,1,0,0,1,0,1,0,1,1,0,0,1,1,1,1,0,0,0


In [43]:
#Generate male/female column and add it to dataframe

gender = np.array(['Male', 'Female']) 

gender_list = []
for i in range(1000):
    gender_list.append(random.choice(gender))

gender_list

['Female',
 'Female',
 'Female',
 'Male',
 'Male',
 'Female',
 'Male',
 'Female',
 'Female',
 'Male',
 'Male',
 'Male',
 'Male',
 'Male',
 'Female',
 'Female',
 'Male',
 'Female',
 'Female',
 'Female',
 'Male',
 'Female',
 'Male',
 'Male',
 'Male',
 'Female',
 'Male',
 'Female',
 'Female',
 'Female',
 'Female',
 'Female',
 'Female',
 'Female',
 'Female',
 'Female',
 'Male',
 'Female',
 'Male',
 'Male',
 'Female',
 'Female',
 'Female',
 'Female',
 'Female',
 'Female',
 'Male',
 'Male',
 'Male',
 'Male',
 'Female',
 'Male',
 'Female',
 'Male',
 'Male',
 'Male',
 'Male',
 'Male',
 'Female',
 'Male',
 'Female',
 'Female',
 'Female',
 'Male',
 'Female',
 'Male',
 'Male',
 'Male',
 'Male',
 'Female',
 'Female',
 'Female',
 'Female',
 'Male',
 'Female',
 'Male',
 'Female',
 'Female',
 'Male',
 'Male',
 'Male',
 'Female',
 'Female',
 'Male',
 'Female',
 'Male',
 'Male',
 'Female',
 'Male',
 'Female',
 'Male',
 'Male',
 'Male',
 'Female',
 'Male',
 'Female',
 'Male',
 'Female',
 'Female',
 'Fem

In [44]:
data.insert(loc=1, column='Gender', value=gender_list)
data

Unnamed: 0,Patient Diagnosis,Gender,Unwanted Incident,Inquietude or Anxiety,Disorientation,Behavior Disorders,Recent Birth,"Chronicle Depression, Intercurrenced",Toxins in Working Environment,Mania/Hallucinosis,Social Recession/ Impulse Reduction,Social Life Detorioration,Personality/Emotional Life Deterioration,"Grimaces, Mannerisms, Puerility","Abusive use of HBP's, Sedatives, Contraception Pills",Genetic Influence,Wariness/Memory Reduction,Elevated Stress Level,Scarcity of Phosphate and B12,Taedium Vitae
0,Schizophrenia,Female,1,0,0,1,1,1,1,0,0,1,1,0,0,1,0,1,0,1
1,Depressive Disorder,Female,0,0,1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,1
2,Mixed Dementia,Female,1,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,1,1
3,Mixed Dementia,Male,1,0,1,0,1,1,1,1,0,1,0,1,1,1,1,1,0,0
4,Depressive Disorder,Male,0,0,1,1,0,1,1,0,1,0,0,0,1,0,0,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Manic Depressive Psychosis,Female,0,0,1,0,1,1,1,0,0,0,1,1,0,0,0,1,1,1
996,Depressive Disorder,Male,0,1,1,1,1,1,0,1,1,1,0,1,0,1,1,1,1,0
997,Schizophrenia,Male,1,0,0,1,1,0,1,0,1,1,1,0,0,0,1,0,1,0
998,Manic Depressive Psychosis,Male,1,0,0,1,0,1,0,1,1,0,0,1,1,1,1,0,0,0


In [45]:
schiz_patients = data.loc[data['Patient Diagnosis'] == 'Schizophrenia']
bipolar_patients = data.loc[data['Patient Diagnosis'] == 'Manic Depressive Psychosis']
dementia_patients = data.loc[data['Patient Diagnosis'] == 'Mixed Dementia']
depressive_patients = data.loc[data['Patient Diagnosis'] == 'Depressive Disorder']
data['Patient Diagnosis'].value_counts()

Manic Depressive Psychosis    265
Depressive Disorder           252
Mixed Dementia                243
Schizophrenia                 240
Name: Patient Diagnosis, dtype: int64

In [142]:
depressive_patients[depressive_patients["Gender"] == "Male"]

Unnamed: 0,Patient Diagnosis,Gender,Unwanted Incident,Inquietude or Anxiety,Disorientation,Behavior Disorders,Recent Birth,"Chronicle Depression, Intercurrenced",Toxins in Working Environment,Mania/Hallucinosis,Social Recession/ Impulse Reduction,Social Life Detorioration,Personality/Emotional Life Deterioration,"Grimaces, Mannerisms, Puerility","Abusive use of HBP's, Sedatives, Contraception Pills",Genetic Influence,Wariness/Memory Reduction,Elevated Stress Level,Scarcity of Phosphate and B12,Taedium Vitae
4,Depressive Disorder,Male,0,0,1,1,0,1,1,0,1,0,0,0,1,0,0,1,0,1
12,Depressive Disorder,Male,0,1,0,1,0,1,1,0,1,0,1,0,1,1,0,1,1,0
13,Depressive Disorder,Male,1,0,0,0,0,0,0,0,0,1,0,1,0,1,1,1,0,1
20,Depressive Disorder,Male,0,0,1,1,0,1,0,0,0,0,0,0,0,0,1,1,0,1
38,Depressive Disorder,Male,1,1,1,0,0,0,0,1,0,1,0,0,0,0,1,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
963,Depressive Disorder,Male,1,1,1,0,0,1,1,1,0,1,1,1,0,0,1,1,1,0
967,Depressive Disorder,Male,1,0,1,1,0,1,1,1,1,1,1,0,0,1,1,0,0,0
968,Depressive Disorder,Male,1,1,1,0,0,0,1,0,1,1,0,1,0,1,1,1,0,1
994,Depressive Disorder,Male,1,0,1,1,0,1,1,0,1,0,0,0,1,0,0,1,0,1


# Making My Data More Accurate

 1. Randomize values for # of patients with each disorder
 2. Create a table for each disorder with columns that represent only the cause-and-effects that are directly related to the    specific disorder in the network
 3. Using the random number that was assigned to each disorder, randomize values from that sample for the parent nodes for 
    each of the disorders.
 4. From there, if any of those parent nodes have parent nodes, use the randomized number for the child nodes to randomize 
    values for those nodes. 

In [148]:
manicDepressivePsychosis = pd.DataFrame(columns = {"Chronicle Depression, Intercurrenced", "Elevated Stress Level",  "Scarcity of Phosphate and B12"})
mixedDementia = pd.DataFrame(columns = {"Toxins in Working Environment", "Elevated Stress Level",  "Scarcity of Phosphate and B12",
                                       "Social Recession/Impulse Reduction", "Wariness/Memory Reduction", "Disorientation",
                                       "Behavior Disorders", "Personality/Emotional Life Deterioration", "Mania/Hallucinosis"})
Schizophrenia = pd.DataFrame(columns = {"Mother with Virosis During Pregnancy"})
depression = pd.DataFrame(columns = {"Recent Birth", "Elevated Stress Level",  "Unwanted Incident", "Inheritance",
                                    "Scarcity of Phosphate and B12", "Taedium Vitae (Suicidal Thoughts)", "Inquietude or Anxiety", 
                                    "Abuse use of HBP's, Sedatives, Contraception Pills"})

In [201]:
for i in range(265):
    manicDepressivePsychosis.loc[i] = list(randint(2, size=3))

In [202]:
print("Chronicle Depression, Intercurrenced", len(manicDepressivePsychosis[manicDepressivePsychosis["Chronicle Depression, Intercurrenced"] == 1]))
print("Scarcity of Phosphate and B12", len(manicDepressivePsychosis[manicDepressivePsychosis["Scarcity of Phosphate and B12"] == 1]))
print("Elevated Stress Level", len(manicDepressivePsychosis[manicDepressivePsychosis["Elevated Stress Level"] == 1]))
manicDepressivePsychosis.head()

Chronicle Depression, Intercurrenced 138
Scarcity of Phosphate and B12 126
Elevated Stress Level 119


Unnamed: 0,Elevated Stress Level,Scarcity of Phosphate and B12,"Chronicle Depression, Intercurrenced"
0,1,0,1
1,0,0,0
2,0,1,0
3,1,1,0
4,1,0,0


In [211]:
for i in range(243):
    mixedDementia.loc[i] = list(randint(2, size=9))

In [212]:
print("Disorientation", len(mixedDementia[mixedDementia["Disorientation"] ==1]))
print("Behavior Disorders", len(mixedDementia[mixedDementia["Behavior Disorders"] ==1]))
print("Toxins in Working Environment", len(mixedDementia[mixedDementia["Toxins in Working Environment"] ==1]))
print("Mania/Hallucinosis", len(mixedDementia[mixedDementia["Mania/Hallucinosis"] ==1]))
print("Social Recession/Impulse Reduction", len(mixedDementia[mixedDementia["Social Recession/Impulse Reduction"] ==1]))
print("Scarcity of Phosphate and B12", len(mixedDementia[mixedDementia["Scarcity of Phosphate and B12"] ==1]))
print("Elevated Stress Level", len(mixedDementia[mixedDementia["Elevated Stress Level"] ==1]))
print("Wariness/Memory Reduction", len(mixedDementia[mixedDementia["Wariness/Memory Reduction"] ==1]))
print("Personality/Emotional Life Deterioration", len(mixedDementia[mixedDementia["Personality/Emotional Life Deterioration"] ==1]))
mixedDementia.head()

Disorientation 130
Behavior Disorders 111
Toxins in Working Environment 114
Mania/Hallucinosis 123
Social Recession/Impulse Reduction 108
Scarcity of Phosphate and B12 129
Elevated Stress Level 102
Wariness/Memory Reduction 119
Personality/Emotional Life Deterioration 127


Unnamed: 0,Disorientation,Behavior Disorders,Toxins in Working Environment,Mania/Hallucinosis,Social Recession/Impulse Reduction,Scarcity of Phosphate and B12,Elevated Stress Level,Wariness/Memory Reduction,Personality/Emotional Life Deterioration
0,1,0,0,1,0,0,1,1,0
1,1,0,0,1,0,1,1,1,0
2,1,0,0,1,1,0,1,0,1
3,0,0,0,0,1,0,1,1,0
4,1,1,1,0,1,1,0,1,0


In [157]:
for i in range(240):
    Schizophrenia.loc[i] = list(randint(2, size=1))

In [180]:
print("Mother with Virosis", len(Schizophrenia[Schizophrenia["Mother with Virosis During Pregnancy"] == 1]))
Schizophrenia.head()

Mother with Virosis 132


Unnamed: 0,Mother with Virosis During Pregnancy
0,0
1,0
2,1
3,1
4,1


In [213]:
for i in range(252):
    depression.loc[i] = list(randint(2, size=8))

In [214]:
print("Unwanted Incident", len(depression[depression['Unwanted Incident'] == 1]))
print("Inquietude or Anxiety", len(depression[depression['Inquietude or Anxiety'] == 1]))
print("Recent Birth", len(depression[depression['Recent Birth'] == 1]))
print("Drug Abuse", len(depression[depression["Abuse use of HBP's, Sedatives, Contraception Pills"] == 1]))
print("Taedium Vitae", len(depression[depression['Taedium Vitae (Suicidal Thoughts)'] == 1]))
print("Scarcity of Phosphate and B12", len(depression[depression['Scarcity of Phosphate and B12'] == 1]))
print("Elevated Stress Level", len(depression[depression['Elevated Stress Level'] == 1]))
print("Inheritance", len(depression[depression['Inheritance'] == 1]))
depression.head()

Unwanted Incident 128
Inquietude or Anxiety 141
Recent Birth 129
Drug Abuse 136
Taedium Vitae 133
Scarcity of Phosphate and B12 119
Elevated Stress Level 126
Inheritance 116


Unnamed: 0,Unwanted Incident,Inquietude or Anxiety,Recent Birth,"Abuse use of HBP's, Sedatives, Contraception Pills",Taedium Vitae (Suicidal Thoughts),Scarcity of Phosphate and B12,Elevated Stress Level,Inheritance
0,1,1,1,1,0,0,1,1
1,1,0,1,1,1,0,1,1
2,0,0,0,1,1,0,1,1
3,1,1,0,1,1,1,0,1
4,1,1,0,0,1,1,1,1
