## Capstone project

Group 1 members
* William Itotia
* Firdosa mohamed
* Esther Munene
* Frank Oyugi

## Overview

The healthcare industry is rapidly evolving with the integration of technology, aiming to provide better accessibility, efficiency, and personalized care. This project focuses on developing a comprehensive recommendation system that predicts diseases, offers detailed descriptions, suggests precautions, and recommends medications based on the symptoms input by users. By leveraging machine learning and extensive medical data, the system aims to empower individuals with timely and accurate medical advice, reducing the need for immediate hospital visits and improving overall health outcomes.

## Problem Statement

Many individuals face challenges in accessing timely and accurate medical advice due to various factors such as geographical barriers, busy schedules, and overcrowded healthcare facilities. These challenges often lead to delayed diagnosis and treatment, potentially worsening health conditions. There is a need for a solution that can provide immediate, reliable, and personalized medical recommendations based on symptoms, thereby improving accessibility to healthcare and reducing the strain on medical facilities.

## Objectives

* To gather a repository of detailed descriptions for a wide range of diseases, including causes, symptoms, and treatment options.
* To develop and train a machine learning model on extensive medical data to predict possible diseases based on the input symptoms.
* To provide suggestions for precautions and preventive measures tailored to the predicted diseases.
* To recommend appropriate medications based on the predicted disease.
* To integrate a user-friendly interface for individuals to input their symptoms.

## Data Understanding

Importing necessary libraries

In [1]:
from sklearn.ensemble import RandomForestClassifier,GradientBoostingClassifier
from sklearn.metrics import accuracy_score,confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import make_classification
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import MultinomialNB
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.multioutput import MultiOutputClassifier
from sklearn.metrics import accuracy_score, hamming_loss, classification_report
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import multilabel_confusion_matrix
from sklearn.svm import SVC
# from keras.models import Model, Sequential
# from keras.layers import Dense, Input
# from keras.callbacks import EarlyStopping
# from keras.models import Model
# from tensorflow.keras.layers import BatchNormalization, Dropout, LeakyReLU
# from tensorflow.keras.regularizers import l2
# from tensorflow.keras.optimizers import Adam
import seaborn as sns
from sklearn.svm import SVC
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
import pickle
import os
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns', None)

In [2]:
#pip install tensorflow --timeout=120

In [3]:
#pip install keras

Load the data set

In [4]:
symptoms_df = pd.read_csv("Dataset\symtoms_df.csv")
precautions_df = pd.read_csv("Dataset\precautions_df.csv")
descriptions_df = pd.read_csv("Dataset\description.csv")
medications_df = pd.read_csv("Dataset\medications.csv")
training_df = pd.read_csv("Dataset\Training.csv")
dataset = pd.read_csv("Dataset\\dataset.csv")

In [5]:
symptoms_df.shape

(4920, 6)

In [6]:
precautions_df.head()

Unnamed: 0.1,Unnamed: 0,Disease,Precaution_1,Precaution_2,Precaution_3,Precaution_4
0,0,Drug Reaction,stop irritation,consult nearest hospital,stop taking drug,follow up
1,1,Malaria,Consult nearest hospital,avoid oily food,avoid non veg food,keep mosquitos out
2,2,Allergy,apply calamine,cover area with bandage,,use ice to compress itching
3,3,Hypothyroidism,reduce stress,exercise,eat healthy,get proper sleep
4,4,Psoriasis,wash hands with warm soapy water,stop bleeding using pressure,consult doctor,salt baths


In [7]:
descriptions_df.head()

Unnamed: 0,Disease,Description
0,Fungal infection,Fungal infection is a common skin condition ca...
1,Allergy,Allergy is an immune system reaction to a subs...
2,GERD,GERD (Gastroesophageal Reflux Disease) is a di...
3,Chronic cholestasis,Chronic cholestasis is a condition where bile ...
4,Drug Reaction,Drug Reaction occurs when the body reacts adve...


In [8]:
medications_df.head()

Unnamed: 0,Disease,Medication
0,Fungal infection,"['Antifungal Cream', 'Fluconazole', 'Terbinafi..."
1,Allergy,"['Antihistamines', 'Decongestants', 'Epinephri..."
2,GERD,"['Proton Pump Inhibitors (PPIs)', 'H2 Blockers..."
3,Chronic cholestasis,"['Ursodeoxycholic acid', 'Cholestyramine', 'Me..."
4,Drug Reaction,"['Antihistamines', 'Epinephrine', 'Corticoster..."


In [9]:
training_df.head()

Unnamed: 0,itching,skin_rash,nodal_skin_eruptions,continuous_sneezing,shivering,chills,joint_pain,stomach_pain,acidity,ulcers_on_tongue,muscle_wasting,vomiting,burning_micturition,spotting_ urination,fatigue,weight_gain,anxiety,cold_hands_and_feets,mood_swings,weight_loss,restlessness,lethargy,patches_in_throat,irregular_sugar_level,cough,high_fever,sunken_eyes,breathlessness,sweating,dehydration,indigestion,headache,yellowish_skin,dark_urine,nausea,loss_of_appetite,pain_behind_the_eyes,back_pain,constipation,abdominal_pain,diarrhoea,mild_fever,yellow_urine,yellowing_of_eyes,acute_liver_failure,fluid_overload,swelling_of_stomach,swelled_lymph_nodes,malaise,blurred_and_distorted_vision,phlegm,throat_irritation,redness_of_eyes,sinus_pressure,runny_nose,congestion,chest_pain,weakness_in_limbs,fast_heart_rate,pain_during_bowel_movements,pain_in_anal_region,bloody_stool,irritation_in_anus,neck_pain,dizziness,cramps,bruising,obesity,swollen_legs,swollen_blood_vessels,puffy_face_and_eyes,enlarged_thyroid,brittle_nails,swollen_extremeties,excessive_hunger,extra_marital_contacts,drying_and_tingling_lips,slurred_speech,knee_pain,hip_joint_pain,muscle_weakness,stiff_neck,swelling_joints,movement_stiffness,spinning_movements,loss_of_balance,unsteadiness,weakness_of_one_body_side,loss_of_smell,bladder_discomfort,foul_smell_of urine,continuous_feel_of_urine,passage_of_gases,internal_itching,toxic_look_(typhos),depression,irritability,muscle_pain,altered_sensorium,red_spots_over_body,belly_pain,abnormal_menstruation,dischromic _patches,watering_from_eyes,increased_appetite,polyuria,family_history,mucoid_sputum,rusty_sputum,lack_of_concentration,visual_disturbances,receiving_blood_transfusion,receiving_unsterile_injections,coma,stomach_bleeding,distention_of_abdomen,history_of_alcohol_consumption,fluid_overload.1,blood_in_sputum,prominent_veins_on_calf,palpitations,painful_walking,pus_filled_pimples,blackheads,scurring,skin_peeling,silver_like_dusting,small_dents_in_nails,inflammatory_nails,blister,red_sore_around_nose,yellow_crust_ooze,prognosis
0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Fungal infection
1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Fungal infection
2,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Fungal infection
3,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Fungal infection
4,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Fungal infection


In [10]:
training_df['prognosis'].unique()

array(['Fungal infection', 'Allergy', 'GERD', 'Chronic cholestasis',
       'Drug Reaction', 'Peptic ulcer diseae', 'AIDS', 'Diabetes ',
       'Gastroenteritis', 'Bronchial Asthma', 'Hypertension ', 'Migraine',
       'Cervical spondylosis', 'Paralysis (brain hemorrhage)', 'Jaundice',
       'Malaria', 'Chicken pox', 'Dengue', 'Typhoid', 'hepatitis A',
       'Hepatitis B', 'Hepatitis C', 'Hepatitis D', 'Hepatitis E',
       'Alcoholic hepatitis', 'Tuberculosis', 'Common Cold', 'Pneumonia',
       'Dimorphic hemmorhoids(piles)', 'Heart attack', 'Varicose veins',
       'Hypothyroidism', 'Hyperthyroidism', 'Hypoglycemia',
       'Osteoarthristis', 'Arthritis',
       '(vertigo) Paroymsal  Positional Vertigo', 'Acne',
       'Urinary tract infection', 'Psoriasis', 'Impetigo'], dtype=object)

In [11]:
len(training_df['prognosis'].unique())

41

In [12]:
X = training_df.drop('prognosis' ,axis=1)
y = training_df['prognosis']

In [13]:
X.head()

Unnamed: 0,itching,skin_rash,nodal_skin_eruptions,continuous_sneezing,shivering,chills,joint_pain,stomach_pain,acidity,ulcers_on_tongue,muscle_wasting,vomiting,burning_micturition,spotting_ urination,fatigue,weight_gain,anxiety,cold_hands_and_feets,mood_swings,weight_loss,restlessness,lethargy,patches_in_throat,irregular_sugar_level,cough,high_fever,sunken_eyes,breathlessness,sweating,dehydration,indigestion,headache,yellowish_skin,dark_urine,nausea,loss_of_appetite,pain_behind_the_eyes,back_pain,constipation,abdominal_pain,diarrhoea,mild_fever,yellow_urine,yellowing_of_eyes,acute_liver_failure,fluid_overload,swelling_of_stomach,swelled_lymph_nodes,malaise,blurred_and_distorted_vision,phlegm,throat_irritation,redness_of_eyes,sinus_pressure,runny_nose,congestion,chest_pain,weakness_in_limbs,fast_heart_rate,pain_during_bowel_movements,pain_in_anal_region,bloody_stool,irritation_in_anus,neck_pain,dizziness,cramps,bruising,obesity,swollen_legs,swollen_blood_vessels,puffy_face_and_eyes,enlarged_thyroid,brittle_nails,swollen_extremeties,excessive_hunger,extra_marital_contacts,drying_and_tingling_lips,slurred_speech,knee_pain,hip_joint_pain,muscle_weakness,stiff_neck,swelling_joints,movement_stiffness,spinning_movements,loss_of_balance,unsteadiness,weakness_of_one_body_side,loss_of_smell,bladder_discomfort,foul_smell_of urine,continuous_feel_of_urine,passage_of_gases,internal_itching,toxic_look_(typhos),depression,irritability,muscle_pain,altered_sensorium,red_spots_over_body,belly_pain,abnormal_menstruation,dischromic _patches,watering_from_eyes,increased_appetite,polyuria,family_history,mucoid_sputum,rusty_sputum,lack_of_concentration,visual_disturbances,receiving_blood_transfusion,receiving_unsterile_injections,coma,stomach_bleeding,distention_of_abdomen,history_of_alcohol_consumption,fluid_overload.1,blood_in_sputum,prominent_veins_on_calf,palpitations,painful_walking,pus_filled_pimples,blackheads,scurring,skin_peeling,silver_like_dusting,small_dents_in_nails,inflammatory_nails,blister,red_sore_around_nose,yellow_crust_ooze
0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [14]:
y.head()

0    Fungal infection
1    Fungal infection
2    Fungal infection
3    Fungal infection
4    Fungal infection
Name: prognosis, dtype: object

In [15]:
le = LabelEncoder()
le.fit(y)
Y = le.transform(y)

In [16]:
Y

array([15, 15, 15, ..., 38, 35, 27])

In [17]:
X_test,X_train,y_test,y_train = train_test_split(X,Y,test_size =0.2 , random_state=42)

In [18]:
X_test.shape,X_train.shape,y_test.shape,y_train.shape

((3936, 132), (984, 132), (3936,), (984,))

After further inspection of all the datasets the most important datasets are the dataset, dataset and the medication dataset. In order to create the medicine recommendation system these are the two datasets that we will be utilizing. 

Next up is to now clean these two datasets


In [19]:
# Here we can train the top best models which can give best accuracy So i can create a Dictionary of models to check which model is more accurate.
models = {
      'SVC' : SVC(kernel='linear'),
      'Random Forest' :RandomForestClassifier(random_state=42,n_estimators=100),
      'KNeighbors' : KNeighborsClassifier(n_neighbors=5),
      'Gradient Boosting': GradientBoostingClassifier(random_state=42,n_estimators=100),
      'MultinomialNB' :MultinomialNB(),
      'Logistic Regression' :LogisticRegression()
}
for model_name , model in models.items():
    # Train modelvo
    model.fit(X_train,y_train)
    # test model
    predictions = model.predict(X_test)
    # calculate accuracy
    accuracy = accuracy_score(y_test,predictions)
    # calculate confusion matrix
    cm = confusion_matrix(y_test,predictions)
    # print results
    print(f"{model_name} accuracy : {accuracy}")
    print(f"{model_name} confusion matrix :")
    print(np.array2string(cm,separator=', '))

SVC accuracy : 1.0
SVC confusion matrix :
[[102,   0,   0, ...,   0,   0,   0],
 [  0,  90,   0, ...,   0,   0,   0],
 [  0,   0,  96, ...,   0,   0,   0],
 ...,
 [  0,   0,   0, ...,  94,   0,   0],
 [  0,   0,   0, ...,   0,  98,   0],
 [  0,   0,   0, ...,   0,   0,  86]]
Random Forest accuracy : 1.0
Random Forest confusion matrix :
[[102,   0,   0, ...,   0,   0,   0],
 [  0,  90,   0, ...,   0,   0,   0],
 [  0,   0,  96, ...,   0,   0,   0],
 ...,
 [  0,   0,   0, ...,  94,   0,   0],
 [  0,   0,   0, ...,   0,  98,   0],
 [  0,   0,   0, ...,   0,   0,  86]]
KNeighbors accuracy : 1.0
KNeighbors confusion matrix :
[[102,   0,   0, ...,   0,   0,   0],
 [  0,  90,   0, ...,   0,   0,   0],
 [  0,   0,  96, ...,   0,   0,   0],
 ...,
 [  0,   0,   0, ...,  94,   0,   0],
 [  0,   0,   0, ...,   0,  98,   0],
 [  0,   0,   0, ...,   0,   0,  86]]
Gradient Boosting accuracy : 0.9801829268292683
Gradient Boosting confusion matrix :
[[102,   0,   0, ...,   0,   0,   0],
 [  0,  90,   0

In [20]:
svc = SVC(kernel='linear')
svc.fit(X_train,y_train)
y_pred = svc.predict(X_test)
acc = accuracy_score(y_pred,y_test)
acc

1.0

In [21]:
with open('disease_prediction_model.pkl', 'wb') as file:
    pickle.dump(svc, file)


In [22]:
with open('disease_prediction_model.pkl', 'rb') as file:
    svc = pickle.load(file)

In [23]:
# 2d arry convert
X_test.iloc[0].values.reshape(1,-1)

array([[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
        0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
        0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]],
      dtype=int64)

In [24]:
# now take a pred on this 2d array to check that our model pred correctly or not

# test 1 :
print('Model Predictions :',svc.predict(X_test.iloc[20].values.reshape(1,-1)))
print('Actual Labels :', y_test[20])

Model Predictions : [16]
Actual Labels : 16


In [25]:
def helper(dis): # this funtion can give us the symptoms Description,Precautions,Medication,Diet plan,workout 
    desc = descriptions_df[descriptions_df['Disease'] == predicted_disease]['Description']
    desc = " ".join([w for w in desc])

    pre = precautions_df[precautions_df['Disease'] == dis][['Precaution_1', 'Precaution_2', 'Precaution_3', 'Precaution_4']]
    pre = [col for col in pre.values]

    med = medications_df[medications_df['Disease'] == dis]['Medication']
    med = [med for med in med.values]


    return desc,pre,med

symptoms_dict = {'itching': 0, 'skin_rash': 1, 'nodal_skin_eruptions': 2, 'continuous_sneezing': 3, 'shivering': 4, 'chills': 5, 'joint_pain': 6, 'stomach_pain': 7, 'acidity': 8, 'ulcers_on_tongue': 9, 'muscle_wasting': 10, 'vomiting': 11, 'burning_micturition': 12, 'spotting_ urination': 13, 'fatigue': 14, 'weight_gain': 15, 'anxiety': 16, 'cold_hands_and_feets': 17, 'mood_swings': 18, 'weight_loss': 19, 'restlessness': 20, 'lethargy': 21, 'patches_in_throat': 22, 'irregular_sugar_level': 23, 'cough': 24, 'high_fever': 25, 'sunken_eyes': 26, 'breathlessness': 27, 'sweating': 28, 'dehydration': 29, 'indigestion': 30, 'headache': 31, 'yellowish_skin': 32, 'dark_urine': 33, 'nausea': 34, 'loss_of_appetite': 35, 'pain_behind_the_eyes': 36, 'back_pain': 37, 'constipation': 38, 'abdominal_pain': 39, 'diarrhoea': 40, 'mild_fever': 41, 'yellow_urine': 42, 'yellowing_of_eyes': 43, 'acute_liver_failure': 44, 'fluid_overload': 45, 'swelling_of_stomach': 46, 'swelled_lymph_nodes': 47, 'malaise': 48, 'blurred_and_distorted_vision': 49, 'phlegm': 50, 'throat_irritation': 51, 'redness_of_eyes': 52, 'sinus_pressure': 53, 'runny_nose': 54, 'congestion': 55, 'chest_pain': 56, 'weakness_in_limbs': 57, 'fast_heart_rate': 58, 'pain_during_bowel_movements': 59, 'pain_in_anal_region': 60, 'bloody_stool': 61, 'irritation_in_anus': 62, 'neck_pain': 63, 'dizziness': 64, 'cramps': 65, 'bruising': 66, 'obesity': 67, 'swollen_legs': 68, 'swollen_blood_vessels': 69, 'puffy_face_and_eyes': 70, 'enlarged_thyroid': 71, 'brittle_nails': 72, 'swollen_extremeties': 73, 'excessive_hunger': 74, 'extra_marital_contacts': 75, 'drying_and_tingling_lips': 76, 'slurred_speech': 77, 'knee_pain': 78, 'hip_joint_pain': 79, 'muscle_weakness': 80, 'stiff_neck': 81, 'swelling_joints': 82, 'movement_stiffness': 83, 'spinning_movements': 84, 'loss_of_balance': 85, 'unsteadiness': 86, 'weakness_of_one_body_side': 87, 'loss_of_smell': 88, 'bladder_discomfort': 89, 'foul_smell_of urine': 90, 'continuous_feel_of_urine': 91, 'passage_of_gases': 92, 'internal_itching': 93, 'toxic_look_(typhos)': 94, 'depression': 95, 'irritability': 96, 'muscle_pain': 97, 'altered_sensorium': 98, 'red_spots_over_body': 99, 'belly_pain': 100, 'abnormal_menstruation': 101, 'dischromic _patches': 102, 'watering_from_eyes': 103, 'increased_appetite': 104, 'polyuria': 105, 'family_history': 106, 'mucoid_sputum': 107, 'rusty_sputum': 108, 'lack_of_concentration': 109, 'visual_disturbances': 110, 'receiving_blood_transfusion': 111, 'receiving_unsterile_injections': 112, 'coma': 113, 'stomach_bleeding': 114, 'distention_of_abdomen': 115, 'history_of_alcohol_consumption': 116, 'fluid_overload.1': 117, 'blood_in_sputum': 118, 'prominent_veins_on_calf': 119, 'palpitations': 120, 'painful_walking': 121, 'pus_filled_pimples': 122, 'blackheads': 123, 'scurring': 124, 'skin_peeling': 125, 'silver_like_dusting': 126, 'small_dents_in_nails': 127, 'inflammatory_nails': 128, 'blister': 129, 'red_sore_around_nose': 130, 'yellow_crust_ooze': 131}
diseases_list = {15: 'Fungal infection', 4: 'Allergy', 16: 'GERD', 9: 'Chronic cholestasis', 14: 'Drug Reaction', 33: 'Peptic ulcer diseae', 1: 'AIDS', 12: 'Diabetes ', 17: 'Gastroenteritis', 6: 'Bronchial Asthma', 23: 'Hypertension ', 30: 'Migraine', 7: 'Cervical spondylosis', 32: 'Paralysis (brain hemorrhage)', 28: 'Jaundice', 29: 'Malaria', 8: 'Chicken pox', 11: 'Dengue', 37: 'Typhoid', 40: 'hepatitis A', 19: 'Hepatitis B', 20: 'Hepatitis C', 21: 'Hepatitis D', 22: 'Hepatitis E', 3: 'Alcoholic hepatitis', 36: 'Tuberculosis', 10: 'Common Cold', 34: 'Pneumonia', 13: 'Dimorphic hemmorhoids(piles)', 18: 'Heart attack', 39: 'Varicose veins', 26: 'Hypothyroidism', 24: 'Hyperthyroidism', 25: 'Hypoglycemia', 31: 'Osteoarthristis', 5: 'Arthritis', 0: '(vertigo) Paroymsal  Positional Vertigo', 2: 'Acne', 38: 'Urinary tract infection', 35: 'Psoriasis', 27: 'Impetigo'}

# Model Prediction function
def given_predicted_value(patient_symptoms):
    # Check for invalid symptoms
    invalid_symptoms = [symptom for symptom in patient_symptoms if symptom not in symptoms_dict]
    if invalid_symptoms:
        raise ValueError(f"Invalid symptom(s) entered: {', '.join(invalid_symptoms)}. Please enter valid symptoms.")

    # Create the input vector for the model
    input_vector = np.zeros(len(symptoms_dict))
    for item in patient_symptoms:
        input_vector[symptoms_dict[item]] = 1

    # Predict the disease
    return diseases_list[svc.predict([input_vector])[0]]



In [26]:
try:
    symptoms = input("Enter your symptoms: ")
    user_symptoms = [s.strip() for s in symptoms.split(',')]
    user_symptoms = [symptom.strip("[]' ") for symptom in user_symptoms]
    
    predicted_disease = given_predicted_value(user_symptoms)
    
    desc, pre, med = helper(predicted_disease)
    
    print("=================predicted disease============")
    print(predicted_disease)
    print("=================description==================")
    print(desc)
    print("=================precautions==================")
    for i, p in enumerate(pre[0], 1):
        print(f"{i}: {p}")
    
    print("=================medications==================")
    for i, m in enumerate(med, 1):
        print(f"{i}: {m}")
except ValueError as e:
    print(e)

Enter your symptoms:  itching


Fungal infection
Fungal infection is a common skin condition caused by fungi.
1: bath twice
2: use detol or neem in bathing water
3: keep infected area dry
4: use clean cloths
1: ['Antifungal Cream', 'Fluconazole', 'Terbinafine', 'Clotrimazole', 'Ketoconazole']


#### Data Cleaning and Preprocessing

In [27]:
# Find out how many prognosis' are in this dataset
len(dataset['Disease'].unique())

41

The df dataset contains a prognosis of 41 different diseases

In [28]:
#Check for missing values
dataset.isna().sum()

Disease          0
Symptom_1        0
Symptom_2        0
Symptom_3        0
Symptom_4      348
Symptom_5     1206
Symptom_6     1986
Symptom_7     2652
Symptom_8     2976
Symptom_9     3228
Symptom_10    3408
Symptom_11    3726
Symptom_12    4176
Symptom_13    4416
Symptom_14    4614
Symptom_15    4680
Symptom_16    4728
Symptom_17    4848
dtype: int64

We will not drop the missing values because different diseases have different symptoms.

In [29]:
dataset.fillna('None', inplace=True)

In [30]:
dataset.shape

(4920, 18)

In [31]:
#symptoms_dict = {symptom: idx for idx, symptom in enumerate(X_4.columns)}

In [32]:
print(symptoms_dict)


{'itching': 0, 'skin_rash': 1, 'nodal_skin_eruptions': 2, 'continuous_sneezing': 3, 'shivering': 4, 'chills': 5, 'joint_pain': 6, 'stomach_pain': 7, 'acidity': 8, 'ulcers_on_tongue': 9, 'muscle_wasting': 10, 'vomiting': 11, 'burning_micturition': 12, 'spotting_ urination': 13, 'fatigue': 14, 'weight_gain': 15, 'anxiety': 16, 'cold_hands_and_feets': 17, 'mood_swings': 18, 'weight_loss': 19, 'restlessness': 20, 'lethargy': 21, 'patches_in_throat': 22, 'irregular_sugar_level': 23, 'cough': 24, 'high_fever': 25, 'sunken_eyes': 26, 'breathlessness': 27, 'sweating': 28, 'dehydration': 29, 'indigestion': 30, 'headache': 31, 'yellowish_skin': 32, 'dark_urine': 33, 'nausea': 34, 'loss_of_appetite': 35, 'pain_behind_the_eyes': 36, 'back_pain': 37, 'constipation': 38, 'abdominal_pain': 39, 'diarrhoea': 40, 'mild_fever': 41, 'yellow_urine': 42, 'yellowing_of_eyes': 43, 'acute_liver_failure': 44, 'fluid_overload': 45, 'swelling_of_stomach': 46, 'swelled_lymph_nodes': 47, 'malaise': 48, 'blurred_and

In [33]:
print(len(symptoms_dict))

132


In [35]:
encoder = OneHotEncoder(sparse=False)
encoded_features = encoder.fit_transform(dataset)

# Create a DataFrame with one-hot encoded features
encoded_df = pd.DataFrame(encoded_features, columns=encoder.get_feature_names_out())

In [36]:
encoded_df

Unnamed: 0,Disease_(vertigo) Paroymsal Positional Vertigo,Disease_AIDS,Disease_Acne,Disease_Alcoholic hepatitis,Disease_Allergy,Disease_Arthritis,Disease_Bronchial Asthma,Disease_Cervical spondylosis,Disease_Chicken pox,Disease_Chronic cholestasis,Disease_Common Cold,Disease_Dengue,Disease_Diabetes,Disease_Dimorphic hemmorhoids(piles),Disease_Drug Reaction,Disease_Fungal infection,Disease_GERD,Disease_Gastroenteritis,Disease_Heart attack,Disease_Hepatitis B,Disease_Hepatitis C,Disease_Hepatitis D,Disease_Hepatitis E,Disease_Hypertension,Disease_Hyperthyroidism,Disease_Hypoglycemia,Disease_Hypothyroidism,Disease_Impetigo,Disease_Jaundice,Disease_Malaria,Disease_Migraine,Disease_Osteoarthristis,Disease_Paralysis (brain hemorrhage),Disease_Peptic ulcer diseae,Disease_Pneumonia,Disease_Psoriasis,Disease_Tuberculosis,Disease_Typhoid,Disease_Urinary tract infection,Disease_Varicose veins,Disease_hepatitis A,Symptom_1_ acidity,Symptom_1_ back_pain,Symptom_1_ bladder_discomfort,Symptom_1_ breathlessness,Symptom_1_ burning_micturition,Symptom_1_ chest_pain,Symptom_1_ chills,Symptom_1_ constipation,Symptom_1_ continuous_sneezing,Symptom_1_ cough,Symptom_1_ cramps,Symptom_1_ fatigue,Symptom_1_ headache,Symptom_1_ high_fever,Symptom_1_ indigestion,Symptom_1_ joint_pain,Symptom_1_ mood_swings,Symptom_1_ muscle_wasting,Symptom_1_ muscle_weakness,Symptom_1_ neck_pain,Symptom_1_ pain_during_bowel_movements,Symptom_1_ patches_in_throat,Symptom_1_ pus_filled_pimples,Symptom_1_ shivering,Symptom_1_ skin_rash,Symptom_1_ stiff_neck,Symptom_1_ stomach_pain,Symptom_1_ sunken_eyes,Symptom_1_ vomiting,Symptom_1_ weakness_in_limbs,Symptom_1_ weight_gain,Symptom_1_ weight_loss,Symptom_1_ yellowish_skin,Symptom_1_itching,Symptom_2_ abdominal_pain,Symptom_2_ acidity,Symptom_2_ anxiety,Symptom_2_ blackheads,Symptom_2_ bladder_discomfort,Symptom_2_ blister,Symptom_2_ breathlessness,Symptom_2_ bruising,Symptom_2_ chest_pain,Symptom_2_ chills,Symptom_2_ cold_hands_and_feets,Symptom_2_ cough,Symptom_2_ cramps,Symptom_2_ dehydration,Symptom_2_ dizziness,Symptom_2_ fatigue,Symptom_2_ foul_smell_of urine,Symptom_2_ headache,Symptom_2_ high_fever,Symptom_2_ indigestion,Symptom_2_ joint_pain,Symptom_2_ knee_pain,Symptom_2_ lethargy,Symptom_2_ loss_of_appetite,Symptom_2_ mood_swings,Symptom_2_ nausea,Symptom_2_ neck_pain,Symptom_2_ nodal_skin_eruptions,Symptom_2_ pain_during_bowel_movements,Symptom_2_ pain_in_anal_region,Symptom_2_ patches_in_throat,Symptom_2_ pus_filled_pimples,Symptom_2_ restlessness,Symptom_2_ shivering,Symptom_2_ skin_peeling,Symptom_2_ skin_rash,Symptom_2_ stiff_neck,Symptom_2_ stomach_pain,Symptom_2_ sunken_eyes,Symptom_2_ sweating,Symptom_2_ swelling_joints,Symptom_2_ ulcers_on_tongue,Symptom_2_ vomiting,Symptom_2_ weakness_in_limbs,Symptom_2_ weakness_of_one_body_side,Symptom_2_ weight_gain,Symptom_2_ weight_loss,Symptom_2_ yellowish_skin,Symptom_3_ abdominal_pain,Symptom_3_ altered_sensorium,Symptom_3_ anxiety,Symptom_3_ blackheads,Symptom_3_ blister,Symptom_3_ bloody_stool,Symptom_3_ blurred_and_distorted_vision,Symptom_3_ breathlessness,Symptom_3_ bruising,Symptom_3_ burning_micturition,Symptom_3_ chest_pain,Symptom_3_ chills,Symptom_3_ cold_hands_and_feets,Symptom_3_ continuous_feel_of_urine,Symptom_3_ cough,Symptom_3_ dark_urine,Symptom_3_ dehydration,Symptom_3_ diarrhoea,Symptom_3_ dischromic _patches,Symptom_3_ dizziness,Symptom_3_ extra_marital_contacts,Symptom_3_ fatigue,Symptom_3_ foul_smell_of urine,Symptom_3_ headache,Symptom_3_ high_fever,Symptom_3_ hip_joint_pain,Symptom_3_ joint_pain,Symptom_3_ knee_pain,Symptom_3_ lethargy,Symptom_3_ loss_of_appetite,Symptom_3_ loss_of_balance,Symptom_3_ mood_swings,Symptom_3_ movement_stiffness,Symptom_3_ nausea,Symptom_3_ neck_pain,Symptom_3_ nodal_skin_eruptions,Symptom_3_ obesity,Symptom_3_ pain_in_anal_region,Symptom_3_ red_sore_around_nose,Symptom_3_ restlessness,Symptom_3_ scurring,Symptom_3_ silver_like_dusting,Symptom_3_ skin_peeling,Symptom_3_ spinning_movements,Symptom_3_ stomach_pain,Symptom_3_ sweating,Symptom_3_ swelling_joints,Symptom_3_ swelling_of_stomach,Symptom_3_ ulcers_on_tongue,Symptom_3_ vomiting,Symptom_3_ watering_from_eyes,Symptom_3_ weakness_of_one_body_side,Symptom_3_ weight_loss,Symptom_3_ yellowish_skin,Symptom_4_ abdominal_pain,Symptom_4_ altered_sensorium,Symptom_4_ bloody_stool,Symptom_4_ blurred_and_distorted_vision,Symptom_4_ breathlessness,Symptom_4_ burning_micturition,Symptom_4_ chest_pain,Symptom_4_ continuous_feel_of_urine,Symptom_4_ cough,Symptom_4_ dark_urine,Symptom_4_ diarrhoea,Symptom_4_ dischromic _patches,Symptom_4_ distention_of_abdomen,Symptom_4_ dizziness,Symptom_4_ excessive_hunger,Symptom_4_ extra_marital_contacts,Symptom_4_ family_history,Symptom_4_ fatigue,Symptom_4_ headache,Symptom_4_ high_fever,Symptom_4_ hip_joint_pain,Symptom_4_ irregular_sugar_level,Symptom_4_ irritation_in_anus,Symptom_4_ lack_of_concentration,Symptom_4_ lethargy,Symptom_4_ loss_of_appetite,Symptom_4_ loss_of_balance,Symptom_4_ mood_swings,Symptom_4_ movement_stiffness,Symptom_4_ nausea,Symptom_4_ obesity,Symptom_4_ painful_walking,Symptom_4_ passage_of_gases,Symptom_4_ red_sore_around_nose,Symptom_4_ restlessness,Symptom_4_ scurring,Symptom_4_ silver_like_dusting,Symptom_4_ small_dents_in_nails,Symptom_4_ spinning_movements,Symptom_4_ spotting_ urination,Symptom_4_ sweating,Symptom_4_ swelling_joints,Symptom_4_ swelling_of_stomach,Symptom_4_ swollen_legs,Symptom_4_ vomiting,Symptom_4_ watering_from_eyes,Symptom_4_ weight_loss,Symptom_4_ yellow_crust_ooze,Symptom_4_ yellowing_of_eyes,Symptom_4_ yellowish_skin,Symptom_4_None,Symptom_5_ abdominal_pain,Symptom_5_ blurred_and_distorted_vision,Symptom_5_ breathlessness,Symptom_5_ chest_pain,Symptom_5_ cough,Symptom_5_ dark_urine,Symptom_5_ diarrhoea,Symptom_5_ distention_of_abdomen,Symptom_5_ dizziness,Symptom_5_ excessive_hunger,Symptom_5_ family_history,Symptom_5_ fatigue,Symptom_5_ headache,Symptom_5_ high_fever,Symptom_5_ history_of_alcohol_consumption,Symptom_5_ inflammatory_nails,Symptom_5_ internal_itching,Symptom_5_ irregular_sugar_level,Symptom_5_ irritation_in_anus,Symptom_5_ lack_of_concentration,Symptom_5_ lethargy,Symptom_5_ loss_of_appetite,Symptom_5_ loss_of_balance,Symptom_5_ mucoid_sputum,Symptom_5_ nausea,Symptom_5_ painful_walking,Symptom_5_ passage_of_gases,Symptom_5_ small_dents_in_nails,Symptom_5_ spotting_ urination,Symptom_5_ stiff_neck,Symptom_5_ sweating,Symptom_5_ swelling_joints,Symptom_5_ swollen_blood_vessels,Symptom_5_ swollen_legs,Symptom_5_ unsteadiness,Symptom_5_ yellow_crust_ooze,Symptom_5_ yellowing_of_eyes,Symptom_5_ yellowish_skin,Symptom_5_None,Symptom_6_ abdominal_pain,Symptom_6_ blurred_and_distorted_vision,Symptom_6_ breathlessness,Symptom_6_ chest_pain,Symptom_6_ constipation,Symptom_6_ dark_urine,Symptom_6_ depression,Symptom_6_ diarrhoea,Symptom_6_ dizziness,Symptom_6_ family_history,Symptom_6_ fast_heart_rate,Symptom_6_ fluid_overload,Symptom_6_ headache,Symptom_6_ high_fever,Symptom_6_ history_of_alcohol_consumption,Symptom_6_ inflammatory_nails,Symptom_6_ internal_itching,Symptom_6_ loss_of_appetite,Symptom_6_ malaise,Symptom_6_ mucoid_sputum,Symptom_6_ nausea,Symptom_6_ obesity,Symptom_6_ painful_walking,Symptom_6_ prominent_veins_on_calf,Symptom_6_ puffy_face_and_eyes,Symptom_6_ stiff_neck,Symptom_6_ sweating,Symptom_6_ swelled_lymph_nodes,Symptom_6_ swollen_blood_vessels,Symptom_6_ unsteadiness,Symptom_6_ yellowing_of_eyes,Symptom_6_ yellowish_skin,Symptom_6_None,Symptom_7_ abdominal_pain,Symptom_7_ blurred_and_distorted_vision,Symptom_7_ breathlessness,Symptom_7_ constipation,Symptom_7_ dark_urine,Symptom_7_ depression,Symptom_7_ diarrhoea,Symptom_7_ enlarged_thyroid,Symptom_7_ excessive_hunger,Symptom_7_ fast_heart_rate,Symptom_7_ fluid_overload,Symptom_7_ headache,Symptom_7_ irritability,Symptom_7_ loss_of_appetite,Symptom_7_ malaise,Symptom_7_ mild_fever,Symptom_7_ muscle_pain,Symptom_7_ nausea,Symptom_7_ obesity,Symptom_7_ phlegm,Symptom_7_ prominent_veins_on_calf,Symptom_7_ puffy_face_and_eyes,Symptom_7_ sweating,Symptom_7_ swelled_lymph_nodes,Symptom_7_ yellow_urine,Symptom_7_ yellowing_of_eyes,Symptom_7_None,Symptom_8_ abdominal_pain,Symptom_8_ brittle_nails,Symptom_8_ chest_pain,Symptom_8_ diarrhoea,Symptom_8_ drying_and_tingling_lips,Symptom_8_ enlarged_thyroid,Symptom_8_ excessive_hunger,Symptom_8_ increased_appetite,Symptom_8_ irritability,Symptom_8_ loss_of_appetite,Symptom_8_ malaise,Symptom_8_ mild_fever,Symptom_8_ muscle_pain,Symptom_8_ muscle_weakness,Symptom_8_ nausea,Symptom_8_ phlegm,Symptom_8_ sweating,Symptom_8_ swelled_lymph_nodes,Symptom_8_ visual_disturbances,Symptom_8_ yellow_urine,Symptom_8_ yellowing_of_eyes,Symptom_8_None,Symptom_9_ abdominal_pain,Symptom_9_ brittle_nails,Symptom_9_ chest_pain,Symptom_9_ diarrhoea,Symptom_9_ drying_and_tingling_lips,Symptom_9_ fast_heart_rate,Symptom_9_ increased_appetite,Symptom_9_ irritability,Symptom_9_ loss_of_appetite,Symptom_9_ malaise,Symptom_9_ mild_fever,Symptom_9_ muscle_weakness,Symptom_9_ pain_behind_the_eyes,Symptom_9_ phlegm,Symptom_9_ polyuria,Symptom_9_ slurred_speech,Symptom_9_ swelled_lymph_nodes,Symptom_9_ swollen_extremeties,Symptom_9_ throat_irritation,Symptom_9_ toxic_look_(typhos),Symptom_9_ visual_disturbances,Symptom_9_ yellowing_of_eyes,Symptom_9_None,Symptom_10_ abnormal_menstruation,Symptom_10_ acute_liver_failure,Symptom_10_ back_pain,Symptom_10_ belly_pain,Symptom_10_ depression,Symptom_10_ fast_heart_rate,Symptom_10_ irritability,Symptom_10_ malaise,Symptom_10_ mild_fever,Symptom_10_ muscle_pain,Symptom_10_ pain_behind_the_eyes,Symptom_10_ polyuria,Symptom_10_ receiving_blood_transfusion,Symptom_10_ red_spots_over_body,Symptom_10_ redness_of_eyes,Symptom_10_ rusty_sputum,Symptom_10_ slurred_speech,Symptom_10_ swollen_extremeties,Symptom_10_ throat_irritation,Symptom_10_ toxic_look_(typhos),Symptom_10_ yellowing_of_eyes,Symptom_10_None,Symptom_11_ abnormal_menstruation,Symptom_11_ acute_liver_failure,Symptom_11_ back_pain,Symptom_11_ belly_pain,Symptom_11_ coma,Symptom_11_ depression,Symptom_11_ irritability,Symptom_11_ malaise,Symptom_11_ muscle_pain,Symptom_11_ palpitations,Symptom_11_ receiving_blood_transfusion,Symptom_11_ receiving_unsterile_injections,Symptom_11_ red_spots_over_body,Symptom_11_ redness_of_eyes,Symptom_11_ rusty_sputum,Symptom_11_ sinus_pressure,Symptom_11_ swelled_lymph_nodes,Symptom_11_ yellowing_of_eyes,Symptom_11_None,Symptom_12_ abnormal_menstruation,Symptom_12_ coma,Symptom_12_ irritability,Symptom_12_ malaise,Symptom_12_ muscle_pain,Symptom_12_ palpitations,Symptom_12_ receiving_unsterile_injections,Symptom_12_ runny_nose,Symptom_12_ sinus_pressure,Symptom_12_ stomach_bleeding,Symptom_12_ swelled_lymph_nodes,Symptom_12_None,Symptom_13_ abnormal_menstruation,Symptom_13_ congestion,Symptom_13_ malaise,Symptom_13_ muscle_pain,Symptom_13_ phlegm,Symptom_13_ red_spots_over_body,Symptom_13_ runny_nose,Symptom_13_ stomach_bleeding,Symptom_13_None,Symptom_14_ chest_pain,Symptom_14_ congestion,Symptom_14_ phlegm,Symptom_14_ red_spots_over_body,Symptom_14_None,Symptom_15_ blood_in_sputum,Symptom_15_ chest_pain,Symptom_15_ loss_of_smell,Symptom_15_None,Symptom_16_ blood_in_sputum,Symptom_16_ loss_of_smell,Symptom_16_ muscle_pain,Symptom_16_None,Symptom_17_ muscle_pain,Symptom_17_None
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4915,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
4916,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
4917,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
4918,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0


In [67]:
# Assuming you have a 'Disease' column in your original dataset
disease_column = dataset['Disease']

# Add the Disease column back to the encoded DataFrame
encoded_df['Disease'] = disease_column

# Melt the DataFrame to create a long format
melted_df = encoded_df.melt(id_vars=['Disease'], var_name='Symptom', value_name='Presence')

# Keep only rows where the symptom is present (Presence == 1)
melted_df = melted_df[melted_df['Presence'] == 1].drop(columns='Presence')

# Reset the index for better readability (optional)
melted_df.reset_index(drop=True, inplace=True)

# Display the resulting DataFrame
print(melted_df.head())


                                   Disease  \
0  (vertigo) Paroymsal  Positional Vertigo   
1  (vertigo) Paroymsal  Positional Vertigo   
2  (vertigo) Paroymsal  Positional Vertigo   
3  (vertigo) Paroymsal  Positional Vertigo   
4  (vertigo) Paroymsal  Positional Vertigo   

                                           Symptom  
0  Disease_(vertigo) Paroymsal  Positional Vertigo  
1  Disease_(vertigo) Paroymsal  Positional Vertigo  
2  Disease_(vertigo) Paroymsal  Positional Vertigo  
3  Disease_(vertigo) Paroymsal  Positional Vertigo  
4  Disease_(vertigo) Paroymsal  Positional Vertigo  


In [64]:
encoder = OneHotEncoder(sparse=False)
encoded_features = encoder.fit_transform(symptoms_df)

# Create a DataFrame with one-hot encoded features
encoded_df_2 = pd.DataFrame(encoded_features, columns=encoder.get_feature_names_out())

In [66]:
# Melt the DataFrame to convert it from wide to long format
melted = encoded_df.melt(id_vars='Disease', value_vars=['Symptom_1', 'Symptom_2', 'Symptom_3', 'Symptom_4', 'Symptom_5','Symptom_6', 'Symptom_7', 'Symptom_8', 'Symptom_9', 'Symptom_10','Symptom_11', 'Symptom_12', 'Symptom_13', 'Symptom_14', 'Symptom_15', 'Symptom_16', 'Symptom_17'],
                   var_name='Symptom_Type', value_name='Symptom').dropna()

# One-hot encode the symptoms
one_hot = pd.get_dummies(melted['Symptom'])

# Combine the one-hot encoded symptoms with the diseases
encoded_data = pd.concat([melted['Disease'], one_hot], axis=1)

# Aggregate by Disease
final_data = encoded_data.groupby('Disease').max().reset_index()

print(final_data)

KeyError: "The following 'value_vars' are not present in the DataFrame: ['Symptom_1', 'Symptom_10', 'Symptom_11', 'Symptom_12', 'Symptom_13', 'Symptom_14', 'Symptom_15', 'Symptom_16', 'Symptom_17', 'Symptom_2', 'Symptom_3', 'Symptom_4', 'Symptom_5', 'Symptom_6', 'Symptom_7', 'Symptom_8', 'Symptom_9']"

In [None]:
# Melt the DataFrame to convert it from wide to long format
melted = dataset.melt(id_vars='Disease', value_vars=['Symptom_1', 'Symptom_2', 'Symptom_3', 'Symptom_4', 'Symptom_5','Symptom_6', 'Symptom_7', 'Symptom_8', 'Symptom_9', 'Symptom_10','Symptom_11', 'Symptom_12', 'Symptom_13', 'Symptom_14', 'Symptom_15', 'Symptom_16', 'Symptom_17'],
                   var_name='Symptom_Type', value_name='Symptom').dropna()

# One-hot encode the symptoms
one_hot = pd.get_dummies(melted['Symptom'])

# Combine the one-hot encoded symptoms with the diseases
encoded_data = pd.concat([melted['Disease'], one_hot], axis=1)

# Aggregate by Disease
final_data = encoded_data.groupby('Disease').max().reset_index()

print(final_data)

In [56]:
# Convert final_data to DataFrame
df = pd.DataFrame(final_data)

# Replace True/False with 1/0
df = df.replace({True: 1, False: 0})

# Display the DataFrame
print(df)

                                    Disease   abdominal_pain  \
0   (vertigo) Paroymsal  Positional Vertigo                0   
1                                      AIDS                0   
2                                      Acne                0   
3                       Alcoholic hepatitis                1   
4                                   Allergy                0   
5                                 Arthritis                0   
6                          Bronchial Asthma                0   
7                      Cervical spondylosis                0   
8                               Chicken pox                0   
9                       Chronic cholestasis                1   
10                              Common Cold                0   
11                                   Dengue                0   
12                                Diabetes                 0   
13             Dimorphic hemmorhoids(piles)                0   
14                            Drug React

In [57]:
df['Disease'].unique()

array(['(vertigo) Paroymsal  Positional Vertigo', 'AIDS', 'Acne',
       'Alcoholic hepatitis', 'Allergy', 'Arthritis', 'Bronchial Asthma',
       'Cervical spondylosis', 'Chicken pox', 'Chronic cholestasis',
       'Common Cold', 'Dengue', 'Diabetes ',
       'Dimorphic hemmorhoids(piles)', 'Drug Reaction',
       'Fungal infection', 'GERD', 'Gastroenteritis', 'Heart attack',
       'Hepatitis B', 'Hepatitis C', 'Hepatitis D', 'Hepatitis E',
       'Hypertension ', 'Hyperthyroidism', 'Hypoglycemia',
       'Hypothyroidism', 'Impetigo', 'Jaundice', 'Malaria', 'Migraine',
       'Osteoarthristis', 'Paralysis (brain hemorrhage)',
       'Peptic ulcer diseae', 'Pneumonia', 'Psoriasis', 'Tuberculosis',
       'Typhoid', 'Urinary tract infection', 'Varicose veins',
       'hepatitis A'], dtype=object)

In [58]:
len(df['Disease'].unique())

41

In [68]:
X = melted_df.drop('Disease' ,axis=1)
y = melted_df['Disease']

In [69]:
X.head()

Unnamed: 0,Symptom
0,Disease_(vertigo) Paroymsal Positional Vertigo
1,Disease_(vertigo) Paroymsal Positional Vertigo
2,Disease_(vertigo) Paroymsal Positional Vertigo
3,Disease_(vertigo) Paroymsal Positional Vertigo
4,Disease_(vertigo) Paroymsal Positional Vertigo


In [70]:
y.head()

0    (vertigo) Paroymsal  Positional Vertigo
1    (vertigo) Paroymsal  Positional Vertigo
2    (vertigo) Paroymsal  Positional Vertigo
3    (vertigo) Paroymsal  Positional Vertigo
4    (vertigo) Paroymsal  Positional Vertigo
Name: Disease, dtype: object

In [71]:
X_test.shape,X_train.shape,y_test.shape,y_train.shape

((32, 132), (9, 132), (32,), (9,))

In [63]:
# Here we can train the top best models which can give best accuracy So i can create a Dictionary of models to check which model is more accurate.
models = {
      'SVC' : SVC(kernel='linear'),
      'Random Forest' :RandomForestClassifier(random_state=42,n_estimators=100),
      'KNeighbors' : KNeighborsClassifier(n_neighbors=5),
      'Gradient Boosting': GradientBoostingClassifier(random_state=42,n_estimators=100),
      'MultinomialNB' :MultinomialNB(),
      'Logistic Regression' :LogisticRegression()
}
for model_name , model in models.items():
    # Train modelvo
    model.fit(X_train,y_train)
    # test model
    predictions = model.predict(X_test)
    # calculate accuracy
    accuracy = accuracy_score(y_test,predictions)
    # calculate confusion matrix
    cm = confusion_matrix(y_test,predictions)
    # print results
    print(f"{model_name} accuracy : {accuracy}")
    print(f"{model_name} confusion matrix :")
    print(np.array2string(cm,separator=', '))

SVC accuracy : 0.0
SVC confusion matrix :
[[0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 ...,
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0]]
Random Forest accuracy : 0.0
Random Forest confusion matrix :
[[0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 ...,
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0]]
KNeighbors accuracy : 0.0
KNeighbors confusion matrix :
[[0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 ...,
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0]]
Gradient Boosting accuracy : 0.0
Gradient Boosting confusion matrix :
[[0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 ...,
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0]]
MultinomialNB accuracy : 0.0
MultinomialNB confusion matrix :
[[0, 0, 0, ..., 0, 0, 0],
 [0, 0, 0, ..., 0, 0, 0],
 [0, 0, 

In [34]:
medications.head()

NameError: name 'medications' is not defined

In [50]:
le = LabelEncoder()
le.fit(y)
Y = le.transform(y)

In [51]:
Y

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40])

In [52]:
X_test,X_train,y_test,y_train = train_test_split(X,Y,test_size =0.2 , random_state=42)

In [None]:
#Check for missing values
medications.isna().sum()

No missing values in this small dataset 

In [None]:
#Check for duplicates
medications.duplicated()

It seems like we have no duplicate rows also in this small dataset

In [None]:
medications.shape


The medication dataset contains two columns one containing disease/prognosis and the other column containing medications for the prognosis the dataset contans 41 rows which represents 41 different diseases with their respective medication

Now that we have cleaned both our datasets we need to now combine them to make it easy to work with.

In [None]:
combined_dataset = pd.merge(dataset, medications, on='Disease', how='left')

combined_dataset

## Data Exploration


## EDA

The first step before EDA is to one hot encode our dataset. This will enable it to make it easier to run machine learning algorithms on our data and even plotting our data becomes easier.

In [None]:

# Assuming df is your DataFrame containing 'Symptom_1' to 'Symptom_17'

#Combine Symptom Columns into a Single Column
combined_dataset['Symptoms'] = combined_dataset[['Symptom_1', 'Symptom_2', 'Symptom_3', 'Symptom_4', 
                     'Symptom_5', 'Symptom_6', 'Symptom_7', 'Symptom_8', 
                     'Symptom_9', 'Symptom_10', 'Symptom_11', 'Symptom_12', 
                     'Symptom_13', 'Symptom_14', 'Symptom_15', 'Symptom_16', 
                     'Symptom_17']].values.tolist()

# Remove 'None' or NaN values from the list of symptoms
combined_dataset['Symptoms'] = combined_dataset['Symptoms'].apply(lambda x: [i for i in x if i and i != 'None'])

# Explode the 'Symptoms' column to split the list into individual rows
df_exploded = combined_dataset.explode('Symptoms')

# One-hot encode the 'Symptoms' column
symptoms_onehot = pd.get_dummies(df_exploded['Symptoms'], prefix='Symptom')

# Concatenate the one-hot encoded columns back to the original dataframe
df_encoded = pd.concat([df_exploded, symptoms_onehot], axis=1)

# Drop the original 'Symptoms' column as it's now encoded
df_encoded = df_encoded.drop(columns=['Symptoms'])

# Convert any non-zero values to 1 to ensure binary format
df_encoded[symptoms_onehot.columns] = df_encoded[symptoms_onehot.columns].applymap(lambda x: 1 if x > 0 else 0)

# Final DataFrame
df_final = df_encoded

# Display the first few rows of the final DataFrame
df_final.head()


In [None]:
df_final.shape

Next step is to turn the medication column into labels using multi-label encoding

In [None]:

#Split the 'Medication' column into multiple columns
df_medications = df_final['Medication'].str.get_dummies(sep=',')

#Concatenate the new medication columns with the original dataframe
df_final2 = pd.concat([df_final, df_medications], axis=1)

#Drop the original 'Medication' column if no longer needed
df_final2.drop('Medication', axis=1, inplace=True)


df_final2.head()


In [None]:
df_final2.shape

In [None]:

# Based on the available columns, adjust this list
columns_of_interest = [
    'Symptom_itching',
    'Symptom_ weight_loss'
    # Add other valid symptoms based on the output of df_final.columns
]

# Filter out columns that are not in the DataFrame
valid_columns = [col for col in columns_of_interest if col in df_final.columns]

if valid_columns:
    # Select only the valid columns
    df_selected = df_final[valid_columns]

    # Calculate the frequency of each symptom by summing the columns
    symptom_frequency = df_selected.sum()

    # Plot the frequency of symptoms
    plt.figure(figsize=(12, 8))
    symptom_frequency.plot(kind='bar', color='skyblue')
    plt.title('Frequency of Selected Symptoms')
    plt.xlabel('Symptoms')
    plt.ylabel('Frequency')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()
    plt.show()
else:
    print("No valid columns found for plotting.")


In [None]:

# Sample DataFrame
# df = pd.read_csv('your_file.csv')

# Columns of interest
columns_of_interest = ['Symptom_itching',
    'Symptom_ weakness_in_limbs']

# Create a bar plot for each column of interest
for column in columns_of_interest:
    if column in df_final.columns:
        # Calculate value counts
        value_counts = df_final[column].value_counts()
        
        # Plot the value counts
        plt.figure(figsize=(10, 6))
        value_counts.plot(kind='bar', color='skyblue')
        plt.title(f'Value Counts for {column}')
        plt.xlabel('Categories')
        plt.ylabel('Frequency')
        plt.xticks(rotation=45)
        plt.show()
    else:
        print(f"Column '{column}' does not exist in the DataFrame.")



## Modelling

Now that the data preprocessing has been done we can now begin modelling using our df_final2 dataset. The first model we will build is a Random Forest Classifier model with MultiOutputClassifier

In [None]:
# Step 1: Define features (X) and target (y)
X = df_final2.drop(columns=df_medications.columns)  # Use symptoms and possibly disease columns as features
y = df_medications  # Multi-label encoded medication columns

#Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [None]:
model = MultiOutputClassifier(RandomForestClassifier(random_state=42))

#Train the model
from sklearn.preprocessing import LabelEncoder

# Initialize the label encoder
label_encoder = LabelEncoder()

# Apply label encoding to categorical features in X_train and X_test
for column in X_train.columns:
    if X_train[column].dtype == 'object':  # Check if the column is categorical
        X_train[column] = label_encoder.fit_transform(X_train[column])
        X_test[column] = label_encoder.transform(X_test[column])

model.fit(X_train, y_train)

In [None]:
#Make predictions
y_pred = model.predict(X_test)

#Evaluate the model
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Hamming Loss:", hamming_loss(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred, target_names=y.columns))

What these results above tell us is that:

Accuracy:
0993 indicates that our model correctly predicted nearly all instances in the test set. 

Hamming Loss:
0.0 signifies that there were no misclassifications; none of the individual labels were incorrectly predicted. 

Classification Report:
The report includes precision, recall, and f1-score for each label, along with support (the number of occurrences of each label in the test data).

Precision:
Precision of 1.00 across all classes suggests that when the model predicted a certain medication, it was correct every time.

Recall:
A recall of 1.00 means the model identified all relevant instances for each medication class correctly.

F1-Score:
F1-score, being the harmonic mean of precision and recall, is also 1.00, which means the model performs perfectly in terms of both precision and recall.

Support:
This indicates the number of samples for each class. For instance, 'Antiviral drugs' has 63 instances in the test data, and the model correctly predicted all of them.

In [None]:

# Extract the symptom columns
symptom_columns = [col for col in df_final2.columns if 'Symptom' in col]
X2 = df_final2[symptom_columns]  # Features

# Extract the target columns
y_disease = df_final2['Disease']  # Target for disease
medication_columns = [col for col in df_final2.columns if col.startswith('[')]
y_medication = df_final2[medication_columns]  # Targets for medications

# Encode the target variables
label_encoder = LabelEncoder()
y_disease_encoded = label_encoder.fit_transform(y_disease)
y_medication_encoded = y_medication.apply(label_encoder.fit_transform)

# Split the data into training and testing sets
X2_train, X2_test, y_disease_train, y_disease_test, y_medication_train, y_medication_test = train_test_split(
    X2, y_disease_encoded, y_medication_encoded, test_size=0.2, random_state=42)


In [None]:


# Input layer
inputs = Input(shape=(X2_train.shape[1],))

# Hidden layers
x = Dense(64, activation='tanh')(inputs)
x = Dense(5, activation='relu')(x)

# Output layer for disease prediction
disease_output = Dense(1, activation='softmax', name='disease_output')(x)
    
# Output layer for medication prediction
medication_output = Dense(len(medication_columns), activation='sigmoid', name='medication_output')(x)

# Combine the inputs and outputs into a Model
model = Model(inputs=inputs, outputs=[disease_output, medication_output])

# Compile the model
model.compile(
    optimizer='adam',
    loss={'disease_output': 'categorical_crossentropy', 'medication_output': 'binary_crossentropy'},
    metrics={'disease_output': 'accuracy', 'medication_output': 'accuracy'}
)

# Print the model summary
model.summary()


In [None]:
# Convert X2_train to a NumPy array of floats
X2_train = np.array(X2_train).astype('float32')
X2_test = np.array(X2_test).astype('float32')


In [None]:
# Early stopping callback to prevent overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Fit the model with the training data
history = model.fit(
    X2_train, 
    [y_disease_train, y_medication_train], 
    epochs=25, 
    batch_size=32, 
    validation_split=0.2, 
    callbacks=[early_stopping]
)




In [None]:
# Evaluate the model on the test data
results = model.evaluate(X2_test, [y_disease_test, y_medication_test])

# Print the evaluation results
print(f"Total Loss: {results[0]}")
print(f"Disease Prediction Loss: {results[1]}")
print(f"Medication Prediction Loss: {results[2]}")
if len(results) > 3:  # Check if accuracy metrics are included
    print(f"Disease Prediction Accuracy: {results[3]}")
    print(f"Medication Prediction Accuracy: {results[4]}")


In [None]:
print(history.history.keys())


In [None]:

# Plot loss for both outputs
plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Total Loss')
plt.plot(history.history['val_loss'], label='Val Total Loss')
plt.plot(history.history['loss'], label='Disease Loss')
plt.title('Loss over Epochs')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()

# Plot accuracy for both outputs, if accuracy was included as a metric
plt.subplot(1, 2, 2)
if 'disease_output_accuracy' in history.history:
    plt.plot(history.history['disease_output_accuracy'], label='Disease Accuracy')
    plt.plot(history.history['val_disease_output_accuracy'], label='Val Disease Accuracy')
if 'medication_output_accuracy' in history.history:
    plt.plot(history.history['medication_output_accuracy'], label='Medication Accuracy')
    plt.plot(history.history['val_medication_output_accuracy'], label='Val Medication Accuracy')
plt.title('Accuracy over Epochs')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()

plt.show()



## Model Selection

Based on all the evaluations we can see the model that we should implement the Random Forest Classifier Model. The model performed very well and is the ideal model to be used.