In this Notebook, we are performing Medical Diagnosis for disease spread pattern using Logistic Regression

Here, We are going to take two datasets, one for predicting the disease based on the symptoms and one for the available precautionary measures for the predicted disease.

Import the Libraries

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

Load the Dataset.

In [2]:
symptoms_dataset = pd.read_csv('disease_symptoms.csv')
precaution_dataset = pd.read_csv('disease_precaution.csv')

In [3]:
symptoms_dataset.head()

Unnamed: 0,Disease,Symptom_1,Symptom_2,Symptom_3,Symptom_4,Symptom_5,Symptom_6,Symptom_7,Symptom_8,Symptom_9,Symptom_10,Symptom_11,Symptom_12,Symptom_13,Symptom_14,Symptom_15,Symptom_16,Symptom_17
0,Fungal infection,itching,skin_rash,nodal_skin_eruptions,dischromic _patches,,,,,,,,,,,,,
1,Allergy,continuous_sneezing,shivering,chills,watering_from_eyes,,,,,,,,,,,,,
2,GERD,stomach_pain,acidity,ulcers_on_tongue,vomiting,cough,chest_pain,,,,,,,,,,,
3,Chronic cholestasis,itching,vomiting,yellowish_skin,nausea,loss_of_appetite,abdominal_pain,yellowing_of_eyes,,,,,,,,,,
4,Drug Reaction,itching,skin_rash,stomach_pain,burning_micturition,spotting_ urination,,,,,,,,,,,,


In [4]:
precaution_dataset.head()

Unnamed: 0,Disease,Precaution_1,Precaution_2,Precaution_3,Precaution_4
0,Drug Reaction,stop irritation,consult nearest hospital,stop taking drug,follow up
1,Malaria,Consult nearest hospital,avoid oily food,avoid non veg food,keep mosquitos out
2,Allergy,apply calamine,cover area with bandage,,use ice to compress itching
3,Hypothyroidism,reduce stress,exercise,eat healthy,get proper sleep
4,Psoriasis,wash hands with warm soapy water,stop bleeding using pressure,consult doctor,salt baths


Select Features X and y

In [5]:
X = symptoms_dataset[['Symptom_1','Symptom_2','Symptom_3','Symptom_4']]
y = symptoms_dataset['Disease']

Encode X and Convert mutidimensional y to single dimension array.

In [6]:
X_encoded = pd.get_dummies(X)
y = y.ravel()

Create Training and Testing Set

In [7]:
X_train,X_test,y_train,y_test = train_test_split(X_encoded,y,test_size = 0.2, random_state = 42)

Fit the Data in the model, Predict the Disease and list down the Precautions.

In [8]:
model = LogisticRegression()
model.fit(X_train,y_train)
new_symptoms = [["itching", "skin_rash", "nodal_skin_eruptions"]]
new_symptoms_encoded = pd.get_dummies(pd.DataFrame(new_symptoms,columns=["Symptom_1", "Symptom_2", "Symptom_3"]))
aligned_features = pd.DataFrame(columns=X_encoded.columns, data=new_symptoms_encoded)
new_symptoms_encoded = aligned_features.fillna(0)
predicted_diseases = model.predict(new_symptoms_encoded)
print("Predicted Diseases:", predicted_diseases)
dis = predicted_diseases[0]
p1 = precaution_dataset[precaution_dataset['Disease'] == dis]['Precaution_1']
p2 = precaution_dataset[precaution_dataset['Disease'] == dis]['Precaution_2']
p3 = precaution_dataset[precaution_dataset['Disease'] == dis]['Precaution_3']
p4 = precaution_dataset[precaution_dataset['Disease'] == dis]['Precaution_4']
prec = [p1, p2, p3, p4]
for p in prec:
  print(p)

Predicted Diseases: ['Fungal infection']
27    bath twice
Name: Precaution_1, dtype: object
27    use detol or neem in bathing water
Name: Precaution_2, dtype: object
27    keep infected area dry
Name: Precaution_3, dtype: object
27    use clean cloths
Name: Precaution_4, dtype: object
