# **Title: स्वास्थ्य Saathi - Disease Prediction & Recommendations with Machine Learning (Random Forest)**

## **Description:**

स्वागत है। स्वास्थ्य Saathi में, ek powerful platform jo aapki health ko samajhne aur manage karne mein madad karta hai. Machine Learning ki capabilities ka use karke, hamara system aapke input symptoms ko analyze karke potential diseases ko accurately predict karta hai.


~ I am Tushar, and I am excited to start this amazing project.ely.

## **Importing Libraries and Models**

In [36]:
import pandas as pd
import numpy as np
import pickle
import os
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.naive_bayes import MultinomialNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, confusion_matrix

## **Load and Prepare Dataset**

In [37]:
dataset = pd.read_csv('datasets/Training.csv')
X = dataset.drop('prognosis', axis=1)
y = dataset['prognosis']

# Encode target labels
le = LabelEncoder()
y_encoded = le.fit_transform(y)

# Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.3, random_state=20)

## **Train and Evaluate Models**

In [29]:
models = {
    'SVC': SVC(kernel='linear'),
    'RandomForest': RandomForestClassifier(n_estimators=100, random_state=42),
    'GradientBoosting': GradientBoostingClassifier(n_estimators=100, random_state=42),
    'KNeighbors': KNeighborsClassifier(n_neighbors=5),
    'MultinomialNB': MultinomialNB()
}

best_model = None
best_accuracy = 0

## **Model Evaluation**

In [40]:
print("========== Model Evaluation ==========")
for name, model in models.items():
    model.fit(X_train, y_train)
    preds = model.predict(X_test)
    acc = accuracy_score(y_test, preds)
    cm = confusion_matrix(y_test, preds)
    print(f"{name} Accuracy: {acc:.4f}")
    print(f"{name} Confusion Matrix:\n{cm}\n{'='*40}")
    if acc > best_accuracy:
        best_accuracy = acc
        best_model = model

SVC Accuracy: 1.0000
SVC Confusion Matrix:
[[40  0  0 ...  0  0  0]
 [ 0 43  0 ...  0  0  0]
 [ 0  0 28 ...  0  0  0]
 ...
 [ 0  0  0 ... 34  0  0]
 [ 0  0  0 ...  0 41  0]
 [ 0  0  0 ...  0  0 31]]
RandomForest Accuracy: 1.0000
RandomForest Confusion Matrix:
[[40  0  0 ...  0  0  0]
 [ 0 43  0 ...  0  0  0]
 [ 0  0 28 ...  0  0  0]
 ...
 [ 0  0  0 ... 34  0  0]
 [ 0  0  0 ...  0 41  0]
 [ 0  0  0 ...  0  0 31]]
GradientBoosting Accuracy: 1.0000
GradientBoosting Confusion Matrix:
[[40  0  0 ...  0  0  0]
 [ 0 43  0 ...  0  0  0]
 [ 0  0 28 ...  0  0  0]
 ...
 [ 0  0  0 ... 34  0  0]
 [ 0  0  0 ...  0 41  0]
 [ 0  0  0 ...  0  0 31]]
KNeighbors Accuracy: 1.0000
KNeighbors Confusion Matrix:
[[40  0  0 ...  0  0  0]
 [ 0 43  0 ...  0  0  0]
 [ 0  0 28 ...  0  0  0]
 ...
 [ 0  0  0 ... 34  0  0]
 [ 0  0  0 ...  0 41  0]
 [ 0  0  0 ...  0  0 31]]
MultinomialNB Accuracy: 1.0000
MultinomialNB Confusion Matrix:
[[40  0  0 ...  0  0  0]
 [ 0 43  0 ...  0  0  0]
 [ 0  0 28 ...  0  0  0]
 ...
 [ 

## **Save Best Model - *Random Forest***

In [41]:
final_model = models['RandomForest']  # or best_model
final_model.fit(X, y_encoded)  # Train on full data

if not os.path.exists('models'):
    os.makedirs('models')

pickle.dump(final_model, open('models/random.pkl', 'wb'))
pickle.dump(le, open('models/label_encoder.pkl', 'wb'))

## **Load Recommendation Datasets**

In [42]:
sym_des = pd.read_csv("datasets/symtoms_df.csv")
precautions = pd.read_csv("datasets/precautions_df.csv")
workout = pd.read_csv("datasets/workout_df.csv")
description = pd.read_csv("datasets/description.csv")
medications = pd.read_csv("datasets/medications.csv")
diets = pd.read_csv("datasets/diets.csv")

## **Load Trained Model and Encoder**

In [33]:
model = pickle.load(open('models/random.pkl', 'rb'))
le = pickle.load(open('models/label_encoder.pkl', 'rb'))

## **Load Trained Model and Encoder**

In [43]:
model = pickle.load(open('models/random.pkl', 'rb'))
le = pickle.load(open('models/label_encoder.pkl', 'rb'))

## **Prediction and Recommendation**

In [None]:
symptoms_dict = {symptom: idx for idx, symptom in enumerate(X.columns)}
diseases_list = {i: disease for i, disease in enumerate(le.classes_)}

def get_predicted_value(patient_symptoms):
    input_vector = np.zeros(len(symptoms_dict))
    for symptom in patient_symptoms:
        if symptom in symptoms_dict:
            input_vector[symptoms_dict[symptom]] = 1
    input_df = pd.DataFrame([input_vector], columns=symptoms_dict.keys())  # Fix warning: match feature names
    prediction = model.predict(input_df)[0]
    return diseases_list[prediction]

def get_recommendations(disease):
    desc = " ".join(description[description['Disease'] == disease]['Description'].values)
    pre = precautions[precautions['Disease'] == disease][['Precaution_1', 'Precaution_2', 'Precaution_3', 'Precaution_4']].values.flatten()
    med = medications[medications['Disease'] == disease]['Medication'].values
    diet = diets[diets['Disease'] == disease]['Diet'].values
    wrk = workout[workout['disease'] == disease]['workout'].values
    return desc, pre, med, diet, wrk

## **Testing**

In [35]:
if __name__ == "__main__":
    print("\nWelcome to Swasthya Saathi!")
    symptoms_input = input("Enter your symptoms : ")
    user_symptoms = [s.strip() for s in symptoms_input.split(',')]

    predicted_disease = get_predicted_value(user_symptoms)
    desc, pre, med, diet, wrk = get_recommendations(predicted_disease)

    print("\n================= Predicted Disease =================")
    print(predicted_disease)
    print("\n================= Description =======================")
    print(desc)
    print("\n================= Precautions =======================")
    for i, p in enumerate(pre, 1):
        print(f"{i}. {p}")
    print("\n================= Medications =======================")
    for i, m in enumerate(med, 1):
        print(f"{i}. {m}")
    print("\n================= Workouts ==========================")
    for i, w in enumerate(wrk, 1):
        print(f"{i}. {w}")
    print("\n================= Diet Suggestions ==================")
    for i, d in enumerate(diet, 1):
        print(f"{i}. {d}")



Welcome to Swasthya Saathi!


Enter your symptoms :  itching



Fungal infection

Fungal infection is a common skin condition caused by fungi.

1. bath twice
2. use detol or neem in bathing water
3. keep infected area dry
4. use clean cloths

1. ['Antifungal Cream', 'Fluconazole', 'Terbinafine', 'Clotrimazole', 'Ketoconazole']

1. Avoid sugary foods
2. Consume probiotics
3. Increase intake of garlic
4. Include yogurt in diet
5. Limit processed foods
6. Stay hydrated
7. Consume green tea
8. Eat foods rich in zinc
9. Include turmeric in diet
10. Eat fruits and vegetables

1. ['Antifungal Diet', 'Probiotics', 'Garlic', 'Coconut oil', 'Turmeric']
