## 🏥 Personalized Medical Recommendation System with Machine Learning 🚀

Welcome to the Personalized Medical Recommendation System, a state-of-the-art platform designed to assist users in understanding and managing their health. We harness the power of machine learning to predict potential diseases based on user symptoms. Let's walk through the steps of building this system!

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


### 🔧 Step 1: Installation and Setup
First, install all required packages using the requirements.txt file.

In [2]:
import os
import time
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix
import pickle

import tensorflow as tf
from tensorflow.keras import layers

### 📊 Step 2: Data Loading and Preprocessing
We load the medical dataset and prepare it for model training.

>- X: Contains the input features (symptoms).
>- y: Contains the target labels (disease prognosis).

In [3]:
# Load dataset
dataset = pd.read_csv('/content/drive/MyDrive/Health Care /Data /Training.csv')

# Preprocess dataset
X = dataset.drop('prognosis', axis=1)
y = dataset['prognosis']

In [4]:
# Encode prognosis labels
le = LabelEncoder()
le.fit(y)
Y = le.transform(y)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=42)

### 🤖 Step 3: Defining and Training Classifiers

In [5]:
# Define classifier models
models = {
    'SVC': SVC(kernel='linear'),
    'RandomForest': RandomForestClassifier(n_estimators=100, random_state=42),
}

### 🧠 Step 4: GAN Model Building

- Generator:
Generates synthetic medical data based on random noise.
- Discriminator:
Discriminates between real and synthetic data.

In [6]:
# ============ Step 1: Build GAN Model ============
input_dim = X_train.shape[1]

# Define Generator model
def build_generator():
    model = tf.keras.Sequential()
    model.add(layers.Dense(128, activation='relu', input_dim=100))  # Latent space of 100
    model.add(layers.Dense(256, activation='relu'))
    model.add(layers.Dense(input_dim, activation='sigmoid'))  # Output the same number of features as input
    return model

# Define Discriminator model
def build_discriminator():
    model = tf.keras.Sequential()
    model.add(layers.Dense(256, activation='relu', input_dim=input_dim))
    model.add(layers.Dense(128, activation='relu'))
    model.add(layers.Dense(1, activation='sigmoid'))  # Binary classification: real or fake
    return model

# Build the GAN
generator = build_generator()
discriminator = build_discriminator()

# Compile the Discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Build and compile GAN model
discriminator.trainable = False  # Fix the discriminator during GAN training
gan_input = layers.Input(shape=(100,))
generated_data = generator(gan_input)
gan_output = discriminator(generated_data)
gan = tf.keras.Model(gan_input, gan_output)

# Compile GAN model
gan.compile(optimizer='adam', loss='binary_crossentropy')

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


### 🏋️‍♂️ Step 5: Training the GAN
We train the GAN by alternating between the generator and discriminator to improve synthetic data quality. The discriminator is fixed during the generator training.

In [7]:
# ============ Step 2: Train GAN Model ============
def train_gan(epochs=100, batch_size=64):
    for epoch in range(epochs):
        # Generate random noise for the generator
        noise = np.random.normal(0, 1, (batch_size, 100))

        # Generate synthetic data from the noise
        generated_data = generator.predict(noise)

        # Get a random batch of real data
        # Use .iloc to select rows by their integer index
        random_indices = np.random.randint(0, X_train.shape[0], batch_size)
        real_data = X_train.iloc[random_indices].values  # Convert to NumPy array

        # Labels for real and fake data
        real_labels = np.ones((batch_size, 1))
        fake_labels = np.zeros((batch_size, 1))

        # Train the discriminator on real and fake data
        d_loss_real = discriminator.train_on_batch(real_data, real_labels)
        d_loss_fake = discriminator.train_on_batch(generated_data, fake_labels)

        # Train the generator via GAN
        noise = np.random.normal(0, 1, (batch_size, 100))
        g_loss = gan.train_on_batch(noise, np.ones((batch_size, 1)))  # Generator wants discriminator to output 1 (real)

        # Print progress
        if epoch % 100 == 0:
            print(f"Epoch {epoch}: D Loss Real={d_loss_real}, D Loss Fake={d_loss_fake}, G Loss={g_loss}")

# Train the GAN
train_gan(epochs=100, batch_size=64)

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step  




Epoch 0: D Loss Real=[array(0.7492083, dtype=float32), array(0.171875, dtype=float32)], D Loss Fake=[array(0.671913, dtype=float32), array(0.5859375, dtype=float32)], G Loss=[array(0.671913, dtype=float32), array(0.671913, dtype=float32), array(0.5859375, dtype=float32)]
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step 




[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 13ms/step
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step  
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4

### 💡 Step 6: Data Augmentation and Model Training

In [8]:
# ============ Step 3: Generate Synthetic Data ============
# Generate synthetic data after training the GAN
def generate_synthetic_data(num_samples=5000):
    noise = np.random.normal(0, 1, (num_samples, 100))
    synthetic_data = generator.predict(noise)
    return synthetic_data

# Generate synthetic data and augment the training set
synthetic_data = generate_synthetic_data(5000)
augmented_X_train = np.vstack([X_train, synthetic_data])

# For labels we randomly choose labels from y_train
augmented_y_train = np.hstack([y_train, np.random.choice(y_train, 5000)])

[1m157/157[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step


In [9]:
# ============ Step 4: Train Classifiers on Augmented Data ============

# Dictionary to store accuracies for each model
model_accuracies = {}

for model_name, model in models.items():
    # Train model on augmented data
    model.fit(augmented_X_train, augmented_y_train)

    # Test model on real test data
    predictions = model.predict(X_test)

    # Calculate accuracy
    accuracy = accuracy_score(y_test, predictions)

    # Save accuracy to dictionary
    model_accuracies[model_name] = accuracy

    print(f"{model_name} Accuracy with augmented data: {accuracy}")

    # Calculate confusion matrix
    cm = confusion_matrix(y_test, predictions)
    print(f"{model_name} Confusion Matrix with augmented data:")
    print(np.array2string(cm, separator=', '))

    print("\n" + "="*40 + "\n")



SVC Accuracy with augmented data: 1.0
SVC Confusion Matrix with augmented data:
[[32,  0,  0, ...,  0,  0,  0],
 [ 0, 39,  0, ...,  0,  0,  0],
 [ 0,  0, 41, ...,  0,  0,  0],
 ...,
 [ 0,  0,  0, ..., 36,  0,  0],
 [ 0,  0,  0, ...,  0, 37,  0],
 [ 0,  0,  0, ...,  0,  0, 39]]


RandomForest Accuracy with augmented data: 1.0
RandomForest Confusion Matrix with augmented data:
[[32,  0,  0, ...,  0,  0,  0],
 [ 0, 39,  0, ...,  0,  0,  0],
 [ 0,  0, 41, ...,  0,  0,  0],
 ...,
 [ 0,  0,  0, ..., 36,  0,  0],
 [ 0,  0,  0, ...,  0, 37,  0],
 [ 0,  0,  0, ...,  0,  0, 39]]






In [10]:
# ============ Step 5: Single Prediction ============
# Selecting SVC for single prediction
svc = SVC(kernel='linear')
svc.fit(augmented_X_train, augmented_y_train)

# Save the model
pickle.dump(svc, open('svc.pkl', 'wb'))

# Load the model
svc = pickle.load(open('svc.pkl', 'rb'))

In [12]:
# Initialize the main model using the pre-configured parameters for the SVC from the 'models' dictionary
models['SVC']
main_model = SVC(**models['SVC'].get_params())


### 🔬 Step 7: Personalized Medical Recommendation System

We use additional datasets to provide users with personalized health recommendations based on predicted diseases, such as:
- Description of the disease
- Precautions to follow
- Medications and diets
- Workout routines

In [15]:
# ============ Step 7: Recommendation System and Prediction ============

# Load additional databases
sym_des = pd.read_csv("/content/drive/MyDrive/Health Care /Data /symtoms_df.csv")
precautions = pd.read_csv("/content/drive/MyDrive/Health Care /Data /precautions_df.csv")
workout = pd.read_csv("/content/drive/MyDrive/Health Care /Data /workout_df.csv")
description = pd.read_csv("/content/drive/MyDrive/Health Care /Data /description.csv")
medications = pd.read_csv('/content/drive/MyDrive/Health Care /Data /medications.csv')
diets = pd.read_csv("/content/drive/MyDrive/Health Care /Data /diets.csv")

# Helper function for recommendations
def helper(dis):
    desc = description[description['Disease'] == dis]['Description']
    desc = " ".join([w for w in desc])

    pre = precautions[precautions['Disease'] == dis][['Precaution_1', 'Precaution_2', 'Precaution_3', 'Precaution_4']]
    pre = [col for col in pre.values]

    med = medications[medications['Disease'] == dis]['Medication']
    med = [med for med in med.values]

    die = diets[diets['Disease'] == dis]['Diet']
    die = [die for die in die.values]

    wrkout = workout[workout['disease'] == dis]['workout']

    return desc, pre, med, die, wrkout

# Mapping between (symptoms and index values) and (disease index and disease name)
symptoms_dict = {'itching': 0, 'skin_rash': 1, 'nodal_skin_eruptions': 2, 'continuous_sneezing': 3, 'shivering': 4, 'chills': 5, 'joint_pain': 6, 'stomach_pain': 7, 'acidity': 8, 'ulcers_on_tongue': 9, 'muscle_wasting': 10, 'vomiting': 11, 'burning_micturition': 12, 'spotting_ urination': 13, 'fatigue': 14, 'weight_gain': 15, 'anxiety': 16, 'cold_hands_and_feets': 17, 'mood_swings': 18, 'weight_loss': 19, 'restlessness': 20, 'lethargy': 21, 'patches_in_throat': 22, 'irregular_sugar_level': 23, 'cough': 24, 'high_fever': 25, 'sunken_eyes': 26, 'breathlessness': 27, 'sweating': 28, 'dehydration': 29, 'indigestion': 30, 'headache': 31, 'yellowish_skin': 32, 'dark_urine': 33, 'nausea': 34, 'loss_of_appetite': 35, 'pain_behind_the_eyes': 36, 'back_pain': 37, 'constipation': 38, 'abdominal_pain': 39, 'diarrhoea': 40, 'mild_fever': 41, 'yellow_urine': 42, 'yellowing_of_eyes': 43, 'acute_liver_failure': 44, 'fluid_overload': 45, 'swelling_of_stomach': 46, 'swelled_lymph_nodes': 47, 'malaise': 48, 'blurred_and_distorted_vision': 49, 'phlegm': 50, 'throat_irritation': 51, 'redness_of_eyes': 52, 'sinus_pressure': 53, 'runny_nose': 54, 'congestion': 55, 'chest_pain': 56, 'weakness_in_limbs': 57, 'fast_heart_rate': 58, 'pain_during_bowel_movements': 59, 'pain_in_anal_region': 60, 'bloody_stool': 61, 'irritation_in_anus': 62, 'neck_pain': 63, 'dizziness': 64, 'cramps': 65, 'bruising': 66, 'obesity': 67, 'swollen_legs': 68, 'swollen_blood_vessels': 69, 'puffy_face_and_eyes': 70, 'enlarged_thyroid': 71, 'brittle_nails': 72, 'swollen_extremeties': 73, 'excessive_hunger': 74, 'extra_marital_contacts': 75, 'drying_and_tingling_lips': 76, 'slurred_speech': 77, 'knee_pain': 78, 'hip_joint_pain': 79, 'muscle_weakness': 80, 'stiff_neck': 81, 'swelling_joints': 82, 'movement_stiffness': 83, 'spinning_movements': 84, 'loss_of_balance': 85, 'unsteadiness': 86, 'weakness_of_one_body_side': 87, 'loss_of_smell': 88, 'bladder_discomfort': 89, 'foul_smell_of urine': 90, 'continuous_feel_of_urine': 91, 'passage_of_gases': 92, 'internal_itching': 93, 'toxic_look_(typhos)': 94, 'depression': 95, 'irritability': 96, 'muscle_pain': 97, 'altered_sensorium': 98, 'red_spots_over_body': 99, 'belly_pain': 100, 'abnormal_menstruation': 101, 'dischromic _patches': 102, 'watering_from_eyes': 103, 'increased_appetite': 104, 'polyuria': 105, 'family_history': 106, 'mucoid_sputum': 107, 'rusty_sputum': 108, 'lack_of_concentration': 109, 'visual_disturbances': 110, 'receiving_blood_transfusion': 111, 'receiving_unsterile_injections': 112, 'coma': 113, 'stomach_bleeding': 114, 'distention_of_abdomen': 115, 'history_of_alcohol_consumption': 116, 'fluid_overload.1': 117, 'blood_in_sputum': 118, 'prominent_veins_on_calf': 119, 'palpitations': 120, 'painful_walking': 121, 'pus_filled_pimples': 122, 'blackheads': 123, 'scurring': 124, 'skin_peeling': 125, 'silver_like_dusting': 126, 'small_dents_in_nails': 127, 'inflammatory_nails': 128, 'blister': 129, 'red_sore_around_nose': 130, 'yellow_crust_ooze': 131}
diseases_list = {15: 'Fungal infection', 4: 'Allergy', 16: 'GERD', 9: 'Chronic cholestasis', 14: 'Drug Reaction', 33: 'Peptic ulcer diseae', 1: 'AIDS', 12: 'Diabetes ', 17: 'Gastroenteritis', 6: 'Bronchial Asthma', 23: 'Hypertension ', 30: 'Migraine', 7: 'Cervical spondylosis', 32: 'Paralysis (brain hemorrhage)', 28: 'Jaundice', 29: 'Malaria', 8: 'Chicken pox', 11: 'Dengue', 37: 'Typhoid', 40: 'hepatitis A', 19: 'Hepatitis B', 20: 'Hepatitis C', 21: 'Hepatitis D', 22: 'Hepatitis E', 3: 'Alcoholic hepatitis', 36: 'Tuberculosis', 10: 'Common Cold', 34: 'Pneumonia', 13: 'Dimorphic hemmorhoids(piles)', 18: 'Heart attack', 39: 'Varicose veins', 26: 'Hypothyroidism', 24: 'Hyperthyroidism', 25: 'Hypoglycemia', 31: 'Osteoarthristis', 5: 'Arthritis', 0: '(vertigo) Paroymsal  Positional Vertigo', 2: 'Acne', 38: 'Urinary tract infection', 35: 'Psoriasis', 27: 'Impetigo'}

# Model Prediction function
def get_predicted_value(patient_symptoms):
    input_vector = np.zeros(len(symptoms_dict))
    for item in patient_symptoms:
        input_vector[symptoms_dict[item]] = 1
    return diseases_list[svc.predict([input_vector])[0]]

In [16]:
# Test 1: Get user input for symptoms
symptoms = input("Enter your symptoms (comma-separated): ")
user_symptoms = [s.strip() for s in symptoms.split(',')]
user_symptoms = [symptom.strip("[]' ") for symptom in user_symptoms]
predicted_disease = get_predicted_value(user_symptoms)

desc, pre, med, die, wrkout = helper(predicted_disease)

# Print recommendation results
print("================= Predicted Disease =============")
print(predicted_disease)
print("================= Description ==================")
print(desc)
print("================= Precautions ==================")
i = 1
for p_i in pre[0]:
    print(i, ": ", p_i)
    i += 1

print("================= Medications ==================")
for m_i in med:
    print(i, ": ", m_i)
    i += 1

print("================= Workout ==================")
for w_i in wrkout:
    print(i, ": ", w_i)
    i += 1

print("================= Diets ==================")
for d_i in die:
    print(i, ": ", d_i)
    i += 1

Enter your symptoms (comma-separated): itching,skin_rash
Fungal infection
Fungal infection is a common skin condition caused by fungi.
1 :  bath twice
2 :  use detol or neem in bathing water
3 :  keep infected area dry
4 :  use clean cloths
5 :  ['Antifungal Cream', 'Fluconazole', 'Terbinafine', 'Clotrimazole', 'Ketoconazole']
6 :  Avoid sugary foods
7 :  Consume probiotics
8 :  Increase intake of garlic
9 :  Include yogurt in diet
10 :  Limit processed foods
11 :  Stay hydrated
12 :  Consume green tea
13 :  Eat foods rich in zinc
14 :  Include turmeric in diet
15 :  Eat fruits and vegetables
16 :  ['Antifungal Diet', 'Probiotics', 'Garlic', 'Coconut oil', 'Turmeric']


## 🔍 Conclusion
Our Personalized Medical Recommendation System uses cutting-edge GANs for data augmentation and multiple classifiers to predict diseases. This system enhances health management by offering personalized recommendations, bringing AI into healthcare!