# Multi Disease Prediction System

This notebook combines four disease prediction models:
1. **Kidney Disease (CKD)**
2. **Liver Disease**
3. **PCOS (Polycystic Ovary Syndrome)**
4. **Breast Cancer**
5. **Heart Disease**

Users can select any disease and provide their information to get a prediction.


In [1]:
# Install required packages
%pip install gradio scikit-learn pandas numpy -q


Note: you may need to restart the kernel to use updated packages.


In [2]:
# Import all necessary libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
import warnings
warnings.filterwarnings('ignore')

import gradio as gr
print("Libraries imported successfully!")


Libraries imported successfully!


## 1. Load and Prepare Kidney Disease Dataset


In [3]:
# Load Kidney Disease Dataset
kidney_df = pd.read_csv('kidney_dataset.csv')

# Preprocess Kidney data
kidney_df_processed = kidney_df.copy()
kidney_df_processed['Medication'] = kidney_df_processed['Medication'].fillna('No Medication')
le_medication = LabelEncoder()
kidney_df_processed['Medication_encoded'] = le_medication.fit_transform(kidney_df_processed['Medication'])

# Prepare features and target
X_kidney = kidney_df_processed.drop(['CKD_Status', 'Medication'], axis=1)
y_kidney = kidney_df_processed['CKD_Status']

# Split and scale
X_train_kidney, X_test_kidney, y_train_kidney, y_test_kidney = train_test_split(
    X_kidney, y_kidney, test_size=0.2, random_state=42, stratify=y_kidney
)

# Store feature names for later
kidney_features = X_kidney.columns.tolist()
print(f"Kidney Disease - Features: {len(kidney_features)}")
print(f"Kidney Disease - Training samples: {len(X_train_kidney)}")
print(f"Kidney Disease - Features: {kidney_features}")


Kidney Disease - Features: 10
Kidney Disease - Training samples: 4000
Kidney Disease - Features: ['Creatinine', 'BUN', 'GFR', 'Urine_Output', 'Diabetes', 'Hypertension', 'Age', 'Protein_in_Urine', 'Water_Intake', 'Medication_encoded']


## 2. Load and Prepare Liver Disease Dataset


In [4]:
# Load Liver Disease Dataset
try:
    liver_df = pd.read_csv('Liver Patient Dataset (LPD)_train.csv', encoding='latin-1')
except:
    liver_df = pd.read_csv('Liver Patient Dataset (LPD)_train.csv', encoding='utf-8', errors='replace')

# Clean column names
liver_df.columns = liver_df.columns.str.strip()
liver_df.columns = liver_df.columns.str.replace(r'[^\w\s]', '', regex=True)
liver_df.columns = liver_df.columns.str.replace(r'\s+', '_', regex=True)

# Assign proper column names
if len(liver_df.columns) == 11:
    liver_df.columns = ['Age', 'Gender', 'Total_Bilirubin', 'Direct_Bilirubin', 
                       'Alkaline_Phosphotase', 'SGPT', 'SGOT', 'Total_Protiens', 
                       'Albumin', 'A_G_Ratio', 'Result']

# Preprocess Liver data
liver_df_processed = liver_df.copy()

# Handle missing values
numerical_cols = liver_df_processed.select_dtypes(include=[np.number]).columns
for col in numerical_cols:
    if col != 'Result':
        liver_df_processed[col].fillna(liver_df_processed[col].median(), inplace=True)

liver_df_processed['Gender'].fillna(liver_df_processed['Gender'].mode()[0], inplace=True)

# Encode Gender
le_gender = LabelEncoder()
liver_df_processed['Gender_Encoded'] = le_gender.fit_transform(liver_df_processed['Gender'])

# Convert Result: 1 = Disease (1), 2 = No Disease (0)
liver_df_processed['Target'] = liver_df_processed['Result'].map({1: 1, 2: 0})

# Prepare features
liver_feature_columns = ['Age', 'Gender_Encoded', 'Total_Bilirubin', 'Direct_Bilirubin', 
                        'Alkaline_Phosphotase', 'SGPT', 'SGOT', 'Total_Protiens', 
                        'Albumin', 'A_G_Ratio']
X_liver = liver_df_processed[liver_feature_columns]
y_liver = liver_df_processed['Target']

# Split
X_train_liver, X_test_liver, y_train_liver, y_test_liver = train_test_split(
    X_liver, y_liver, test_size=0.2, random_state=42, stratify=y_liver
)

# Store feature names (with readable names for UI)
liver_features = liver_feature_columns.copy()
liver_features[1] = 'Gender'  # Replace encoded with original name for UI
print(f"Liver Disease - Features: {len(liver_features)}")
print(f"Liver Disease - Training samples: {len(X_train_liver)}")
print(f"Liver Disease - Features: {liver_features}")


Liver Disease - Features: 10
Liver Disease - Training samples: 24552
Liver Disease - Features: ['Age', 'Gender', 'Total_Bilirubin', 'Direct_Bilirubin', 'Alkaline_Phosphotase', 'SGPT', 'SGOT', 'Total_Protiens', 'Albumin', 'A_G_Ratio']


## 3. Load and Prepare PCOS Dataset


In [5]:
# Load PCOS Dataset
pcos_df = pd.read_csv('PCOS_extended_dataset.csv')

# Preprocess PCOS data
pcos_df_processed = pcos_df.copy()

# Drop ID columns
cols_to_drop = ['Sl. No', 'Patient File No.']
pcos_df_processed.drop(columns=[c for c in cols_to_drop if c in pcos_df_processed.columns], inplace=True)

# Clean column names
pcos_df_processed.columns = pcos_df_processed.columns.str.strip()

# Convert to numeric
for col in pcos_df_processed.columns:
    pcos_df_processed[col] = pd.to_numeric(pcos_df_processed[col], errors='coerce')

# Handle missing values
pcos_df_processed.fillna(pcos_df_processed.median(), inplace=True)

# Prepare features and target
target_col = 'PCOS (Y/N)'
X_pcos = pcos_df_processed.drop(columns=[target_col])
y_pcos = pcos_df_processed[target_col]

# Split
X_train_pcos, X_test_pcos, y_train_pcos, y_test_pcos = train_test_split(
    X_pcos, y_pcos, test_size=0.2, random_state=42, stratify=y_pcos
)

# Store feature names
pcos_features = X_pcos.columns.tolist()
print(f"PCOS - Features: {len(pcos_features)}")
print(f"PCOS - Training samples: {len(X_train_pcos)}")
print(f"PCOS - Top 10 Features: {pcos_features[:10]}")


PCOS - Features: 41
PCOS - Training samples: 1600
PCOS - Top 10 Features: ['Age (yrs)', 'Weight (Kg)', 'Height(Cm)', 'BMI', 'Blood Group', 'Pulse rate(bpm)', 'RR (breaths/min)', 'Hb(g/dl)', 'Cycle(R/I)', 'Cycle length(days)']


## 4. Load and Prepare Breast Cancer Dataset


In [6]:
# Load Breast Cancer Dataset
breast_df = pd.read_csv('breast-cancer.csv')

# Preprocess Breast Cancer data
breast_df_processed = breast_df.copy()

# Drop ID column
if 'id' in breast_df_processed.columns:
    breast_df_processed.drop(columns=['id'], inplace=True)

# Handle missing values and duplicates
breast_df_processed.dropna(axis=1, how='all', inplace=True)
if breast_df_processed.duplicated().sum() > 0:
    breast_df_processed.drop_duplicates(inplace=True)
breast_df_processed.fillna(breast_df_processed.mean(numeric_only=True), inplace=True)

# Encode target (M=1, B=0)
le_diagnosis = LabelEncoder()
breast_df_processed['diagnosis'] = le_diagnosis.fit_transform(breast_df_processed['diagnosis'])

# Prepare features and target
X_breast = breast_df_processed.drop(columns=['diagnosis'])
y_breast = breast_df_processed['diagnosis']

# Split
X_train_breast, X_test_breast, y_train_breast, y_test_breast = train_test_split(
    X_breast, y_breast, test_size=0.2, random_state=42, stratify=y_breast
)

# Store feature names
breast_features = X_breast.columns.tolist()
print(f"Breast Cancer - Features: {len(breast_features)}")
print(f"Breast Cancer - Training samples: {len(X_train_breast)}")
print(f"Breast Cancer - Top 10 Features: {breast_features[:10]}")


Breast Cancer - Features: 30
Breast Cancer - Training samples: 455
Breast Cancer - Top 10 Features: ['radius_mean', 'texture_mean', 'perimeter_mean', 'area_mean', 'smoothness_mean', 'compactness_mean', 'concavity_mean', 'concave points_mean', 'symmetry_mean', 'fractal_dimension_mean']


## 5. Load and Prepare Heart Disease Dataset


In [7]:
# Load Heart Disease Dataset
heart_df = pd.read_csv('synthetic_heart_disease_dataset.csv')

# Preprocess Heart Disease data
heart_df_processed = heart_df.copy()

# Handle missing values
heart_df_processed['Alcohol_Intake'] = heart_df_processed['Alcohol_Intake'].fillna('None')

# Drop rows with any remaining NaN values
heart_df_processed.dropna(inplace=True)

# Encode categorical variables using the same mappings as in the original notebook
mappings = {
    'Stress_Level': {'Low': 0, 'Medium': 1, 'High': 2},
    'Diet': {'Unhealthy': 0, 'Average': 1, 'Healthy': 2},
    'Alcohol_Intake': {'None': 0, 'Low': 1, 'Moderate': 2, 'High': 3},
    'Physical_Activity': {'Sedentary': 0, 'Moderate': 1, 'Active': 2},
    'Smoking': {'Never': 0, 'Former': 1, 'Current': 2},
    'Gender': {'Male': 0, 'Female': 1}
}

# Apply mappings
for col, mapping in mappings.items():
    if col in heart_df_processed.columns:
        heart_df_processed[col] = heart_df_processed[col].map(mapping)
        # Drop rows with unmapped values
        heart_df_processed.dropna(subset=[col], inplace=True)

# Prepare features and target
X_heart = heart_df_processed.drop(columns=['Heart_Disease'])
y_heart = heart_df_processed['Heart_Disease']

# Align y with X after dropping NaNs
y_heart = y_heart.loc[X_heart.index]

# Split
X_train_heart, X_test_heart, y_train_heart, y_test_heart = train_test_split(
    X_heart, y_heart, test_size=0.2, random_state=42, stratify=y_heart
)

# Store feature names and encoders for later use
heart_features = X_heart.columns.tolist()
heart_encoders = mappings  # Store mappings for UI

print(f"Heart Disease - Features: {len(heart_features)}")
print(f"Heart Disease - Training samples: {len(X_train_heart)}")
print(f"Heart Disease - Features: {heart_features[:10]}...")


Heart Disease - Features: 20
Heart Disease - Training samples: 40000
Heart Disease - Features: ['Age', 'Gender', 'Weight', 'Height', 'BMI', 'Smoking', 'Alcohol_Intake', 'Physical_Activity', 'Diet', 'Stress_Level']...


## 5. Train Models for Each Disease


In [8]:
# Train Kidney Disease Model (Random Forest)
print("Training Kidney Disease Model...")
model_kidney = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
model_kidney.fit(X_train_kidney, y_train_kidney)
scaler_kidney = StandardScaler()
X_train_kidney_scaled = scaler_kidney.fit_transform(X_train_kidney)
# Note: For Random Forest, scaling isn't necessary, but we'll use it for consistency
kidney_accuracy = model_kidney.score(X_test_kidney, y_test_kidney)
print(f"Kidney Disease Model Accuracy: {kidney_accuracy:.4f}")

# Store medication encoder for later use
kidney_medication_encoder = le_medication


Training Kidney Disease Model...
Kidney Disease Model Accuracy: 1.0000


In [9]:
# Train Liver Disease Model (Random Forest)
print("Training Liver Disease Model...")
model_liver = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
model_liver.fit(X_train_liver, y_train_liver)
scaler_liver = StandardScaler()
X_train_liver_scaled = scaler_liver.fit_transform(X_train_liver)
liver_accuracy = model_liver.score(X_test_liver, y_test_liver)
print(f"Liver Disease Model Accuracy: {liver_accuracy:.4f}")

# Store gender encoder for later use
liver_gender_encoder = le_gender


Training Liver Disease Model...
Liver Disease Model Accuracy: 0.9979


In [10]:
# Train PCOS Model (Random Forest)
print("Training PCOS Model...")
scaler_pcos = StandardScaler()
X_train_pcos_scaled = scaler_pcos.fit_transform(X_train_pcos)
model_pcos = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
model_pcos.fit(X_train_pcos_scaled, y_train_pcos)
X_test_pcos_scaled = scaler_pcos.transform(X_test_pcos)
pcos_accuracy = model_pcos.score(X_test_pcos_scaled, y_test_pcos)
print(f"PCOS Model Accuracy: {pcos_accuracy:.4f}")


Training PCOS Model...
PCOS Model Accuracy: 0.9975


In [11]:
# Train Breast Cancer Model (SVM - best performing)
print("Training Breast Cancer Model...")
scaler_breast = StandardScaler()
X_train_breast_scaled = scaler_breast.fit_transform(X_train_breast)
model_breast = SVC(probability=True, random_state=42)
model_breast.fit(X_train_breast_scaled, y_train_breast)
X_test_breast_scaled = scaler_breast.transform(X_test_breast)
breast_accuracy = model_breast.score(X_test_breast_scaled, y_test_breast)
print(f"Breast Cancer Model Accuracy: {breast_accuracy:.4f}")

# Train Heart Disease Model (Random Forest - best performing)
print("Training Heart Disease Model...")
scaler_heart = StandardScaler()
X_train_heart_scaled = scaler_heart.fit_transform(X_train_heart)
model_heart = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
model_heart.fit(X_train_heart_scaled, y_train_heart)
X_test_heart_scaled = scaler_heart.transform(X_test_heart)
heart_accuracy = model_heart.score(X_test_heart_scaled, y_test_heart)
print(f"Heart Disease Model Accuracy: {heart_accuracy:.4f}")
print("\nAll models trained successfully!")


Training Breast Cancer Model...
Breast Cancer Model Accuracy: 0.9737
Training Heart Disease Model...
Heart Disease Model Accuracy: 1.0000

All models trained successfully!


## 6. Create Prediction Functions


In [12]:
# Prediction function for Kidney Disease
def predict_kidney(creatinine, bun, gfr, urine_output, diabetes, hypertension, 
                   age, protein_in_urine, water_intake, medication):
    """Predict Kidney Disease"""
    try:
        # Encode medication
        medication_encoded = kidney_medication_encoder.transform([medication])[0]
        
        # Create input array
        input_data = np.array([[
            float(creatinine), float(bun), float(gfr), float(urine_output),
            int(diabetes), int(hypertension), float(age), float(protein_in_urine),
            float(water_intake), medication_encoded
        ]])
        
        # Make prediction
        prediction = model_kidney.predict(input_data)[0]
        probability = model_kidney.predict_proba(input_data)[0]
        
        if prediction == 1:
            result = f"‚ö†Ô∏è **Kidney Disease Detected**\n\n"
            result += f"Probability of Disease: {probability[1]*100:.2f}%\n"
            result += f"Probability of No Disease: {probability[0]*100:.2f}%\n\n"
            result += "**Recommendation:** Please consult with a healthcare professional immediately."
        else:
            result = f"‚úÖ **No Kidney Disease Detected**\n\n"
            result += f"Probability of No Disease: {probability[0]*100:.2f}%\n"
            result += f"Probability of Disease: {probability[1]*100:.2f}%\n\n"
            result += "**Recommendation:** Continue maintaining a healthy lifestyle."
        
        return result
    except Exception as e:
        return f"Error: {str(e)}"


In [13]:
# Prediction function for Heart Disease
def predict_heart(age, gender, weight, height, bmi, smoking, alcohol_intake, 
                  physical_activity, diet, stress_level, hypertension, diabetes,
                  hyperlipidemia, family_history, previous_heart_attack,
                  systolic_bp, diastolic_bp, heart_rate, blood_sugar_fasting, cholesterol_total):
    """Predict Heart Disease"""
    try:
        # Encode categorical variables using stored mappings
        gender_encoded = heart_encoders['Gender'][gender]
        smoking_encoded = heart_encoders['Smoking'][smoking]
        alcohol_encoded = heart_encoders['Alcohol_Intake'][alcohol_intake]
        activity_encoded = heart_encoders['Physical_Activity'][physical_activity]
        diet_encoded = heart_encoders['Diet'][diet]
        stress_encoded = heart_encoders['Stress_Level'][stress_level]
        
        # Create input array in the correct feature order
        input_data = np.array([[
            float(age), gender_encoded, float(weight), float(height), float(bmi),
            smoking_encoded, alcohol_encoded, activity_encoded, diet_encoded, stress_encoded,
            int(hypertension), int(diabetes), int(hyperlipidemia), int(family_history),
            int(previous_heart_attack), float(systolic_bp), float(diastolic_bp),
            float(heart_rate), float(blood_sugar_fasting), float(cholesterol_total)
        ]])
        
        # Scale the input
        input_data_scaled = scaler_heart.transform(input_data)
        
        # Make prediction
        prediction = model_heart.predict(input_data_scaled)[0]
        probability = model_heart.predict_proba(input_data_scaled)[0]
        
        if prediction == 1:
            result = f"‚ö†Ô∏è **Heart Disease Detected**\n\n"
            result += f"Probability of Heart Disease: {probability[1]*100:.2f}%\n"
            result += f"Probability of No Heart Disease: {probability[0]*100:.2f}%\n\n"
            result += "**Recommendation:** Please consult with a cardiologist immediately for further evaluation and treatment."
        else:
            result = f"‚úÖ **No Heart Disease Detected**\n\n"
            result += f"Probability of No Heart Disease: {probability[0]*100:.2f}%\n"
            result += f"Probability of Heart Disease: {probability[1]*100:.2f}%\n\n"
            result += "**Recommendation:** Continue maintaining a healthy lifestyle with regular exercise and a balanced diet."
        
        return result
    except Exception as e:
        return f"Error: {str(e)}"


In [14]:
# Prediction function for Liver Disease
def predict_liver(age, gender, total_bilirubin, direct_bilirubin, alkaline_phosphotase,
                  sgpt, sgot, total_protiens, albumin, a_g_ratio):
    """Predict Liver Disease"""
    try:
        # Encode gender
        gender_encoded = liver_gender_encoder.transform([gender])[0]
        
        # Create input array
        input_data = np.array([[
            float(age), gender_encoded, float(total_bilirubin), float(direct_bilirubin),
            float(alkaline_phosphotase), float(sgpt), float(sgot), float(total_protiens),
            float(albumin), float(a_g_ratio)
        ]])
        
        # Make prediction
        prediction = model_liver.predict(input_data)[0]
        probability = model_liver.predict_proba(input_data)[0]
        
        if prediction == 1:
            result = f"‚ö†Ô∏è **Liver Disease Detected**\n\n"
            result += f"Probability of Disease: {probability[1]*100:.2f}%\n"
            result += f"Probability of No Disease: {probability[0]*100:.2f}%\n\n"
            result += "**Recommendation:** Please consult with a healthcare professional immediately."
        else:
            result = f"‚úÖ **No Liver Disease Detected**\n\n"
            result += f"Probability of No Disease: {probability[0]*100:.2f}%\n"
            result += f"Probability of Disease: {probability[1]*100:.2f}%\n\n"
            result += "**Recommendation:** Continue maintaining a healthy lifestyle."
        
        return result
    except Exception as e:
        return f"Error: {str(e)}"


In [15]:
# Prediction function for PCOS
def predict_pcos(*args):
    """Predict PCOS - Takes all features as arguments"""
    try:
        # Convert arguments to numpy array
        input_data = np.array([args], dtype=float)
        
        # Scale the input
        input_data_scaled = scaler_pcos.transform(input_data)
        
        # Make prediction
        prediction = model_pcos.predict(input_data_scaled)[0]
        probability = model_pcos.predict_proba(input_data_scaled)[0]
        
        if prediction == 1:
            result = f"‚ö†Ô∏è **PCOS Detected**\n\n"
            result += f"Probability of PCOS: {probability[1]*100:.2f}%\n"
            result += f"Probability of No PCOS: {probability[0]*100:.2f}%\n\n"
            result += "**Recommendation:** Please consult with a healthcare professional for proper diagnosis and treatment."
        else:
            result = f"‚úÖ **No PCOS Detected**\n\n"
            result += f"Probability of No PCOS: {probability[0]*100:.2f}%\n"
            result += f"Probability of PCOS: {probability[1]*100:.2f}%\n\n"
            result += "**Recommendation:** Continue maintaining a healthy lifestyle."
        
        return result
    except Exception as e:
        return f"Error: {str(e)}"


In [16]:
# Prediction function for Breast Cancer
def predict_breast(*args):
    """Predict Breast Cancer - Takes all features as arguments"""
    try:
        # Convert arguments to numpy array
        input_data = np.array([args], dtype=float)
        
        # Scale the input
        input_data_scaled = scaler_breast.transform(input_data)
        
        # Make prediction
        prediction = model_breast.predict(input_data_scaled)[0]
        probability = model_breast.predict_proba(input_data_scaled)[0]
        
        # Decode prediction (0=Benign, 1=Malignant)
        if prediction == 1:
            result = f"‚ö†Ô∏è **Malignant (Cancer Detected)**\n\n"
            result += f"Probability of Malignancy: {probability[1]*100:.2f}%\n"
            result += f"Probability of Benign: {probability[0]*100:.2f}%\n\n"
            result += "**Recommendation:** Please consult with an oncologist immediately for further evaluation."
        else:
            result = f"‚úÖ **Benign (No Cancer Detected)**\n\n"
            result += f"Probability of Benign: {probability[0]*100:.2f}%\n"
            result += f"Probability of Malignancy: {probability[1]*100:.2f}%\n\n"
            result += "**Recommendation:** Continue regular check-ups and maintain a healthy lifestyle."
        
        return result
    except Exception as e:
        return f"Error: {str(e)}"


## 7. Create Web Interface with Gradio


In [17]:
# Heart Disease inputs with better organization and info
heart_inputs = [
    gr.Number(label="Age (years)", value=50.0, minimum=30, maximum=80,
             info="Your current age"),
    gr.Dropdown(label="Gender", choices=["Male", "Female"], value="Male"),
    gr.Number(label="Weight (kg)", value=75.0, minimum=50, maximum=120,
             info="Your body weight in kilograms"),
    gr.Number(label="Height (cm)", value=170.0, minimum=150, maximum=200,
             info="Your height in centimeters"),
    gr.Number(label="BMI (Body Mass Index)", value=25.0, minimum=18, maximum=40,
             info="Calculated as Weight(kg) / Height(m)¬≤"),
    gr.Dropdown(label="Smoking Status", choices=["Never", "Former", "Current"], value="Never",
               info="Your smoking history"),
    gr.Dropdown(label="Alcohol Intake", choices=["None", "Low", "Moderate", "High"], value="None",
               info="Your alcohol consumption level"),
    gr.Dropdown(label="Physical Activity", choices=["Sedentary", "Moderate", "Active"], value="Moderate",
               info="Your level of physical activity"),
    gr.Dropdown(label="Diet", choices=["Unhealthy", "Average", "Healthy"], value="Average",
               info="Your diet quality"),
    gr.Dropdown(label="Stress Level", choices=["Low", "Medium", "High"], value="Medium",
               info="Your stress level"),
    gr.Radio(label="Hypertension", choices=["No", "Yes"], value="No",
            info="Do you have high blood pressure?"),
    gr.Radio(label="Diabetes", choices=["No", "Yes"], value="No",
            info="Do you have diabetes?"),
    gr.Radio(label="Hyperlipidemia", choices=["No", "Yes"], value="No",
            info="Do you have high cholesterol?"),
    gr.Radio(label="Family History of Heart Disease", choices=["No", "Yes"], value="No",
            info="Do you have a family history of heart disease?"),
    gr.Radio(label="Previous Heart Attack", choices=["No", "Yes"], value="No",
            info="Have you had a previous heart attack?"),
    gr.Number(label="Systolic Blood Pressure (mmHg)", value=120.0, minimum=90, maximum=200,
             info="Normal: <120 mmHg"),
    gr.Number(label="Diastolic Blood Pressure (mmHg)", value=80.0, minimum=60, maximum=120,
             info="Normal: <80 mmHg"),
    gr.Number(label="Heart Rate (bpm)", value=72.0, minimum=60, maximum=120,
             info="Resting heart rate in beats per minute"),
    gr.Number(label="Fasting Blood Sugar (mg/dL)", value=100.0, minimum=70, maximum=200,
             info="Normal: 70-100 mg/dL"),
    gr.Number(label="Total Cholesterol (mg/dL)", value=200.0, minimum=150, maximum=300,
             info="Normal: <200 mg/dL")
]

# Helper function to convert Yes/No to 0/1 for heart prediction
def convert_heart_inputs(age, gender, weight, height, bmi, smoking, alcohol_intake,
                         physical_activity, diet, stress_level, hypertension, diabetes,
                         hyperlipidemia, family_history, previous_heart_attack,
                         systolic_bp, diastolic_bp, heart_rate, blood_sugar_fasting, cholesterol_total):
    """Convert inputs for heart prediction"""
    hypertension_val = 1 if hypertension == "Yes" else 0
    diabetes_val = 1 if diabetes == "Yes" else 0
    hyperlipidemia_val = 1 if hyperlipidemia == "Yes" else 0
    family_history_val = 1 if family_history == "Yes" else 0
    previous_heart_attack_val = 1 if previous_heart_attack == "Yes" else 0
    
    return predict_heart(age, gender, weight, height, bmi, smoking, alcohol_intake,
                         physical_activity, diet, stress_level, hypertension_val, diabetes_val,
                         hyperlipidemia_val, family_history_val, previous_heart_attack_val,
                         systolic_bp, diastolic_bp, heart_rate, blood_sugar_fasting, cholesterol_total)

heart_interface = gr.Interface(
    fn=convert_heart_inputs,
    inputs=heart_inputs,
    outputs=gr.Markdown(label="Prediction Result"),
    title="‚ù§Ô∏è Heart Disease Prediction",
    description="""
    **Enter your medical information and lifestyle factors to check for Heart Disease.**
    
    This tool analyzes various risk factors including demographics, lifestyle choices, medical history,
    and vital signs to assess your risk of heart disease. **Please note:** This is a screening tool 
    and should not replace professional medical consultation with a cardiologist.
    """,
    theme=gr.themes.Soft(),
    examples=[
        [50.0, "Male", 75.0, 170.0, 25.0, "Never", "None", "Moderate", "Healthy", "Low",
         "No", "No", "No", "No", "No", 120.0, 80.0, 72.0, 100.0, 200.0],
        [65.0, "Male", 90.0, 175.0, 29.0, "Former", "Moderate", "Sedentary", "Unhealthy", "High",
         "Yes", "Yes", "Yes", "Yes", "No", 150.0, 95.0, 85.0, 140.0, 250.0]
    ]
)


In [18]:
# Create Gradio Interface with improved UI

# Define input components for Kidney Disease with better labels, info, and validation
kidney_inputs_list = [
    gr.Number(label="Creatinine (mg/dL)", value=1.0, 
             info="Normal range: 0.6-1.2 mg/dL", minimum=0.1, maximum=10.0),
    gr.Number(label="BUN - Blood Urea Nitrogen (mg/dL)", value=15.0,
              info="Normal range: 7-20 mg/dL", minimum=1.0, maximum=200.0),
    gr.Number(label="GFR - Glomerular Filtration Rate (mL/min/1.73m¬≤)", value=90.0,
              info="Normal: >90, CKD Stage 1-5: <90", minimum=0.0, maximum=200.0),
    gr.Number(label="Daily Urine Output (mL)", value=1500.0,
              info="Average: 800-2000 mL/day", minimum=0.0, maximum=10000.0),
    gr.Radio(label="Diabetes", choices=["No", "Yes"], value="No",
             info="Do you have diabetes?"),
    gr.Radio(label="Hypertension", choices=["No", "Yes"], value="No",
             info="Do you have high blood pressure?"),
    gr.Number(label="Age (years)", value=45.0, minimum=0, maximum=120),
    gr.Number(label="Protein in Urine (mg/day)", value=100.0,
              info="Normal: <150 mg/day", minimum=0.0, maximum=5000.0),
    gr.Number(label="Daily Water Intake (Liters)", value=2.0,
              info="Recommended: 2-3 L/day", minimum=0.0, maximum=20.0),
    gr.Dropdown(label="Current Medication", 
               choices=list(kidney_medication_encoder.classes_), 
               value="No Medication",
               info="Select your current medication type")
]

# Define input components for Liver Disease with better labels and info
liver_inputs_list = [
    gr.Number(label="Age (years)", value=45.0, minimum=0, maximum=120),
    gr.Dropdown(label="Gender", choices=list(liver_gender_encoder.classes_), 
               value="Male"),
    gr.Number(label="Total Bilirubin (mg/dL)", value=1.0,
              info="Normal: 0.1-1.2 mg/dL", minimum=0.0, maximum=50.0),
    gr.Number(label="Direct Bilirubin (mg/dL)", value=0.3,
              info="Normal: 0.0-0.3 mg/dL", minimum=0.0, maximum=30.0),
    gr.Number(label="Alkaline Phosphatase (U/L)", value=100.0,
              info="Normal: 44-147 U/L", minimum=0.0, maximum=2000.0),
    gr.Number(label="SGPT/ALT (U/L)", value=30.0,
              info="Normal: 7-56 U/L", minimum=0.0, maximum=5000.0),
    gr.Number(label="SGOT/AST (U/L)", value=30.0,
              info="Normal: 10-40 U/L", minimum=0.0, maximum=5000.0),
    gr.Number(label="Total Proteins (g/dL)", value=7.0,
              info="Normal: 6.0-8.3 g/dL", minimum=0.0, maximum=20.0),
    gr.Number(label="Albumin (g/dL)", value=4.0,
              info="Normal: 3.5-5.0 g/dL", minimum=0.0, maximum=10.0),
    gr.Number(label="A/G Ratio (Albumin/Globulin)", value=1.0,
              info="Normal: 1.0-2.2", minimum=0.0, maximum=5.0)
]

print("Input components defined with improved UI!")


Input components defined with improved UI!


In [19]:
# Helper function to convert Yes/No to 0/1 for kidney prediction
def convert_kidney_inputs(creatinine, bun, gfr, urine_output, diabetes, hypertension, 
                          age, protein_urine, water_intake, medication):
    """Convert inputs for kidney prediction"""
    diabetes_val = 1 if diabetes == "Yes" else 0
    hypertension_val = 1 if hypertension == "Yes" else 0
    return predict_kidney(creatinine, bun, gfr, urine_output, diabetes_val, hypertension_val,
                         age, protein_urine, water_intake, medication)

# Create improved interfaces for each disease with better UI
kidney_interface = gr.Interface(
    fn=convert_kidney_inputs,
    inputs=kidney_inputs_list,
    outputs=gr.Markdown(label="Prediction Result"),
    title="üî¨ Chronic Kidney Disease (CKD) Prediction",
    description="""
    **Enter your medical test results and information to check for Chronic Kidney Disease.**
    
    This tool uses machine learning to analyze your kidney function test results and provide 
    a prediction. **Please note:** This is a screening tool and should not replace professional 
    medical consultation.
    """,
    theme=gr.themes.Soft(),
    examples=[
        [1.0, 15.0, 90.0, 1500.0, "No", "No", 45.0, 100.0, 2.0, "No Medication"],
        [2.5, 40.0, 45.0, 800.0, "Yes", "Yes", 60.0, 500.0, 1.5, "ACE Inhibitor"]
    ]
)

liver_interface = gr.Interface(
    fn=predict_liver,
    inputs=liver_inputs_list,
    outputs=gr.Markdown(label="Prediction Result"),
    title="ü´Ä Liver Disease Prediction",
    description="""
    **Enter your liver function test results to check for Liver Disease.**
    
    This tool analyzes your liver function test (LFT) results including bilirubin levels, 
    liver enzymes, and protein levels. **Please note:** This is a screening tool and should 
    not replace professional medical consultation.
    """,
    theme=gr.themes.Soft(),
    examples=[
        [45.0, "Male", 1.0, 0.3, 100.0, 30.0, 30.0, 7.0, 4.0, 1.0],
        [55.0, "Female", 3.5, 2.0, 200.0, 80.0, 95.0, 6.5, 3.2, 0.85]
    ]
)

print("Individual interfaces created!")


Individual interfaces created!


In [20]:
# Create a comprehensive interface with tabs for each disease
# We'll create a simplified version that's more user-friendly

# PCOS simplified inputs with better organization and info
pcos_simplified_inputs = [
    gr.Number(label="Age (years)", value=30.0, minimum=0, maximum=120,
             info="Your current age"),
    gr.Number(label="Weight (Kg)", value=60.0, minimum=20.0, maximum=200.0,
             info="Your body weight in kilograms"),
    gr.Number(label="Height (Cm)", value=160.0, minimum=100.0, maximum=250.0,
             info="Your height in centimeters"),
    gr.Number(label="BMI (Body Mass Index)", value=23.5, minimum=10.0, maximum=50.0,
             info="Calculated as Weight(kg) / Height(m)¬≤. Normal: 18.5-24.9"),
    gr.Number(label="Follicle Count (Right Ovary)", value=5.0, minimum=0, maximum=30,
             info="Number of follicles in right ovary"),
    gr.Number(label="Follicle Count (Left Ovary)", value=5.0, minimum=0, maximum=30,
             info="Number of follicles in left ovary"),
    gr.Number(label="LH - Luteinizing Hormone (mIU/mL)", value=5.0, minimum=0.0, maximum=200.0,
             info="Normal: 1.1-11.6 mIU/mL (follicular phase)"),
    gr.Number(label="FSH - Follicle Stimulating Hormone (mIU/mL)", value=6.0, minimum=0.0, maximum=200.0,
             info="Normal: 2.8-11.3 mIU/mL (follicular phase)"),
    gr.Number(label="AMH - Anti-M√ºllerian Hormone (ng/mL)", value=5.0, minimum=0.0, maximum=50.0,
             info="Normal: 1.0-4.0 ng/mL. PCOS often shows higher values"),
    gr.Number(label="Waist:Hip Ratio", value=0.85, minimum=0.5, maximum=2.0,
             info="Waist circumference divided by hip circumference. Normal: <0.85"),
]

# Function to predict PCOS with simplified inputs
def predict_pcos_simplified(age, weight, height, bmi, follicle_r, follicle_l, 
                            lh, fsh, amh, waist_hip_ratio):
    """Predict PCOS with simplified inputs"""
    try:
        # Create a full feature array with median values for other features
        input_array = []
        median_values = X_train_pcos.median().values
        
        # Map simplified inputs to full feature array (normalize feature names for matching)
        feature_map = {
            'age (yrs)': age,
            'weight (kg)': weight,
            'height(cm)': height,
            'bmi': bmi,
            'follicle no. (r)': follicle_r,
            'follicle no. (l)': follicle_l,
            'lh(miu/ml)': lh,
            'fsh(miu/ml)': fsh,
            'amh(ng/ml)': amh,
            'waist:hip ratio': waist_hip_ratio,
        }
        
        for i, feature in enumerate(pcos_features):
            # Normalize feature name for comparison (lowercase, strip spaces)
            feature_normalized = feature.lower().strip()
            if feature_normalized in feature_map:
                input_array.append(feature_map[feature_normalized])
            else:
                input_array.append(median_values[i])
        
        # Convert to numpy array and scale
        input_data = np.array([input_array], dtype=float)
        input_data_scaled = scaler_pcos.transform(input_data)
        
        # Make prediction
        prediction = model_pcos.predict(input_data_scaled)[0]
        probability = model_pcos.predict_proba(input_data_scaled)[0]
        
        if prediction == 1:
            result = f"‚ö†Ô∏è **PCOS Detected**\n\n"
            result += f"Probability of PCOS: {probability[1]*100:.2f}%\n"
            result += f"Probability of No PCOS: {probability[0]*100:.2f}%\n\n"
            result += "**Recommendation:** Please consult with a healthcare professional for proper diagnosis and treatment."
        else:
            result = f"‚úÖ **No PCOS Detected**\n\n"
            result += f"Probability of No PCOS: {probability[0]*100:.2f}%\n"
            result += f"Probability of PCOS: {probability[1]*100:.2f}%\n\n"
            result += "**Recommendation:** Continue maintaining a healthy lifestyle."
        
        return result
    except Exception as e:
        return f"Error: {str(e)}"

pcos_interface = gr.Interface(
    fn=predict_pcos_simplified,
    inputs=pcos_simplified_inputs,
    outputs=gr.Markdown(label="Prediction Result"),
    title="üå∏ PCOS (Polycystic Ovary Syndrome) Prediction",
    description="""
    **Enter your medical information to check for Polycystic Ovary Syndrome (PCOS).**
    
    PCOS is a hormonal disorder common among women of reproductive age. This tool analyzes 
    key indicators including hormone levels, physical measurements, and ultrasound findings.
    
    **Please note:** This is a screening tool and should not replace professional medical 
    consultation with a gynecologist or endocrinologist.
    """,
    theme=gr.themes.Soft(),
    examples=[
        [30.0, 60.0, 160.0, 23.5, 5.0, 5.0, 5.0, 6.0, 5.0, 0.85],
        [28.0, 75.0, 165.0, 27.5, 12.0, 10.0, 25.0, 5.5, 9.0, 0.95]
    ]
)


In [21]:
# Breast Cancer simplified inputs with better labels and info
breast_simplified_inputs = [
    gr.Number(label="Radius Mean (mm)", value=15.0, minimum=1.0, maximum=50.0,
             info="Mean of distances from center to points on the perimeter"),
    gr.Number(label="Texture Mean", value=20.0, minimum=1.0, maximum=50.0,
             info="Standard deviation of gray-scale values"),
    gr.Number(label="Perimeter Mean (mm)", value=100.0, minimum=10.0, maximum=300.0,
             info="Mean size of the core tumor perimeter"),
    gr.Number(label="Area Mean (mm¬≤)", value=1000.0, minimum=50.0, maximum=5000.0,
             info="Mean area of the core tumor"),
    gr.Number(label="Smoothness Mean", value=0.1, minimum=0.0, maximum=1.0,
             info="Local variation in radius lengths (0-1 scale)"),
    gr.Number(label="Compactness Mean", value=0.15, minimum=0.0, maximum=1.0,
             info="Perimeter¬≤ / area - 1.0 (0-1 scale)"),
    gr.Number(label="Concavity Mean", value=0.15, minimum=0.0, maximum=1.0,
             info="Severity of concave portions of the contour (0-1 scale)"),
    gr.Number(label="Concave Points Mean", value=0.1, minimum=0.0, maximum=1.0,
             info="Number of concave portions of the contour (0-1 scale)"),
    gr.Number(label="Symmetry Mean", value=0.2, minimum=0.0, maximum=1.0,
             info="Symmetry of the tumor (0-1 scale)"),
    gr.Number(label="Fractal Dimension Mean", value=0.06, minimum=0.0, maximum=1.0,
             info="Coastline approximation - 1.0 (0-1 scale)"),
]

# Function to predict Breast Cancer with simplified inputs
def predict_breast_simplified(radius_mean, texture_mean, perimeter_mean, area_mean,
                              smoothness_mean, compactness_mean, concavity_mean,
                              concave_points_mean, symmetry_mean, fractal_dimension_mean):
    """Predict Breast Cancer with simplified inputs"""
    try:
        # Create a full feature array with median values for other features
        input_array = []
        median_values = X_train_breast.median().values
        
        # Map simplified inputs to full feature array (first 10 are mean features)
        input_array = [
            radius_mean, texture_mean, perimeter_mean, area_mean,
            smoothness_mean, compactness_mean, concavity_mean,
            concave_points_mean, symmetry_mean, fractal_dimension_mean
        ]
        
        # Fill remaining features with median values
        for i in range(10, len(breast_features)):
            input_array.append(median_values[i])
        
        # Convert to numpy array and scale
        input_data = np.array([input_array], dtype=float)
        input_data_scaled = scaler_breast.transform(input_data)
        
        # Make prediction
        prediction = model_breast.predict(input_data_scaled)[0]
        probability = model_breast.predict_proba(input_data_scaled)[0]
        
        if prediction == 1:
            result = f"‚ö†Ô∏è **Malignant (Cancer Detected)**\n\n"
            result += f"Probability of Malignancy: {probability[1]*100:.2f}%\n"
            result += f"Probability of Benign: {probability[0]*100:.2f}%\n\n"
            result += "**Recommendation:** Please consult with an oncologist immediately for further evaluation."
        else:
            result = f"‚úÖ **Benign (No Cancer Detected)**\n\n"
            result += f"Probability of Benign: {probability[0]*100:.2f}%\n"
            result += f"Probability of Malignancy: {probability[1]*100:.2f}%\n\n"
            result += "**Recommendation:** Continue regular check-ups and maintain a healthy lifestyle."
        
        return result
    except Exception as e:
        return f"Error: {str(e)}"

breast_interface = gr.Interface(
    fn=predict_breast_simplified,
    inputs=breast_simplified_inputs,
    outputs=gr.Markdown(label="Prediction Result"),
    title="üéóÔ∏è Breast Cancer Prediction",
    description="""
    **Enter breast cancer diagnostic features from imaging/tissue analysis.**
    
    This tool analyzes features extracted from digitized images of fine needle aspirate (FNA) 
    of breast masses. The features describe characteristics of the cell nuclei present in the image.
    
    **Important:** These values are typically extracted from medical imaging software and 
    pathology reports. **This tool is for educational/screening purposes only and should 
    not replace professional medical diagnosis.**
    """,
    theme=gr.themes.Soft(),
    examples=[
        [12.0, 18.0, 78.0, 500.0, 0.085, 0.12, 0.08, 0.05, 0.18, 0.055],
        [18.0, 22.0, 120.0, 1200.0, 0.12, 0.25, 0.30, 0.15, 0.25, 0.075]
    ]
)


In [22]:
# Create the main interface with tabs for all diseases (including Heart Disease)
multi_disease_interface = gr.TabbedInterface(
    [kidney_interface, liver_interface, pcos_interface, breast_interface, heart_interface],
    ["üî¨ Kidney Disease", "ü´Ä Liver Disease", "üå∏ PCOS", "üéóÔ∏è Breast Cancer", "‚ù§Ô∏è Heart Disease"]
)

print("Main interface created with improved UI! (Now includes 5 diseases)")


Main interface created with improved UI! (Now includes 5 diseases)


## 8. Launch the Web Interface

The interface will be available on localhost. Click the link shown below to access it in your browser.


In [23]:
# Launch the interface
# This will start a local server and provide a URL
# Gradio will automatically find an available port (starts from 7860)
# The URL will be displayed in the output - open it in your browser

# Note: If you get a port error, try:
# 1. Close any previous Gradio instances
# 2. Restart the kernel to close all running servers
# 3. Or manually close processes using port 7860

# Launch without specifying port - Gradio will find an available port automatically
multi_disease_interface.launch(
    share=False, 
    server_name="127.0.0.1"
)


* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.


