# Title: Personalized Medical Recommendation System with Machine Learning
Description:
Welcome to our cutting-edge Personalized Medical Recommendation System, a powerful platform designed to assist users in understanding and managing their health. Leveraging the capabilities of machine learning, our system analyzes user-input symptoms to predict potential diseases accurately.

load dataset & tools

In [None]:
import pandas as pd

In [3]:
dataset=pd.read_csv('Training.csv')

In [4]:
dataset

Unnamed: 0,itching,skin_rash,nodal_skin_eruptions,continuous_sneezing,shivering,chills,joint_pain,stomach_pain,acidity,ulcers_on_tongue,...,blackheads,scurring,skin_peeling,silver_like_dusting,small_dents_in_nails,inflammatory_nails,blister,red_sore_around_nose,yellow_crust_ooze,prognosis
0,1,1,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Fungal infection
1,0,1,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Fungal infection
2,1,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Fungal infection
3,1,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Fungal infection
4,1,1,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Fungal infection
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4915,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,(vertigo) Paroymsal Positional Vertigo
4916,0,1,0,0,0,0,0,0,0,0,...,1,1,0,0,0,0,0,0,0,Acne
4917,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,Urinary tract infection
4918,0,1,0,0,0,0,1,0,0,0,...,0,0,1,1,1,1,0,0,0,Psoriasis


In [5]:
dataset.shape

(4920, 133)

# train test split


In [15]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
X = dataset.drop('prognosis', axis=1)
y = dataset['prognosis']

# ecoding prognonsis
le = LabelEncoder()
le.fit(y)
Y = le.transform(y)
    
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=20)

# Training top models

In [18]:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, confusion_matrix
from scipy.sparse import issparse

# Load dataset
dataset = pd.read_csv('Training.csv')

# Ensure all columns except 'prognosis' are numerical
X = dataset.drop('prognosis', axis=1).astype(float)  
y = dataset['prognosis']

# Encode target labels
le = LabelEncoder()
y_encoded = le.fit_transform(y)

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.3, random_state=42)

# Convert sparse matrices to dense if necessary
if issparse(X_train):
    X_train = X_train.toarray()
if issparse(X_test):
    X_test = X_test.toarray()

# Convert X_train and X_test to numpy arrays and ensure correct data type
X_train = np.array(X_train, dtype=np.float64)
X_test = np.array(X_test, dtype=np.float64)

# Define models
models = {
    'SVC': SVC(kernel='linear'),
    'RandomForest': RandomForestClassifier(n_estimators=100, random_state=42),
    'GradientBoosting': GradientBoostingClassifier(n_estimators=100, random_state=42),
    'KNeighbors': KNeighborsClassifier(n_neighbors=5),
    'GaussianNB': GaussianNB()
}

# Train, test, and evaluate models
for model_name, model in models.items():
    try:
        # Train model
        model.fit(X_train, y_train)

        # Test model
        predictions = model.predict(X_test)

        # Calculate accuracy
        accuracy = accuracy_score(y_test, predictions)
        print(f"{model_name} Accuracy: {accuracy:.4f}")

        # Print confusion matrix
        cm = confusion_matrix(y_test, predictions)
        print(f"{model_name} Confusion Matrix:")
        print(np.array2string(cm, separator=', '))

    except Exception as e:
        print(f"Error in {model_name}: {e}")

    print("\n" + "="*40 + "\n")

SVC Accuracy: 1.0000
SVC Confusion Matrix:
[[32,  0,  0, ...,  0,  0,  0],
 [ 0, 39,  0, ...,  0,  0,  0],
 [ 0,  0, 41, ...,  0,  0,  0],
 ...,
 [ 0,  0,  0, ..., 36,  0,  0],
 [ 0,  0,  0, ...,  0, 37,  0],
 [ 0,  0,  0, ...,  0,  0, 39]]


RandomForest Accuracy: 1.0000
RandomForest Confusion Matrix:
[[32,  0,  0, ...,  0,  0,  0],
 [ 0, 39,  0, ...,  0,  0,  0],
 [ 0,  0, 41, ...,  0,  0,  0],
 ...,
 [ 0,  0,  0, ..., 36,  0,  0],
 [ 0,  0,  0, ...,  0, 37,  0],
 [ 0,  0,  0, ...,  0,  0, 39]]


GradientBoosting Accuracy: 1.0000
GradientBoosting Confusion Matrix:
[[32,  0,  0, ...,  0,  0,  0],
 [ 0, 39,  0, ...,  0,  0,  0],
 [ 0,  0, 41, ...,  0,  0,  0],
 ...,
 [ 0,  0,  0, ..., 36,  0,  0],
 [ 0,  0,  0, ...,  0, 37,  0],
 [ 0,  0,  0, ...,  0,  0, 39]]


KNeighbors Accuracy: 1.0000
KNeighbors Confusion Matrix:
[[32,  0,  0, ...,  0,  0,  0],
 [ 0, 39,  0, ...,  0,  0,  0],
 [ 0,  0, 41, ...,  0,  0,  0],
 ...,
 [ 0,  0,  0, ..., 36,  0,  0],
 [ 0,  0,  0, ...,  0, 37,  0],
 [ 0

# single prediction

In [20]:
import pickle

# Train SVC model
svc = SVC(kernel='linear')
svc.fit(X_train, y_train)

# Predict on test data
ypred = svc.predict(X_test)
print("SVC Accuracy:", accuracy_score(y_test, ypred))

# Save the model using pickle
pickle.dump(svc, open('svc.pkl', 'wb'))

# Load the saved model
svc = pickle.load(open('svc.pkl', 'rb'))

# *Test Case 1*
print("Predicted Disease:", svc.predict(X_test[0].reshape(1, -1)))  # Fixed indexing
print("Actual Disease:", y_test[0])

# *Test Case 2*
print("Predicted Disease:", svc.predict(X_test[100].reshape(1, -1)))  # Fixed indexing
print("Actual Disease:", y_test[100])

SVC Accuracy: 1.0
Predicted Disease: [2]
Actual Disease: 2
Predicted Disease: [23]
Actual Disease: 23
