<h3>Training & Saving Treatment  Prediction Model using Logestic Regression</h3>

In [1]:
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.multioutput import MultiOutputClassifier
import joblib

# 1. Load CSV data
df = pd.read_csv('treatment.csv')

# 3. Combine precaution columns into a single text column with $ separation
df['Precautions'] = df[['Precaution_1', 'Precaution_2', 'Precaution_3', 'Precaution_4']].apply(lambda row: '$'.join(row), axis=1)

# 4. Split the precautions column into separate columns
precautions_split = df['Precautions'].str.split('$', expand=True)

# Assign new column names for each precaution
precautions_split.columns = ['Precaution_1', 'Precaution_2', 'Precaution_3', 'Precaution_4']

# 5. Define Features (X) and Labels (y)
X = df['Disease']  # Disease names
y = precautions_split  # Separate precaution columns as labels

# 6. Vectorize the disease names using an enhanced TF-IDF vectorizer
vectorizer = TfidfVectorizer(
    max_features=10000,        # Increased vocabulary size
    ngram_range=(1, 3),        # Include unigrams, bigrams, and trigrams
    stop_words='english',      # Remove common stopwords
    sublinear_tf=True          # Sublinear term frequency scaling
)

X_tfidf = vectorizer.fit_transform(X)

# 7. Logistic Regression Model for Treatment Prediction
# Use MultiOutputClassifier for multi-label prediction
model = LogisticRegression(max_iter=1000)
multi_model = MultiOutputClassifier(model, n_jobs=-1)

# Train the model
multi_model.fit(X_tfidf, y)

# 8. Save the model and vectorizer for future use
joblib.dump(multi_model, 'treatment_prediction_model.joblib')
joblib.dump(vectorizer, 'vectorizer.joblib')

print("Model and vectorizer saved successfully!")


Model and vectorizer saved successfully!


<h3>TestingThe Model</h3>

In [25]:
import joblib
dis="Gastroenteritis"
# Load the saved model and vectorizer
loaded_model = joblib.load('treatment_prediction_model.joblib')
loaded_vectorizer = joblib.load('vectorizer.joblib')

# Get the disease input as a string
sample_disease = dis # Replace with any disease to test

# Transform the input disease using the loaded vectorizer
sample_disease_tfidf = loaded_vectorizer.transform([sample_disease])

# Predict the treatment using the loaded model
predicted_precaution = loaded_model.predict(sample_disease_tfidf)

# Handle multi-output predictions for a single input (list of lists)
predicted_precautions = predicted_precaution[0]

# Output the result with each precaution on a new line
print(f"Predicted treatment for '{sample_disease}':")
i = 1
for precaution in predicted_precautions:
    print(f" {i}. {precaution.strip()}")
    i += 1


Predicted treatment for 'Gastroenteritis':
 1. Stop eating solid food for a while to give your digestive system time to rest and heal. This will allow your body to focus on recovery without the added strain of processing food.
 2. Try taking small sips of water to stay hydrated. It's important to maintain hydration, but drinking small amounts at a time will help avoid overwhelming your system
 3. Rest to allow your body the time it needs to heal. Avoid physical exertion, and make sure to get plenty of sleep to support recovery.
 4. Ease back into eating gradually. Start with light, easy-to-digest foods like broths or soups, and slowly reintroduce solid food as your body tolerates it.
