This healthcare chatbot uses machine learning algorithms to predict possible diseases based on the symptoms entered by the user. It is built using a combination of data preprocessing, decision tree classification, and natural language processing techniques. The model is trained using medical symptom datasets and provides relevant information like disease prognosis, precautions, severity levels, and descriptions.
This chatbot predicts possible diseases based on the symptoms provided by the user. Using a decision tree classifier (DecisionTreeClassifier from sklearn), the model classifies input symptoms into various disease categories. It also provides useful information such as:
- Disease description
- Severity levels
- Precautions
The chatbot also uses text-to-speech (TTS) technology to read out the predictions and information to the user.
- Python 3.x
- Pandas for data manipulation
- Scikit-learn for machine learning models
- pyttsx3 for text-to-speech functionality
You can install the required packages via pip:
pip install pandas scikit-learn pyttsx3Training.csv: A dataset containing training data for the machine learning model (symptoms and diseases).Testing.csv: A dataset for testing the model.symptom_Description.csv: Contains descriptions of different symptoms.Symptom_severity.csv: A mapping of symptoms to their severity levels.symptom_precaution.csv: Contains precautionary measures for various diseases.
Ensure these CSV files are located on your local machine and referenced correctly in the code (in the getSeverityDict(), getDescription(), and getprecautionDict() functions).
The following libraries are used:
re: For pattern matching in user input.pandas: For loading and processing the CSV data files.pyttsx3: For text-to-speech conversion.sklearn: For machine learning algorithms like decision trees and support vector machines (SVM).numpy: For array manipulation.csv: For reading CSV files.
To use the chatbot:
- Run the script.
- The chatbot will prompt you to enter your name and start a conversation.
- Enter the symptoms you're experiencing (e.g., "fever", "cough").
- Based on your input, the bot will use a decision tree to predict the possible disease and provide suggestions.
Enter the symptom you are experiencing -> cough
searches related to input:
0) cough
Select the one you meant (0 - 0): 0
Okay. From how many days? : 5
Are you experiencing any of the following symptoms?
fever? : yes
headache? : no
You may have flu
Flu is a viral infection that attacks your respiratory system.
Take following measures:
1) Rest and hydrate
2) Avoid close contact with others
The model is trained using a DecisionTreeClassifier and an SVC (Support Vector Classifier). The code below shows how the decision tree model is trained:
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
training = pd.read_csv('Training.csv')
cols = training.columns[:-1]
X = training[cols]
y = training['prognosis']
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# Decision Tree Classifier
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)
# Cross-validation
scores = cross_val_score(clf, X_test, y_test, cv=3)
print(f"Cross-validation scores: {scores.mean()}")- Symptom-Based Disease Prediction: Input symptoms and get the predicted disease based on trained decision tree models.
- Severity Calculation: Determines the severity of symptoms and recommends consultation if needed.
- Precautionary Measures: Provides precautionary measures for the diagnosed disease.
- Text-to-Speech: Converts the disease prediction and precautionary recommendations to speech for an interactive experience.
Ensure that you have the following data files in your local directory:
Training.csv- The training dataset for the classifier.Testing.csv- The dataset for model testing.symptom_Description.csv- Descriptions of the symptoms.Symptom_severity.csv- A dictionary of symptoms and their severity levels.symptom_precaution.csv- The precautions for each disease.
The paths to these files need to be correctly referenced in the script for the chatbot to function properly.
This project is licensed under the MIT License - see the LICENSE file for details.