We build a system that can identify patient's condition by the help of both Natural Language Processing(NLP)
and Machine Learning(ML)
in classifying patient to reduce the efforts and time expanded by the doctors and evaluate the type of patient at an early stage.
The datset is collected from UCI.
* Statisticaly analysis data
- Sopt Word
- Lemmatization
- Split the dataset
- Split the dataset with
80%
oftraining set
and20%
oftest set
.
- Split the dataset with
- Creating features and Target Variable
TF-IDF
- TF-IDF is a very popular feature extraction technique. Text needs to converted into vector or matrix before fed them to the Machine Learning model.Bag of Words
Naive Bayes
Passive Aggressive Classifier
Confusion Matrix
Accuracy