In this project, various machine learning algorithms and data analysis techniques were used to predict diabetes types. The dataset was obtained from Apollo hospital. The project involved data preprocessing, exploratory data analysis (EDA), feature selection, model training, and hyperparameter tuning. Then, I deployed the final model using the Streamlit cloud platform. The web app can be found here https://gunaxprofessional-end-to-end-diabetes-mell-streamlit-app-p243sr.streamlit.app/ .
- Data Preprocessing: The data was cleaned, and missing values were imputed using different techniques.
- Exploratory Data Analysis: The EDA was performed using the Plotly library. The relationships between variables were explored using various visualizations.
- Feature Selection: Different feature selection techniques were used to identify the most important features for model training.
- Machine Learning Algorithms: A total of various machine learning algorithms were compared to identify the best one for the task of predicting diabetes types. The algorithms included Logistic Regression, K-Nearest Neighbor, Decision Tree, Random Forest, and XGBoost.
- Hyperparameter Tuning: GridSearchCV was used to tune the hyperparameters of the selected machine learning algorithm.
The XGBoostClassifier algorithm gave the best results, with an accuracy of 98.24%. The results show that the project can be useful in predicting diabetes types.
This project shows that machine learning algorithms can be used to accurately predict diabetes types. The results obtained from this project can be used to improve the diagnosis and treatment of diabetes patients.