Contains TheGradientBoost November cohort Group 2 project. The project deals with predicting whether an individual would take up an health insurance policy or not giving different factors, such as age groups, type of place of residence, highest educational level, wealth index, e.t.c., leveraging machine learning classification models.
Dataset used in this project is the Individual Recode section of the 2018 Nigerian Demographic and Health Survey NDHS dataset.
Streamlit app built with pointers from this repo and this repo.
Group Members:
Teaching Assistant:
Checklist:
- Problem statement
- Check similar research paper for methodology
- Load and clean dataset
- Exploratory data analysis
- Model building and evaluation
- Handle class imbalance using oversampling and undersampling
- Models comparison
- Hyperparamter Optimization
- Create app with Streamlit
- Deploy app with Heroku