Skip to content

kurdiakaran863/Tanzania_waterpoint_classification

Repository files navigation

Tanzania_waterpoint_classification:

UNI297334

Business Problem:

Tanzania is a developing country, it has a very large population of over 57,000,000. The country struggles to provide clean water. There are many water points already established in the country, but some are and some are completely non-functional

Scope:

To create a model that helps predicting, the waterpoints that are functional and non functional. This would, help the Government know the areas it should be focusing more on, in order to provide proper cleaning water to its population

Data:

The data was acquired from https://www.drivendata.org/competitions/7/pump-it-up-data-mining-the-water-table/page/23/
Rows = 59400 , Columns = 40

Files:

The repository has the following files:
Data : It contains the main data that the cleaning was done on
Data Cleaning : This notebook, consists of the data cleaning process.
EDA: This notebook involves Exploratory Data Analysis on the cleaned data</> Base_Model: This notebook, consists of the code that deals with evaluating the model performances on the train and test sets.

Project Details:

Data Modelling:

  • All the data was processed by OneHotEncoding and then standardised through sklearn pipeline
  • GridsearchCV was further cross validation
  • The classification models used in this project are:
    • Logistic Regression
    • Knn Classifier
    • RandomForest Classifier
    • XGBoost Classifier
  • The best performing model was RandomForest Classifier, with a accuracy score of 92%

Screen Shot 2022-10-06 at 2 15 05 PM

- Model metrics:

Screen Shot 2022-10-06 at 3 48 49 PM

- Feature Engineering:

Screen Shot 2022-10-07 at 3 08 18 AM

Screen Shot 2022-10-07 at 3 09 05 AM

Summary:

-58% of all waterpoints in Tanzania are functional.
-The model is 92% accurate.
-The Government of Tanzania can use features like quantity of water, extraction type of waterpoints, the height ,payments being made etc, to make further repairs on water points that are non functional
-This model can be further used by the Government to analyze the future improvements.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published