Skip to content

This repo contains my files for competing in the Driven Data Competition Pump It Up: Data Mining the Water Table. I went through several different models to classify wells as functioning, not function, or functioning needs repair. I had the most success using the Random Forest classifier while using GridSearch to tune the parameters for the best…

Notifications You must be signed in to change notification settings

roweyerboat/Pump_It_Up_MLProject

Repository files navigation

Pump It Up: Driven Data Competition

This Repo contains notebooks used to Obtain, Scrub, Explore, Model, and iNterpret the data from the Pump It Up competition put on by Driven Data. The task is to create a model using machine learning that will predict whether a well in Tanzania is functional, not functional, or functional but needs repair.

Files

This repo contains:

  • csv files for the data preprocessed and then processed

  • the test set from the competition used to submit an entry in the competition

  • a notebook that contains all the cleaning and exploratory analysis.

  • a folder of notebooks creating models for the ternary classification.

  • a folder contains my work on the problem using a binary classification of functional or needs repair.

  • an executive summary presentation showcasing my final models and my recommendations for those looking to invest in repairing wells.

Updates

I've been working with this data in Tableau and digging a bit deeper into how to best classify. map of Tanzania with the wells This is one of the visualizations I was able to create to showcase the areas impacted by wells that need repair and the size of the population impacted.

Blogs

Blog about competing for the first time. Blog about Tableau visualization

Contact Info

LinkedIn

About

This repo contains my files for competing in the Driven Data Competition Pump It Up: Data Mining the Water Table. I went through several different models to classify wells as functioning, not function, or functioning needs repair. I had the most success using the Random Forest classifier while using GridSearch to tune the parameters for the best…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published