Skip to content

poojaram/dengue-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

a4-starter

This project is divided into 3 significant portions: Data Preparation, Feature Selection and Modeling - Validation. We begin by explaining how we handled the missing data values, and go on to talk about the hypothesis we modeled our data based on. Feature Selection - We try various feature selection methods, and provide justifying visualizations for reducing the number of features by removing the redundant features and creating features that are relavant. We then proceed to the modeling stage, and we used 4 main models - Poisson Regression, Negative Binomial Regression Approach, Random Forest Approach and Gradient Boosting Algorithm (XGBoost). We performed many iterations of these models to decide on the best parameters to be selected, and many of these code chunks are commented out because they take about 15-20 mins to run. Unfortunately, the algorithms we decided do not have visualizations that are signifacnt to show, so we haven't included any, but we have provided an in-depth analysis on why we chose each model, and why we this the Random Forest Approach is the best out of 4. After deciding the best approach, we then show how it behaves on the test dataset, and compare the predicted values to actual values.

All the source code can be found on the gh-pages branch.

Link to the report: https://info370a-w19.github.io/a4-poojaram/

About

Driven Data's Predicting Disease Spread Competition.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •