This repository contains the data minning and predicion codes used for DengAI: Predicting Disease Spread competition
Group name is DVios-UOMCSE13
- pandas
- statsmodels
- Filling the missing values with latest value, attribute mean, class mean
- Linear interpolation
- Negative Binomial
- Random fores
Use 2 separate models for 2 cities with 4 features with the highest coorelation.
First the 2/3 of the data set as used for model building and 1/3 for model testing. Using the complete set of data for model building improved the accuracy
Updating the model again using the best set of predicted values imporved the accuracy
Reducing the features to top 3 incresed the accuracy of predicted data.
Time shifting the data