This repository contains the data minning and predicion codes used for DengAI: Predicting Disease Spread competition
Group name is DVios-UOMCSE13
- pandas
- statsmodels
- Filling the missing values with latest value, attribute mean, class mean
- Linear interpolation
- Negative Binomial
- Random fores
-
Use 2 separate models for 2 cities with 4 features with the highest coorelation.
-
First the 2/3 of the data set as used for model building and 1/3 for model testing. Using the complete set of data for model building improved the accuracy
-
Updating the model again using the best set of predicted values imporved the accuracy
-
Reducing the features to top 3 incresed the accuracy of predicted data.
-
Time shifting the data