Applying Dynamic Time Warping for boosting the performance of Static and Incremental Machine Learning in Forecasting COVID-19 Cases

Modelling the COVID-19 virus evolution using staic and incremental machine learning methods with and without similarity(DTW, ED).

Published paper link

https://doi.org/10.1111/exsy.13237

Experiments

Experiment 1: Single-Country training

This experiment trains the supervised ML models with a single-country and predicts the cases for the same single country with which the model is trained

Mean performance of the incremental and static methods

Metric	RMSE	MAE	Time (Sec)
Gradient Boosting	1,821	1,635	0.147
Decision Tree	1,976	1,713	0.005
Random Forest	2,187	2,031	0.197
LSTM	3,311	3,053	20.525
Bayesian Ridge	7,165	5,934	0.011
Hoeffding Adaptive Trees*	15,653	11,502	0.146
Hoeffding Trees*	19,412	13,548	0.115
Adaptive Random Forest*	23,923	17,336	2.909
Linear SVR	37,518	31,370	0.02
Passive Aggressive Regressor*	111,570	91,809	0.002

Experiment 2: Multi-Country training

This second experiment predicts over the same 50 countries at the same eight points as Experiment I, but this time we train the model with 50 countries rather than training it with a single country as in Experiment I

Mean performance of the incremental and static methods

Metric	RMSE	MAE	Time (Sec)
Gradient Boosting	7,500	3,052	6.86
Bayesian Ridge	7,671	2,943	0.04
Random Forest	7,739	3,197	3.11
LSTM	8,272	3,089	138.57
Decision Tree	8,764	3,663	0.46
Linear SVR	12,824	4,577	1.91
Hoeffding Adaptive Trees*	13,946	4,803	21.23
Adaptive Random Forest*	14,151	4,599	163.51
Hoeffding Trees*	17,381	5,496	7.26
Passive Aggressive Regressor*	130,604	40,041	0.005

Experiment 3: Multi-Countries Training by Similarity

We use time similarity measures (ED and DTW) to calculate the nine more similar countries to the predicted one for the training set of this predicted country at each milestone. And then, we create a dataset to train the models with the data from those nine countries plus the one that is predicted. Thus, training occurs over a total of ten countries where COVID-19 impacts similarly at each milestone. Predictions and evaluation of the models occur over all the milestones and countries as in Experiments I and II.

N	Algorithm/RMSE	SC	MC	ED	DTW	Mean
1	Gradient Boosting	1,822	7,500	1,908	1,896	3,282
2	Random Forest	2,188	7,739	1,811	1,800	3,385
3	Decision Tree	1,977	8,764	2,171	2,155	3,767
4	LSTM	3,311	8,272	2,277	2,253	4,028
5	Bayesian Ridge	7,166	7,671	1,839	1,845	4,630
6	Hoeffding Adaptive Trees*	15,654	13,946	2,594	2,827	8,755
7	Hoeffding Trees*	19,412	17,381	2,424	2,894	10,528
8	Adaptive Random Forest*	23,923	14,151	3,120	3,102	11,074
9	Linear SVR	37,518	12,824	4,448	4,405	14,799
10	Passive Aggressive Regressor*	111,570	130,604	35,319	33,204	77,674

Steps to run

Before running the code make sure the following empty structure for running_results directory is created in results:

`results`
`├───final_result_files`
`├───final_result_plots`
`│   ├───barplot`
`│   └───boxplots`
`└───running_results`
    `└───content`
        `├───csv_files`
        `│   ├───processed`
        `│   └───processed_null`
        `├───Plots`
        `│   ├───barplot`
        `│   └───boxplots`
        `└───Result`
            `├───exp1`
            `│   ├───runtime`
            `│   ├───summary`
            `│   └───united_dataframe`
            `│       ├───incremental`
            `│       └───static`
            `├───exp2`
            `│   ├───runtime`
            `│   ├───summary`
            `│   └───united_dataframe`
            `│       ├───incremental`
            `│       └───static`
            `└───exp3`
                `├───runtime`
                `├───summary`
                `└───united_dataframe`
                    `└───incremental`
            		`└───static`

Once the above folder structure is ready, make sure all the requirements are installed as mentioned in requirement.txt.
Run the preprocess.py file to create the data frames and files required for both experiments.
Make sure that skmultiflow folder is present in source code as we have implemented few functionalities that were not a part of skmultiflow package originally.
Run the exp1.py, exp2.py and exp3.py files to generate results.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
data		data
notebooks		notebooks
results		results
src		src
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

notebooks

notebooks

results

results

src

src

.gitignore

.gitignore

README.md

README.md

config.yaml

config.yaml

requirement.txt

requirement.txt

Repository files navigation

Applying Dynamic Time Warping for boosting the performance of Static and Incremental Machine Learning in Forecasting COVID-19 Cases

Published paper link

Experiments

Steps to run

About

Releases

Packages

Contributors 2

Languages

ankitk2109/Covid_Evolution_Using_Incremental_ML

Folders and files

Latest commit

History

Repository files navigation

Applying Dynamic Time Warping for boosting the performance of Static and Incremental Machine Learning in Forecasting COVID-19 Cases

Published paper link

Experiments

Steps to run

About

Resources

Stars

Watchers

Forks

Languages