.
├── dataset
│ ├── ecg_mitbih_test.csv
│ ├── imputed_data
│ │ └── ecg_mitbih_test_imputed.csv
│ ├── decomposed_data
│ │ ├── ecg_mitbih_test_imputed.csv
│ │ └── trend_decomposed.csv
│ └── synchronized_data
│ └── synchronized_dtw.csv
├── imputation.py
├── seasonal_trend_decomposition.py
├── synchronization.py
└── README.md
Impute the missing values in a dataset and save the result.
- Simple Imputation with
mean, median, most_frequent, constant
value [description]
# Sample Usage
python imputation.py --data_path='./dataset/ecg_mitbih_test.csv' \
--option='simple' \
--strategy='mean' \
--output_path='./dataset/imputed_data/ecg_mitbih_test_imputed.csv'
- KNN Imputation [description]
# Sample Usage
python seasonal_trend_decomposition.py --data_path='./dataset/ecg_mitbih_test.csv' \
--option='knn' \
--n_neighbors=5 \
--output_path='./dataset/imputed_data/ecg_mitbih_test_imputed.csv'
- MICE Imputation [description]
# Sample Usage
python imputation.py --data_path='./dataset/ecg_mitbih_test.csv' \
--option='mice' \
--strategy='mean' \
--output_path='./dataset/imputed_data/ecg_mitbih_test_imputed.csv'
Just add --test_module
argument to the command-line for testing the module.
If ``--test_moduleargument is given,
imputation.py` automatically adds random NAs to the dataset and then continues to impute the missing values.
* Sample Usage
python imputation.py --data_path='./dataset/ecg_mitbih_test.csv' \
--option='simple' \
--strategy='mean' \
--output_path='./dataset/imputed_data/ecg_mitbih_test_imputed.csv'
--test_module
- STL Decomposition [description]
- Auto Arima [description]
# Sample Usage
python imputation.py --data_path='./dataset/machine_temperature_system_failure.csv' \
--seasonal_output_path='./dataset/decomposed_data/seasonal_decomposed.csv'
--trend_output_path='./dataset/decomposed_data/trend_decomposed.csv'
- DTW [description]
- soft-DTW [description]
# Sample Usage
python synchronization.py --data_path='./dataset/power_voltage.csv' \
--dtw_output_path='./dataset/synchronized_data/synchronized_dtw.csv'\
--plot_output_path='./dataset/synchronized_data'\
--option='dtw'\
--distance=2