Skip to content

fom-big-data-bike-path-quality/fom-big-data-bike-path-quality-analytics

Repository files navigation

Train model (Dry-Run) Issues


Logo

Bike Path Quality (Altiplano)

PyTorch based model that learns from bike activity time series data

About The Project

tbd

Built With

Installation

Initialize the submodules of this repository by running the following commands.

git submodule init
git submodule update

Install the following dependencies to fulfill the requirements for this project to run.

python -m pip install --upgrade pip
pip install flake8 pytest
pip install pandas
pip install matplotlib
pip install sklearn
pip install torch
pip install tqdm
pip install seaborn
pip install telegram-send
pip install gcloud
pip install google-api-core
pip install google-api-tools
pip install google-auth
pip install google-cloud-core
pip install google-cloud-storage
pip install torchviz

Usage

Run this command to start the main script.

python main.py [OPTION]...

-h, --help                                         show this help
-c, --clean                                        clean intermediate results before start
-q, --quiet                                        do not log outputs
-t, --transient                                    do not store results
-d, --dry-run                                      only run a limited training to make sure syntax is correct

--skip-data-understanding                          skip data understanding
--skip-validation                                  skip validation

-s, --slice-width <slice-width>                    number of measurements per slice
-w, --window-step <window-step>                    step size used for sliding window data splitter
--down-sampling-factor <down-sampling-factor>      factor by which target classes are capped in comparison to smallest class
-m, --model <model>                                name of the model to use for training
-f, --k-folds <k-folds>                            number of k-folds

-k, --k-nearest-neighbors <k-nearest-neighbors>    number of nearest neighbors to consider in kNN approach
--dtw-subsample-step <dtw-subsample-step>          subsample steps for DTW
--dtw-max-warping-window <dtw-max-warping-window>  max warping window for DTW

-e, --epochs <epochs>                              number of epochs
-p, --patience <patience>                          number of epochs to wait for improvements before finishing training
-l, --learning-rate <learning-rate>                learning rate
--dropout <dropout>                                dropout percentage
--lstm-hidden-dimension <lstm-hidden-dimension>    hidden dimensions in LSTM
--lstm-layer-dimension <lstm-layer-dimension>      layer dimensions in LSTM

Examples:
  python main.py -c -m knn-dtw -k 10 --dtw-subsample-step=1 --dtw-max-warping-window=500
  python main.py -c -m lstm -s 500 -w 500 --lstm-hidden-dimension 128 --lstm-layer-dimension 3
  python main.py -c -m cnn -s 500 -w 500

Roadmap

See the open issues for a list of proposed features (and known issues).

Metrics

Configuration 1

  • slice width 500
  • step size 500
  • overlap 0%
Confusion Matrix Confusion Matrix Confusion Matrix Confusion Matrix
Confusion matrix kNN-DTW with k=1 Confusion matrix kNN-DTW with k=9 Confusion matrix LSTM Confusion matrix CNN
Matthews Correlation Coefficient 0.39 Matthews Correlation Coefficient 0.39 Matthews Correlation Coefficient 0.31 Matthews Correlation Coefficient 0.58

Configuration 2

  • slice width 500
  • step size 375
  • overlap 25%
Confusion Matrix Confusion Matrix Confusion Matrix Confusion Matrix
Confusion matrix kNN-DTW with k=1 Confusion matrix kNN-DTW with k=9 Confusion matrix LSTM Confusion matrix CNN
Matthews Correlation Coefficient 0.36 Matthews Correlation Coefficient 0.40 Matthews Correlation Coefficient 0.31 Matthews Correlation Coefficient 0.53

Configuration 3

  • slice width 500
  • step size 250
  • overlap 50%
Confusion Matrix Confusion Matrix Confusion Matrix Confusion Matrix
Confusion matrix kNN-DTW with k=1 Confusion matrix kNN-DTW with k=9 Confusion matrix LSTM Confusion matrix CNN
Matthews Correlation Coefficient 0.36 Matthews Correlation Coefficient 0.37 Matthews Correlation Coefficient 0.32 Matthews Correlation Coefficient 0.59

Contributing

Since this project is part of an ongoing Master's thesis contributions are not possible as for now.

License

Distributed under the GPLv3 License. See LICENSE.md for more information.

Contact

Florian Schwanz - florian.schwanz@gmail.com

Acknowledgements

Icon made by Freepik from www.flaticon.com