Skip to content

himalayasharma/sensor-data-compression

Repository files navigation


Logo

Sensor Data Compression

GitHub Repo stars GitHub forks GitHub pull requests GitHub issues

The project aimed to analyze wearable physiological sensor data through the use of 6 feature extraction techniques and 3 feature selection methods. The reduced data was then evaluated for its classification performance using 4 machine learning algorithms: KNN, Decision Trees, SVC, and Random Forest. The results showed impressive compression rates of up to 99.25% with a minor accuracy loss of only 6.7%. The project aimed to demonstrate the effectiveness of these techniques in achieving high data compression while preserving the accuracy of the analysis.

Check out the video presentation here!

Project Organization

├── LICENSE
├── Makefile                <- Makefile with commands like `make data` or `make train`
├── README.md               <- The top-level README for developers using this project.
├── readme-assets           <- Resources used in README.md.
├── data
│   ├── processed           <- The final, canonical data sets for modeling.
│   └── raw                 <- The original, immutable data dump.
│
├── references              <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports                 <- Generated analysis as HTML, PDF, LaTeX, etc.
│   ├── figures             <- Generated graphics and figures to be used in reporting
│   └── presentation        <- Presentation for reporting experimental findings
│
├── requirements.txt        <- The requirements file for reproducing the analysis environment, e.g.
│                              generated with `pip freeze > requirements.txt`
│
└── src                     <- Source code for use in this project.
    ├── __init__.py         <- Makes src a Python module
    │
    ├── data                <- Scripts to download or generate data
    │   └── make_dataset.py
    │
    ├── calculations        <- Scripts to calculate statistics
    │   └── calculate.py
    │
    ├── features            <- Scripts to turn raw data into features for modeling
    │   └── build_features.py
    │
    ├── models              <- Scripts to train models and evaluate models
    │   └── train_and_evaluate_model.py              
    │
    └── visualization       <- Scripts to create exploratory and results oriented visualizations
        └── visualize.py

Prerequisites

Before you begin, ensure you have met the following requirements:

  • You have a Linux/Mac/Windows machine.
  • You have installed a python distribution.
  • You have installed pip.
  • You have installed make.

Setup

  1. Clone the repository.
    git clone https://github.com/himalayasharma/data-compression-using-dimensionality-reduction.git
    
  2. Traverse into project directory.
  3. Create virtual environment.
    make create_environment
  4. Activate virtual environment.
  5. Download and install all required packages.
    make requirements
  6. Download and process physiological sensor dataset.
    make data
  7. Build new set of features after dimensionality reduction.
    make build_features
  8. Calculate required statistics (compression ratio, space saving etc).
    make calculate
  9. Train and evaluate models.
    make train_and_evaluate
  10. Generate plots.
    make plot

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated. If you have a suggestion that would make this better, please fork the repo and create a pull request. Don't forget to give the project a star! Thanks again!

  1. Fork this repository.
  2. Create a branch: git checkout -b <branch_name>.
  3. Make your changes and commit them: git commit -m '<commit_message>'
  4. Push to the original branch: git push origin <project_name>/<location>
  5. Create the pull request.

Alternatively see the GitHub documentation on creating a pull request.

License

Distributed under the MIT License. See LICENSE for more information.

Ackowledgements

References

  • Mohino-Herranz I, Gil-Pita R, Rosa-Zurera M, Seoane F. Activity Recognition Using Wearable Physiological Measurements: Selection of Features from a Comprehensive Literature Study. Sensors (Basel). 2019 Dec 13;19(24):0. doi: 10.3390/s19245524. PMID: 31847261; PMCID: PMC6960825.
  • Compression ratio. (2022, June 2). In Wikipedia. https://en.wikipedia.org/wiki/Compression_ratio

About

Exploration of dimensionality reduction for data compression

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published