Skip to content

burnpiro/wod-usage-predictor

Repository files navigation

Wrocław Open Dataset - WRM usage predictor

ML project to predict usage of WRM

Getting Started

Data input file (compressed with 7z because standard zip had over 130MB which is 1/7 of the original file):

Download data/model_input.7z

Prerequisites

Packages:

tensorflow>=2.1
scikit-learn
seaborn>=0.10.0
pandas>=1.0.1
notebook>=6.0.3
matplotlib>=3.2.0
jupyter-core>=4.6.3
xlrd>=1.2.0

Installing

Best way to install dependencies and avoid unnecessary problems is to setup Anaconda env and run following command inside the environment

pip install -r requirements.txt

after that you should be able to run:

jupyter notebook

to open one of the notebooks.

Give an example

And coding style tests

Explain what these tests test and why

Give an example

Data Processing

For weather data description check Files description This Link

For each day we're extracting data in format:

[
    'date',
    'time',
    'totalSnow_cm',
    'sunrise',
    'sunset',
    'tempC',
    'FeelsLikeC',
    'HeatIndexC',
    'windspeedKmph',
    'weatherCode',
    'precipMM',
    'humidity',
    'visibility',
    'pressure',
    'cloudcover'
]

Most of the data comes directly from the source except:

  • time - source data converted from string 1300 to number 13*60
  • sunrise - date converted from HH:mm AM to minutes, started from midnight
  • sunset - date converted from HH:mm PM to minutes, started from midnight

To run data processing:

Process weather data

python weather_parser.py

Generate WRM data per year and list of places in data/bike_data directory

python data_parser.py

Generated dataset has format of:

    'bike_number',
    'start_time',
    'end_time',
    'rental_place',
    'return_place',
    'year',
    'week_day',
    'totalSnow_cm',
    'sunrise',
    'sunset',
    'tempC',
    'FeelsLikeC',
    'HeatIndexC',
    'windspeedKmph',
    'weatherCode',
    'precipMM',
    'humidity',
    'visibility',
    'pressure',
    'cloudcover'

WARNING!!! It takes a while to run.

Deployment

Add additional notes about how to deploy this on a live system

Built With

  • Tensorflow - An end-to-end open source machine learning platform for everyone
  • Pandas - Open source data analysis and manipulation tool
  • WOD - Wroclaw Open Dataset

Versioning

We use SemVer for versioning. For the versions available, see the https://github.com/burnpiro/wod-usage-predictor/tags.

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

  • Hat tip to anyone whose code was used
  • Inspiration
  • etc

About

Wrocław Open Dataset - WRM usage predictor

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published