Skip to content

guidbem/MDA-Project-Noise-Leuven

Repository files navigation

MDA-Project: Noise in Leuven

Main Files and Folders Description:

  • create_venv.ps1 : PowerShell script to create a virtual environment using the requirements.txt file
  • requirements.txt : Contains the required packages for the project
  • install_config_awscli.sh : Script to configure the AWS CLI in the Docker container image
  • dockerfile : Docker File to build the container image
  • .dockerignore : Contains the files in the repository that are not copied to the container image
  • .gitignore : Local files that are not to be pushed to Github repository
  • pull_and_merge_data.py : Python script to download all the available data and merge them as much as possible based on the source (meteo data, noise events and noise percentiles are all merged in single files each, with data from the whole year. The noise levels data are too big to be handled in one single file, so it is aggregated by month)
  • prepare_data_analysis.py : Python script to generate required data for the analysis and app
  • prepare_data_model.py : Python script to generate required data for the model development
  • generate_best_model_data.py : Python script to generate required model data for the app
  • app.py : Python script that contains the main structure and body of the Dash app
  • utils/ : Contains methods and classes to be used in other scripts for the analysis and model developments
  • assets/ : Contains images and CSS styles used in the Dash app
  • pages/ : Contains the Python scripts for the pages in the Dash app
  • model_notebooks/ : Contains Jupyter notebooks used for the development and tuning of the prediction models
  • analysis_notebooks/ : Contains Jupyter notebooks used to develop the data analysis for the project
  • .github/ : Contains the workflows folder that holds the YAML file for the Github Actions pipeline

How to start the project locally:

After cloning the repository:

  • Run the create_venv.ps1 file on the terminal to create the virtual environment (if on mac or linux, create the venv manually and run pip install -r requirements.txt).
  • Run the full pull_and_merge_data.py to obtain ready-to-use parquet files with the project data (it takes a bit of time to run, as there is more than 18GB of data to be downloaded). This has to be ran before both the instructions below.
  • Run the prepare_data_analysis.py to obtain the required data for the analysis and app (large data manipulations, also can take some time).
  • Run the prepare_data_model.py to obtain the required data to train, test and evaluate the noise event prediction model.
  • Run the generate_best_model_data.py script to generate and store the figures for the app and the best model in a .pkl file.

Observations for the Evaluators:

  • The data generation scripts can take quite some time due to the size of the raw datafiles. If that is a problem and the evaluator does not wish to run them, a zipfile will be provided in the OneDrive folder with all data that would be generated by running them. If you decide to use the zipfile, please extract it in the the working directory of the cloned repository (e.g it is needed to have the folder assets and data on the same working directory as the app.py script)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published