AtmoSeer

About

This project provides a pipeline to build rainfall forecast models. The pipeline can be configured with different meteorological data sources.

Install

In the root directory of this repository, type the following command (you must have conda installed in your system):

./setup.sh

Project pipeline

The project pipeline is defined as a sequence of three steps: (1) data retrieving, (2) data pre-processing and (3) model training. These steps are implemented as Python scripts in the ./src directory.

Data retrieval

All datasets retrieved and/or generated by the scripts will be stored in the ./data folder.

retrieve_ws_cor.py: This script retrieves observation from a user-provided weather station.
retrieve_ws_inmet.py: This script retrieves observations for from a user-provided weather station.
retrieve_as.py: this script retrieves atmospheric sounding data.
retrieve_ERA5.py: this script retrieves numerical simulation data from the ERA5 portal.

Script gen_sounding_indices.py

This script will generate atmospheric instability indices for the data retrieveed by the script retrieve_as.py. Data from the SBGL sounding (located at the Galeão Airport, Rio de Janeiro - Brazil) will be used to calculate atmospheric instability indices, generating a new dataset. This new dataset contains one entry per sounding probe. SBGL sounding station produces two probes per day (at 00:00h and 12:00h UTC). Each entry in the produced contains the values of the computed instability indices for one probe. The following instability indices are computed:

CAPE
CIN
Lift
k
Total totals
Show alter

Preprocessing

The preprocessing scripts are responsible for performing several operations on the original dataset, such as creating variables or aggregating data, which can be interesting for model training and its final result.

Dataset building

These scripts will build the train, validation and test dataset from the times series produced in the previous steps. These are the datasets to be given as input to the model training step.

Model training and evaluation

The model generation script is responsible for performing the training and exporting the results obtained by the model after testing.

r2t

Notebooks

There are several Jupyter Notebooks in the notebooks directory. They were used for initial experiments and explorarory data analisys. These notebooks are not guaranteed to run 100% correctly due to the subsequent code refactor.

Name		Name	Last commit message	Last commit date
Latest commit History 279 Commits
config		config
notebooks		notebooks
quarantine		quarantine
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
WeatherStations.csv		WeatherStations.csv
__init__.py		__init__.py
dsi_features.parquet		dsi_features.parquet
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AtmoSeer

About

Install

Project pipeline

Data retrieval

Script gen_sounding_indices.py

Preprocessing

Dataset building

Model training and evaluation

r2t

Notebooks

About

Releases

Packages

Contributors 9

Languages

AIRGOLAB-CEFET-RJ/atmoseer

Folders and files

Latest commit

History

Repository files navigation

AtmoSeer

About

Install

Project pipeline

Data retrieval

Script gen_sounding_indices.py

Preprocessing

Dataset building

Model training and evaluation

r2t

Notebooks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages