Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train/test model with blurred data #74

Merged
merged 14 commits into from
May 21, 2019
Merged

Train/test model with blurred data #74

merged 14 commits into from
May 21, 2019

Conversation

nbren12
Copy link
Owner

@nbren12 nbren12 commented May 21, 2019

No description provided.

nbren12 added 14 commits May 16, 2019 17:35
The pre-processing pipeline was too confusing, and was scatter accross several
snakemake rules.

This commit combines these into one script called uwnet/data/preprocess.py.
There is no longer a `step` dimension in the training data.
The data blurred with a radius of xxx will be stored at

data/processed/training/sigmaxxx.nc

The unblurred data will be stored at

data/processed/training/noBlur.nc
This commit makes it easier to identify rules related to
pre-processing
training doesn't work in this commit
Previously it was hard debugging errors with the input data.
The new pre-processed data has a time varying layer_mass dimension,
which broke the metrics calculation.
It now runs on olympus using the `sam_path` specified in the configuration
file.
one is for fast debugging purpose
one is for the blurred data
model_run_path needs to be set
@nbren12 nbren12 merged commit 5571450 into master May 21, 2019
@nbren12 nbren12 deleted the feature/blur-inputs branch May 21, 2019 06:57
nbren12 added a commit that referenced this pull request Nov 13, 2020
To simulate the effect of coarse-resolution data, we can test/train the NN on blurred training data.

Changes needed:

* Add script for blurring the data

* Refactor and improve SAM-based pre-processing scripts.

The pre-processing pipeline was too confusing, and was scatter across several snakemake rules. Now these are combined into one script: `uwnet/data/preprocess.py.`

* Automate NN training and SAM simulation with snakemake

These steps had to be executed manually because Sacred generated the folder names automatically. Now the model training and SAM runs are named based on the filename of the json file used to train them.

* Improve training messages

Previously it was hard debugging errors with the input data. Also use the agg backend for plots, so that the training does not die on olympus.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant