adjustments to be able to run workflows in dev branch #218

AnnaKwa · 2020-04-06T23:40:23Z

~~The land sea mask variable now has NaN values instead of 0's. This PR converts those to 0 in the training data pipeline.~~ This is now fixed in the zarr write step: #223

Fix references to old variable names in training data, model training, offline diagnostics, and prognostic run workflows.

…ng/fv3net into dev-feature/train-ml

brianhenn

Hey this looks good, I skimmed this and haven't run it yet but thanks for the refactor and removing the hardcodes..

brianhenn · 2020-04-09T03:32:46Z

external/vcm/vcm/visualize/plot_cube.py

    rest = [dim for dim in ds[[var_name]].dims if dim not in first_dims]
    xpose_dims = first_dims + rest
    new_ds = ds[[var_name]].copy().transpose(*xpose_dims)

-    for grid_var in _COORD_VARS:
+    for grid_var in coord_vars:
        new_ds = new_ds.assign_coords(coords={grid_var: ds[grid_var]})

    return new_ds.drop(


OK I checked and this is not actually necessary for plotting open_restarts datasets, but it is nice to remove the extraneous coords in that case since the point of mappable_var is to reset the coords to the grid vars. So I would suggest the following:

Suggested change

return new_ds.drop(

for coord in [coord_y_center, coord_x_center, coord_y_outer, coord_x_outer]:

if coord in new_ds.coords:

new_ds = new_ds.drop(coord)

return new_ds

Co-Authored-By: brianhenn <brianhenn@gmail.com>

* Feature/one step save baseline (#193) This adds several features to the one-step pipeline - big zarr. Everything is stored as one zarr file - saves physics outputs - some refactoring of the job submission. Sample output: https://gist.github.com/nbren12/84536018dafef01ba5eac0354869fb67 * save lat/lon grid variables from sfc_dt_atmos (#204) * save lat/lon grid variables from sfc_dt_atmos * Feature/use onestep zarr train data (#207) Use the big zarr from the one step workflow as input to the create training data pipeline * One-step sfc variables time alignment (#214) This makes the diagnostics variables appended to the big zarr have the appropriate step and forecast_time dimensions, just as the variables extracted by the wrapper do. * One step zarr fill values (#223) This accomplishes two things: 1) preventing true model 0s from being cast to NaNs in the one-step big zarr output, and 2) initializing big zarr arrays with NaNs via full so that if they are not filled in due to a failed timestep or other reason, it is more apparent than using empty which produces arbitrary output. * adjustments to be able to run workflows in dev branch (#218) Remove references to hard coded dims and data variables or imports from vcm.cubedsphere.constants, replace with arguments. Can provide coords and dims as args for mappable var * One steps start index (#231) Allows for starting the one-step jobs at the specified index in the timestep list to allow for testing/avoiding spinup timesteps * Dev fix/integration tests (#234) * change order of required args so output is last * fix arg for onestep input to be dir containing big zarr * update end to end integration test ymls * prognostic run adjustments * Improved fv3 logging (#225) This PR introduces several improvements to the logging capability of our prognostic run image - include upstream changes to disable output capturing in `fv3config.fv3run` - Add `capture_fv3gfs_func` function. When called this capture the raw fv3gfs outputs and re-emit it as DEBUG level logging statements that can more easily be filtered. - Refactor `runtime` to `external/runtime/runtime`. This was easy since it did not depend on any other module in fv3net. (except implicitly the code in `fv3net.regression` which is imported when loading the sklearn model with pickle). - updates fv3config to master * manually merge in the refactor from master while keeping new names from develop (#237) * lint * remove logging from testing * Dev fix/arg order (#238) * update history * fix positional args * fix function args * update history * linting Co-authored-by: Anna Kwa <annak@vulcan.com> Co-authored-by: brianhenn <brianhenn@gmail.com>

Anna Kwa added 4 commits April 6, 2020 23:30

rm cos(zenith) var and fill nans in surface_type

b5f7f8f

add optional arg for land sea type in masking

99fd2bc

update var names in training config

6efa2eb

update var names in training config

bca3525

AnnaKwa requested a review from brianhenn April 6, 2020 23:40

Anna Kwa added 5 commits April 6, 2020 23:42

update training script

eed3aa0

fix test

779ef63

linting

51e995d

remove nan handling in train data pipeline

3175866

Merge branch 'develop-one-steps' into dev-feature/train-ml

f437757

AnnaKwa changed the title ~~adjustments to be able to train ML model~~ adjustments to be able to run workflows Apr 8, 2020

AnnaKwa changed the title ~~adjustments to be able to run workflows~~ adjustments to be able to run workflows in dev branch Apr 8, 2020

Anna Kwa added 18 commits April 8, 2020 18:23

fix names incalls to regression/test

cef90ff

fix names incalls to regression/test

a3d92e4

Merge branch 'develop-one-steps' into dev-feature/train-ml

8093135

use names from file in load/predict data and net precip/heatining

fde9652

change var names throughout main

dfe63ef

put missing fixes back into regression

c89b12a

update LTS functions"

ee977a0

update metrics creation

b7b2722

update plot metrics

ffe997f

update diagnostics

df00f71

update diagnostics

4ce17aa

remove renaming step in sklearn prognostic interface

ca9ce7c

fix coord names in diagnostics

6ddcc7e

allow coord z to be specified in pressure interpolation

e08ae99

fix 1e9 units error

1e93446

drop step dim when training data is created

faa3987

update test script paths

1b7f8df

linting

5a54807

Anna Kwa added 4 commits April 8, 2020 23:54

linting

4b2fe02

Delete data.py~HEAD

bc71e7c

replace drop coords at end of mappable var

00118fe

Merge branch 'dev-feature/train-ml' of github.com:VulcanClimateModeli…

cbc80db

…ng/fv3net into dev-feature/train-ml

brianhenn approved these changes Apr 9, 2020

View reviewed changes

Anna Kwa and others added 2 commits April 9, 2020 09:05

Update coord drop in mappable var

6e5d61a

Co-Authored-By: brianhenn <brianhenn@gmail.com>

rm extra line

0def7c5

AnnaKwa merged commit 1db14ba into develop-one-steps Apr 9, 2020

AnnaKwa deleted the dev-feature/train-ml branch April 9, 2020 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adjustments to be able to run workflows in dev branch #218

adjustments to be able to run workflows in dev branch #218

AnnaKwa commented Apr 6, 2020 •

edited

Loading

brianhenn left a comment

brianhenn Apr 9, 2020

-    return new_ds.drop(
+    for coord in [coord_y_center, coord_x_center, coord_y_outer, coord_x_outer]:
+        if coord in new_ds.coords:
+            new_ds = new_ds.drop(coord)
+    return new_ds

adjustments to be able to run workflows in dev branch #218

adjustments to be able to run workflows in dev branch #218

Conversation

AnnaKwa commented Apr 6, 2020 • edited Loading

brianhenn left a comment

Choose a reason for hiding this comment

brianhenn Apr 9, 2020

Choose a reason for hiding this comment

AnnaKwa commented Apr 6, 2020 •

edited

Loading