Refactor stacked climate dataset #103

prakhar6sharma · 2023-05-18T01:43:36Z

Issues fixed by this PR:

StackedClimateDataset is hard coded to work for Downscaling only because it has a different return signature than the ERA5 and ClimateDataset. Specifically, Downscaling assumes that the raw_data it would receive would be a list with 2 items, one for input and other for output. Thus limiting us with a single dataset for input and single dataset for output.
Imports under the data/ folder were all absolute imports.
Forecasting's create_copy() allows illegal values for some of it's attributes

Solution implemented:

All ClimateDataset has a new attribute called name. Thus, all variables are indexed by f"{dataset_name}:{variable}".
StackedClimateDataset now allows recursively stacking arbitrary ClimateDataset's while keeping the same return signature. Specifically, if it has two child datasets named "child1" and "child2" and it's own name is "parent", now the name of the children are changed to "parent:child1" and "parent:child2".
Because of the above change, the in_vars, out_vars, and constants of the Task should have the dataset name followed by a colon followed by the variable name. Example: ["2m_temperature", "geopotential"] to ["era5:2m_temperature", "era5:geopotential"].
Imports are changed under the data/ folder to follow relative paths.
Alphabetize imports.
Implemented create_copy() for Forecasting.

prakhar6sharma · 2023-05-18T02:19:13Z

To Do:

Update the notebooks.

jasonjewik

I'm 50-50 on this, so you could convince me either way, but why does name need to be a parameter for ClimateDatasetArgs. For example, if I use ERA5Args, it just makes sense that the name is "era5". A compromise solution could be that the default name is "era5", but we still allow the user to specify a different name.

Otherwise, good PR.

prakhar6sharma · 2023-05-18T20:05:38Z

I'm 50-50 on this, so you could convince me either way, but why does name need to be a parameter for ClimateDatasetArgs. For example, if I use ERA5Args, it just makes sense that the name is "era5". A compromise solution could be that the default name is "era5", but we still allow the user to specify a different name.

Why name need to be a parameter for ClimateDatasetArgs: Suppose we are doing downscaling. For the typical setup that we support, both high_res and low_res data are from ERA5. Thus keeping the name as "era5" for both of them would lead to ambiguity. Right now, there is a default name for every different ClimateDatasetArgs class, for ERA5Args that happen to be "era5".

prakhar6sharma requested a review from jasonjewik as a code owner May 18, 2023 01:43

prakhar6sharma mentioned this pull request May 18, 2023

Current ViT implementation works with timm 0.6.12 and not with 0.9.2 #102

Closed

prakhar6sharma added 3 commits May 17, 2023 20:27

Implemented data source name as the part of the keys

8621937

fixed formatting

3f0b0a1

Updated tests to reflect the new changes

c278d11

prakhar6sharma force-pushed the refactor_stacked_climate_dataset branch from 09beb4a to c278d11 Compare May 18, 2023 03:28

prakhar6sharma added 2 commits May 17, 2023 21:06

Updated notebooks

b1e8c1a

Alphabetize imports

0784457

jasonjewik requested changes May 18, 2023

View reviewed changes

Changed default names of ERA5Args and StackedCLimateDatasetArgs

80be42c

jasonjewik approved these changes May 18, 2023

View reviewed changes

jasonjewik merged commit 95dc85c into main May 18, 2023
6 checks passed

jasonjewik deleted the refactor_stacked_climate_dataset branch May 18, 2023 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor stacked climate dataset #103

Refactor stacked climate dataset #103

prakhar6sharma commented May 18, 2023 •

edited

prakhar6sharma commented May 18, 2023 •

edited

jasonjewik left a comment

prakhar6sharma commented May 18, 2023

Refactor stacked climate dataset #103

Refactor stacked climate dataset #103

Conversation

prakhar6sharma commented May 18, 2023 • edited

prakhar6sharma commented May 18, 2023 • edited

jasonjewik left a comment

Choose a reason for hiding this comment

prakhar6sharma commented May 18, 2023

prakhar6sharma commented May 18, 2023 •

edited

prakhar6sharma commented May 18, 2023 •

edited