Reduce disk space on surface datasets by using the urban lookup table #633

ekluzek · 2019-02-05T18:13:02Z

There are 34 urban fields on the surface dataset that are output as double precision at the resolution of the output grid. We could save space and make it easier to bring in updates to urban fields in, by just keeping the region ID on the surface dataset, and referencing the lookup table.

Roughly 25% of the surface dataset is made up of such urban fields. We could save about 50GBytes per new set of surface datasets made. The entire raw dataset (with region ID at 0.05x0.05 degree resolution) is only about one GByte.

@olyson @dlawrenncar

ekluzek · 2019-02-21T22:47:45Z

I went over this with @olyson. We agree it would be good to do. It will both save disk space, and also make it easier for Keith to do urban updates. He won't have to create new surface datasets in order to try out a new urban dataset parameters, he can just point to the new dataset. So this change is both more flexible for Keith and his work, and also saves maybe a 100GBytes of data for each time we create new surface datasets at all the different standard resolutions. It also makes the dataset easier to read and understand, and bundles all the urban fields into one file (so it's more obvious this data isn't needed for Paleo work for example).

The surface dataset will still have the region_id and the density index. And so mksurfdata_map will read those two from the raw urban dataset. Then CLM at initialization will take the density index of the land-unit and the region id to populate all the urban parameters.

We could separate out the urban dataset into two files, one used by mksurfdata and one by CLM. But, we figure we might as well in the beginning keep them on the same raw data file. And both CLM and mksurfdata_map will point to the same file, just use different bits of information from it. Keeping it on the same file, allows the current tools to continue working with it.

Keith will look into if he has time to work on this. If not we don't know when it will happen, but it will be in the list of priorities. The change to mksurfdata_map is simple -- just removing things. So @ekluzek will do that. The part in CTSM is more involved, but it's mainly just moving the code from mksurfdata_map into CTSM itself, so in principle isn't hard.

olyson · 2019-03-25T20:44:57Z

I will create a branch to start this. Should I branch off master or clm release?

ekluzek · 2019-03-25T20:53:06Z

Branch off of master. This will just go on master and not the release branch.

olyson · 2019-04-01T16:06:47Z

I have something on a branch that is working. It runs for a global test case and it passes this test (including bfb with ctsm1.0.dev031):

ERP_Ld3.f09_g17.I1850Clm50BgcCropCru.cheyenne_intel.clm-ciso

Basically, it replaces the UrbanInput code in UrbanParamsMod.F90 that reads in the urban parameters from the surface dataset with a call to mkurbanpar, the urban lookup table routine, that is in tools/mksurfdata_map/src.

But to do this, I had to add in the following modules from tools/mksurfdata_map/src because of dependencies:
src/biogeophys/mkgridmapMod.F90
src/biogeophys/mkindexmapMod.F90
src/biogeophys/mkncdio.F90
src/biogeophys/mkvarctl.F90

Not sure if that is desirable. It might be possible to extract exactly what is needed from these four routines and stick them in other existing appropriate modules.

Another issue is that there may be a more elegant way to execute the following code which is used to fill the urbinp arrays:

do p = 1, size(params_scalar)
call lookup_and_check_err(params_scalar(p)%name, params_scalar(p)%fill_val, &
params_scalar(p)%check_invalid, urban_skip_abort_on_invalid_data_check, &
data_scalar_o, 0)
if (params_scalar(p)%name == "CANYON_HWR") then
urbinp%canyon_hwr = data_scalar_o
else if (params_scalar(p)%name == "EM_IMPROAD") then
urbinp%em_improad = data_scalar_o
else if (params_scalar(p)%name == "EM_PERROAD") then
urbinp%em_perroad = data_scalar_o
else if (params_scalar(p)%name == "EM_ROOF") then
urbinp%em_roof = data_scalar_o
else if (params_scalar(p)%name == "EM_WALL") then
urbinp%em_wall = data_scalar_o
else if (params_scalar(p)%name == "HT_ROOF") then
urbinp%ht_roof = data_scalar_o
else if (params_scalar(p)%name == "THICK_ROOF") then
urbinp%thick_roof = data_scalar_o
else if (params_scalar(p)%name == "THICK_WALL") then
urbinp%thick_wall = data_scalar_o
else if (params_scalar(p)%name == "T_BUILDING_MIN") then
urbinp%t_building_min = data_scalar_o
else if (params_scalar(p)%name == "WIND_HGT_CANYON") then
urbinp%wind_hgt_canyon = data_scalar_o
else if (params_scalar(p)%name == "WTLUNIT_ROOF") then
urbinp%wtlunit_roof = data_scalar_o
else if (params_scalar(p)%name == "WTROAD_PERV") then
urbinp%wtroad_perv = data_scalar_o
else if (params_scalar(p)%name == "NLEV_IMPROAD") then
urbinp%nlev_improad = data_scalar_o
end if
end do

For example, I'm wondering if there are any upper to lower case conversion functions in Fortran.

olyson · 2020-06-24T21:27:47Z

Erik and I had a discussion about whether this new feature should be required before bringing in the new set of urban datasets described in PR#591. In the interest of supporting the urban user community and provide them with the new datasets by default in a timely manner, we decided to table this new feature until after the new urban datasets are brought in.
Another barrier to doing this is that I realized that the THESIS urban properties tool now hosted here:

https://github.com/olyson/THESISUrbanPropertiesTool

would have to be reworked to produce a separate table of urban properties consistent with the above approach.
I don't have any resources to do that at this time.
The new urban datasets and the THESIS urban properties tools are documented in Oleson and Feddema (2019) and it's important to have a CTSM version that has these datasets. This is now slated to be a part of CTSM5.2.

ekluzek · 2024-04-10T17:38:35Z

@olyson and @briandobbins this is part of what I was talking about in email.

From the above comments though we see that @olyson got started on this, but the THESISUrbanPropertiesTool would need to be updated which was too big of a lift. And for CESM3 it's obvious from that this couldn't be done.

ekluzek added the enhancement new capability or improved behavior of existing capability label Feb 5, 2019

ekluzek self-assigned this Feb 5, 2019

ekluzek mentioned this issue Feb 5, 2019

Remove unused fields from surface datasets #629

Closed

ekluzek mentioned this issue Mar 1, 2022

Think about surface datasets for very high resolution cases... #1292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce disk space on surface datasets by using the urban lookup table #633

Reduce disk space on surface datasets by using the urban lookup table #633

ekluzek commented Feb 5, 2019

ekluzek commented Feb 21, 2019

olyson commented Mar 25, 2019

ekluzek commented Mar 25, 2019

olyson commented Apr 1, 2019

olyson commented Jun 24, 2020 •

edited by ekluzek

Loading

ekluzek commented Apr 10, 2024

Reduce disk space on surface datasets by using the urban lookup table #633

Reduce disk space on surface datasets by using the urban lookup table #633

Comments

ekluzek commented Feb 5, 2019

ekluzek commented Feb 21, 2019

olyson commented Mar 25, 2019

ekluzek commented Mar 25, 2019

olyson commented Apr 1, 2019

olyson commented Jun 24, 2020 • edited by ekluzek Loading

ekluzek commented Apr 10, 2024

olyson commented Jun 24, 2020 •

edited by ekluzek

Loading