# grower function

Sometimes you have data that are on different domain but you need to compare them.

The most common example is a mask. Let's say you have three dimensional (time/lat/lon) data but are only interested in the land or ocean area.

To illustrate this point let's construct a dataset made of **surface** temperature over land (variable `ts`) and **2 meter air surface temperature** over ocean (variable `tas`). This new dataset will be closer from actual observations.

first let's retrieve one year worth of data for each variable

In [1]:
import cdms2
ipsl_tas_file = cdms2.open("/global/cscratch1/sd/cmip6/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Amon/tas/gr/v20180803/tas_Amon_IPSL-CM6A-LR_historical_r1i1p1f1_gr_185001-201412.nc")
ipsl_ts_file = cdms2.open("/global/cscratch1/sd/cmip6/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Amon/ts/gr/v20180803/ts_Amon_IPSL-CM6A-LR_historical_r1i1p1f1_gr_185001-201412.nc")
tas = ipsl_tas_file('tas', time=slice(0,12)) # First year
ts = ipsl_ts_file('ts', time=slice(0,12)) # First year
print(ts.shape)

(12, 143, 144)


Now we need to get a land/sea mask from piControl

In [2]:
ipsl_ldsea_file = cdms2.open("/global/cscratch1/sd/cmip6/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/piControl/r1i1p1f1/fx/sftlf/gr/v20181123/sftlf_fx_IPSL-CM6A-LR_piControl_r1i1p1f1_gr.nc")
sftlf = ipsl_ldsea_file("sftlf")
print(sftlf.shape)

(143, 144)


We need to add the time dimension to `sftlf` so that we can mask appropriately, for this we use the grower function:

In [3]:
import genutil
tas, sftlf_grown = genutil.grower(tas, sftlf)

print(sftlf_grown.shape)  # Now has time

(12, 143, 144)


`grower` takes two variables and duplicates any missing dimensions in each variables. It then returns the new *grown* variables with added missing dimensions.

***IMPORTANT***

The new variables order will be all the dimensions in the first passed variable and **THEN** any dimensions that were not present in the first variable but were in the second.

This means the order in which you send the variables to grower matters for example if we switch them in the above example we would get:

In [4]:
sftlf_grown2, tas2 = genutil.grower(sftlf, tas)

print(sftlf_grown2.shape)  # Now has time
print(tas2.shape)  # Now has time

(143, 144, 12)
(143, 144, 12)


Notice that time is now **LAST** for **BOTH** variable because the dimension of the first passed variable were lat/lon.

***NOTE***

The variables do not have to have any variable in common.

In [5]:
time_serie = genutil.averager(tas, axis = 'xy')
print(time_serie.shape, sftlf.shape)
v1, v2 = genutil.grower(time_serie, sftlf)
print(v1.shape, v2.shape)

(12,) (143, 144)
(12, 143, 144) (12, 143, 144)


Let's go back to our example, now that our land_sea mask and our data have matching shape the can easily produce our combined dataset:

In [5]:
import MV2
land = MV2.greater(sftlf_grown, 50.)
# Use ts over land, tas otherwise (ocean)
combined = MV2.where(land, ts, tas)
print(combined.shape, combined.getOrder())

(12, 143, 144) tyx
