# Masking land or ocean values

Sometimes we deal with variables that do not necessarily make sense to plot on the ocean. For instance primary productivity is none on the ocean in the models we work with. So instead of showing a value of 0 for the ocean gridpoints, we can simply alltogether mask those points out and not show them. Or sometimes we have sea surface temperature (SST) fields that also have values on land simply because the model needs to have values everywhere. It will itself mask them out later. So in that case we should simply mask out the land values because they are not meaningful.
So let's do this with obrero. First import the module:

In [1]:
# small hack to be able to import module without install
import os
import sys
sys.path.append(os.getcwd() + '/../')

import obrero

Now read some data. To know where land and ocean are, we generally use a "land binary mask", that is an array that has 0 values for ocean gridpoints and 1 values for land gridpoints. Hence the binary part. So we need to get this mask from somewhere. We have one in our sample data kit. It goes without saying that all these different arrays should have the same spatial extent (same latitude and longitude). Though they can have different times, since usually the land mask does not vary in time. Let's assume we are interested in studying precipitation on the ocean and on land separately, so let's read precipitation and convert units:

In [3]:
# read data
fname = 'data/ctl.nc'
lname = 'data/lsm.nc'

# read as datarray
da = obrero.read_nc(fname, 'pr')
lm = obrero.read_nc(lname, 'lsm')

# convert units
da.convert_units('mm day-1')

Now we use the function `mask_land_ocean()` in obrero to get values on either land or ocean. To choose we have a simple boolean flag as third optional argument. It will by default always mask out ocean values: 

In [5]:
# only land
daland = obrero.mask_land_ocean(da, lm)
daland

# only ocean
daocean = obrero.mask_land_ocean(da, lm, ocean=True)
daocean

<xarray.DataArray 'pr' (time: 72, latitude: 32, longitude: 64)>
array([[[0.70923 , 0.777796, ..., 0.540069, 0.629775],
        [0.974657, 1.099068, ..., 0.706435, 0.826919],
        ...,
        [     nan,      nan, ...,      nan,      nan],
        [     nan,      nan, ...,      nan,      nan]],

       [[0.186415, 0.181961, ..., 0.169736, 0.178863],
        [1.689129, 1.527837, ..., 1.74003 , 1.793405],
        ...,
        [     nan,      nan, ...,      nan,      nan],
        [     nan,      nan, ...,      nan,      nan]],

       ...,

       [[0.555498, 0.599123, ..., 0.43743 , 0.504916],
        [1.976995, 1.71042 , ..., 2.458799, 2.274632],
        ...,
        [     nan,      nan, ...,      nan,      nan],
        [     nan,      nan, ...,      nan,      nan]],

       [[0.625771, 0.617749, ..., 0.601025, 0.618503],
        [1.539145, 1.30133 , ..., 1.79069 , 1.720224],
        ...,
        [     nan,      nan, ...,      nan,      nan],
        [     nan,      nan, ...,      n

One can see `numpy.nan` values in opposite places in both arrays.