# Masking Data


---

## Learning Objectives 


- Provide an overview of masking data in xarray
- Masking data using `.where()` method

## Prerequisites


| Concepts | Importance | Notes |
| --- | --- | --- |
| [Understanding of xarray core data structures](./01-xarray-fundamentals.ipynb) | Necessary | |
| [Familiarity with NumPy ](https://numpy.org/doc/stable/reference/arrays.indexing.html) | Helpful | |

- **Time to learn**: *10 minutes*



---

## Overview

Using `xr.where()` or `.where()` method, elements of an Xarray Dataset or xarray DataArray that satisfy a given condition or multiple conditions can be replaced/masked.To demonstrate this, we are going to use the `.where()` method on the `tos` dataarray. 

## Imports


In [None]:
import matplotlib.pyplot as plt
import xarray as xr

Open the sea surface temperature dataset:

In [None]:
ds = xr.open_dataset(
    "./data/tos_Omon_CESM2_historical_r11i1p1f1_gr_200001-201412.nc", engine="netcdf4"
)
ds

## Using `where` with one condition

In [None]:
sample = ...
sample

Before applying `.where()`, let's look at the documentation

In [None]:
sample.where?

- As the documention points out, the conditional expression in `.where` can be: 

    - a DataArray
    - a Dataset
    - a function

- Unlike `.isel()` and `sel()` that change the shape of the returned results, `.where()` preserves the shape of the original data. It does accomplishes this by returning values from the original DataArray or Dataset if the `condition` is `True`, and fills in missing values wherever the `condition` is `False`. 


For demonstration purposes, let's use where to mask locations with temperature values greater than `0`:

In [None]:
masked_sample = sample...
masked_sample

Let's plot both our original sample, and the masked sample:

In [None]:
fig, axes = plt.subplots(ncols=2, figsize=(19, 6))
sample.plot(ax=axes[0], robust=True)
masked_sample.plot(ax=axes[1], robust=True);

## Using `where` with multiple conditions

`.where()` allows providing multiple conditions. To do this, we need to make sure each conditional expression is enclosed in `()`. To combine conditions, we use the `bit-wise and` (`&`) operator and/or the `bit-wise or` (`|`). let's use `where` to mask locations with temperature values less than 25 and greater than 30:

We can use coordinates to apply a mask as well. Below, we use the `latitude` and `longitude` coordinates to mask the [Niño 3.4 region](https://www.ncdc.noaa.gov/teleconnections/enso/indicators/sst/):

![](https://www.ncdc.noaa.gov/monitoring-content/teleconnections/nino-regions.gif)



## Using `where` with a custom fill value

`.where()` can take a second argument, which, if supplied, is used to fill value for the masked region. Below we fill masked regtions with a constant `0`

---

In [None]:
%load_ext watermark
%watermark --time --python --updated --iversion

## Resources and References

- [Xarray Documentation - Masking with `where()`](https://xarray.pydata.org/en/stable/user-guide/indexing.html#masking-with-where)

<div class="admonition alert alert-success">
    <p class="title" style="font-weight:bold">Previous: <a href="./04-computation.ipynb">Computation</a></p>
    <p class="title" style="font-weight:bold">Next: <a href="./06-end-to-end-example.ipynb">End-to-End example: Computing Niño 3.4 Index </a></p>
</div>