# **Tutorial 8: Masking with One Condition**

**Week 1, Day 1, Introduction to the Climate System**

**Content creators:** Sloane Garelick, Julia Kent

**Content reviewers:** Danika Gupta, Younkap Nina Duplex 

**Content editors:** Agustina Pesce

**Production editors:** TBD

**Our 2023 Sponsors:** TBD



###**Code and Data Sources**

Code and data for this tutorial is based on existing content from [Project Pythia](https://foundations.projectpythia.org/core/xarray/computation-masking.html).

## **Tutorial 8 Objectives**

One useful tool for assessing climate data is to masking, which allows you to filter elements of a dataset according to a specific condition and create a "masked array" in which the elements not fulfilling the condition will not be shown. This tool is helpful if you wish to, for example, only look at data greater or less than a certain value, or from a specific temporal or spatial range. For instance, when analyzing a map of global precipitation, we could mask regions that contain a value of mean annual precipitation above or below a specific value or range of values in order to assess wet and dry seasons. 

In this tutorial we will learn how to mask data with one condition, and will apply this to our map of global SST to assess the impacts of the ice-albedo feedback on polar SST.

## Imports


In [None]:
# !pip install matplotlib.pyplot
# !pip install numpy
# !pip install xarray
# !pip install pythia_datasets
# !pip install pandas

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from pythia_datasets import DATASETS
import pandas as pd

## Masking Data


Using the `xr.where()` or `.where()` method, elements of an xarray Dataset or xarray DataArray that satisfy a given condition or multiple conditions can be replaced/masked. To demonstrate this, we are going to use the `.where()` method on the `tos` DataArray that we've been using in the past few tutorials. 

Let's load the same data that we used in the previous tutorial (monthly SST data from CESM2):

In [None]:
filepath = DATASETS.fetch('CESM2_sst_data.nc')
ds = xr.open_dataset(filepath,decode_times=False)
new_time = pd.date_range(start='2000-01-15', end='2014-12-15', periods=180)
ds = ds.assign(time=new_time)
ds

### Using `where` with one condition

Let's say we want to analyze SST just from the last time in the dataset (2014-12-15). We can isolate this time using `.isel()`:

In [None]:
sample = ds.tos.isel(time=-1)
sample

Now that we have our DataArray of the desired time period, we can use another function, `.where()` to filter elements according to a condition. The conditional expression in `.where()` can be a DataArray, a Dataset or a function. Indexing methods on xarray objects generally return a subset of the original data. However, it is sometimes useful to select an object with the same shape as the original data, but with some elements masked. Unlike `.isel()` and `.sel()` that change the shape of the returned results, `.where()` preserves the shape of the original data. It accomplishes this by returning values from the original DataArray or Dataset if the `condition` is `True`, and fills in values (by default `nan`) wherever the `condition` is `False`. Additional information can be found in the [`.where()` documentation](http://xarray.pydata.org/en/stable/generated/xarray.DataArray.where.html). 

Let's use `.where()` to mask locations with temperature values greater than `0`:

In [None]:
masked_sample = sample.where(sample < 0.0)
masked_sample

Let's plot both our original sample, and the masked sample:

In [None]:
fig, axes = plt.subplots(ncols=2, figsize=(19, 6))
sample.plot(ax=axes[0])
masked_sample.plot(ax=axes[1]);

Notice how only the SST from the areas where SST is below 0ºC is shown and the other areas are white since these are now NaN values. Now let's assess how polar SST has changed over the time period recorded by this dataset. To do so, we can run the same code but focus on the time 2000-03-15: 

In [None]:
sample_2 = ds.tos.isel(time=2) #2000-03-15
masked_sample_2 = sample_2.where(sample_2 < 0.0)
fig, axes = plt.subplots(ncols=2, figsize=(19, 6))
masked_sample.plot(ax=axes[0])
masked_sample_2.plot(ax=axes[1]);

- What is similar and different between the SST maps from these two time periods?
- How do the SST values compare?
- How does the masked area compare? Is there less area masked (i.e., more areas warmer than 0ºC) during one time period than the other?
- How might changes in the ice-albedo feedback be playing a role in what you observe?