```{eval-rst}
.. currentmodule:: xarray.indexes
```

# Intervals with `CFIntervalIndex`

```{seealso}
Learn more at the [Climate and Forecast Conventions on bounds variables](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html#cell-boundaries).
```

```{warning}
This Index is [in development](https://github.com/pydata/xarray/pull/10296), and is not available in the released Xarray version yet.
```

## Highlights

1. {py:class}`CFIntervalIndex` models intervals using _two_ arrays, one 2D array represents interval bounds, and a second 1D array represents the "central values". Such a format for recording "cell bounds" is recommended by the [CF conventions](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html#cell-boundaries).
1. Overrides Xarray's default coordinate propagation rules to include both representative points and bounds coordinate arrays when extracting DataArrays from a Dataset.

## Example

We will load the standard air temperature dataset, but add a new `time_bounds` variable.

In [1]:
%xmode minimal

import numpy as np
import pandas as pd
import xarray as xr

xr.set_options(display_expand_indexes=True, display_expand_attrs=False)
pd.set_option('display.max_seq_items', 10)

orig = xr.tutorial.open_dataset("air_temperature")
orig

Exception reporting mode: Minimal


To emphasize the difference from {doc}`pdinterval`, we will
1. dd a `time_bounds` variable, assuming that the data represent averages over 6 hour periods centered at 00h, 06h, 12h, 18h; and
2. Add arbitrary offsets to the time coordinate so the "central" values are _not_ the midpoint

In [2]:
left = orig.time - pd.Timedelta("3h")
right = orig.time + pd.Timedelta("3h")
time_bounds = xr.concat([left, right], dim="bounds")
time_bounds

In [3]:
time = orig.time.data
newtime = time + np.random.random(time.size) * pd.Timedelta("1h")
newtime

array(['2013-01-01T00:41:27.002772747', '2013-01-01T06:16:05.568508910',
       '2013-01-01T12:01:04.182052355', ...,
       '2014-12-31T06:19:07.545307236', '2014-12-31T12:14:55.020523888',
       '2014-12-31T18:21:54.571544094'],
      shape=(2920,), dtype='datetime64[ns]')

Now assign these new arrays to the Dataset

In [6]:
orig.coords["time"] = orig.time.copy(data=newtime)
orig.coords["time_bounds"] = time_bounds.assign_coords(time=newtime)
orig.time.attrs["bounds"] = "time_bounds"  # add the attribute for CF compliance
orig

### Assigning

We will drop the existing PandasIndex along `"time"` and add a new {py:class}`CFIntervalIndex`

In [7]:
from xarray.indexes import CFIntervalIndex

indexed = orig.drop_indexes("time").set_xindex(
    ["time", "time_bounds"], CFIntervalIndex, closed="left"
)
indexed

### Coordinate variable propagation

A classic issue with Xarray is that bounds variables don't get propagated by default.

Note that the `"time_bounds"` variable is lost when pulling out the `"air"` DataArray.
Important information has been lost!

In [8]:
orig["air"]

```{margin}
{py:class}`CFIntervalIndex` overrides Xarray's default coordinate propagation rules using {py:func}`xarray.Index.should_add_coord_to_array`. By default, when extracting a DataArray from a Dataset, Xarray will only include those coordinate variables whose dimensions are a subset of the dimensions of the extracted DataArray.
```

But {py:class}`CFIntervalIndex` overrides that rule, and can propagate the `"time_bounds"` variable since it is required to correctly propagate the Index.

In [9]:
indexed["air"]

### Indexing

Index the intervals using the `'time'` coordinate

In [10]:
indexed.sel(time=["2013-01-02 10:00", "2013-01-04 10:00"])