# Direct Coords Approach

Instead of a IntervalIndex, we can encode metadata as coordinates that share the time dimension.

## Setup

In [None]:
import numpy as np
import xarray as xr

## Creating the Dataset

We create a dataset where `word` is a coordinate on the `time` dimension.

In [None]:
# Create sample data
T = 1000
C = 2
times = np.linspace(0, 120, T)
data = np.random.rand(C, T)

# Define word boundaries
breaks = np.array([0, 333, 666, 1000])

# Create word labels for each time point
words = np.array(["red"] * T)
words[breaks[0] : breaks[1]] = "red"
words[breaks[1] : breaks[2]] = "green"
words[breaks[2] :] = "blue"

# Create Dataset with word as a coord on the time dimension
ds = xr.Dataset(
    {"data": (("C", "time"), data)}, coords={"time": times, "word": ("time", words)}
).set_xindex("word")
ds

## What Works

### Time slicing:

In [None]:
ds.sel(time=slice(0.15, 15.5))

### Selection by word label works:

In [None]:
ds.sel(word="red")

## Limitations

### 1. Annoying to construct

The natural representation of metadata is often: `onset, duration, word`. To create the dense array we need to manually expand this into a value for every time point.

### 2. No `isel` by word

Since `word` is on the `time` dimension, there's no `word` dimension to index into:

In [None]:
# This doesn't work - word is not a dimension
try:
    ds.isel(word=0)
except ValueError as e:
    print(f"Error: {e}")

### 3. Interval info is obscured

Important questions become hard to answer:
- What was the total duration of the 3rd word?
- What are the exact interval boundaries?

### 4. Constrained to measurement time points

If metadata events happen at times not in your measurement grid, you lose that precision. For example, if you sample monthly but an event happened mid-month, you can't represent that exactly.

### 5. Can't drop coord as index when it becomes scalar

When a selection reduces the coordinate to a single value, you can't easily drop it from being an index:

In [None]:
# Selecting a single word returns all matching time points
result = ds.sel(word="red")
print(f"Result has {len(result.time)} time points")
print(f"word coord: {result.word.values}")

# But word is still an index on the time dimension
# If you want to drop it, you can't easily do so when it's scalar
# result.drop_indexes('word')  # This can cause issues

## Advantages

Despite the limitations, this approach:
- Clearly shows that word spans time
- Uses only standard xarray features
- Is simple to understand