# Slicing

Data in a [Variable](../generated/scipp.Variable.rst#scipp.Variable) or [Dataset](../generated/scipp.Dataset.rst#scipp.Dataset) can be indexed in a similar manner to NumPy and xarray.
The dimension to be sliced is specified using a dimension label and, in contrast to NumPy, positional dimension lookup is not available.
Positional indexing with an integer or an integer range is using `__getitem__` and `__setitem__` with a dimension label as first argument.
This is available for variables, datasets, as well as items of a dataset.
In all cases a *view* is returned, i.e., just like when slicing a [numpy.ndarray](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html#numpy.ndarray) no copy is performed.

## Basic slicing

Consider the following variable:

In [None]:
import numpy as np
import scipp as sc
from scipp.plot import plot

var = sc.Variable(
    dims=['z', 'y', 'x'],
    values=np.random.rand(2, 3, 4),
    variances=np.random.rand(2, 3, 4))
sc.show(var)

As when slicing a `numpy.ndarray`, the dimension `'x'` is removed since no range is specified:

In [None]:
s = var['x', 1]
sc.show(s)
print(s.dims, s.shape)

When a range is specified, the dimension is kept, even if it has extent 1:

In [None]:
s = var['x', 1:3]
sc.show(s)
print(s.dims, s.shape)

s = var['x', 1:2]
sc.show(s)
print(s.dims, s.shape)

Slicing can be chained arbitrarily:

In [None]:
s = var['x', 1:4]['y', 2]['x', 1]
sc.show(s)
print(s.dims, s.shape)

Slicing for datasets works in the same way, but some additional rules apply:

In [None]:
d = sc.Dataset(
    {'a': sc.Variable(dims=['x', 'y'], values=np.random.rand(2, 3)),
     'b': sc.Variable(dims=['y', 'x'], values=np.random.rand(3, 2)),
     'c': sc.Variable(dims=['x'], values=np.random.rand(2)),
     '0d-data': sc.Variable(1.0)},
    coords={
        'x': sc.Variable(['x'], values=np.arange(2.0), unit=sc.units.m),
        'y': sc.Variable(['y'], values=np.arange(3.0), unit=sc.units.m),
        'aux_x': sc.Variable(['x'], values=np.arange(2.0), unit=sc.units.m),
        'aux_y': sc.Variable(['y'], values=np.arange(3.0), unit=sc.units.m)})
sc.show(d)

As when slicing a variable, the sliced dimension is removed when slicing without range, and kept when slicing with range.

When slicing a dataset a number of other things happen as well:

- Any data item that does not depend on the sliced dimension is removed.
- Slicing **without range**:
  - The *coordinates* for the sliced dimension are *removed*.
- Slicing **with a range**:
  - The *coordinates* for the sliced dimension are *kept*.


This is an important aspect and it is worthwhile to take some time and think through the mechanism.
Consider the following example, contrasting slicing with and without range:

- We slice dimension `'x'`, so the data item `'0d-data'` which does not depend on dimension `'x'` is not visible in the slice views.
- In the second case (without range) the coord for dimension `'x'` is also not part of the slice view

Make sure to inspect the `dims` and `shape` of all variable (data and coordinates) of the resulting slice views (note the tooltip shown when moving the mouse over the name also contains this information):

In [None]:
# Range of length 1
sc.show(d['x', 1:2])
d['x', 1:2]

In [None]:
# No range
sc.show(d['x', 1])
d['x', 1]

Slicing a data item of a dataset should not bring any surprises.
Essentially this behaves like slicing a dataset with just a single data item:

In [None]:
sc.show(d['a']['x', 1:2])

Slicing and item access can be done in arbitrary order with identical results:

In [None]:
d['x', 1:2]['a'] == d['a']['x', 1:2]
d['x', 1:2]['a'].coords['x'] == d.coords['x']['x', 1:2]

## Slicing tools

Scipp provides a couple of tools that can be used to split a multi-dimensional Dataset or DataArray into a `dict` of slices.

### scipp.slices

The first slices a Scipp object along a given dimension:

In [None]:
N = 40
M = 3
L = 2
x = np.arange(N).astype(np.float64)
b = 0.5 * N
a = 4.0 * np.random.random([L, M, N])
a[1, 1, :] = np.abs(10.0 * np.cos((x - b) * 2.0 / b))
v = 0.1 * np.random.random([L, M, N])
d1 = sc.Dataset()
d1.coords['x'] = sc.Variable(['x'], values=x, unit=sc.units.us)
d1.coords['y'] = sc.Variable(['y'],
                             values=np.arange(M).astype(np.float64),
                             unit=sc.units.m)
d1.coords['z'] = sc.Variable(['z'],
                             values=np.arange(L).astype(np.float64),
                             unit=sc.units.m)
d1['a'] = sc.Variable(['z', 'y', 'x'], values=a, variances=v)
d1

In [None]:
sliced = sc.slices(d1['a'], dim='z')
plot(sliced)

### scipp.collapse

The second method slices down the input object until only the supplied `keep` dimension is left, returning a `dict` of 1D slices. This is often useful if for instance most detector pixels contain noise, but one specific channel contains a strong signal.

In [None]:
collapsed = sc.collapse(d1['a'], keep='x')
plot(collapsed)