# Operations with variables and datasets

Operations "align" data items based on their dimension labels:

In [1]:
import numpy as np
import scipp as sc
from scipp import Dim

a = sc.Variable(values=np.random.rand(2, 4),
                variances=np.random.rand(2, 4),
                dims=[Dim.X, Dim.Y],
                unit=sc.units.m)
b = sc.Variable(values=np.random.rand(4, 2),
                variances=np.random.rand(4, 2),
                dims=[Dim.Y, Dim.X],
                unit=sc.units.s)
a/b

    <scipp.Variable>          double    [m s^-1]         (Dim.X, Dim.Y)  [1.512420, 0.896560, 0.517993, 1.116080, 0.660887, 0.798604, 1.407452, 0.763347]  [17.513852, 3.289358, 3.147823, 2.777012, 38.676690, 2.262908, 7.942264, 2.248208]

Note how operations with variables correctly propagate uncertainties (the variances), in contrast to a naive implementation using numpy:

In [2]:
result = a/b
result.values

array([[1.51242046, 0.89655973, 0.51799305, 1.11607958],
       [0.66088695, 0.79860444, 1.40745185, 0.76334714]])

In [3]:
a.values/np.transpose(b.values)

array([[1.51242046, 0.89655973, 0.51799305, 1.11607958],
       [0.66088695, 0.79860444, 1.40745185, 0.76334714]])

In [4]:
result.variances

array([[17.51385166,  3.2893584 ,  3.14782314,  2.77701245],
       [38.67668987,  2.26290811,  7.9422638 ,  2.24820827]])

In [5]:
a.variances/np.transpose(b.variances)

array([[1.03542978, 2.03259694, 3.07205974, 0.07766485],
       [2.40929699, 0.52900654, 0.75848209, 0.81175937]])

The implementation assumes uncorrelated data and is otherwise based on, e.g., [Wikipedia: Propagation of uncertainty](https://en.wikipedia.org/wiki/Propagation_of_uncertainty#Example_formulae>).
See also [Propagation of uncertainties](error-propagation.rst) for the concrete equations used for error propagation.

Missing dimensions in the operands are automatically broadcast:

In [6]:
a.values

array([[0.53490047, 0.50462015, 0.24760308, 0.69791789],
       [0.04053041, 0.56151394, 0.80772139, 0.5671464 ]])

In [7]:
a -= a[Dim.X, 1]
a.values

array([[ 0.49437006, -0.05689379, -0.56011831,  0.13077149],
       [ 0.        ,  0.        ,  0.        ,  0.        ]])

Both operands may be broadcast, creating an output with the combination of input dimensions:

In [12]:
sc.show(a[Dim.X, 1])
sc.show(a[Dim.Y, 1])
sc.show(a[Dim.X, 1] + a[Dim.Y, 1])

Note that in-place operations such as `+=` will never change the shape of the left-hand-side.
That is only the right-hand-side operation can be broadcast, and the operation fails of a broadcast of the left-hand-side would be required.

Units are required to be compatible:

In [8]:
try:
    a + b
except Exception as e:
    print(str(e))

Expected m to be equal to s.


Data items are paired based on their names when applying operations to datasets.
Operations fail if names do not match:

- In-place operations such as `+=` accept a right-hand-side operand that misses items that the left-hand-side has.
  If the right-hand-side contains items that are not in the left-hand-side the operation fails.
- Non-in-place operations such as `+` return a new dataset with items from the intersection of the inputs.

Coords and labels are compared in operations with datasets (or items of datasets).
Operations fail if there is any mismatch in coord or label values.

In [None]:
d1 = sc.Dataset(
    {'a': sc.Variable(dims=[Dim.X, Dim.Y], values=np.random.rand(2, 3)),
     'b': sc.Variable(dims=[Dim.Y, Dim.X], values=np.random.rand(3, 2)),
     'c': sc.Variable(dims=[Dim.X], values=np.random.rand(2)),
     'd': sc.Variable(1.0)},
    coords={
        Dim.X: sc.Variable([Dim.X], values=np.arange(2.0), unit=sc.units.m),
        Dim.Y: sc.Variable([Dim.Y], values=np.arange(3.0), unit=sc.units.m)})
d2 = sc.Dataset(
    {'a': sc.Variable(dims=[Dim.X, Dim.Y], values=np.random.rand(2, 3)),
     'b': sc.Variable(dims=[Dim.Y, Dim.X], values=np.random.rand(3, 2))},
    coords={
        Dim.X: sc.Variable([Dim.X], values=np.arange(2.0), unit=sc.units.m),
        Dim.Y: sc.Variable([Dim.Y], values=np.arange(3.0), unit=sc.units.m)})

In [21]:
d1 += d2

In [19]:
try:
    d2 += d1
except Exception as e:
    print(str(e))

Could not find data with name c.


In [23]:
d3 = d1 + d2
for name, _ in d3:
    print(name)

a
b


In [20]:
d['a'] -= d['b'] # transposing
d['a'] -= d[Dim.X, 1]['b'] # broadcasting
try:
    d['a'] -= d[Dim.X, 1:2]['b'] # fail due to coordinate mismatch
except Exception as e:
    print(str(e))

Expected coords to match.
