# Einstein Operations

[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chaobrain/saiunit/blob/master/docs/mathematical_functions/einstein_operation.ipynb)
[![Open in Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/chaobrain/saiunit/blob/master/docs/mathematical_functions/einstein_operation.ipynb)

We don't write 
```python
y = x.transpose(0, 2, 3, 1)
```
We write comprehensible code
```python
y = bm.einrearrange(x, 'b c h w -> b h w c')
```


## What's in this tutorial?

- fundamentals: reordering, composition and decomposition of axes
- operations: `einrearrange`, `einreduce`, `einrepeat`
- how much you can do with a single operation!

In [None]:
import numpy

import saiunit as u

## Load a batch of images to play with

Please download [the data](./test_images.npy).

In [None]:
ims = numpy.load('./test_images.npy', allow_pickle=False)
# There are 6 images of shape 96x96 with 3 color channels packed into tensor
print(ims.shape, ims.dtype)

In [None]:
# display the first image (whole 4d tensor can't be rendered)
ims[0].shape

In [None]:
# second image in a batch
ims[1].shape

In [None]:
# rearrange, as its name suggests, rearranges elements
# below we swapped height and width.
# In other words, transposed first two axes (dimensions)
u.math.einrearrange(ims[0], 'h w c -> w h c').shape

## Composition of axes
transposition is very common and useful, but let's move to other capabilities provided by einops

In [None]:
# einops allows seamlessly composing batch and height to a new height dimension
# We just rendered all images by collapsing to 3d tensor!
u.math.einrearrange(ims, 'b h w c -> (b h) w c').shape

In [None]:
# or compose a new dimension of batch and width
u.math.einrearrange(ims, 'b h w c -> h (b w) c').shape

In [None]:
# resulting dimensions are computed very simply
# length of newly composed axis is a product of components
# [6, 96, 96, 3] -> [96, (6 * 96), 3]
u.math.einrearrange(ims, 'b h w c -> h (b w) c').shape

In [None]:
# we can compose more than two axes. 
# let's flatten 4d array into 1d, resulting array has as many elements as the original
u.math.einrearrange(ims, 'b h w c -> (b h w c)').shape

## Decomposition of axis

In [None]:
# decomposition is the inverse process - represent an axis as a combination of new axes
# several decompositions possible, so b1=2 is to decompose 6 to b1=2 and b2=3
u.math.einrearrange(ims, '(b1 b2) h w c -> b1 b2 h w c ', b1=2).shape

In [None]:
# finally, combine composition and decomposition:
u.math.einrearrange(ims, '(b1 b2) h w c -> (b1 h) (b2 w) c ', b1=2).shape

In [None]:
# slightly different composition: b1 is merged with width, b2 with height
# ... so letters are ordered by w then by h
u.math.einrearrange(ims, '(b1 b2) h w c -> (b2 h) (b1 w) c ', b1=2).shape

In [None]:
# move part of width dimension to height. 
# we should call this width-to-height as image width shrunk by 2 and height doubled. 
# but all pixels are the same!
# Can you write reverse operation (height-to-width)?
u.math.einrearrange(ims, 'b h (w w2) c -> (h w2) (b w) c', w2=2).shape

## Order of axes matters

In [None]:
# compare with the next example
u.math.einrearrange(ims, 'b h w c -> h (b w) c').shape

In [None]:
# order of axes in composition is different
# rule is just as for digits in the number: leftmost digit is the most significant, 
# while neighboring numbers differ in the rightmost axis.

# you can also think of this as lexicographic sort
u.math.einrearrange(ims, 'b h w c -> h (w b) c').shape

In [None]:
# what if b1 and b2 are reordered before composing to width?
u.math.einrearrange(ims, '(b1 b2) h w c -> h (b1 b2 w) c ', b1=2).shape 

In [None]:
u.math.einrearrange(ims, '(b1 b2) h w c -> h (b2 b1 w) c ', b1=2).shape 

## Meet einops.reduce

In einops-land you don't need to guess what happened
```python
x.mean(-1)
```
Because you write what the operation does
```python
u.math.einreduce(x, 'b h w c -> b h w', 'mean')
```

if axis is not present in the output — you guessed it — axis was reduced.

In [None]:
# average over batch
u.math.einreduce(ims, 'b h w c -> h w c', 'mean').shape

In [None]:
# the previous is identical to familiar:
ims.mean(axis=0).shape
# but is so much more readable

In [None]:
# Example of reducing of several axes 
# besides mean, there are also min, max, sum, prod
u.math.einreduce(ims, 'b h w c -> h w', 'min').shape

In [None]:
# this is mean-pooling with 2x2 kernel
# image is split into 2x2 patches, each patch is averaged
u.math.einreduce(ims, 'b (h h2) (w w2) c -> h (b w) c', 'mean', h2=2, w2=2).shape

In [None]:
# max-pooling is similar
# result is not as smooth as for mean-pooling
u.math.einreduce(ims, 'b (h h2) (w w2) c -> h (b w) c', 'max', h2=2, w2=2).shape

In [None]:
# yet another example. Can you compute result shape?
u.math.einreduce(ims, '(b1 b2) h w c -> (b2 h) (b1 w)', 'mean', b1=2).shape

## Stack and concatenate

In [None]:
# rearrange can also take care of lists of arrays with the same shape
x = list(ims)
print(type(x), 'with', len(x), 'tensors of shape', x[0].shape)
# that's how we can stack inputs
# "list axis" becomes first ("b" in this case), and we left it there
res = u.math.einrearrange(x, 'b h w c -> b h w c')

[r.shape for r in res]

In [None]:
# but new axis can appear in the other place:
u.math.einrearrange(x, 'b h w c -> h w c b').shape

In [None]:
# that's equivalent to numpy stacking, but written more explicitly
numpy.array_equal(u.math.einrearrange(x, 'b h w c -> h w c b'), numpy.stack(x, axis=3))

In [None]:
# ... or we can concatenate along axes
u.math.einrearrange(x, 'b h w c -> h (b w) c').shape

In [None]:
# which is equivalent to concatenation
numpy.array_equal(u.math.einrearrange(x, 'b h w c -> h (b w) c'), numpy.concatenate(x, axis=1))

## Addition or removal of axes

You can write 1 to create a new axis of length 1. Similarly you can remove such axis.

There is also a synonym `()` that you can use. That's a composition of zero axes and it also has a unit length.

In [None]:
x = u.math.einrearrange(ims, 'b h w c -> b 1 h w 1 c') # functionality of numpy.expand_dims
print(x.shape)
print(u.math.einrearrange(x, 'b 1 h w 1 c -> b h w c').shape) # functionality of numpy.squeeze

In [None]:
# compute max in each image individually, then show a difference 
x = u.math.einreduce(ims, 'b h w c -> b () () c', 'max') - ims
u.math.einrearrange(x, 'b h w c -> h (b w) c').shape

## Repeating elements

Third operation we introduce is `repeat`

In [None]:
# repeat along a new axis. New axis can be placed anywhere
u.math.einrepeat(ims[0], 'h w c -> h new_axis w c', new_axis=5).shape

In [None]:
# shortcut
u.math.einrepeat(ims[0], 'h w c -> h 5 w c').shape

In [None]:
# repeat along w (existing axis)
u.math.einrepeat(ims[0], 'h w c -> h (repeat w) c', repeat=3).shape

In [None]:
# repeat along two existing axes
u.math.einrepeat(ims[0], 'h w c -> (2 h) (2 w) c').shape

In [None]:
# order of axes matters as usual - you can repeat each element (pixel) 3 times 
# by changing order in parenthesis
u.math.einrepeat(ims[0], 'h w c -> h (w repeat) c', repeat=3).shape

Note: `repeat` operation covers functionality identical to `numpy.repeat`, `numpy.tile` and actually more than that.

## Reduce ⇆ repeat

reduce and repeat are like opposite of each other: first one reduces amount of elements, second one increases.

In the following example each image is repeated first, then we reduce over new axis to get back original tensor. Notice that operation patterns are "reverse" of each other

In [None]:
repeated = u.math.einrepeat(ims, 'b h w c -> b h new_axis w c', new_axis=2)
reduced = u.math.einreduce(repeated, 'b h new_axis w c -> b h w c', 'min')


assert u.math.allclose(ims, reduced)

## Fancy examples in random order

(a.k.a. mad designer gallery)

In [None]:
# interweaving pixels of different pictures
# all letters are observable
u.math.einrearrange(ims, '(b1 b2) h w c -> (h b1) (w b2) c ', b1=2).shape

In [None]:
# interweaving along vertical for couples of images
u.math.einrearrange(ims, '(b1 b2) h w c -> (h b1) (b2 w) c', b1=2).shape

In [None]:
# interweaving lines for couples of images
# exercise: achieve the same result without einops in your favourite framework
u.math.einreduce(ims, '(b1 b2) h w c -> h (b2 w) c', 'max', b1=2).shape

In [None]:
# color can be also composed into dimension
# ... while image is downsampled
u.math.einreduce(ims, 'b (h 2) (w 2) c -> (c h) (b w)', 'mean').shape

In [None]:
# disproportionate resize
u.math.einreduce(ims, 'b (h 4) (w 3) c -> (h) (b w)', 'mean').shape

In [None]:
# spilt each image in two halves, compute mean of the two
u.math.einreduce(ims, 'b (h1 h2) w c -> h2 (b w)', 'mean', h1=2).shape

In [None]:
# split in small patches and transpose each patch
u.math.einrearrange(ims, 'b (h1 h2) (w1 w2) c -> (h1 w2) (b w1 h2) c', h2=8, w2=8).shape

In [None]:
# stop me someone!
u.math.einrearrange(ims, 'b (h1 h2 h3) (w1 w2 w3) c -> (h1 w2 h3) (b w1 h2 w3) c', h2=2, w2=2, w3=2, h3=2).shape

In [None]:
u.math.einrearrange(ims, '(b1 b2) (h1 h2) (w1 w2) c -> (h1 b1 h2) (w1 b2 w2) c', h1=3, w1=3, b2=3).shape

In [None]:
# patterns can be arbitrarily complicated
u.math.einreduce(ims, '(b1 b2) (h1 h2 h3) (w1 w2 w3) c -> (h1 w1 h3) (b1 w2 h2 w3 b2) c', 'mean', 
       h2=2, w1=2, w3=2, h3=2, b2=2).shape

In [None]:
# subtract background in each image individually and normalize
# pay attention to () - this is composition of 0 axis, a dummy axis with 1 element.
im2 = u.math.einreduce(ims, 'b h w c -> b () () c', 'max') - ims
im2 /= u.math.einreduce(im2, 'b h w c -> b () () c', 'max')
u.math.einrearrange(im2, 'b h w c -> h (b w) c').shape

In [None]:
# pixelate: first downscale by averaging, then upscale back using the same pattern
averaged = u.math.einreduce(ims, 'b (h h2) (w w2) c -> b h w c', 'mean', h2=6, w2=8)
u.math.einrepeat(averaged, 'b h w c -> (h h2) (b w w2) c', h2=6, w2=8).shape

In [None]:
u.math.einrearrange(ims, 'b h w c -> w (b h) c').shape

In [None]:
# let's bring color dimension as part of horizontal axis
# at the same time horizontal axis is downsampled by 2x
u.math.einreduce(ims, 'b (h h2) (w w2) c -> (h w2) (b w c)', 'mean', h2=3, w2=3).shape

## Summary

- `rearrange` doesn't change number of elements and covers different numpy functions (like `transpose`, `reshape`, `stack`, `concatenate`,  `squeeze` and `expand_dims`)
- `reduce` combines same reordering syntax with reductions (`mean`, `min`, `max`, `sum`, `prod`, and any others)
- `repeat` additionally covers repeating and tiling
- composition and decomposition of axes are a corner stone, they can and should be used together
