In "Einstein" summation, the repeated index defines what we sum by, the expression must have a repeated index, so:
$$
\sum_{i=1}^n = a_1x_1 + a_2x_2 + ... + a_nx_n \equiv a_ix_i
$$

is valid. But $a_{ij}x_k$ is not, whilst $a_{ij}x_j$ is:
$$
a_{ij}x_j \equiv a_{i1}x_1 + a_{i2}x_2 + ... + a_{in}x_n
$$

Double sums  
Summation on both $i$ and $j$:
$$
a_{ij}x_iy_j
$$

In [4]:
import tensorflow as tf

tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

In [30]:
k = 2
batch_size, m, n = None, 4, 2
init = tf.random.uniform(shape=(m, n), minval=0, maxval=16, dtype=tf.int32)
A = tf.Variable(init)
A = tf.expand_dims(A, 0)
A

<tf.Tensor: shape=(1, 4, 2), dtype=int32, numpy=
array([[[14,  4],
        [ 4, 12],
        [ 9, 13],
        [ 0, 13]]], dtype=int32)>

In [31]:
init = tf.random.uniform(shape=(n, k), minval=0, maxval=16, dtype=tf.int32)
B = tf.Variable(init)
B

<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[3, 9],
       [5, 1]], dtype=int32)>

In [33]:
C = tf.einsum('ijk,kl->ijl', A, B)
C

<tf.Tensor: shape=(1, 4, 2), dtype=int32, numpy=
array([[[ 62, 130],
        [ 72,  48],
        [ 92,  94],
        [ 65,  13]]], dtype=int32)>

In [32]:
tf.matmul(A, B)

<tf.Tensor: shape=(1, 4, 2), dtype=int32, numpy=
array([[[ 62, 130],
        [ 72,  48],
        [ 92,  94],
        [ 65,  13]]], dtype=int32)>

In [34]:
# matmul equivalent to whatever was doing before?

In [45]:
import numpy as np

A = np.matrix('''
    1 4;
    2 3
''')

B = np.matrix('''
    5 7;
    6 8
''')

C = A @ B
C

matrix([[29, 39],
        [28, 38]])

Every element in C $C_{ik}$ is:
$$
C_{ik} = \sum_jA_{ij}B_{jk}
$$

$C_{01} = 39$ so

$$
C_{01} = \sum_jA_{0j}B_{j1} = (1\times 7)_{j=0} + (4\times 8)_{j=1}
$$

In [48]:
A = tf.convert_to_tensor(A)
B = tf.convert_to_tensor(B)

A, B

(<tf.Tensor: shape=(2, 2), dtype=int64, numpy=
 array([[1, 4],
        [2, 3]])>,
 <tf.Tensor: shape=(2, 2), dtype=int64, numpy=
 array([[5, 7],
        [6, 8]])>)

In [52]:
# equivalent to A @ B or tf.matmul(A, B)
tf.einsum('ij,jk->ik', A, B)

<tf.Tensor: shape=(2, 2), dtype=int64, numpy=
array([[29, 39],
       [28, 38]])>

In [67]:
# applying to batch case
A = tf.Variable([
    [[1,2],
    [3,4]],
    [[3, 5], 
    [2, 9]]
])

B = tf.Variable(
    [[2], [1]]
)
A.shape, B.shape

(TensorShape([2, 2, 2]), TensorShape([2, 1]))

In [68]:
tf.matmul(A, B)

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[ 4],
        [10]],

       [[11],
        [13]]], dtype=int32)>

For the $ijl^{th}$ element in $C$, sum across the $k^{th}$ dimension in A and B

```
output[i,j,l] = sum_k A[i,j,k] * B[k, l]
```

In [70]:
# for the ijl-th element in C, 
C = tf.einsum('ijk,kl->ijl', A, B)
C

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[ 4],
        [10]],

       [[11],
        [13]]], dtype=int32)>