# Sparseloop Tutorial - 03 - Convolution

This notebook contains a series of examples of a DNN **convolution** computation. The **fibertree** emulator is used to illustrate the impact of a set of optimizations to exploit sparsity. The basic computation for a 1-D convolution with one input channel and one output channel is represented by the Einsum:

$$ O_{q} = I_{(q+s)} \times F_{s} $$

Note that while the output is possibly a sparse rank-1 tensor, in this notebook the output is assumed to be in an uncompressed format and is directly referenced by **coordinate** since **position** == **coordinate** in an uncompressed format.

First, include some libraries

In [None]:
# Run boilerplate code to set up environment

%run ./prelude.py --style=tree --animation=movie

## Configure convolution tensors input tensors

The following cell sets up the control sliders to specify the attributes of the 1-D `I` (input activations) and `F` (filter weight) input tensors. Those attributes include their **shape**, which specifies the allowable range of **coordinates** of elements of the tensor and their **density**.

The empty 1-D 'O' (output activations) tensors are declared when used in later cells.  

The rank names use the following convention:

- `W` - The width of the input activation tensor (`I`)
- `S` - The width of the filter weight tensor (`F`)
- `Q` - The width of the output activation tensor (`O`)
- `M` - The number of output channels in `F` and `O` (used in later cells).


In [None]:
#
# Set default problem instance attributes (i.e., the shape of the tensors)
#
W = 12
S = 3


#
# Create controls to configure the `I` and `F` tensors
#
tm3 = TensorMaker("sparseloop-convolution-1", autoload=True)

tm3.addTensor("F", rank_ids=["S"], shape=[S], density=0.5, color="green")
tm3.addTensor("I", rank_ids=["W"], shape=[W], density=1.0, color="blue")

tm3.displayControls()


## Create and display convolution tensors

In [None]:
F_S = tm3.makeTensor("F")
I_W = tm3.makeTensor("I")

displayTensor(F_S)
displayTensor(I_W)


# Simple 1-D Convolution

$$ O_{q} = I_{(q+s)} \times F_{s} $$

In [None]:
#
# Create input convolution tensors
#
S = getShape(tm3, "S")
W = getShape(tm3, "W")
Q = W-S+1

F_S = tm3.makeTensor("F")
I_W = tm3.makeTensor("I")

O_Q = Tensor(name="O", rank_ids=["Q"], shape=[Q])
uncompressTensor(O_Q)

#
# Display Tensors
#
print("Problem Instance:")
print(f"S: {S}")
print(f"W: {W}")
print(f"Q: {Q}")
print("")

print("Filter weight tensor F")
displayTensor(F_S)
print("Input activation tensor I")
displayTensor(I_W)
print("Output activation tensor O (initial)")
displayTensor(O_Q)

#
# Get root of tensors
#
i_w = I_W.getRoot()
f_s = F_S.getRoot()
o_q = O_Q.getRoot()

#
# Animation bookkeeeping
#
canvas = createCanvas(F_S, I_W, O_Q)

cycle = 0

#
# Traverse all `Q` coordinates of the output tensor
#
for q in range(Q):
    #
    # Traverse the non-empty coordinates of the filter weights
    #
    for s, f_val in f_s:
        #
        # Compute and fetch the required input activation coordinate
        #
        w = q + s
        i_val = i_w.getPayload(w)
        
        #
        # Compute value to contribute to partial output sum
        #
        o_q[q] += i_val * f_val
        
        #
        # Animation bookkeeping
        #
        canvas.addActivity((s,), (w,), (q,), spacetime=(0,cycle))
        cycle += 1
            
#
# Display results
#
print("Output activation tensor O (final)")
displayTensor(O_Q)

displayCanvas(canvas)

# Convolution - 1D - with output channels

$$ O_{m,q} = I_{(q+s)} \times F_{m,s} $$

## Configure convolution tensors

In [None]:
#
# Set default problem instance attributes (i.e., the shape of the tensors)
#
M = 4
W = 12
S = 4

#
# Create controls to configure the `I` and `F` tensors
#
tm4 = TensorMaker("sparseloop-convolution-2", autoload=True)

tm4.addTensor("F", rank_ids=["M", "S"], shape=[M, S], density=0.5, color="green")
tm4.addTensor("I", rank_ids=["W"], shape=[W], density=1.0, color="blue")

tm4.displayControls()


## Create and display convolution tensors

In [None]:
F_MS = tm4.makeTensor("F")
I_W = tm4.makeTensor("I")

displayTensor(F_MS)
displayTensor(I_W)


## Convolution with multiple output channels

In [None]:
#
# Create input convolution tensors
#
M = getShape(tm4, "M")
S = getShape(tm4, "S")
W = getShape(tm4, "W")
Q = W - S + 1

F_MS = tm4.makeTensor("F")
I_W = tm4.makeTensor("I")

O_MQ = Tensor(name="O", rank_ids=["M", "Q"], shape=[M, Q])
uncompressTensor(O_MQ)

#
# Display Tensors
#
print("Problem Instance:")
print(f"M: {M}")
print(f"S: {S}")
print(f"W: {W}")
print(f"Q: {Q}")
print("")

print("Filter weight tensor F")
displayTensor(F_MS)
print("Input activation tensor I")
displayTensor(I_W)
print("Output activation tensor O (initial)")
displayTensor(O_MQ)

#
# Get root of tensors
#
i_w = I_W.getRoot()
f_m = F_MS.getRoot()
o_m = O_MQ.getRoot()

#
# Animation bookkeeeping
#
canvas = createCanvas(F_MS, I_W, O_MQ)

cycle = 0

#
# Traverse filter weight output channels
#
for m, f_s in f_m:    
    #
    # Traverse all `Q` coordinates of the output tensor
    #
    for q in range(Q):
        #
        # Traverse all non-empty filter weights for this output channel
        #
        for s, f_val in f_s:
            #
            # Compute and fetch the required input activation coordinate
            #
            w = q + s
            i_val = i_w.getPayload(w)

            #
            # Compute value to contribute to partial output sum
            #
            o_m[m][q] += i_val * f_val

            #
            # Animation bookkeeping
            #
            canvas.addActivity((m,s), (w,), (m,q), spacetime=(0,cycle))
            cycle += 1

#
# Display results
#
print("Output activation tensor O (initial)")
displayTensor(O_MQ)

displayCanvas(canvas)

## Convolution - spatial output channels

In [None]:
#
# Create input convolution tensors
#

split = 2

F_MS = tm4.makeTensor("F")
F_M1M0S = F_MS.splitUniform(split)
F_M1M0S = F_M1M0S.updateCoords(lambda n, c, p: n)

I_W = tm4.makeTensor("I")

W = I_W.getShape("W")
S = F_MS.getShape("S")
Q = W-S+1

M = F_MS.getShape("M")
M0 = split
M1 = (M+split-1)//split

O_MQ = Tensor(name="O", rank_ids=["M", "Q"], shape=[M, Q])
uncompressTensor(O_MQ)
O_M1M0Q = O_MQ.splitUniform(split)

#
# Display Tensors
#
print("Problem Instance:")
print(f"M1: {M1}")
print(f"M0: {M0}")
print(f"W:  {W}")
print(f"Q:  {Q}")
print(f"S:  {S}")

displayTensor(F_M1M0S)
displayTensor(I_W)
displayTensor(O_M1M0Q)

#
# Get root of tensors
#
i_w = I_W.getRoot()
f_m1 = F_M1M0S.getRoot()
o_m1 = O_M1M0Q.getRoot()

#
# Animation bookkeeeping
#
canvas = createCanvas(F_M1M0S, I_W, O_M1M0Q)

#
# Traverse group of filter weight output channels
#
for m1, f_m0 in f_m1:
    #
    # Traverse filter weight output channels in this group
    #
    for m0, f_s in f_m0:        
        #
        # Traverse all output locations for this output channel
        #
        for q in range(Q):
            #
            # Traverse all output locations
            #
            for s, f_val in f_s:
                #
                # Compute and fetch the required input activation coordinate
                #
                w = q + s
                i_val = i_w.getPayload(w)

                #
                # Compute value to contribute to partial output sum
                #
                o_m1[m1][m0%M0][q] += i_val * f_val
                #
                # Animation bookkeeping
                #
                spacestamp = m0
                
                # TBD: Time should be counted in position not coordinates!
                timestamp = m1*W*S + q*S + s 
                
                canvas.addActivity((m1,m0,s), (w,), (m1,m0,q),
                                   spacetime=(spacestamp, timestamp))


displayTensor(O_M1M0Q)
displayCanvas(canvas)


## Testing area

For running alternative algorithms