# Exploring Cartesian Product

This notebook explores running Cartesian product computations for the following Einsum expression:

$$ Z_{m,n} = A_m \times B_n $$

First, include some libraries

In [None]:
# Begin - startup boilerplate code

import pkgutil

if 'fibertree_bootstrap' not in [pkg.name for pkg in pkgutil.iter_modules()]:
  !python3 -m pip  install git+https://github.com/Fibertree-project/fibertree-bootstrap --quiet

# End - startup boilerplate code

from fibertree_bootstrap import *
fibertree_bootstrap(style="tree", animation='movie')


## Cartesian Product

The following cell contains a utility function to create the tensors

In [None]:
#
# Function to create tensor inputs
#
M = 9
N = 9

tm = TensorMaker()

tm.addTensor("A", rank_ids=["M"], shape=[M], density=0.5, interval=9, color="blue")
tm.addTensor("B", rank_ids=["N"], shape=[N], density=0.5, interval=90, color="green")
tm.displayControls()

def create_tensors():
    a = tm.makeTensor("A")
    b = tm.makeTensor("B")
    
    z = Tensor(rank_ids=["M", "N"], shape=[a.getShape("M"), b.getShape("N")])
    
    return (z, a, b)

## Display Tensors

In [None]:
(Z_MN, A_M, B_N) = create_tensors()

displayTensor(A_M)
displayTensor(B_N)
displayTensor(Z_MN)

## Simple Cartesian product - with ordered outputs

The following cell shows two nested loops doing concordant traversal of the input tensors and also doing a concordant traversal of the output tensor using the populate operation (`<<`).

In [None]:
# Cartesian product with assignment to z_ref

(Z_MN, A_M, B_N) = create_tensors()

a_m = A_M.getRoot()
b_n = B_N.getRoot()
z_m = Z_MN.getRoot()

canvas = createCanvas(A_M, B_N, Z_MN)

for m, (z_n, a_val) in z_m << a_m:
    for n, (z_ref, b_val) in z_n << b_n:
        z_ref += a_val * b_val
        
        canvas.addActivity((m,), (n,), (m,n))
        
displayCanvas(canvas)

## Cartesian product with outputs updated based on coordinates

This example illustrates updating outputs based on coordinates using the Fiber.getPayloadRef(&lt;coord&gt;) method. This corresponds to doing a scatter write. Whether the write is into a uncompressed or compressed space is representation dependent. 

Note that procedure isn't really necessary because the outputs are generated in a concordent order.

In [None]:
# Cartesian product with assignment to z payload

(Z_MN, A_M, B_N) = create_tensors()

a_m = A_M.getRoot()
b_n = B_N.getRoot()
z_m = Z_MN.getRoot()

canvas = createCanvas(A_M, B_N, Z_MN)

for m, (a_val) in a_m:
    for n, (b_val) in b_n:
        p = z_m.getPayloadRef(m, n)
        p += a_val * b_val
        
        canvas.addActivity((m,), (n,), (m, n))

displayCanvas(canvas)

### Changing computation order

Note that by using Fiber.getPayloadRef() we don't need to maintain a concordant traversal order for the writes. So the following sequence with the "for" loops reversed works just fine, but exhibits discordant traversal of the outputs.

In [None]:
# Cartesian product with assignment to z payload

(Z_MN, A_M, B_N) = create_tensors()

a_m = A_M.getRoot()
b_n = B_N.getRoot()
z_m = Z_MN.getRoot()

canvas = createCanvas(A_M, B_N, Z_MN)

for n, (b_val) in b_n:
    for m, (a_val) in a_m:
        p = z_m.getPayloadRef(m, n)
        p += a_val * b_val
        
        canvas.addActivity((m,), (n,), (m, n))

displayCanvas(canvas)

## Cartesian product with position split inputs

This example splits the input vectors uniformly in postion space (into groups of 2) and illustrates the processing sequence for a 2x2 parallel Cartesian product. 

### Generate inputs

In [None]:
# Get inputs

(Z_MN, A_M, B_N) = create_tensors()

a_m = A_M.getRoot()
b_n = B_N.getRoot()
z_m = Z_MN.getRoot()

### Split the input vectors

In [None]:
# Run cartesian product

a_m1 = a_m.splitEqual(2)
print("Split a")
displayTensor(a_m1)

b_n1 = b_n.splitEqual(2)
print("Split b")
displayTensor(b_n1)

### Process the split vectors

In the animation below on can see that each of the two inputs read from each input tensor is being used by two PEs and that four distinct outputs are generated (each in a distinct PE).

Note: Be sure to rerun the tensor creation cells above before running the cell below

In [None]:
canvas = createCanvas(A_M, B_N, Z_MN)

cycle = 0

for m1, (a_m0) in a_m1:
    print(f"Process a_m0: {a_m0}")
    
    for n1,(b_n0) in b_n1:
        print(f"Process b_n0: {b_n0}")
        
        #
        # The following two loops can be run in parallel
        # and the `enumerate()` method allows us to identify
        # the currently active PE number for the display
        #
        for pe_m, (m0, a_val) in enumerate(a_m0):
            for pe_n, (n0, b_val) in enumerate(b_n0):
                # Note: m0 and n0 are the original coordinates
                p = z_m.getPayloadRef(m0, n0)
                p += a_val * b_val
                
                canvas.addActivity((m0,), (n0,), (m0,n0),
                                   spacetime=((pe_m,pe_n), cycle))

        cycle += 1

displayTensor(Z_MN)
displayCanvas(canvas)

## Testing area

For running alternative algorithms