# Tool for manipulating tensors using the hierarchical fiber abstraction

## Introduction

The following cells provide an introduction to the Python operators on tensors (and other objects) in the fibertree abstraction. More background on this abstraction for representing tensors can be found sections 8.2 and 8.3 of the book [Efficient Processing of Deep Neural Networks](https://doi.org/10.2200/S01004ED1V01Y202004CAC050).

First, we run some standard boilerplate code to include some libraries and create some dropdown lists to select the display style and type of animation.

Note this boilerplate code will install the fibertree package, if needed, and will run in a variety of environments, including Google Colab. 

In [None]:
# Begin - startup boilerplate code

import pkgutil

if 'fibertree_bootstrap' not in [pkg.name for pkg in pkgutil.iter_modules()]:
  !python3 -m pip  install git+https://github.com/Fibertree-project/fibertree-bootstrap --quiet

# End - startup boilerplate code

from fibertree_bootstrap import *
fibertree_bootstrap(style="tree", animation='movie')

## Naming conventions

To make reading fiber tree-based code a bit easier, we try to use a consistent variable naming convention in this and other notebooks as follows:

### Rank shapes

The **shape** of a rank (and all the fibers in a rank) is the maximum number of coordinates in each fiber. Variables holding the **shape** of all the fibers in a rank is usually a single uppercase character corresponding to the the name of the rank. For example:

```python
# Fibers in ranks "M" and "K" have shapes 4 and 6, respectively

M = 4
K = 6
```

### Tensors

**Tensors** are either identified by a single lowercase letter or a single lowercase letter, an underscore (\_), and the names of the ranks in the tensor. The rank names are listed in order from top to bottom of the fiber tree. For example:

```python
# Two tensors with ranks named "M" and "K"

a = Tensor(rank_ids=["M", "K"])
a_MK = Tensor(rank_ids=["M", "K"])

```

### Fibers

**Fibers** that are extracted from the ranks of a tensor are named with the lowercase name of the tensor, an underscore (\_), and a lowercase letter matching the name of the fiber's rank. For example:

```python
# Get the root fiber from a tensor

a_m = a_MK.getRoot()

```

Note how the naming of the variable holding the root fiber of a tensor follows from the name of the tensor.


### Coordinates

When accessing elements of a fiber one can use the **Fiber.getPayload()** to access the payload by **coordinate**. Coordinates are named with a lowercase letter corronding the the name of the fiber's rank. An example assuming the coordianates in a fiber are in the open range from 0 to "rank shape" and using **Fiber.getPayload()** is:

```python

# Get the payloads (which happen to be fibers) at each coordinate in the a_m fiber

for m in range(M):
    a_k = a_m.getPayload(m)
    ...
```

Note how the name of the coordinate follows from the name of the fiber.

This convention for coordinate and payload fiber names will also be used when iterating through a fiber in the cells below. 


### Tensor values

The values as the the bottom of a tensor's fiber tree, i.e., leaf values, will be a terminal value. For Python programming language reasons, we need to distinguish between such values that are just going to be used computaionally as a input (right-hand side of an assignment) and an output (left-hand side of an assigment or update). So we use a lowercase letter corresponding to the name of the tensor followed by either \_val or \_ref to indicate those two cases. Such values are generated by the **Fiber.getPayload{,Ref}()** and also the fiber mutation/insertion operator (<<). For example:

```python

# Get the value and a reference to a fiber element's payload at the lowest rank of a tensor

k = 0 

a_val = a_k.getPayload(k)
a_ref = a_k.getPayloadRef(k)
```

Note: This distinction tends to only needed at the leaf payloads of a fiber tree, since variables holding a fiber tend to behave properly as either a value or a reference.


## Creating a tensor

Following is an example of reading in a tensor from a file in YAML format.


In [None]:
# Display an example tensor

filename = datafileName("draw-a.yaml")

print("YAML representation of a tensor\n")
f = open(filename)
for line in f:
    print(line.rstrip('\n'))
f.close()

## Create and display a tensor from a YAML file

In [None]:
a = Tensor.fromYAMLfile(filename)

print("Fiber-tree picture of a tensor")
displayTensor(a)

## Print output for fibers in the tensor

In [None]:
# Get the root fiber out of the tensor
a_m = a.getRoot()

print("Formatted printout of fiber\n")
print(f"{a_m}\n\n")

print("Formatted printout of fiber (with newlines)\n")
print(f"{a_m:n}\n\n")

print("Formatted printout of fiber (with newlines and no elipsis)\n")
print(f"{a_m:n*}\n\n")

print("Formatted printout of fiber (with explicit coordinate and payload format)\n")
print(f"{a_m:(02d,03d)n*}\n\n")


## Create a tensor from an uncompressed array

One can also create a tensor from an set of nested lists

In [None]:
b_data = [[0, 0, 0, 60, 70, 0, 0, 0],
          [0, 0, 0, 0, 70, 80, 0, 0],
          [0, 0, 0, 0, 0, 0, 0, 0],
          [0, 0, 0, 0, 0, 90, 100, 0]]

b = Tensor.fromUncompressed(["X", "Y"], b_data)

displayTensor(b)

## Create a random tensor

One can also create a random tensor.

In [None]:
c = Tensor.fromRandom(rank_ids=["X", "Y"],     # required
                      shape=[4,6],             # required
                      density=[1.0, 0.4],      # required
                      name="C",                # optional, default=""
                      interval=100,            # optional, default=10
                      color="red",             # optional, default="red"
                      seed=100)                # optional, default=None

displayTensor(c)

## Create a user configured random tensor

One can also create a random tensor using user specified controls. This is done in two steps a configuration step in the cell below and an instantiation step in the following cell.

In [None]:
#
# Instantiate the tensor factory
#
tm = TensorMaker()

#
# Define the templates for two tensors
#
d = tm.addTensor(name="D",                     # required
                 rank_ids=["X", "Y"],          # required
                 shape=[4,6],                  # required
                 density=0.4,                  # optional, default=0.2
                 interval=9,                   # optional, default=5
                 color="green",                # optional, default="red"
                 seed=100)                     # optional, default=10

e = tm.addTensor(name="E",                     # required
                 rank_ids=["Y", "X"],          # required
                 shape=[6,4],                  # required
                 density=0.4,                  # optional, default=0.2
                 interval=5,                   # optional, default=5
                 color="blue",                 # optional, default="red"
                 seed=200)                     # optional, default=10

#
# Display the controls to configure the tensors
#
tm.displayControls()

In [None]:
#
# Instantiate the named tensors defined above
#
d = tm.makeTensor("D")
e = tm.makeTensor("E")

#
# Display the tensors
#
displayTensor(d)
displayTensor(e)

## Traverse a tensor

The fibers in a tensor (starting with the root fiber) can be interated over using a for loop. Each iteration returns the coordinate and payload for each element in the fiber. If the payload is itself a fiber then that fiber can be iterated over.

In [None]:
# Traverse a tensor

a = Tensor.fromYAMLfile(datafileName("matrix-a.yaml"))

displayTensor(a)

## Animating a traversal

The codebase provides some utility functions to animate the accesses to a tensor. The notebook [fibertree animation](./fibertree-animation) has more details on animating a computation.

In [None]:

canvas = createCanvas(a)

a_m = a.getRoot()

for m, (a_k) in a_m:
    print(f"({m}, {a_k})")
    for k, (a_val) in a_k:
        print(f"Processing: ({k}, {a_val})")
        canvas.addActivity((m,k))

displayCanvas(canvas)

# Element-wise update (an empty) tensor, i.e., copy

To interatively update the values in a fiber one can use the mutation/insertion binary operator (<<). When given two fibers, for example "z << a", the operator returns a fiber that has every coordinate in "a" and a payload that is tuple containing a reference to the payload in "z" and the payload in "a" for those coordinates. Note that if "z" did not already have a element at a coordinate that exists in "a" then a element at that coordinate is inserted with a default value (typically zero).

An in-depth exploration of the ```<<``` operator can be found at [lessthan-lessthan-operator](./lessthan-lessthan-operator.ipynb).

In [None]:
# Element-wise update a tensor

a = Tensor.fromYAMLfile(datafileName("elementwise-a.yaml"))
z = Tensor(rank_ids=["M"])

a_m = a.getRoot()
z_m = z.getRoot()

print("Z < A Fiber")

canvas = createCanvas(a, z)

for m, (z_ref, a_val) in z_m << a_m:
    print(f"Processing: ({m}, ({z_ref}, {a_val})")
    
    z_ref += a_val
    canvas.addActivity((m,), (m,))

displayCanvas(canvas)

# Intersection

One can intersect the contents of two fibers using the **and** (&) operator. That operator takes two fibers as operands, and returns a fiber that has a element for each coordinate that appears in **both** input fibers and a paylaod that consists of a tuple of the corresponding payloads for the two input fibers.

In [None]:
# Fiber instersection

a = Tensor.fromYAMLfile(datafileName("elementwise-a.yaml"))
b = Tensor.fromYAMLfile(datafileName("elementwise-b.yaml"))

a_m = a.getRoot()
b_m = b.getRoot()


z_m = a_m & b_m

print("Fiber a_m")
displayTensor(a_m)

print("Fiber b_m")
displayTensor(b_m)

print("Fiber a_m & b_m")
displayTensor(z_m)

## Elementwise multiply

Elementwise multiply uses intersection to work on only those elements of the input fibers that each have the same coordinate.

In [None]:
# Element-wise multiply

a = Tensor.fromYAMLfile(datafileName("elementwise-a.yaml"))
b = Tensor.fromYAMLfile(datafileName("elementwise-b.yaml"))
z = Tensor(rank_ids=["M"])

a_m = a.getRoot()
b_m = b.getRoot()
z_m = z.getRoot()

print("Z < A Fiber")

canvas = createCanvas(a, b, z)

for m, (z_ref, (a_val, b_val)) in z_m << (a_m & b_m):
    print(f"Processing: ({m}, ({z_ref}, ({a_val}, {b_val})))")

    z_ref += a_val * b_val
    canvas.addActivity((m,), (m,), (m,))

displayCanvas(canvas, width="75%")

## Dot-product

Here is a dot product of two tensors

In [None]:
# Dot product
#
# To perform a dot-product we need a "row" for an output.
# So we represent the vectors as 2-D tensors
#


a = Tensor.fromYAMLfile(datafileName("dot-product-a.yaml"))
b = Tensor.fromYAMLfile(datafileName("dot-product-b.yaml"))
z = Tensor(rank_ids=["M"])

a_m = a.getRoot()
b_m = b.getRoot()
z_m = z.getRoot()

canvas = createCanvas(a, b, z)

for m, (z_ref, (a_k, b_k)) in z_m << (a_m & b_m):
    for k, (a_val, b_val) in a_k & b_k:
        print(f"Processing: [{k} -> ( {z_ref}, ({a_val}, {b_val})]")

        z_ref += a_val * b_val
        canvas.addActivity((m,k), (m, k), (m,))


displayCanvas(canvas)

# Union

One can union the contents of two fibers using the **or** (|) operator. That operator takes two fibers as operands, and returns a fiber that has a element for each coordinate that appears in **either** input fibers and a paylaod that consists of a triple containing a mask (indicating with the rest of the triple contains payload from only-A, only-B or both-A-and-B and the corresponding payloads for the two input fibers. If a fiber doesn't have a particular coordinate the default value (typically zero) is used.

For a deeper dive into the union operator, see [union-operator](./union-operator.ipynb).

In [None]:
# Fiber union

a = Tensor.fromYAMLfile(datafileName("elementwise-a.yaml"))
b = Tensor.fromYAMLfile(datafileName("elementwise-b.yaml"))


a_m = a.getRoot()
b_m = b.getRoot()

z_m = a_m | b_m

print("Fiber a_m")
displayTensor(a_m)

print("Fiber b_m")
displayTensor(b_m)

print("Fiber a_m | b_m")
displayTensor(z_m)

## Elementwise addition

Elementwise addition uses the union operator. A more sophisiticated version could look at the mask to see if an addition is actually needed.


In [None]:
#
# Do a sum of sums of the rows of two matrices
#

a = Tensor.fromYAMLfile(datafileName("dot-product-a.yaml"))
b = Tensor.fromYAMLfile(datafileName("dot-product-b.yaml"))

z = Tensor(rank_ids=["M"])

a_m = a.getRoot()
b_m = b.getRoot()
z_m = z.getRoot()

canvas = createCanvas(a, b, z)

for m, (z_ref, (mask_k, a_k, b_k)) in z_m << (a_m | b_m):
    for k, (ab_mask, a_val, b_val) in a_k | b_k:
        print(f"Processing: [{k} -> ( {z_ref}, ({ab_mask}, {a_val}, {b_val})]")

        z_ref += a_val + b_val
        canvas.addActivity((m, k), (m, k), (m,))


displayCanvas(canvas)

## Other binary operators

Other binary operators on fibers include **difference** (-) and **exclusive or** (^). Note **exclusive or** returns a mask like **or**

In [None]:
a = Tensor.fromYAMLfile(datafileName("elementwise-a.yaml"))
b = Tensor.fromYAMLfile(datafileName("elementwise-b.yaml"))


a_m = a.getRoot()
b_m = b.getRoot()

z_m = a_m - b_m
z2_m = a_m ^ b_m 

print("Fiber a_m")
displayTensor(a_m)

print("Fiber b_m")
displayTensor(b_m)

print("Fiber a_m - b_m")
displayTensor(z_m)

print("Fiber a_m ^ b_m")
displayTensor(z2_m)

## Reduce vector to a rank zero tensor

A final example illustrating reducing the elements of a fiber into a rank-0 tensor, which is created using a tensor with an empty set of rank_ids.

Note: we currently need to specifiy a constanst highlight coordinate for the rank-0 tensor...

In [None]:
a = Tensor.fromYAMLfile(datafileName("elementwise-a.yaml"))
z = Tensor(rank_ids=[])

a_m = a.getRoot()
z_ref = z.getRoot()

canvas = createCanvas(a, z)

for m, (a_val) in a_m:
    z_ref += a_val
    canvas.addActivity((m,), (0,))

displayCanvas(canvas)

## Testing area

For running alternative algorithms