# Swap ranks using update batching

This notebook explores the concept of swapping a tensor's ranks using update batching.

See [update batching example](../graphs/update-batching.ipynb) for a simple example of update batching for a graph algorithm.

First, include some libraries

In [None]:
# Begin - startup boilerplate code

import pkgutil

if 'fibertree_bootstrap' not in [pkg.name for pkg in pkgutil.iter_modules()]:
  !python3 -m pip  install git+https://github.com/Fibertree-project/fibertree-bootstrap --quiet

# End - startup boilerplate code


from fibertree_bootstrap import *
fibertree_bootstrap(style="tree", animation="movie")

## Create a rank-2 tensor

Create a random rank-2 tensor (__a__) 

In [None]:
R1 = 10
R0 = R1

a = Tensor.fromRandom(["R1", "R0"], [R1, R0], (1.0, 0.28), 9, seed=100)
a.setColor("green")
a.setName("A")

displayTensor(a)


## One-pass transpose

Process a transpose in one pass by doing a concordant traversal of __a__ and using __getPayloadRef()__ to insert/update a fiber in the top rank of __a_swapped__ at coordinate __r0__ for each element of __a__ and append the value (from __a__) into that fiber with coordinate __r1__.

Observations:

- Traversal of __a__ is concordant.

- Update traveral of top rank (__R0__) of __a_swapped__ is discordant.

- Traveral of bottom rank (__R1__) of __a_swapped__ is always an append, but to many different fibers each of which is of unknown ultimate size.

- It is not known when a fiber in the lower rank of __a_swapped__ is completed until the traversal of __a__ has finished.


In [None]:
a_swapped = Tensor(rank_ids=["R0", "R1"])
a_swapped.setColor("blue")
a_swapped.setName("A_swapped")

a_r1 = a.getRoot()
a_swapped_r0 = a_swapped.getRoot()

canvas = createCanvas(a, a_swapped)

for r1, a_r0 in a_r1:
    for r0, a_val in a_r0:
        a_swapped_r1 = a_swapped_r0.getPayloadRef(r0)
        a_swapped_r1.append(r1, a_val)
        
        canvas.addFrame((r1, r0), (r0, r1))
        
displayTensor(a_swapped)
displayCanvas(canvas)
        

## Step 1 - Update batch sequence

Do a (concordant) traversal of the rank-2 tensor (__a__), and for each element of the tensor log the coordinates (__r1__ and __r0__) and the value (__a_val__) as a tuple in a new rank-2 tensor (__bins__). When logging, select a bin id (top rank coordinate of __bins__) by a partitioning of the bottom rank (__r0__) coordinates of __a__ (e.g., divide the bottom rank coordinate of __a__ by 2). 

Observations:

- Insert/updates to top rank of bins is discordant, but to a much smaller number of coordinates than in the one-pass implementation

- Note that the additions to the fibers in the lower rank of __bins__ is always just at the end of the fibers, so each of those fibers can be streamed to a larger storage array (although the ultimate size of each fiber is not known statically).


TBD: Maybe consider using the tuple (__r1__, __r0__) as the coordinate for the lower rank of __bins__.

In [None]:
coordinates_per_bin = 2

bins = Tensor(rank_ids=["B", "N"])
bins.setColor("purple").setName("bins")

a_r1 = a.getRoot()
bins_b = bins.getRoot()

canvas = createCanvas(a, bins)
n = 0

for r1, a_r0 in a_r1:
    for r0, a_val in a_r0:
        n += 1
        
        b = r0//coordinates_per_bin
        bins_n = bins_b.getPayloadRef(b)
        
        bins_n.append(n, (r0, r1, a_val))
        canvas.addFrame([(r1, r0)], [(b, n)])
        
displayTensor(bins)
displayCanvas(canvas)


## Step 2  - Replay the log

Do a concordant traversal of all the elements of the log tensor (__bins__) and create the swapped tensor (__a_swapped__) by adding fibers (and elements) while processing each bin (top rank coordinate of __bins__). Specifically, for each payload (__r0__, __r1__, and __a_val__) update (or insert/update) a fiber into the new tensor (__a_swapped__) at the swapped top rank coordinate (__r0__) and append to that fiber the proper value (__a_val__) at the proper lower rank coordinate (__r1__). 

Observations:

- During each time interval, i.e., the time spent working on a bin, the range of coordinates for insert/updates of the upper rank of the swapped tensor (__a_swapped))__ is small, which allows the __getPayloadRef()__ to be efficient (and amenable to shortcuts - see cells below)

- All updates to the coordinates in the new tensor (__a_swapped__) associated with a bin are done when the bin is done, so that part of the new tensor can be streamed out to another storage level immediately.

- The operations on the fibers in the lower rank of the new tensor (__a_swapped__) are aways an append. However, it still could be challenging to directly create a compressed represenation, e.g., a payload/coordlist, for the multiple fibers created from a single bin, because they are created in parallel and the ultimate sizes of each fiber are unknown. A least they are more bounded in size, so can be created in a smaller storage unit and concatenated when streamed to the a larger storage unit.

In [None]:
a_swapped = Tensor(rank_ids=["R0", "R1"])
a_swapped.setColor("blue")
a_swapped.setName("A_swapped")

bins_b = bins.getRoot()
a_swapped_r0 = a_swapped.getRoot()

canvas = createCanvas(bins, a_swapped)

for b, bins_n in bins_b:
    for n, (r0, r1, a_val) in bins_n:
        a_swapped_r1 = a_swapped_r0.getPayloadRef(r0)
        a_swapped_r1.append(r1, a_val)
        canvas.addFrame((b, n), (r0, r1))
        
        
displayTensor(a_swapped)
displayCanvas(canvas)
        

In [None]:
# Check

a_swapped.getRoot() == a.swapRanks().getRoot()

## Step 2  - Replay the log - with shortcuts

Given the nice pattern of the values returned by the __getPayloadRef()__ method call exhibited by the above dataflow, one can optimize the search for the payload associated with the desired coordinate in the __getPayloadRef()__ call by using the "start_pos" shortcut. The following cell displays a control to enable or disable the use of the shortcut for the following log replay dataflow.

In [None]:
createEnableControl("Use shortcut")

In [None]:
a_swapped = Tensor(rank_ids=["R0", "R1"])
a_swapped.setColor("blue")
a_swapped.setName("A_swapped")

bins_b = bins.getRoot()
a_swapped_r0 = a_swapped.getRoot()

canvas = createCanvas(bins, a_swapped)

next_start_pos = 0

for b, bins_n in bins_b:
    start_pos = next_start_pos
    
    for n, (r0, r1, a_val) in bins_n:
        a_swapped_r1 = a_swapped_r0.getPayloadRef(r0, start_pos=start_pos)
        if enable["Use shortcut"]:
            next_start_pos = max(next_start_pos, a_swapped_r0.getSavedPos())
                
        a_swapped_r1.append(r1, a_val)
        canvas.addFrame((b, n), (r0, r1))
        
(n, distance) = a_swapped_r0.getSavedPosStats()
print(f"Average search distance = {distance/n:4.2f}")
      
displayTensor(a_swapped)
displayCanvas(canvas)


## Testing area

For running alternative algorithms