# What is this? What is `mandala`?
`mandala` is a library that makes it radically simpler to manage the data of
complex computational projects, e.g. in machine learning and data science. It
lets you save the inputs and outputs of Python functions, so that 

- calling a function on the same inputs again directly looks up the saved
outputs;
- the saved inputs/outputs across all functions form a connected whole (a kind
of [graph database](https://en.wikipedia.org/wiki/Graph_database)) that can be
queried and manipulated in high-level ways;
- a bunch of extra features (loading values from storage only when needed,
transparent integration with Python's data structures, ...) help make it a
practical tool for complex workflows.

This tutorial is a minimal interesting example of `mandala`, which introduces
the main features in the simplest scenario. 

# Create a storage and functions to save
You can save the inputs/outputs of any Python function (also called to
*memoize* the function) by applying the `@op` decorator to it. A `Storage`
instance stores all calls to `@op`-decorated functions within a project.

In [1]:
from mandala._next.imports import *

### create a storage for results
storage = Storage()

### create the functions whose results we want to save
@op # memoization decorator
def inc(x):
    print('Hi from inc!')
    return x + 1 

@op
def add(x, y):
    print('Hi from add!')
    return x + y

# Run functions in a `with storage:` block to save their results
The `@op` decorator activates only when the function is called inside a `with
storage:` block:

In [2]:
with storage:
    x = inc(20)
    y = add(21, x)
    print(y) # a *reference* to the result of calling `add(21, inc(20))`

Hi from inc!
Hi from add!
AtomRef(42, hid='4da...', cid='d92...')


# Retrace saved code to load results and add new calls

**Running a second time will not re-execute the calls**, only retrieve
*references*, or `Ref`s, to the results from the `storage` at each function call
along the way:

In [3]:
with storage:
    x = inc(20) # `inc` won't be executed again, instead a reference to the previous result will be returned as `x`
    y = add(21, x) # the combination of (21, x) is recognized as a previously computed result and `add` won't be executed again
    print(y) # now `y` is only a *reference* to the value 42, and the object (42) itself is not loaded from the storage

AtomRef(hid='4da...', cid='d92...', in_memory=False)


Similarly, **adding more calls will not re-execute the parts that have
already been executed**:

In [4]:
with storage:
    for a in [10, 20, 30]: # when a is 20, we don't re-execute `inc` and `add`
        x = inc(a)
        y = add(21, x)

Hi from inc!
Hi from add!
Hi from inc!
Hi from add!


# Querying the storage with computation frames
The `storage` keeps track of how all the `@op`-decorated functions compose with
each other. You can explore this interconnected web of calls using
`ComputationFrame` objects, which generalize dataframes to operate over
computational graphs. 

For example, we can create a computation frame that holds all the calls to the
function `inc`:

In [5]:
cf = storage.cf(inc); cf

ComputationFrame with 2 variable(s) (6 unique refs), 1 operation(s) (3 unique calls)
Computational graph:
    output_0 = inc(x=x)

We see the computational graph represented by this computation frame, the number
of **variables** and **operations** in the graph, as well as the number of
**references** to values of the variables, and **calls** to the operations. 

We can make this more interesting by **expanding** the computation frame to
include all calls that use `output_0` in the graph:

In [6]:
cf = cf.expand_forward('output_0'); cf

ComputationFrame with 3 variable(s) (9 unique refs), 2 operation(s) (6 unique calls)
Computational graph:
    output_0 = inc(x=x)
    output_0_0 = add(y=output_0)

We see that the calls to `add` show up. To convert the computation frame into a
form more useful for subsequent analysis, we can convert it into a dataframe:

In [7]:
cf.get_df()

Extracting tuples from the computation graph:
output_0 = inc(x=x)
output_0_0 = add(y=output_0)...


Unnamed: 0,x,inc,output_0,add,output_0_0
0,10,"Call(inc, cid='380...', hid='afd...')",11,"Call(add, cid='ecc...', hid='172...')",32
1,30,"Call(inc, cid='039...', hid='12e...')",31,"Call(add, cid='19b...', hid='302...')",52
2,20,"Call(inc, cid='692...', hid='e8e...')",21,"Call(add, cid='b84...', hid='7d9...')",42


The resulting table contains one row for each computation in the storage that 
follows the given computational graph, with both the values of the variables in
the graph, as well as calls connecting them. The computations can also only
**partially** follow the computation graph - but more on that later. 

# Deleting calls
Finally, the computation frame object can be used to declaratively delete the
calls captured by it:

In [8]:
cf.delete_calls() # delete all calls made to `inc` and `add` captured by the computation frame (in this case, all calls)
# check that the calls have been deleted
cf = storage.cf(inc); cf

ComputationFrame with 0 variable(s) (0 unique refs), 1 operation(s) (0 unique calls)
Computational graph:
     = inc()

# Conclusion and next steps

This covers the most basic ways in which `mandala`'s main functionalities can be
used. The next tutorials dive deeper into more practical use cases.