# Rewriting Symbolic Expressions in Bartiq

As quantum algorithms increase in complexity their symbolic resource expressions similarly become more complex. For a state of the art algorithm like double factorization the resource expressions can be almost impossible to parse, due to the sheer number of terms and symbols.

`bartiq` includes a set of utilities for manipulating symbolic expressions – rewriting them to make them simpler and easier to analyze. They're known as **rewriters**. This functionality is contained in the analysis submodule, and backend-specific rewriters can be imported directly.

In [1]:
from bartiq.analysis import sympy_rewriter

Here we will provide a brief demo of how to use rewriters applied to a state-of-the-art algorithm in the literature.

Double factorization (DF) is an important subroutine in quantum computations of chemistry. We will not explore the construction of the algorithm nor the intricacies of the circuit itself, but we invite the reader to read the following papers if they are interested:
- [Quantum computing enhanced computational catalysis](https://arxiv.org/abs/2007.14460)
- [Even more efficient quantum computations of chemistry through tensor hypercontraction](https://arxiv.org/abs/2011.03494)

Instead we will load in a pre-built `bartiq` `CompiledRoutine` object representing this algorithm.

In [2]:
from bartiq import CompiledRoutine, sympy_backend
import yaml

with open("../data/double_factorization_compiled.yaml", "r") as f:
    compiled_routine = CompiledRoutine.from_qref(yaml.safe_load(f), sympy_backend)

The `compiled_routine` variable is an object containing quantum resource estimates (along with other information) about the double factorization algorithm. To see all the resource costs we can call the `resource_values` property, but since these expressions are particular nasty we won't for now. 

As an example, consider fault-tolerant arbitrary angle rotations. To implement these fault tolerantly they must be decomposed into a sequence of fixed-angle gates, and often these decompositions contain many $T$-gates. As such the number of rotations in an algorithm may be a value we seek to minimize, or optimize.

To see how many arbitrary angle rotations we need to synthesize for double factorization, we can easily get the expression:

In [25]:
compiled_routine.resource_values["rotations"]

(N_spatial - 1)*Max(0, Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 1.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 1.5, 0) - Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 2.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 2.5, 0)) + (N_spatial - 1)*Max(0, Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(0, b_mas - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b

But, it is not easily understood!


First, a brief run down of the symbols we're seeing here:

- $N_{spatial}$: The number of spatial orbitals.
- $b_{as}$: The number of bits of precision for the alias sampling subroutine.
- $b_{mas}$: The number of bits of precision for the multiplexed alias sampling subroutine.
- $b_{givens}$: The number of bits of precision to represent each Givens rotation in the basis rotated number operator subroutine.

With rewriters we can apply assumptions on these symbols to hopefully tidy up this expression and make it more palatable! 

To get started, we load the expression into the rewriter object.

In [26]:
rotations = sympy_rewriter(compiled_routine.resource_values["rotations"])
rotations

SympyExpressionRewriter(expression=(N_spatial - 1)*Max(0, -Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 1.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 1.5, 0) - Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 2.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 2.5, 0) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas))) + (N_spatial - 1)*Max(0, -Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(0, b_mas - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b

The `rewriter` variable here is actually a `dataclass`, and in an interactive environment it displays a `KaTeX`-friendly expression so we can see the effect of method calls immediately.

Notice that the term $\max(b_{as}, b_{givens}, b_{mas})$ occurs a number of times in this expression. For simplicity we can replace this with a new variable: $B:=\max(b_{as}, b_{givens}, b_{mas})$.

In [27]:
rotations = rotations.substitute("max(b_as, b_givens, b_mas)", "B")
rotations

SympyExpressionRewriter(expression=(N_spatial - 1)*Max(0, -Heaviside(-B - Max(0, -B + b_mas) + 1.5, 0)*Heaviside(B + Max(0, -B + b_mas) + Max(0, -B + b_givens - Max(0, -B + b_mas)) - 1.5, 0) - Heaviside(-B - Max(0, -B + b_mas) + 2.5, 0)*Heaviside(B + Max(0, -B + b_mas) + Max(0, -B + b_givens - Max(0, -B + b_mas)) - 2.5, 0) + Max(0, -B + b_givens - Max(0, -B + b_mas))) + (N_spatial - 1)*Max(0, -Heaviside(-B - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas)) - Max(0, -B + b_mas - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas))) + 1.5, 0)*Heaviside(B + Max(0, -B + b_mas) + Max(0, -B + b_givens - Max(0, -B + b_mas)) + Max(0, -B + b_mas - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas))) + Max(0, -B + b_givens - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas)) - Max(0, -B + b_mas - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas)))) - 1.5, 0) - Heaviside(-B - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_

The `substitute` method allows us to perform one-to-one substitutions (as well as more complex substitutions, but more on that later) and we can track what substitutions we have made with the `substitutions` property:

In [6]:
rotations.substitutions

(Substitution(expr='max(b_as, b_givens, b_mas)', replacement='B', backend='SympyBackend'),)

Now we know that each of the $b_{x}$ variables are non-zero and positive, and as these are _bits of precision_ they typically take values between 7 and 15. For simplicity we will say each of these is simply greater than 5, and straightaway we can apply that information with the `assume` method:

In [28]:
rotations = rotations.assume("b_as > 5").assume("b_mas>5").assume("b_givens>5").assume("B > 5")
rotations

SympyExpressionRewriter(expression=B + (N_spatial - 1)*Max(0, -B + b_givens - Max(0, -B + b_mas)) + (N_spatial - 1)*Max(0, -B + b_givens - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas)) - Max(0, -B + b_mas - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas)))) + Max(0, -B + b_mas) + Max(0, -B + b_mas - Max(0, -B + b_mas) - Max(0, -B + b_givens - Max(0, -B + b_mas))) + 1, _original_expression=(N_spatial - 1)*Max(0, -Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 1.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 1.5, 0) - Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 2.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 2

Note we also had to provide an assumption on our `B` symbol - assumptions are not inherited across linked symbols!

Since $B$ is the maximum of $b_{as}$, $b_{mas}$ and $b_{givens}$, we can get rid of the `max(0, -B + b_mas)`  and `max(0, -B + b_givens)` terms with more assumptions.

Note that the assumptions are **not** `b_mas <= B` or `b_givens <= B` - defining a relationship between two symbols is not possible in SymPy!

In [29]:
rotations = rotations.assume("b_mas - B <= 0").assume("b_givens - B <= 0")
rotations

SympyExpressionRewriter(expression=B + 1, _original_expression=(N_spatial - 1)*Max(0, -Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 1.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 1.5, 0) - Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 2.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 2.5, 0) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas))) + (N_spatial - 1)*Max(0, -Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(0, b_mas - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b

Nice! The number of rotations we need to synthesize is simply related to which of the bits of precision is biggest, plus one!

There is also no need to constantly redefine the `rewriter` variable either, since the rewriter functionality permits method chaining we could do the whole thing in one cell:

In [None]:
rotations = sympy_rewriter(compiled_routine.resource_values["rotations"])
rotations = (
    rotations.substitute("max(b_as, b_givens, b_mas)", "B")
    .assume("b_as > 5")
    .assume("b_mas > 5")
    .assume("b_givens > 5")
    .assume("B > 10")
    .assume("b_mas - B <= 0")
    .assume("b_givens - B <= 0")
)
rotations

SympyExpressionRewriter(expression=B + 1, _original_expression=(N_spatial - 1)*Max(0, -Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 1.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 1.5, 0) - Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 2.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 2.5, 0) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas))) + (N_spatial - 1)*Max(0, -Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(0, b_mas - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b

## Rewriter Cheat Sheet

We've already seen how to substitute symbols or expressions for another with `substitute`, and to apply assumptions onto symbols and expressions with `assume`. Here we will provide a quick rundown of the most important and most powerful functions of rewriters. 

#### Wildcard substitutions

SymPy implements something called `Wild` symbols - symbols that match anything. We can use these directly in the `substitute` method, prefacing a symbol with `$` will mark is as wild and allows for pattern matching in expressions:

In [10]:
sympy_rewriter("log(x + 2) + log(y + 4)").substitute("log($x + $y)", "f(x, y)")

SympyExpressionRewriter(expression=f(2, x) + f(4, y), _original_expression=log(x + 2) + log(y + 4), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Substitution(expr='log($x + $y)', replacement='f(x, y)', backend='SympyBackend'), SympyExpressionRewriter(expression=log(x + 2) + log(y + 4), _original_expression=log(x + 2) + log(y + 4), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Initial(), None))))

If symbols were marked as wild in the first argument to `substitute` and then referenced in the second argument, the corresponding matching pattern is used. If a new, or existing, symbol is referenced, it is replaced as-is. If an existing symbol is used as a wild symbol, the corresponding matching pattern takes precedence.

In [11]:
# Replace a wild pattern with a new symbol
sympy_rewriter("f(x) + f(y) + z").substitute("f($x)", "t")

SympyExpressionRewriter(expression=2*t + z, _original_expression=z + f(x) + f(y), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Substitution(expr='f($x)', replacement='t', backend='SympyBackend'), SympyExpressionRewriter(expression=z + f(x) + f(y), _original_expression=z + f(x) + f(y), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Initial(), None))))

In [12]:
# Replace a wild pattern with an existing symbol
sympy_rewriter("f(x) + f(y) + z").substitute("f($x)", "z")

SympyExpressionRewriter(expression=3*z, _original_expression=z + f(x) + f(y), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Substitution(expr='f($x)', replacement='z', backend='SympyBackend'), SympyExpressionRewriter(expression=z + f(x) + f(y), _original_expression=z + f(x) + f(y), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Initial(), None))))

In [13]:
# Using an existing symbol as a wild symbol
sympy_rewriter("f(x) + f(y) + z").substitute("f($y)", "y")

SympyExpressionRewriter(expression=x + y + z, _original_expression=z + f(x) + f(y), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Substitution(expr='f($y)', replacement='y', backend='SympyBackend'), SympyExpressionRewriter(expression=z + f(x) + f(y), _original_expression=z + f(x) + f(y), backend=<bartiq.symbolics.sympy_backend.SympyBackend object at 0x10b0122a0>, linked_symbols={}, _previous=(Initial(), None))))

The full documentation for `substitute` can be seen [here](https://psiq.github.io/bartiq/latest/concepts/rewriters/#substitutions).

#### Other useful methods

- `expand()`: Expand all the brackets in the expression.
- `simplify()`: Call the built-in SymPy `simplify` functionality. Use with care!

These two methods, along with `substitute` and `assume`, are the only ones that rewrite an expression. They return a new instance of the rewriter dataclass, and thus allow for method chaining. 

The remaining methods are useful for gathering information about the expression.

- `focus(symbols: str | Iterable[str])`: Return only those terms in the expression that contain certain `symbols`. This only hides the remaining terms, it does not delete them.
- `all_functions_and_arguments()`: Return all functions and their arguments in the expression, including nested functions.
- `list_arguments_of_function(function_name: str)`: Return all the arguments of a given function. If the function takes multiple arguments, they are returned as a tuple in the order they appear.
- `history()`: View a time-ordered list of all instructions applied to this instance of the rewriter.
- `evaluate_expression(assignments: dict[str, int | float], functions_map: dict[str, callable])`: Evaluate the expression for a specific data point.

The full documentation can be seen [here](https://psiq.github.io/bartiq/latest/concepts/rewriters/).

# Challenges

Assume that all of our parameters are positive, real values greater than 10.

Show that the number of $T$-gates in this algorithm is **exactly**

$$T\mathrm{-gates} = 24\min\left(M_r, \lceil \log_2(R)\rceil\right) + 1$$

In [14]:
t_gates = sympy_rewriter(compiled_routine.resource_values["t_gates"])
t_gates

SympyExpressionRewriter(expression=(N_spatial - 1)*Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 2.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas) - 2.5, 0) + (N_spatial - 1)*Heaviside(-Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(0, b_mas - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas) + 2.5, 0)*Heaviside(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(0, b_mas - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_g

<div class="alert alert-block alert-info"> 
<b>PITFALL</b> 

The above expression contains a `1*mod(...)` expression. There is a difference between how SymPy _displays_ a function, and how it stores that function internally. 

Typing `rewriter.substitute("a*mod(b)", "c")` will raise a `TypeError` from SymPy - because `Mod` takes two arguments in the SymPy library. 

Therefore, the proper way to substitute would be by typing `rewriter.substitute("mod(a, b)", "c")` 
</div>

------

As a bonus challenge, and assuming that the variable $\lambda = 1$ everywhere, show that the number of Toffoli gates is
 $$ \mathrm{Toffs} \approx 4\max(b_{as}, b_{mas}, b_{givens})\left(N_{spatial} - 1\right) + 4N_{spatial} + b_{as} + 4\lceil \log_2(R)\rceil + 3\lceil(\log_2(R + 1))\rceil + 7$$

Assuming the function $\lambda = 1$ everywhere is an aggressive assumption - making this expression only valid for a restricted set of parameters.

**HINT**: Since `lambda` is a protected word in Python, the SymPy library defines its symbol with the alias `lamda`. 

In [24]:
toffs = sympy_rewriter(compiled_routine.resource_values["toffs"])
toffs

SympyExpressionRewriter(expression=4*N_spatial + 2*b_givens*(-1 + (b_givens*(N_spatial - 1)*(lamda - 1) + b_givens*(N_spatial - 1))/(b_givens*(N_spatial - 1)))*(N_spatial - 1) + (-2 + 2*(b_mas + (b_mas + ceiling(log2(R)))*(lamda - 1) + ceiling(log2(R)))/(b_mas + ceiling(log2(R))))*(b_mas + ceiling(log2(R))) + (-1 + (2*b_as + 2*ceiling(log2(R + 1)))/(b_as + ceiling(log2(R + 1))))*(b_as + ceiling(log2(R + 1))) + (2*N_spatial - 2)*(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(b_as, b_givens, b_mas)) + (2*N_spatial - 2)*(Max(0, b_mas - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(0, b_mas - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) - Max(b_as, b_givens, b_mas)) + Max(0, b_givens - Max(0, b_mas - Max(b_as, b_givens, b_mas)

<div class="alert alert-block alert-info"> 
<b>INFO</b> 

The function `ntz` in the above expression stands for 'number of trailing zeroes', and is defined as number of zeroes before the decimal point in _binary representation_ of the number.

For example: `ntz(2)=ntz(0b10) = 1`, `ntz(5)=ntz(0b101)=0`, `ntz(256)=ntz(0b100000000)=8`. 

In physical systems none of our symbols will be larger than ~1000, and so the lowest value this function can take is `0` and the largest value it can take is `9`. 