## What is PyTensor?

A library to define, manipulate, and compile computational graphs.

### Let's break it apart
A library to (1.) define, (2.) manipulate, and (3.) compile (0.) computational graphs.


#### (0.) Computational graph

Any program implies a computational graph [citation needed]. In PyTensor we're mostly focusing on static array-based (i.e, numpy) programs with some branching and looping primitives. Having said that, PyTensor can be easily extended to represent arbitrary types and operations although its usefulness quickly vanishes as you venture out of its area of focus.

#### (1.) Definition 
In PyTensor, you define a computational graph explicitly, starting with placeholder input variables and/or constants as the inputs and composing operators that create intermediate placeholder output variables that can be used as inputs to further operators. It's made to look almost like numpy code (to reduce learning barrier and avoid too many design decisions), but it's not!

In [1]:
import numpy as np
import pytensor
import pytensor.tensor as pt

# Numpy
x = np.array([0, 1, np.e])
y = np.log(1 + x)
y

array([0.        , 0.69314718, 1.31326169])

In [2]:
# Pytensor
x = pt.tensor(shape=(3,), dtype="float64")  # placeholder
y = pt.log(1 + x)  # placeholder

In [3]:
y.dprint()

Log [id A]
 └─ Add [id B]
    ├─ ExpandDims{axis=0} [id C]
    │  └─ 1 [id D]
    └─ <Vector(float64, shape=(3,))> [id E]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

In [4]:
y, type(y)

(Log.0, pytensor.tensor.variable.TensorVariable)

In [5]:
y.type, type(y.type)

(TensorType(float64, shape=(3,)), pytensor.tensor.type.TensorType)

In [6]:
y.owner, type(y.owner)

(Log(Add.0), pytensor.graph.basic.Apply)

In [7]:
y.owner.op, type(y.owner.op)

(Elemwise(scalar_op=log,inplace_pattern=<frozendict {}>),
 pytensor.tensor.elemwise.Elemwise)

In [8]:
y.owner.outputs, y.owner.outputs == [y]

([Log.0], True)

In [9]:
y.owner.inputs

[Add.0]

In [10]:
y.owner.inputs[0].dprint()  # And the story begins again

Add [id A]
 ├─ ExpandDims{axis=0} [id B]
 │  └─ 1 [id C]
 └─ <Vector(float64, shape=(3,))> [id D]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

For those curious: This kind of graph is a bi-partite, directed, acyclic graph composed of interconnected Variable -> Apply -> Variable nodes.

Apply nodes connect input variables to output variables, via a specific operator. Variables have a type and can have an owner (the Apply node that creates them) or not (if they are root placeholder variables).

Schematically: 

![](https://pytensor.readthedocs.io/en/latest/_images/apply.png)

### (2) Manipulation

PyTensor puts a strong focus on manipulating (and hacking) the computational graph at the Python level.

In [11]:
from pytensor.graph import rewrite_graph

In [12]:
y.dprint()

Log [id A]
 └─ Add [id B]
    ├─ ExpandDims{axis=0} [id C]
    │  └─ 1 [id D]
    └─ <Vector(float64, shape=(3,))> [id E]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

#### Rewrites

You can rewrite graphs with different goals in mind, such as making it numerically more stable

In [13]:
stable_y = rewrite_graph(y, include=("stabilize",))
stable_y.dprint()

Log1p [id A]
 └─ <Vector(float64, shape=(3,))> [id B]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

#### Differentiation

You can differentiatie it

In [14]:
from pytensor.gradient import grad
grad_y = grad(stable_y.sum(), wrt=x)
grad_y.dprint()

True_div [id A]
 ├─ Second [id B]
 │  ├─ Log1p [id C]
 │  │  └─ <Vector(float64, shape=(3,))> [id D]
 │  └─ ExpandDims{axis=0} [id E]
 │     └─ Second [id F]
 │        ├─ Sum{axes=None} [id G]
 │        │  └─ Log1p [id C]
 │        │     └─ ···
 │        └─ 1.0 [id H]
 └─ Add [id I]
    ├─ ExpandDims{axis=0} [id J]
    │  └─ 1 [id K]
    └─ <Vector(float64, shape=(3,))> [id D]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

Cryptic second means: keep the second input after broadcasting the shape with the first.
Same as `np.broadcast_arrays(x, y)[1]`

You can simplify / canonicalize equivalent graphs

In [15]:
rewrite_graph(grad_y, include=("canonicalize",)).dprint()

True_div [id A]
 ├─ [1.] [id B]
 └─ Add [id C]
    ├─ [1.] [id B]
    └─ <Vector(float64, shape=(3,))> [id D]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

Or specialize for faster computation

In [16]:
rewrite_graph(grad_y, include=("canonicalize", "specialize")).dprint()

Reciprocal [id A]
 └─ Add [id B]
    ├─ [1.] [id C]
    └─ <Vector(float64, shape=(3,))> [id D]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

#### Vectorization

In [17]:
y.dprint(print_type=True)

Log [id A] <Vector(float64, shape=(3,))>
 └─ Add [id B] <Vector(float64, shape=(3,))>
    ├─ [1] [id C] <Vector(int8, shape=(1,))>
    └─ <Vector(float64, shape=(3,))> [id D] <Vector(float64, shape=(3,))>


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

In [18]:
from pytensor.graph.replace import vectorize_graph

new_x = pt.matrix("new_x", shape=(2, 3))
new_y = vectorize_graph(y, replace={x: new_x})
new_y.dprint(print_type=True)

Log [id A] <Matrix(float64, shape=(2, 3))>
 └─ Add [id B] <Matrix(float64, shape=(2, 3))>
    ├─ ExpandDims{axis=0} [id C] <Matrix(int8, shape=(1, 1))>
    │  └─ [1] [id D] <Vector(int8, shape=(1,))>
    └─ new_x [id E] <Matrix(float64, shape=(2, 3))>


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

#### Scalarization

In [19]:
only_one_entry_of_y = new_y[0, 1]
only_one_entry_of_y.dprint(print_type=True)

Subtensor{i, j} [id A] <Scalar(float64, shape=())>
 ├─ Log [id B] <Matrix(float64, shape=(2, 3))>
 │  └─ Add [id C] <Matrix(float64, shape=(2, 3))>
 │     ├─ ExpandDims{axis=0} [id D] <Matrix(int8, shape=(1, 1))>
 │     │  └─ [1] [id E] <Vector(int8, shape=(1,))>
 │     └─ new_x [id F] <Matrix(float64, shape=(2, 3))>
 ├─ 0 [id G] <int64>
 └─ 1 [id H] <int64>


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

In [20]:
rewrite_graph(only_one_entry_of_y).dprint(print_type=True)

Log [id A] <Scalar(float64, shape=())>
 └─ Add [id B] <Scalar(float64, shape=())>
    ├─ 1.0 [id C] <Scalar(float64, shape=())>
    └─ Subtensor{i, j} [id D] <Scalar(float64, shape=())>
       ├─ new_x [id E] <Matrix(float64, shape=(2, 3))>
       ├─ 0 [id F] <int64>
       └─ 1 [id G] <int64>


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

#### Integration

Okay it can't do everything (but maybe you can extend it to?)

#### Graph surgery

In [21]:
from pytensor.graph.replace import graph_replace

In [22]:
y.dprint()

Log [id A]
 └─ Add [id B]
    ├─ [1] [id C]
    └─ <Vector(float64, shape=(3,))> [id D]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

You can truncate a graph easily

In [23]:
x_plus_1 = pt.vector("x_plus_1", shape=(3,))
new_y = graph_replace(y, replace={y.owner.inputs[0]: x_plus_1})
new_y.dprint()

Log [id A]
 └─ x_plus_1 [id B]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

Or etxend it

In [24]:
log_x = pt.vector("log_x", shape=(3,))
new_y = graph_replace(y, replace={x: pt.exp(log_x)})
new_y.dprint()

Log [id A]
 └─ Add [id B]
    ├─ [1] [id C]
    └─ Exp [id D]
       └─ log_x [id E]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

And manipulations are composable. You can rewrite, differentiate, vectorize, ... every graph you get back: 

In [25]:
rewrite_graph(new_y, include=("stabilize",)).dprint()

Scalar_softplus [id A]
 └─ log_x [id B]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

No idea why it's not just Softplus (PR welcome), but I promise it's a stable computational graph!

### (3) Compilation

All this is fun and dandy but only useful if we actually use it compute stuff! 

PyTensor provides a critical non-composable graph operation: `function`, which converts a pytensor graph into a callable python object that takes concrete inputs and returns concrete outputs. 

By default it runs an extensive database of rewrites to try and optimize the computational graph, and then compiles to C (technically a mix of C and Python if not all operations have a C implementation). See https://pytensor.readthedocs.io/en/latest/extending/pipeline.html for a bit more detail.

As with anything remotely useful in Python, when it comes to work you want to [STAY OUT OF PYTHON](https://www.youtube.com/watch?v=vVUnCXKuNOg) as much as possible.

In [26]:
x = pt.vector("x", shape=(None,))
z = pt.exp(pt.sin(x))
out = pt.cos((z[None, :] @ z[:, None]).squeeze())
out.dprint()

Cos [id A]
 └─ DropDims{axes=[0, 1]} [id B]
    └─ Blockwise{dot, (m,k),(k,n)->(m,n)} [id C]
       ├─ ExpandDims{axis=0} [id D]
       │  └─ Exp [id E]
       │     └─ Sin [id F]
       │        └─ x [id G]
       └─ ExpandDims{axis=1} [id H]
          └─ Exp [id E]
             └─ ···


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

In [27]:
y_fn = pytensor.function([x], out)

In [28]:
type(y_fn)

pytensor.compile.function.types.Function

In [29]:
y_fn(np.random.randn(3))

array(0.24843424)

In [30]:
y_fn(np.random.randn(5))

array(-0.99915347)

In [31]:
y_fn.dprint(print_destroy_map=True)  # Some memory aliasing optimizations

Cos [id A] d={0: [0]} 5
 └─ DropDims{axis=0} [id B] 4
    └─ CGemv{inplace} [id C] d={0: [0]} 3
       ├─ AllocEmpty{dtype='float64'} [id D] 1
       │  └─ 1 [id E]
       ├─ 1.0 [id F]
       ├─ ExpandDims{axis=0} [id G] 2
       │  └─ Composite{exp(sin(i0))} [id H] 0
       │     └─ x [id I]
       ├─ Composite{exp(sin(i0))} [id H] 0
       │  └─ ···
       └─ 0.0 [id J]

Inner graphs:

Composite{exp(sin(i0))} [id H]
 ← exp [id K] 'o0'
    └─ sin [id L]
       └─ i0 [id M]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

PyTensor can also delegate compilation to other libraries in town, namely Numba, JAX, and PyTorch (latter still under active development)

In [32]:
y_numba_fn = pytensor.function([x], out, mode="NUMBA")  # Numba does it's own Blas optimizations we don't have to!
y_numba_fn.dprint(print_destroy_map=True)

Cos [id A] d={0: [0]} 5
 └─ DropDims{axes=[0, 1]} [id B] 4
    └─ dot [id C] 3
       ├─ ExpandDims{axis=0} [id D] 2
       │  └─ Composite{exp(sin(i0))} [id E] 0
       │     └─ x [id F]
       └─ ExpandDims{axis=1} [id G] 1
          └─ Composite{exp(sin(i0))} [id E] 0
             └─ ···

Inner graphs:

Composite{exp(sin(i0))} [id E]
 ← exp [id H] 'o0'
    └─ sin [id I]
       └─ i0 [id J]


<ipykernel.iostream.OutStream at 0x7fb3e19a3310>

In [33]:
y_numba_fn(np.random.randn(3))  # first time takes long, jit compilation actually happening

array(0.65956483)

In [34]:
y_numba_fn(np.random.randn(5))

array(0.99906762)

## Taking a step back

### What is PyTensor again?

For a more coherent introduction of PyTensor and its design principles see: https://pytensor.readthedocs.io/en/latest/introduction.

### How does it compare with alternative frameworks

* Graph is built explicitly with placeholder inputs (common source of confusion for users)
* It is focused on array (tensor) operations (dense and sparse). Tries to look almost like numpy / scipy, (until the abstraction breaks).
  * There is narrow / hidden support for other types like scalars, lists, slices, random Generators, strings, None (although easy to extend)
* Functional design (there is no variable mutation when defining graphs)
* Strong focus on hackability / graph manipulation
* It's completely ours! (And nobody else uses it)
* Evolved from:
  1. Theano which strongly inspired Tensorflow 1.x and JAX. Many concepts stood the test of time. Others have aged and provide some drag.
  2. Aesara, which cleaned up the codebase, added alternative backends (Numba and JAX) and proved there's some interest out there in a library like this.

### Why are we using it?

A mix of inertia/technical debt and Stockholm syndrome of course. 

More seriously, Theano strongly influenced the design (and unique strengths) of PyMC. The graph based approach turned out to be perfect for the Bayesian workflow where you can reuse the same program specification for very distinct goals: ancestral random sampling (prior predictive), tuncated ancestral random sampling (posterior predictive), probability transformation and differentiation (inference and optimization), explicit graph manipulation (causal inference), and so on...

Many of the stabilization optimizations that PyTensor can do are very relevant for Bayesian inference, and it's great that users don't have to worry (as much) about it. For instance passing `logit_p=logits` or `p=pm.math.invlogit(logits)` yields exactly the same graph (and stabilization)! Yuo may have known about log1p, but did you know about log1pexp (softplus above) and log1mexp?

On the other hand the laziness / abstraction level makes it easier to interoperate with other popular python libraries, like numpyro / blackjax (and more generally the  JAX ecosystem), NUMBA, and even libraries in different languages, luke nutpie and BART in RUST. All this would have been more limited if PyMC were to be built on a more eager/specialized computational framework.

### What's the catch?

A mix of intertia/technical debt and Stockholm syndrome, this time for real. 

It's only us out here, and so:
1. It takes effoct to keep up with the times (e.g., implement an xarray-like dims-based abstraction on top of PyTensor),
2. Fix bugs and improve user experience (eg., do you find the super long error messages that PyTensor outputs useful? Ever? If not, why don't you try and make it better ;))
3. There are less resources and community help out there to help learning the tool. If it feels you are forced to use some obscure library it's hard not to hate it! There's a reason kidnapping someone is not on the top of seduction handbooks!

We hope this workshop helps in these regards! For the project to succeed in the long-term though we will need your help with it!