# L-GATr Quickstart
# [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/heidelberg-hepml/lgatr/blob/main/examples/demo_lgatr.ipynb)

In this tutorial, we give a quick introduction for how to use LGATr. LGATr is a Lorentz-equivariant transformer for applications in high-energy physics and other domains where Lorentz symmetry is relevant.

`LGATr` is build on geometric algebra representations. The idea is to unify scalars, vectors as well as certain higher-order objects (bivectors, axial vectors, pseudoscalars) into so-called multivectors. Concretely, the input data is embedded into multivectors, then processed by the architecture while maintaining this form, before finally the relevant data is extracted from the output multivector. For the case of the Lorentz group, a multivector is a 16-dimensional object of the form $(s, v^0, v^1, v^2, v^3, \dots )$ with a scalar $s$ and a vector $v^\mu$. We add a seperate chain of scalar channels to formally allow for a smooth transition to non-equivariant transformers, which would only have scalar channels.

In [None]:
# install the lgatr package
%pip install lgatr

After importing the required modules, we construct a LGATr encoder module. The `attention` and `mlp` dicts organize hyperparameter information, you can find more information in `SelfAttentionConfig` and `MLPConfig` in the docs. They are arguments for the `LGATr` module, in addition to the number of incoming, outgoing and hidden multivector and scalar channels, and the number of `LGATr` blocks. 

In [None]:
# construct LGATr module
from lgatr import LGATr

attention = dict(num_heads=2)
mlp = dict()
lgatr = LGATr(
   in_mv_channels=1,
   out_mv_channels=1,
   hidden_mv_channels=8,
   in_s_channels=0,
   out_s_channels=0,
   hidden_s_channels=16,
   attention=attention,
   mlp=mlp,
   num_blocks=2,
)

We now test `LGATr` on toy data, e.g. a bunch of LHC events. We create particles with fixed mass and gaussian noise as momentum. The resulting four-momenta have shape `p.shape = (128, 20, 1, 4)`; for batch size 128, 20 particles per jet, 1 four-momentum per particle, and 4 numbers for the four-momentum. More generally, `LGATr` operates on objects of the shape `(batch_size, num_particles, num_channels, 16)`, while normal transformers operate on `(batch_size, num_particles, num_channels)`, without the extra 'multivector' dimension.

In [3]:
# generate toy data
import torch
p3 = torch.randn(128, 20, 1, 3)
mass = 1
E = (mass**2 + (p3**2).sum(dim=-1, keepdim=True))**0.5
p = torch.cat((E, p3), dim=-1)
print(p.shape) # torch.Size([128, 20, 1, 4])

torch.Size([128, 20, 1, 4])


We have to embed the four-momenta into multivectors to process them with `LGATr`. The `lgatr` package has functions for that in `lgatr/interface`, usually one needs `embed_vector`, `embed_scalar`, `extract_vector` and `extract_scalar`. For instance, `embed_vector` puts the four-momentum at the indices 1-4 of the multivector, while setting the other indices to zero.

In [4]:
from lgatr.interface import embed_vector, extract_scalar
multivector = embed_vector(p)
print(multivector.shape) # torch.Size([128, 20, 1, 16])

torch.Size([128, 20, 1, 16])


We can now process the multivector with the `LGATr` architecture! It returns another multivector, from which we can extract the component that we want -- for instance the scalar component for a jet tagging or amplitude regression application. Depending on the task, we can also include or extract additional scalar channels.

In [5]:
output_mv, output_s = lgatr(multivectors=multivector, scalars=None)
out = extract_scalar(output_mv)
print(out.shape) # torch.Size([128, 20, 1, 1])

torch.Size([128, 20, 1, 1])


Thats it, now you're ready to build your own `LGATr` model! 