# How To Transpile a Pytorch Model

In this tutorial we learn the following topics:

* How to create a Pytorch model from scratch
* How to train the model
* How to export this model to ONNX
* How to transpile the model to Cairo using Giza CLI!
* How to run inference on the transpiled model

## Creating a Pytorch Model

In this section we will create a simple Pytorch model using the MNIST dataset. The MNIST dataset is a dataset of handwritten digits. The dataset consists of 60,000 training images and 10,000 test images. The images are grayscale, 28x28 pixels, and centered to reduce preprocessing and get started quicker. You can read more about the dataset [here](https://en.wikipedia.org/wiki/MNIST_database).

The first step is to install the libraries that we are going to use:

```bash
pip install -r requirements.txt
```

Or:

```bash
pip install giza-cli==0.3.0 onnx==1.14.1 torch==2.1.0 torchvision==0.16.0
```

We will use the libraries for the following purposes:

* `giza-cli` is used to transpile the model to Cairo
* `torch` is used to create the model and train it
* `onnx` is used to export the model to ONNX
* `torchvision` is used to load the MNIST dataset

Now we can import our dependencies and configure basic settings:


In [39]:
import torch
import torchvision
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

learning_rate = 0.01
momentum = 0.5
log_interval = 10

random_seed = 1
torch.backends.cudnn.enabled = False
torch.manual_seed(random_seed)

To download the dataset we will use the `torchvision` library to download the dataset and create a `DataLoader` object that we can use to iterate over the dataset.

Some actions that we need to perform on the dataset are:

* Resize the images to 14x14 pixels, as we will be using `Linear` layers and we need to reduce the number of parameters
* Flatten the images to a vector of 196 elements

In [40]:
train_loader = torch.utils.data.DataLoader(
  torchvision.datasets.MNIST('/tmp', train=True, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               torchvision.transforms.Resize((14,14)),
                                torchvision.transforms.Lambda(lambda x: torch.flatten(x)),
                             ])), shuffle=True)

test_loader = torch.utils.data.DataLoader(
  torchvision.datasets.MNIST('/tmp', train=False, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               torchvision.transforms.Resize((14,14)),
                                torchvision.transforms.Lambda(lambda x: torch.flatten(x)),
                             ])), shuffle=True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz


100.0%

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/MNIST/raw/train-images-idx3-ubyte.gz



100.0%

Extracting /tmp/MNIST/raw/train-images-idx3-ubyte.gz to /tmp/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to /tmp/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting /tmp/MNIST/raw/train-labels-idx1-ubyte.gz to /tmp/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz



100.0%
100.0%


Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to /tmp/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting /tmp/MNIST/raw/t10k-images-idx3-ubyte.gz to /tmp/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to /tmp/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting /tmp/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/MNIST/raw



Now lets see an example of the data that we have just downloaded:

In [41]:
examples = enumerate(test_loader)
batch_idx, (example_data, example_targets) = next(examples)
print(f"example_data.shape: {example_data.shape}")
print(f"example_targets.shape: {example_targets.shape}")

example_data.shape: torch.Size([1, 196])
example_targets.shape: torch.Size([1])




## How To Train The Model

Now its time to train the model, for this we are going to define a basic neural network with 2 hidden layers and 1 output layer.

We will follow the usual way of training a model in Pytorch by creating a `torch.nn.Module` class and defining the `forward` method. The `forward` method is the method that will be called when we pass an input to the model.

In [47]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(196, 10)
        self.fc2 = nn.Linear(10, 10)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

Now lets instantiate the model and define the optimizer:

In [48]:
network = Net()
optimizer = optim.SGD(network.parameters(), lr=learning_rate,
                      momentum=momentum)

Now we are going to create a training loop that will train the model for the desired number of epochs. We will feed the data into the network and calculate the loss. Then we will use the loss to calculate the gradients and update the weights of the network.

In [49]:
train_losses = []
train_counter = []

def train(epoch):
  network.train()
  for batch_idx, (data, target) in enumerate(train_loader):
    optimizer.zero_grad()
    output = network(data)
    loss = F.nll_loss(output, target)
    loss.backward()
    optimizer.step()
    if batch_idx % log_interval == 0:
      print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
        epoch, batch_idx * len(data), len(train_loader.dataset),
        100. * batch_idx / len(train_loader), loss.item()))
      train_losses.append(loss.item())
      train_counter.append(
        (batch_idx*64) + ((epoch-1)*len(train_loader.dataset)))

Time to start training the model! We have choosen 10 epochs, but you can increase/decrease the number as you wish.

In [50]:
train(10)





Lets perform a simple prediction to see how the model performs:

In [63]:
network.eval()

with torch.no_grad():
    pred = network(example_data)
print(f"Prediction: {pred.argmax()}")
print(f"Real Value: {example_targets.item()}")


Prediction: 1
Real Value: 1


Now we have our trained model and we can export it to ONNX. ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers. If you want to know more about it you can read the [documentation](https://onnx.ai/).

## How To Export The ONNX Model Using PyTorch

1. Ensure that your model is in evaluation mode. This can be done by calling `model.eval()`.
2. Generate a dummy input that matches the input size that your model expects. This can be done using `torch.randn()`. In this case we just use the example data.
3. Call `torch.onnx.export()`, passing in your model, the dummy input, and the desired output file name.

The reason we export our PyTorch model to ONNX is to increase interoperability. ONNX is a platform-agnostic format for machine learning models, meaning it can be used with various machine learning and deep learning frameworks. This allows developers to train a model in one framework (in this case, PyTorch) and then use the model in another framework for inference, in our case we will use the model in **Cairo**.

In [65]:
torch.onnx.export(network, example_data, "mnist_pytorch.onnx")

Now our model is in the ONNX format, we can visually check the output using [Netron](https://github.com/lutzroeder/netron), it will allow us to check the final architecture and the operators used by the network.

![neural_network](img/mnist_pytorch_onnx.png)

## Transpile The Model Using Giza CLI!

We are now ready to transpile the model to Cairo using Giza CLI. Giza CLI is a command line tool that allows you to transpile ONNX models to Cairo. You can read more about in the [docs](https://cli.gizatech.xyz/).

The first step to start using Giza CLI is to create a user in the platform. You can do this by running the following command:

```console
❯ giza users create
Enter your username 😎: # YOUR USERNAME GOES HERE
Enter your password 🥷 : # YOUR PASSWORD GOES HERE
Confirm your password 👉🏻 : 
Enter your email 📧: # YOUR EMAIL GOES HERE
[giza][2023-10-12 12:04:06.072] Creating user in Giza ✅ 
[giza][2023-10-12 12:04:13.875] User created ✅. Check for a verification email 📧
```

You will be prompted to add tour username, password and email. Finally you will need to verify your email address by clicking on the link that you will receive in your inbox.

![email](img/email.png)

Once we click the link we will be redirected to a verification endpoint and we will see a message saying that our email has been verified. Now we are ready to start using Giza CLI!

Lets start by login into the platform:

```console
❯ giza users login 
Enter your username 😎: # YOUR USERNAME GOES HERE
Enter your password 🥷 : # YOUR PASSWORD GOES HERE
[giza][2023-10-12 12:09:51.843] Log into Giza
[giza][2023-10-12 12:09:52.622] Credentials written to: {HOME DIRECTORY}/.giza/.credentials.json
[giza][2023-10-12 12:09:52.624] Successfully logged into Giza ✅
```

We should be ready to start using Giza's capabilities, we can easily check by running the following command:

```console
❯ giza users me
[giza][2023-10-12 12:11:37.153] Retrieving information about me!
{
  "username": "YOUR USERNAME GOES HERE",
  "email": "YOUR EMAIL GOES HERE",
  "is_active": true
}
```

Now we are ready to transpile our model to Cairo! We want to help you jumpstart your journey into ZKML by helping you to create this amazing models, we asbtract you from the tedious process of instrospecting the model and getting the ifnromation needed to use it in Cairo, thats why we build the transpilation process, to ease this and improve the iteration time from creating a model to using it in Cairo! 

Lets check how we can do it:

```console
❯ giza transpile mnist_pytorch.onnx --output-path mnist_cairo
[giza][2023-10-12 12:15:30.624] No model id provided, checking if model exists ✅ 
[giza][2023-10-12 12:15:30.625] Model name is: mnist_pytorch
[giza][2023-10-12 12:15:30.956] Model Created with id -> 1! ✅
[giza][2023-10-12 12:15:31.520] Sending model for transpilation ✅ 
[giza][2023-10-12 12:15:42.592] Transpilation recieved! ✅
[giza][2023-10-12 12:15:42.601] Transpilation saved at: mnist_cairo
```

To explain a bit what is happening here, we are calling the `giza transpile` command and passing the path to the ONNX model that we want to transpile. We are also passing the `--output-path` flag to specify the path where we want to save the transpiled model. If we don't pass this flag the model will be saved in the current directory under `cairo_model`.

We can see that we make reference to a `model` this is because in Giza we organize the transpilations under models and versions:

* A `model` is a collection of versions of the same model, we we are iterating over the model and improving it we can create different versions of the same model and keep track of the changes.
* A `version` is a reference to the transpiled model, each new transpilation will be referenced as a new version of the model.

We handle the creation for you, and if the model has the same name we will re-use the model and create a new version under it. If you want to know more about the model and version concept you can read the [docs](https://cli.gizatech.xyz/).

Also, if you ever have a question about the commands that you can run you can always run `giza --help` or `giza COMMAND --help` to get more information about the command and the flags that you can use.

```console
❯ giza --help
```

In [33]:
from math import floor


for val in example_data[0]:
    fp_val = floor(float(val)*(2**16))
    sign = "false" if fp_val > 0 else "true"
    print(f"FixedTrait::<FP16x16>::new({abs(fp_val)}, {sign})")

FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)
FixedTrait::<FP16x16>::new(27802, true)


In [36]:
result = network(example_data[0])

  return F.log_softmax(x)


In [38]:
result

tensor([-1.9416e+01, -1.7760e+01, -1.0474e+01, -1.3184e+01, -1.3526e+01,
        -1.1725e+01, -2.0627e+01, -4.1484e-05, -1.6292e+01, -1.3251e+01],
       grad_fn=<LogSoftmaxBackward0>)