# `torch` vs `sytorch` 

In this tutorial we will compare the training/repair API of `torch` and
`sytorch`. We consider a simple pointwise repair specification: `N(x) == x**2`
for the input point `x = 2`.

In [1]:
import warnings; warnings.filterwarnings("ignore")

## Modules
#### `torch`: define a module

In [2]:
import torch
torch_model = torch.nn.Sequential(
    torch.nn.Linear(1,3),
    torch.nn.ReLU(),
    torch.nn.Linear(3, 1)
)
torch_model

Sequential(
  (0): Linear(in_features=1, out_features=3, bias=True)
  (1): ReLU()
  (2): Linear(in_features=3, out_features=1, bias=True)
)

#### `sytorch`: define a module

In [3]:
# sytorch: define a module. `sytorch` modules are subclasses of `torch.nn.Module`
import sytorch as st
sytorch_model = st.nn.Sequential(
    st.nn.Linear(1,3),
    st.nn.ReLU(),
    st.nn.Linear(3, 1)
)

# Or you can convert a torch module to sytorch module.
sytorch_model = st.nn.from_torch(torch_model)
sytorch_model

Sequential(
  (0): Linear(in_features=1, out_features=3, bias=True)
  (1): ReLU()
  (2): Linear(in_features=3, out_features=1, bias=True)
)

## Optimizer
#### `torch`: define an optimizer and attach it to an model

In [4]:
# torch: define an SGD optimizer for parameters of `torch_model`.
torch_optimizer = torch.optim.SGD(torch_model.parameters(), 1e-4)

# torch: fine-grained controll of trainable parameters.
torch_model.requires_grad_(False)
torch_model[-1].requires_grad_()

Linear(in_features=3, out_features=1, bias=True)

#### `sytorch`: define an optimizer and attach it to an model

In [5]:
# sytorch: define an Gurobi optimizer
sytorch_optimizer = st.GurobiSolver()

# sytorch: Attach the `sytorch_model` to the Gurobi optimizer.
sytorch_model.to(sytorch_optimizer)

# sytorch: fine-grained controll of ediable parameters.
sytorch_model[-1].requires_symbolic_()

Restricted license - for non-production use only - expires 2024-10-28
Set parameter Crossover to value 0
Set parameter Method to value 2
Set parameter Threads to value 8
Set parameter Presolve to value 1


Linear(in_features=3, out_features=1, bias=True)

## Execution modes

In [6]:
x = torch.as_tensor([[2.]])

# torch: toggle training mode, it affects the behavior of layers like Dropout and BatchNorm.
torch_model.train()

# torch: zero gradients
torch_model.zero_grad()

# torch: forward execution
y = torch_model(x)

# torch: backward execution, compute gradient of the loss with respect to all
#        the learnable parameters of the model.
y.backward()

# torch: print the gradients
torch_model[-1].weight.grad

tensor([[0.7090, 0.0000, 0.6592]])

In [7]:
# torch: `torch.no_grad()` disables the gradient computaiton.
# with torch.no_grad():
#     torch_model.train()
#     torch_model.zero_grad()
#     y = torch_model(x)
#     y.backward()
#     torch_model[-1].weight.grad

#### `sytorch`: `repair` mode

In [8]:
x = torch.randn(1,1)

# sytorch: `.repair()` enables the symbolic execution mode.
sytorch_model.repair()

sy = sytorch_model(x)

print(sy)
sytorch_optimizer.print()

encoding:   0%|          | 0/1 [00:00<?, ?it/s]

[[<gurobi.Var C5>]]
Minimize
  0.0
Subject To
  R0: C0 = 0
  R1: 0.9498160481452942 C1 + C4 + -1.0 C5 = 0
Bounds
  C0 free
  C1 free
  C2 free
  C3 free
  C4 free
  C5 free


In [9]:
# sytorch: `.repair(False)` disables the symbolic execution mode.
sytorch_model.repair(False)
print(sytorch_model(x))

tensor([[-0.7021]], grad_fn=<AddmmBackward0>)


In [10]:
# sytorch: `st.no_symbolic()` creates a context that disables the symbolic execution mode.
with st.no_symbolic():
    print(sytorch_model(x))

tensor([[-0.7021]], grad_fn=<AddmmBackward0>)


## Optimization
Pointwise specification: $N(x) = x^2$

In [11]:
# torch:
criterion = torch.nn.MSELoss()
y = torch_model(x)
loss = criterion(y, x**2)
loss.backward()

In [12]:
# sytorch: add constraints: N(x) >= 0
sytorch_optimizer.add_constraints(sy >= 0)

# sytorch: add optimization objective: N(x) == x**2
sytorch_optimizer.minimize(sy - x**2)

## Model update

In [13]:
# torch:
torch_optimizer.step()

In [14]:
# sytorch:
sytorch_optimizer.solve()
sytorch_model.update_()

Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)

CPU model: Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz, instruction set [SSE2|AVX|AVX2|AVX512]
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 4 rows, 7 columns and 7 nonzeros
Model fingerprint: 0x1f5f0100
Coefficient statistics:
  Matrix range     [9e-01, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [2e-05, 2e-05]
Presolve removed 4 rows and 7 columns
Presolve time: 0.04s
Presolve: All rows and columns removed

Barrier solved model in 0 iterations and 0.04 seconds (0.00 work units)
Optimal objective -2.33790561e-05


Sequential(
  (0): Linear(in_features=1, out_features=3, bias=True)
  (1): ReLU()
  (2): Linear(in_features=3, out_features=1, bias=True)
)

## Evaluate Optimization Result

In [15]:
# torch: after one step of SGD
with torch.no_grad():
    print(f"torch_model(x) == {torch_model(x).item():.2f} < 0")
    print(f"torch_model(x) - x**2 == {(torch_model(x) - x**2).item():.2f}")

torch_model(x) == -0.70 < 0
torch_model(x) - x**2 == -0.70


In [16]:
# sytorch: 
with torch.no_grad(), st.no_symbolic():
    print(f"sytorch_model(x) == {sytorch_model(x).item():.2f} >= 0")
    print(f"sytorch_model(x) - x**2 == {(sytorch_model(x) - x**2).item():.2f}")

sytorch_model(x) == 0.00 >= 0
sytorch_model(x) - x**2 == -0.00


## Array/Tensor APIs

In [17]:
# torch:
a = torch.randn(3,4)
b = torch.randn(4,2)
(a @ b + 1).argmax(-1)

tensor([1, 0, 1])

In [18]:
# sytorch: create numpy array of variables
a = sytorch_optimizer.reals((3,4))
b = torch.randn(4,2)
(a @ b + 1).argmax(-1)

ArgMaxEncoder([[<gurobi.LinExpr: 1.0 + -0.6095733046531677 C7 + -1.1738548278808594 C8 + -1.1657414436340332 C9 + -1.9638018608093262 C10>,
                <gurobi.LinExpr: 1.0 + 0.6273224949836731 C7 + -1.8662066459655762 C8 + -1.3796387910842896 C9 + 1.6790162324905396 C10>],
               [<gurobi.LinExpr: 1.0 + -0.6095733046531677 C11 + -1.1738548278808594 C12 + -1.1657414436340332 C13 + -1.9638018608093262 C14>,
                <gurobi.LinExpr: 1.0 + 0.6273224949836731 C11 + -1.8662066459655762 C12 + -1.3796387910842896 C13 + 1.6790162324905396 C14>],
               [<gurobi.LinExpr: 1.0 + -0.6095733046531677 C15 + -1.1738548278808594 C16 + -1.1657414436340332 C17 + -1.9638018608093262 C18>,
                <gurobi.LinExpr: 1.0 + 0.6273224949836731 C15 + -1.8662066459655762 C16 + -1.3796387910842896 C17 + 1.6790162324905396 C18>]],
              dtype=object)