# A Quick Tutorial for Implicit Deep Learning

This tutorial introduces the **Implicit Deep Learning** (IDL) framework using the `idl` package in 3 main parts:

1. **A Simple Example**
    - Implicit Model
    - Implcit RNN
    - State-driven Implicit Model (SIM)
3. **Custom Activation for Implicit model**
4. **Implicit model as a layer**

## 1. A Simple Example

This section provides a quick guide on how to use our package. With just a few lines of code, you can get started effortlessly.

Before proceeding, please ensure you have installed the required packages by following the [installation](https://github.com/HoangP8/Implicit-Deep-Learning?tab=readme-ov-file#installation) instructions.

#### 1a. `ImplicitModel`

`ImplicitModel` is the most fundamental implicit model. For details on its parameters and the underlying intuition, please refer to the [documentation](link).

In this example, we demonstrate how to use the model for a simple regression task.

In [13]:
import torch
import torch.nn as nn
import torch.optim as optim
from idl import ImplicitModel

torch.manual_seed(0)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Random input and output data
x = torch.randn(5, 64).to(device)  # (batch_size=5, input_dim=64)
y = torch.randn(5, 10).to(device)  # (batch_size=5, output_dim=10)

# Initialize the model
model = ImplicitModel(input_dim=64,
                      output_dim=10, 
                      hidden_dim=128)
model.to(device)

# Define MSE loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Training loop
num_epochs = 10
for epoch in range(num_epochs):
    optimizer.zero_grad() 
    output = model(x)  # Forward pass
    loss = criterion(output, y)  # Compute MSE loss
    loss.backward() 
    optimizer.step()
    
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")
        
# Inference step
model.eval()  
with torch.no_grad():  
    x_test = torch.randn(1, 64).to(device)
    y_pred = model(x_test)  
    print(f"Inference result: \n {y_pred}")

Epoch [1/10], Loss: 1.5919
Epoch [2/10], Loss: 1.0334
Epoch [3/10], Loss: 0.4830
Epoch [4/10], Loss: 0.1951
Epoch [5/10], Loss: 0.1479
Epoch [6/10], Loss: 0.1692
Epoch [7/10], Loss: 0.1399
Epoch [8/10], Loss: 0.0868
Epoch [9/10], Loss: 0.0465
Epoch [10/10], Loss: 0.0318
Inference result: 
 tensor([[-0.0525,  0.5056, -0.1804, -0.2234, -0.2438, -0.4717, -0.2398, -0.4559,
          0.0045, -0.1295]], device='cuda:0')


The `ImplicitModel` has its forward and backward passes **fully packaged**, ensuring that the training and inference steps work **as normal**, with no additional modifications required. You only need to define the model with the appropriate `input_dim`, `output_dim`, and `hidden_dim`, and use it just like any other model.

#### 1b. `ImplicitRNN`

`ImplicitRNN` uses an implicit layer to define recurrence within a standard RNN framework. For more details, please refer to the [documentation](link).

Its usage is very similar to `ImplicitModel`. Below, we provide an example where the model learns to predict a single output from an input sequence in a simple regression task.

In [14]:
import torch
import torch.nn as nn
import torch.optim as optim
from idl import ImplicitRNN

torch.manual_seed(0)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Random input and output sequence 
x = torch.randn(50, 20, 1).to(device)  # (batch_size=50, seq_len=20, input_dim=1)
y = torch.randn(50, 1).to(device)  # (batch_size=50, output_dim=1)

# Initialize the ImplicitRNN model
model = ImplicitRNN(input_dim=1, output_dim=1, hidden_dim=10, implicit_hidden_dim=10)
model.to(device)

# Define MSE loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Training loops
num_epochs = 10
for epoch in range(num_epochs):
    optimizer.zero_grad()
    output = model(x)  # Forward pass
    loss = criterion(output, y)  # Compute MSE loss
    loss.backward()
    optimizer.step()

    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

# Inference step
model.eval()
with torch.no_grad():
    x_test = torch.randn(1, 20, 1).to(device)
    y_pred = model(x_test)
    print(f"Inference result: {y_pred}")

Epoch [1/10], Loss: 0.8179
Epoch [2/10], Loss: 0.8017
Epoch [3/10], Loss: 0.7861
Epoch [4/10], Loss: 0.7708
Epoch [5/10], Loss: 0.7557
Epoch [6/10], Loss: 0.7392
Epoch [7/10], Loss: 0.7199
Epoch [8/10], Loss: 0.6989
Epoch [9/10], Loss: 0.6883
Epoch [10/10], Loss: 0.6879
Inference result: tensor([[-0.5798]], device='cuda:0')


#### 1c. `SIM`

In [15]:
from torch.utils.data import Dataset

# Custom Dataset to mimic torchvision.datasets (with .data and .targets)
class TorchvisionDataset(Dataset):
    def __init__(self, x, y):
        self.data = x
        self.targets = y

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        return self.data[idx], self.targets[idx]

In [16]:
import torch
from torch.utils.data import DataLoader
from idl.sim import SIM
from idl.sim.solvers import CVXSolver

torch.manual_seed(0)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
batch_size = 5

# Random training data
x_train = torch.randn(100, 64).to(device) # (no_samples=100, input_dim=64)
y_train = torch.randint(0, 10, (100,)).to(device) # 100 random class labels (0-9)
train_loader = DataLoader(TorchvisionDataset(x_train, y_train), batch_size=batch_size, shuffle=True)

# Random test data
x_test = torch.randn(20, 64).to(device) # (no_samples=20, input_dim=64)
y_test = torch.randint(0, 10, (20,)).to(device) # 20 random class labels (0-9)
test_loader = DataLoader(TorchvisionDataset(x_test, y_test), batch_size=batch_size, shuffle=False)

# Explicit model 
explicit_model = torch.nn.Sequential(
    torch.nn.Linear(64, 128),
    torch.nn.ReLU(),
    torch.nn.Linear(128, 10)
).to(device)

# SIM model
sim = SIM(device=device)

# Define CrossEntropy loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(explicit_model.parameters(), lr=0.01)

# Train and evaluate the SIM model
sim.train(solver=CVXSolver(), model=explicit_model, dataloader=train_loader)

# After training, evaluate the SIM model on test data
sim.evaluate(test_loader)

  0%|          | 0/4 [00:00<?, ?it/s]

100%|██████████| 4/4 [00:19<00:00,  4.90s/it]
100%|██████████| 1/1 [00:02<00:00,  2.61s/it]

Test accuruacy: 0.2





0.2

## 2. Custom Activation for Implicit model
The default activation of the Implicit model is ReLU. To override the implicit function you wish to use, just simply replace the `phi` and `dphi` (gradient of activation) methods. Below is an example of SiLU activation.

In [17]:
# ImplicitFunctionInf: function to ensure wellposedness of Implicit model
from idl import ImplicitModel, ImplicitFunctionInf 
import torch

class ImplicitFunctionInfSiLU(ImplicitFunctionInf):
    """
    An implicit function that uses the SiLU nonlinearity.
    """
    
    @staticmethod
    def phi(X):
        return X * torch.sigmoid(X)

    @staticmethod
    def dphi(X):
        grad = X.clone().detach()
        sigmoid = torch.sigmoid(grad)
        return sigmoid * (1 + grad * (1 - sigmoid))


# Initialize the model
model = ImplicitModel(input_dim=64,
                      output_dim=10, 
                      hidden_dim=128,
                      f=ImplicitFunctionInfSiLU)

# train model normally after

## Implicit model as a layer
Implicit Model can be integrated as a layer within larger models, allowing it to be trained as part of the overall network. The training process works normally, below is an example:


In [18]:
import torch
import torch.nn as nn
import torch.optim as optim
from idl import ImplicitModel

torch.manual_seed(0)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define a larger model that includes ImplicitModel as a layer
class MLPWithImplicit(nn.Module):
    def __init__(self, input_dim, hidden_dim, implicit_hidden_dim, output_dim):
        super(MLPWithImplicit, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        self.implicit_layer = ImplicitModel(input_dim=hidden_dim, output_dim=output_dim, hidden_dim=implicit_hidden_dim)
        self.activation = nn.ReLU()

    def forward(self, x):
        x = self.activation(self.fc1(x))
        x = self.implicit_layer(x)  # Pass through ImplicitModel
        return x


torch.manual_seed(42)

# Random input and output data
x = torch.randn(5, 64).to(device)  # (batch_size=5, input_dim=64)
y = torch.randn(5, 10).to(device)  # (batch_size=5, output_dim=10)

# Initialize the model
model = MLPWithImplicit(input_dim=64, hidden_dim=128, implicit_hidden_dim=64, output_dim=10)
model.to(device)

# Define MSE loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Training loop
num_epochs = 10
for epoch in range(num_epochs):
    optimizer.zero_grad() 
    output = model(x)  # Forward pass
    loss = criterion(output, y)  # Compute MSE loss
    loss.backward() 
    optimizer.step()
    
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")
        
# Inference step
model.eval()  
with torch.no_grad():  
    x_test = torch.randn(1, 64).to(device)  
    y_pred = model(x_test)  
    print(f"Inference result: \n {y_pred}")

Epoch [1/10], Loss: 0.7000
Epoch [2/10], Loss: 0.3324
Epoch [3/10], Loss: 0.1701
Epoch [4/10], Loss: 0.0583
Epoch [5/10], Loss: 0.0516
Epoch [6/10], Loss: 0.0430
Epoch [7/10], Loss: 0.0322
Epoch [8/10], Loss: 0.0248
Epoch [9/10], Loss: 0.0216
Epoch [10/10], Loss: 0.0203
Inference result: 
 tensor([[-0.7536, -0.1237,  0.1082, -0.7727,  0.6030,  1.1026,  0.3494,  0.2714,
          0.3646,  0.4176]], device='cuda:0')
