# A "skeleton" model explaining the Hybrid QML in PennyLane

_This notebook explores the creation of a "skeleton" hybrid model in **PennyLane and PyTorch**_.

**By:** Jacob Cybulski ([website](https://jacobcybulski.com/))<br>
**Date:** Sptember 26, 2025<br>
**Updates:** October 8, 2025<br>
**Aims:** To develop a small hybrid model consisting of dummy components, such as quantum circuits, functions and matrices acting on data.<br/>
**Prerequisites:** We will assume your knowledge of *QML* and *PennyLane* with *Python*<br>
**License:** 
This project is licensed under the [GPL-3.0](https://www.gnu.org/licenses/gpl-3.0.txt)<br>
**Changes:** All changes to this code must be listed at the bottom of this notebook

**Example:**<br>
>Let's say we want to create a differentiable and trainable function $f$, defined as a network of functions $(f2)$, quantum circuits $(c1, c2, c3)$ and matrices $(m1)$, acting on data $(d1, d2)$. For example, in the form:<br>
$$
\begin{align*}
f(d1, d2) = m1(c3(& && \textit{Join and transform}\\
                  &c1(d1), && \textit{Branch 1}\\
                  &c2(f2(d2)))) && \textit{Branch 2}
\end{align*}
$$
What kind of code glue would be required to achieve this?

## Libraries

In [1]:
import pennylane as qml
from pennylane import numpy as np
import torch
import torch.nn as nn

## Constants

In [2]:
### Data sizes
N_FEATURES_D1 = 3
N_FEATURES_D2 = 5

### Circuits I/O sizes
#   We'll use the number of wires for C1 and C2
#      as there are data features on their input
#   C3 will need to handle their combined output
N_WIRES_C1 = N_FEATURES_D1
N_WIRES_C2 = N_FEATURES_D2
N_WIRES_C3 = N_WIRES_C1 + N_WIRES_C2

### Devices for different circuits
dev_c1 = qml.device("default.qubit", wires=N_WIRES_C1)
dev_c2 = qml.device("default.qubit", wires=N_WIRES_C2)
dev_c3 = qml.device("default.qubit", wires=N_WIRES_C3)

##  Circuit definition
*A batch size refers to the number of examples on input*

In [3]:
### c1: Acts on data d1
#   Processes N_WIRES_C1 features and outputs N_WIRES_C1 expectation values
@qml.qnode(dev_c1, interface="torch")
def C1(inputs, weights):
    qml.AngleEmbedding(inputs, wires=range(N_WIRES_C1))
    qml.BasicEntanglerLayers(weights, wires=range(N_WIRES_C1))
    return [qml.expval(qml.PauliZ(i)) for i in range(N_WIRES_C1)]

### c2: Acts on the output of f2(d2)
#   Same structure as c1
@qml.qnode(dev_c2, interface="torch")
def C2(inputs, weights):
    qml.AngleEmbedding(inputs, wires=range(N_WIRES_C2))
    qml.BasicEntanglerLayers(weights, wires=range(N_WIRES_C2))
    return [qml.expval(qml.PauliZ(i)) for i in range(N_WIRES_C2)]

### c3: Acts on the combined output of c1 and c2
#   Needs to have N_WIRES_C3 features
@qml.qnode(dev_c3, interface="torch")
def C3(inputs, weights):
    qml.AngleEmbedding(inputs, wires=range(N_WIRES_C3))
    qml.BasicEntanglerLayers(weights, wires=range(N_WIRES_C3))
    return [qml.expval(qml.PauliZ(i)) for i in range(N_WIRES_C3)]

## The entire hybrid model
*Here instead of a full 2D matrix, we use a simple linear transformation. For a more complex matrix transform, we can pass the matrix at the model initialisation and then apply it using torch.matmul() in the forward step.*

In [4]:
### Model of function f in Torch with PennyLane layers
class HybridF(nn.Module):
    def __init__(self):
        super().__init__()
        
        # f2: A simple classical linear transformation (NN)
        #     Input features must match d2 dimensions
        #     Output features must match c2 input
        self.f2 = nn.Sequential(
            nn.Linear(in_features=N_FEATURES_D2, out_features=2*N_FEATURES_D2),
            nn.ReLU(),
            nn.Linear(in_features=2*N_FEATURES_D2, out_features=N_WIRES_C2)
        )
        
        # m1: Final linear matrix transformation
        #     Input features must match c3 output
        self.m1 = nn.Linear(in_features=N_WIRES_C3, out_features=1)

        # c1, c2, c3: All quantum components are Torch layers
        #     Weight shapes now match the number of wires in c1, c2 and c3
        c1_weight_shapes = {"weights": (1, N_WIRES_C1)}
        c2_weight_shapes = {"weights": (1, N_WIRES_C2)}
        c3_weight_shapes = {"weights": (1, N_WIRES_C3)}
        self.c1_layer = qml.qnn.TorchLayer(C1, c1_weight_shapes)
        self.c2_layer = qml.qnn.TorchLayer(C2, c2_weight_shapes)
        self.c3_layer = qml.qnn.TorchLayer(C3, c3_weight_shapes)
        
    def forward(self, d1, d2):
        
        ### Branch 1
        c1_out = self.c1_layer(d1).float()
        
        ### Branch 2
        f2_out = self.f2(d2)
        c2_out = self.c2_layer(f2_out).float()
        
        ### Join outputs of c1 and f2 for c3
        c3_in = torch.cat([c1_out, c2_out], dim=1)
        c3_out = self.c3_layer(c3_in).float() 
        
        ### Final result through a simple classical linear transformation
        #   For more complex matrix transform, we can use torch.matmul()
        final_output = self.m1(c3_out) 
        
        return final_output

## Model creation and training

In [5]:
### For reproducibility
seed = 2025
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

### Instantiate the model
f = HybridF()

### Create dummy data
batch_size = 10
d1_data = torch.randn(batch_size, N_FEATURES_D1)
d2_data = torch.randn(batch_size, N_FEATURES_D2)
labels = torch.randn(batch_size, 1)

### Define loss function and optimizer
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(f.parameters(), lr=0.1)

### Training loop
epochs = 20
print('\nStarting training\n')
for epoch in range(epochs):
    optimizer.zero_grad()
    predictions = f(d1_data, d2_data)
    loss = loss_fn(predictions, labels)
    loss.backward()
    optimizer.step()
    print(f'Epoch {epoch+1:2d}, Loss: {loss.item():0.4f}')

print('\nTraining complete')


Starting training

Epoch  1, Loss: 0.5069
Epoch  2, Loss: 0.4726
Epoch  3, Loss: 0.4664
Epoch  4, Loss: 0.4632
Epoch  5, Loss: 0.4344
Epoch  6, Loss: 0.4039
Epoch  7, Loss: 0.3662
Epoch  8, Loss: 0.3017
Epoch  9, Loss: 0.2199
Epoch 10, Loss: 0.1460
Epoch 11, Loss: 0.1919
Epoch 12, Loss: 0.0878
Epoch 13, Loss: 0.0938
Epoch 14, Loss: 0.0366
Epoch 15, Loss: 0.0571
Epoch 16, Loss: 0.0361
Epoch 17, Loss: 0.0417
Epoch 18, Loss: 0.0383
Epoch 19, Loss: 0.0434
Epoch 20, Loss: 0.0377

Training complete


## Model testing

In [6]:
### For reproducibility
seed = 2025
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

### New data
test_batch_size = 5
d1_test = torch.randn(test_batch_size, N_FEATURES_D1)
d2_test = torch.randn(test_batch_size, N_FEATURES_D2)

### Function application and result is an array
f(d1_test, d2_test).detach().numpy()

array([[ 0.09331846],
       [ 0.46699953],
       [-0.5923619 ],
       [ 0.2560212 ],
       [-0.32537133]], dtype=float32)

## Systems in use (Linux)

In [7]:
!pip list | grep -e pennylane -e torch

pennylane                 0.42.3
pennylane_lightning       0.42.0
torch                     2.8.0
torchaudio                2.8.0
torcheval                 0.0.7
torchmetrics              1.8.2
torchsummary              1.5.1
torchvision               0.23.0
