# TopoBench: Learnable Topology Lifting Tutorial

This tutorial demonstrates how TopoBench enables **learnable topology lifting** - the ability to learn and infer graph structure during the training process rather than using fixed, predefined topologies.

## Overview

TopoBench provides a unified pipeline that decouples topology learning from the main model architecture. This allows you to:
- Learn optimal graph structures automatically
- Integrate custom topology learning modules easily  
- Combine different lifting strategies with various backbone models


## Configuration Inspection

Let's start by examining the configuration for learnable topology lifting:


```python
import rootutils
rootutils.setup_root(__file__, indicator=".project-root", pythonpath=True)
import hydra
from omegaconf import OmegaConf

# Load the configurations
config_file = "run.yaml"
with hydra.initialize(
    version_base="1.3",
    config_path="../configs",
    job_name="run"
):
    print('Current config file: ', config_file)
    configs = hydra.compose(
        config_name="run.yaml",
        overrides=[f"dataset=graph/cocitation_cora", f"model=graph/gcn_dgm"], 
        return_hydra_config=True, 
    )

# First lets take a look at not resolved configurations
resolved_config = OmegaConf.to_container(configs.model.feature_encoder, resolve=False)
print(OmegaConf.to_yaml(resolved_config))
```


**Output:**
```yaml
_target_: topobench.nn.encoders.${model.feature_encoder.encoder_name}
encoder_name: DGMStructureFeatureEncoder
in_channels: ${infer_in_channels:${dataset},${oc.select:transforms,null}}
out_channels: 64
proj_dropout: 0.0
loss:
  _target_: topobench.loss.model.DGMLoss
  loss_weight: 10
```

```python
# Now let's resolve the configuration to see the final destinations
resolved_config = OmegaConf.to_container(configs.model.feature_encoder, resolve=True)
print("✅ Resolved Configuration:")
print(OmegaConf.to_yaml(resolved_config))
```

**Output:**
```yaml
_target_: topobench.nn.encoders.DGMStructureFeatureEncoder
encoder_name: DGMStructureFeatureEncoder
in_channels: [computed_based_on_dataset]
out_channels: 64
proj_dropout: 0.0
loss:
  _target_: topobench.loss.model.DGMLoss
  loss_weight: 10
```

Here we can see that feature encoder has a loss:

```yaml
loss:
  _target_: topobench.loss.model.DGMLoss
  loss_weight: 10
```

Let's now take a step back and understand a TopoBench pipeline!

## 🏗️ The TopoBench Pipeline

TopoBench uses a unified pipeline for all models that separates topology learning from the main model computation:

```python
def model_step(self, batch: Data) -> dict:
    """Perform a single model step on a batch of data.

    Parameters
    ----------
    batch : torch_geometric.data.Data
        Batch object containing the batched data.

    Returns
    -------
    dict
        Dictionary containing the model output and the loss.
    """
    # Allow batch object to know the phase of the training
    batch["model_state"] = self.state_str

    # 🔍 Feature Encoder - This is where topology learning happens!
    batch = self.feature_encoder(batch)

    # 🧠 Domain model (your main architecture)
    model_out = self.forward(batch)

    # 📊 Readout
    if self.readout is not None:
        model_out = self.readout(model_out=model_out, batch=batch)

    # 📉 Loss computation (includes topology learning loss)
    model_out = self.process_outputs(model_out=model_out, batch=batch)
    model_out = self.loss(model_out=model_out, batch=batch)
    
    # 📈 Metrics
    self.evaluator.update(model_out)

    return model_out
```

### 🎯 Key Insight
The **feature encoder step** is where topology learning is decoupled from the main model. This allows the pipeline to learn optimal graph structures before they're passed to the backbone model.

## 🔬 Deep Dive: DGM Structure Feature Encoder

Let's examine how the `DGMStructureFeatureEncoder` implements learnable topology lifting:

```python
class DGMStructureFeatureEncoder(AbstractFeatureEncoder):
    
    def forward(self, data: torch_geometric.data.Data) -> torch_geometric.data.Data:
        """Forward pass that learns and updates graph structure.

        The method applies BaseEncoders to features of selected dimensions
        and infers new graph topology using Deep Graph Matching (DGM).

        Parameters
        ----------
        data : torch_geometric.data.Data
            Input data object with x_{i} features for each dimension i.

        Returns
        -------
        torch_geometric.data.Data
            Output data object with learned topology and updated features.
        """
        # Ensure x_0 exists (node features)
        if not hasattr(data, "x_0"):
            data.x_0 = data.x

        # Process each topological dimension
        for i in self.dimensions:
            if hasattr(data, f"x_{i}") and hasattr(self, f"encoder_{i}"):
                batch = getattr(data, f"batch_{i}")
                
                # 🚀 The magic happens here: DGM inference
                x_, x_aux, edges_dgm, logprobs = getattr(self, f"encoder_{i}")(
                    data[f"x_{i}"], batch
                )
                
                # Update features and structure
                data[f"x_{i}"] = x_
                data[f"x_aux_{i}"] = x_aux
                
                # 🔥 Key: Replace original edges with learned edges!
                data["edges_index"] = edges_dgm
                data[f"logprobs_{i}"] = logprobs
                
        return data
```

### 🔑 Critical Operations

The encoder performs two crucial operations:

1. **Feature Learning**: Updates node features (`x_`) and auxiliary features (`x_aux`)
2. **Topology Learning**: **Replaces** the original edge structure with learned edges:
   ```python
   data["edges_index"] = edges_dgm      # New learned topology substiturs the old one
   data[f"logprobs_{i}"] = logprobs     # Variables needed to compute the loss
   ```

## 🎯 Loss Integration

The learned topology is optimized through the loss function in the main pipeline:

```python
# Loss computation includes both task loss and topology learning loss
model_out = self.process_outputs(model_out=model_out, batch=batch)
model_out = self.loss(model_out=model_out, batch=batch)  # batch contains learned edges + logprobs
```

This enables end-to-end learning where the topology is optimized jointly with the main task objective.

## 🚀 Running the Example

Execute the learnable topology lifting pipeline:

```bash
python -m topobench dataset=graph/cocitation_cora model=graph/gcn_dgm
```

This command:
- Uses the **Cora citation network** dataset
- Applies **GCN backbone** with **DGM topology learning**
- Learns optimal graph structure during training

## 🛠️ Integrating Your Own Learnable Module

Want to add your custom topology learning approach? Follow these three simple steps:

### Step 1: Create Your Feature Encoder

Create `/topobench/nn/encoders/my_custom_encoder.py`:

```python
from topobench.nn.encoders.base import AbstractFeatureEncoder

class MyCustomTopologyEncoder(AbstractFeatureEncoder):
    def forward(self, data):
        # Your custom topology learning logic here
        learned_edges = self.infer_topology(data.x_0)
        data["edges_index"] = learned_edges
        return data
```

### Step 2: Create Your Loss Function  

Create `/topobench/loss/model/my_custom_loss.py`:

```python
class MyCustomTopologyLoss:
    def __init__(self, loss_weight=1.0, regularization=0.01):
        self.loss_weight = loss_weight
        self.regularization = regularization
    
    def __call__(self, model_out, batch):
        # Your custom topology learning loss
        topology_loss = self.compute_topology_loss(batch)
        model_out["loss"] += self.loss_weight * topology_loss
        return model_out
```

### Step 3: Create Model Configuration

Create `/configs/model/graph/my_custom_model.yaml`:

```yaml
_target_: topobench.nn.encoders.MyCustomTopologyEncoder
encoder_name: MyCustomTopologyEncoder
in_channels:
  - ${infer_in_channels:${dataset},${oc.select:transforms,null}}
out_channels: 64
proj_dropout: 0.0
loss:
  _target_: topobench.loss.model.MyCustomTopologyLoss
  loss_weight: 5.0
  regularization: 0.01
```

### Step 4: Run Your Model

```bash
python -m topobench dataset=graph/your_dataset model=graph/my_custom_model
```

## 🎉 Summary

TopoBench's learnable topology lifting provides:

- **🔄 Unified Pipeline**: Consistent interface for all topology learning approaches
- **🎯 Decoupled Design**: Topology learning separated from backbone models  
- **🔧 Easy Integration**: Simple 3-step process to add custom modules
- **📈 End-to-End Learning**: Joint optimization of structure and task objectives

The key insight is that topology learning happens in the **feature encoder stage**, allowing any backbone model to benefit from learned graph structures without modification.

---

**Next Steps**: Try experimenting with different datasets and backbone models to see how learnable topology lifting improves performance on your specific tasks!