# SlotDeconv Tutorial

This notebook demonstrates how to use SlotDeconv for spatial transcriptomics deconvolution.

**Key features:**
- Slot-based reference learning from scRNA-seq
- Spatial dependency modeling via neighborhood consistency
- Automatic parameter selection based on data characteristics

In [1]:
import os,sys
sys.path.insert(0,"/project/zhiwei/hf78/Slotdecon/model")
from slot_model import SlotDeconv, DEFAULT_CONFIG
from slot_utility import load_data, align_genes, compute_metrics, print_metrics
from run_slot import run_slotdeconv
import torch

## Quick Start: One-line Execution

The easiest way to run SlotDeconv with optimized parameters:

In [None]:
# One-line execution with default configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
data_dir = "/project/zhiwei/hf78/Slotdecon/data/spotifydata"
result = run_slotdeconv(data_dir)

In [3]:
# Access results
pred = result['pred']           # Predicted proportions (with spatial)
pred_nnls = result['pred_nnls'] # Predicted proportions (NNLS only)
metrics = result['metrics_spatial']     # Evaluation metrics

pred.head()

Unnamed: 0,Astro,BCell,CA,DG,Endo,L2_3_IT_CTX,L4_5_IT_CTX,L4_IT_CTX,L5_6_IT_CTX,L5_6_NP_CTX,...,Microglia,Neutrophil,Oligo,Pvalb,SUB,Sncg,Sst,Sst_Chodl,TCell,Vip
AAACACCAATAACTGC-1,0.154998,0.045958,0.024607,0.007264,0.007754,0.053136,0.042549,0.01904017,0.058789,4.72168e-07,...,0.018633,4.755727e-07,0.151664,0.071584,0.014204,0.014746,0.027234,0.022588,0.044983,4.870921e-07
AAACAGAGCGACTCCT-1,0.191761,0.052963,0.011852,0.005853,0.025532,0.032567,0.02521,5.49861e-07,0.029004,5.379986e-07,...,0.020138,5.551204e-07,0.211225,0.059521,0.00847,0.01865,0.032922,0.086423,0.049793,0.009484059
AAACAGCTTTCAGAAG-1,0.151945,0.043956,0.021566,0.002775,0.016147,0.065962,0.06447,0.04483424,0.067603,4.970908e-07,...,0.018095,5.107284e-07,0.143492,0.06086,0.011349,0.014641,0.022228,0.021338,0.024753,5.037155e-07
AAACAGGGTCTATATT-1,0.133008,0.032813,0.026009,0.000182,0.009469,0.059219,0.045854,0.0158335,0.069444,5.09019e-07,...,0.016908,5.200093e-07,0.18248,0.080749,0.0181,0.020384,0.039419,0.027541,0.027021,0.004323166
AAACCGGGTAGGTACC-1,0.233118,0.052882,0.049479,0.014137,0.035462,0.049882,0.028792,6.778046e-07,0.040957,6.566639e-07,...,0.029245,6.748666e-07,0.200845,0.048384,0.006265,0.010509,0.014361,0.018258,0.049616,6.891951e-07


## Step-by-Step Usage

For more control over the process:

In [4]:
# 1. Load data
dat = load_data(data_dir)
sc_count, st_count = align_genes(dat["sc_count"], dat["st_count"])

print(f"Genes: {sc_count.shape[0]}")
print(f"Cells: {sc_count.shape[1]}")
print(f"Spots: {st_count.shape[1]}")
print(f"Cell types: {len(dat['cell_types'])}")

Genes: 32285
Cells: 24627
Spots: 3243
Cell types: 27


In [5]:
# 2. Initialize model (use_default_config=True ensures reproducibility)
model = SlotDeconv(random_state=42, use_default_config=True)
# 3. Fit reference from scRNA-seq
model.fit(sc_count, dat["sc_meta"], dat["cell_types"])
# 4. Deconvolve ST data
pred = model.transform(st_count, dat["spatial"], use_spatial=True)

[SlotDeconv] Fitting: 27 types, 3000 genes, max 750 cells/type
[SlotDeconv] Config: λ_div=4.0, margin=0.1
[SlotDeconv] Training reference matrix...
[SlotDeconv] Reference matrix learned
[SlotDeconv] NNLS deconvolution...
[SlotDeconv] Spatial refinement (λ_sp=15.0, backend=dense)...
[SlotDeconv] Done


In [6]:
# 5. Evaluate (if ground truth available)
if 'true_props' in dat:
    true = dat["true_props"].loc[st_count.columns, dat["cell_types"]]
    metrics = compute_metrics(true, pred)
    print_metrics(metrics, "SlotDeconv")


  SlotDeconv
  RMSE:        0.0863
  JSD:         0.4212
  Corr(spot):  0.5533
  Corr(type):  0.1947
  Cosine:      0.6266
  AUPR:        0.5794


## default Configuration

The optimized parameters used in the default:

In [7]:
print("default Configuration:")
for k, v in DEFAULT_CONFIG.items():
    print(f"  {k}: {v}")

default Configuration:
  n_genes: 3000
  max_cells_per_type: 750
  lambda_div: 4.0
  margin: 0.1
  lambda_sp: 15.0
  sp_epochs: 1500
  sp_lr: 0.01
  b_epochs: 2000
  b_lr: 0.001
  pow_w: 0.8
  knn: 15
  d_slot: 128
  dec_hidden: (256, 512)
  dec_dropout: 0.1


## Custom Parameters

To use custom parameters instead of default defaults:

In [12]:
# Option 1: Override specific parameters
model = SlotDeconv(use_default_config=True)  # Start with default config
model.fit(sc_count, dat["sc_meta"], dat["cell_types"],
          b_epochs=1500,
          n_genes=4000,      
          lambda_div=5.0)     
pred = model.transform(st_count, dat["spatial"], 
                       sp_epochs=2500,
                       lambda_sp=10.0) 

[SlotDeconv] Fitting: 27 types, 4000 genes, max 750 cells/type
[SlotDeconv] Config: λ_div=5.0, margin=0.1
[SlotDeconv] Training reference matrix...
[SlotDeconv] Reference matrix learned
[SlotDeconv] NNLS deconvolution...
[SlotDeconv] Spatial refinement (λ_sp=10.0, backend=dense)...
[SlotDeconv] Done


In [13]:
if 'true_props' in dat:
    true = dat["true_props"].loc[st_count.columns, dat["cell_types"]]
    metrics = compute_metrics(true, pred)
    print_metrics(metrics, "SlotDeconv")


  SlotDeconv
  RMSE:        0.0884
  JSD:         0.4464
  Corr(spot):  0.5292
  Corr(type):  0.1930
  Cosine:      0.6066
  AUPR:        0.5477


In [10]:
# Option 2: Use fully adaptive parameters (for new datasets)
model = SlotDeconv(use_default_config=False)  # Use adaptive parameters
model.fit(sc_count, dat["sc_meta"], dat["cell_types"])
pred = model.transform(st_count, dat["spatial"])

[SlotDeconv] Fitting: 27 types, 3600 genes, max 750 cells/type
[SlotDeconv] Config: λ_div=4.0, margin=0.1
[SlotDeconv] Training reference matrix...
[SlotDeconv] Reference matrix learned
[SlotDeconv] NNLS deconvolution...
[SlotDeconv] Spatial refinement (λ_sp=12.0, backend=dense)...
[SlotDeconv] Done


## Compare With/Without Spatial Modeling

In [11]:
# Without spatial (NNLS only)
pred_nnls = model.transform(st_count, dat["spatial"], use_spatial=False)

# With spatial dependency modeling
pred_spatial = model.transform(st_count, dat["spatial"], use_spatial=True)

if 'true_props' in dat:
    true = dat["true_props"].loc[st_count.columns, dat["cell_types"]]
    
    m_nnls = compute_metrics(true, pred_nnls)
    m_spatial = compute_metrics(true, pred_spatial)
    
    print_metrics(m_nnls, "Without Spatial")
    print_metrics(m_spatial, "With Spatial")
    
    improvement = (m_spatial['corr_spot'] - m_nnls['corr_spot']) / m_nnls['corr_spot'] * 100
    print(f"\nImprovement: +{improvement:.1f}%")

[SlotDeconv] NNLS deconvolution...
[SlotDeconv] NNLS deconvolution...
[SlotDeconv] Spatial refinement (λ_sp=12.0, backend=dense)...
[SlotDeconv] Done

  Without Spatial
  RMSE:        0.0943
  JSD:         0.4832
  Corr(spot):  0.4292
  Corr(type):  0.1779
  Cosine:      0.5361
  AUPR:        0.5610

  With Spatial
  RMSE:        0.0876
  JSD:         0.4349
  Corr(spot):  0.5386
  Corr(type):  0.1998
  Cosine:      0.6143
  AUPR:        0.5605

Improvement: +25.5%


## Save Results

In [12]:
# Save predictions
pred.to_csv("slotdeconv_predictions.csv")
print("Saved to slotdeconv_predictions.csv")

Saved to slotdeconv_predictions.csv


## Command Line Usage

```bash
# Basic usage
python run_slot.py --data_dir ./data/spotifydata --output_dir ./results

# Without spatial refinement
python run_slot.py --data_dir ./data/spotifydata --no_spatial

# Custom seed
python run_slot.py --data_dir ./data/spotifydata --seed 123
```