InfluencerFormer

Transformer-based instance segmentation with Influencer Loss — a drop-in replacement for Hungarian matching + dice loss.

Overview

InfluencerFormer takes the same architecture as MaskFormer / Mask2Former — a transformer decoder with learned queries that produce dense per-pixel mask predictions — but replaces the training loss.

Instead of:

Hungarian matching (O(N³) combinatorial assignment, non-differentiable, slow to converge)
Dice loss + binary cross-entropy on matched pairs

We use the Influencer Loss: continuous attractive/repulsive potentials where queries naturally "claim" instances without combinatorial matching. The queries become influencer points that attract their instance's pixels and repel other instances' pixels, all through fully differentiable operations on the dense class vectors.

Originally developed for particle physics track reconstruction, the Influencer Loss is here generalized to mainstream instance segmentation benchmarks (COCO, S3DIS/ScanNet).

What Changes (and What Doesn't)

Component	MaskFormer/Mask2Former	InfluencerFormer
Backbone	Swin / ResNet + pixel decoder	Same
Transformer decoder	Learned queries → dense mask logits	Same
Output format	Dense per-pixel class vectors per query	Same
Training loss	Hungarian matching + dice + BCE	Influencer Loss (continuous potentials)
Query–GT assignment	Bipartite matching (non-differentiable)	Attraction/repulsion (fully differentiable)
Convergence	~500 epochs (DETR) / ~50 epochs (Mask2Former)	Expected faster (no matching instability)

Project Structure

influencerformer/
├── data/     # Dataset download and loading (COCO, S3DIS)
├── losses/   # InfluencerLoss (condensation) + MaskInfluencerLoss (mask-matrix)
└── models/   # InfluencerCriterion — drop-in for OneFormer3D's InstanceCriterion

examples/
├── toy_pointcloud.py   # MLP on S3DIS with InfluencerLoss (condensation approach)
└── toy_image.py        # CNN on COCO with InfluencerLoss (condensation approach)

docs/
├── literature_review.md    # Survey of related work
├── first_pass.md           # Development log
└── integration_guide.md    # OneFormer3D integration (S3DIS / ScanNet)

configs/
├── s3dis/
│   ├── oneformer3d_1xb4_s3dis-area-5.py       # Baseline: Hungarian matching
│   └── influencerformer_1xb4_s3dis-area-5.py  # InfluencerFormer: Influencer Loss
└── scannet/
    ├── oneformer3d_1xb2_scannet.py             # Baseline: Hungarian matching
    └── influencerformer_1xb2_scannet.py        # InfluencerFormer: Influencer Loss

scripts/
├── train_s3dis.sh        # Training wrapper (auto-detects OneFormer3D)
├── eval_s3dis.sh         # Evaluation wrapper
└── register_criterion.py # Registry verification / integration helper

Quick Start

# Install
pip install -e ".[pointcloud]"

# Point cloud example (auto-downloads S3DIS ~2GB)
python examples/toy_pointcloud.py

# Image example (requires COCO val2017 download first)
python examples/toy_image.py

Why Replace Hungarian Matching?

Hungarian matching is the universal training loss for query-based segmentation (DETR, MaskFormer, Mask2Former, Mask3D, SPFormer). It has well-known problems:

Non-differentiable. Gradients don't flow through the assignment decision — the model can't learn which query should own which instance, only how to predict masks once assigned.
O(N³) complexity. Cubic in the number of queries, adding training cost.
Slow convergence. DETR needed ~500 epochs. Subsequent work (DN-DETR, DINO, Mask2Former) invested heavily in tricks to stabilize matching — denoising training, anchor initialization, masked attention. These are workarounds for a fundamental loss function problem.
Fixed query budget. The number of queries is a hyperparameter ceiling on instance count.

The Influencer Loss replaces all of this with continuous potentials: queries attract their instance's pixels and repel others, with formal optimality guarantees that the correct assignment is the unique global minimum.

See docs/literature_review.md for a comprehensive survey.

OneFormer3D Integration (3D Instance Segmentation)

InfluencerCriterion is a drop-in replacement for OneFormer3D's InstanceCriterion on S3DIS / ScanNet — no changes to OneFormer3D source required.

Install

# 1. PyTorch + mmdet3d stack (order matters)
pip install mmengine>=0.10.0 mmdet>=3.0.0
pip install mmcv>=2.0.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.0/index.html

# 2. OneFormer3D (not on PyPI)
git clone https://github.com/filaPro/oneformer3d && pip install -e oneformer3d/

# 3. spconv — pick the wheel matching your CUDA version
#    CUDA 11.3/11.6/11.8/12.x → spconv-cu113 / cu116 / cu118 / cu120
pip install spconv-cu118

# 4. This package
pip install -e ".[mmdet3d]"

# 5. Verify
python scripts/register_criterion.py --verify
# → PASS: InfluencerCriterion is registered in the MODELS registry.

Data

Run OneFormer3D's S3DIS data prep scripts (oneformer3d/data/s3dis/). They produce superpoint .pth files and per-area .pkl annotation files. Then set data_root in configs/s3dis/oneformer3d_1xb4_s3dis-area-5.py to point to your processed data directory.

Train and evaluate

# Single GPU
./scripts/train_s3dis.sh baseline            # Reproduce OneFormer3D (expected mAP 59.8)
./scripts/train_s3dis.sh influencerformer    # InfluencerFormer

# Multi-GPU (uses OneFormer3D's dist_train.sh automatically)
GPUS=4 ./scripts/train_s3dis.sh influencerformer

# Evaluate
./scripts/eval_s3dis.sh influencerformer work_dirs/influencerformer/epoch_512.pth
GPUS=4 ./scripts/eval_s3dis.sh influencerformer work_dirs/influencerformer/epoch_512.pth

# Resume (auto-resumes from latest.pth in work_dir)
./scripts/train_s3dis.sh influencerformer --resume

Key hyperparameter

bg_weight in configs/s3dis/influencerformer_1xb4_s3dis-area-5.py defaults to 0.1, not 1.0. S3DIS rooms are ~75% background superpoints — bg_weight=1.0 destabilises early training by pushing all mask logits negative before the attractive term can compensate. Increase gradually after ~50–100 epochs if background superpoints are still being activated.

Full details, hyperparameter guide, and troubleshooting: docs/integration_guide.md

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.claude		.claude
benchmarks		benchmarks
configs		configs
docs		docs
examples		examples
influencerformer		influencerformer
logs		logs
scripts		scripts
studies		studies
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InfluencerFormer

Overview

What Changes (and What Doesn't)

Project Structure

Quick Start

Why Replace Hungarian Matching?

OneFormer3D Integration (3D Instance Segmentation)

Install

Data

Train and evaluate

Key hyperparameter

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

InfluencerFormer

Overview

What Changes (and What Doesn't)

Project Structure

Quick Start

Why Replace Hungarian Matching?

OneFormer3D Integration (3D Instance Segmentation)

Install

Data

Train and evaluate

Key hyperparameter

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages