Skip to content

WestbrookLong/Interformer_Pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InterFormer PyTorch Port

This repository is a PyTorch reading-and-reimplementation project based on the paper:

  • Paper: InterFormer: Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction
  • arXiv: https://arxiv.org/abs/2411.09852

The repository follows the same engineering layout as OneTrans_Pytorch, but replaces the backbone with an InterFormer-style architecture that combines:

  • non-sequence interaction learning
  • sequence modeling with personalized FFN
  • cross-modal summarization between the two branches

What This Repository Contains

There are two layers in this codebase:

  1. The backbone port This is the PyTorch implementation of the InterFormer building blocks.
  2. The training wrapper This adds dataset loading, feature tensorization, metrics, mixed precision, checkpointing, and CLI training support.

This repo is therefore a runnable local training scaffold around an InterFormer-style backbone, not just a shape demo.

Implementation Notes

The model design follows the paper-level architecture:

  • non-sequence feature preprocessing into per-field tokens
  • sequence preprocessing through a MaskNet-like block
  • Interaction Arch for behavior-aware non-sequence learning
  • Sequence Arch for context-aware sequence modeling
  • Cross Arch for low-dimensional information exchange

The default Interaction Arch follows the paper's reported benchmark setting:

  • DHEN
  • with DOT and DCN as the ensembled interaction modules

Because the runnable demo script reuses the local TAAC tensorization pipeline from OneTrans_Pytorch, the data interface is adapted to the available numeric tensors:

  • non-sequence features are treated as numeric fields and tokenized field-wise
  • sequence features are treated as per-step numeric channels and unified by SequenceMaskNet

This keeps the project runnable while preserving the main architecture of the paper.

Repository Structure

main_pytorch.py

This file is the backbone reference implementation. It contains:

  • NumericFieldTokenizer
  • SequenceMaskNet
  • LinearCompressedEmbedding
  • SelfGating
  • DotProductInteraction
  • DCNInteraction
  • DHENInteraction
  • PersonalizedFFN
  • RotaryMultiHeadAttention
  • InterFormerBlock
  • InterFormerEncoder

This is the closest file to the core paper architecture.

models/

Task-level model wrappers live here.

Current file:

  • models/taac_interformer.py

This wraps the backbone into a trainable classifier:

  • tokenizes non-sequence features into field tokens
  • applies MaskNet-style sequence preprocessing
  • stacks InterFormer blocks
  • summarizes the final non-sequence and sequence states
  • applies a classification head

utils/

Reusable non-model logic lives here.

  • utils/common.py General helpers such as seed setup and split generation.
  • utils/metrics.py Accuracy and AUC computation.
  • utils/taac_data.py Dataset loading, schema handling, feature conversion, and tensor construction.

scripts/

Runnable entrypoints live here.

Current file:

  • scripts/run_taac2026_sample.py

This script is the main training entrypoint. It handles:

  • dataset download or local parquet reading
  • feature tensorization
  • model construction
  • AMP setup
  • training and validation loops
  • checkpoint save and resume

Backbone Architecture

At a high level, the model flow is:

  1. Map non-sequence features into field tokens.
  2. Map sequence channels into dense sequence tokens through SequenceMaskNet.
  3. Compute an initial non-sequence summary and prepend it as CLS tokens.
  4. Repeat L InterFormer blocks:
    • summarize non-sequence tokens
    • summarize sequence tokens through CLS, PMA, and recent tokens
    • update non-sequence tokens through the Interaction Arch
    • update sequence tokens through PFFN and rotary attention
  5. Recompute final non-sequence and sequence summaries.
  6. Flatten and concatenate both summaries for downstream prediction.

Input Shapes

  • non-sequential input: [batch_size, non_seq_dim]
  • sequential input: [batch_size, seq_len, seq_feature_dim]

Recommended Entry Points

Backbone sanity check

python main_pytorch.py

Training run

python scripts/run_taac2026_sample.py --epochs 5 --batch-size 32

Training with the default paper-style Interaction Arch

python scripts/run_taac2026_sample.py --epochs 5 --batch-size 32 --interaction-backbone dhen

Resume training

python scripts/run_taac2026_sample.py --epochs 10 --batch-size 32 --resume best_model_20260416_120000.pt --save-checkpoint

What This README Tries To Clarify

This repository is not an official release from the paper authors. It is:

  • a local PyTorch reimplementation of the published InterFormer architecture
  • adapted to the engineering conventions used in OneTrans_Pytorch
  • wrapped in a runnable training scaffold over the same local TAAC demo pipeline

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages