Release KERMT v2.0.0 — Contrastive KERMT · NVIDIA-BioNeMo/KERMT

This release introduces Contrastive KERMT, a graph-transformer foundation model for ADMET (absorption, distribution, metabolism, excretion, toxicity) property prediction. It extends the v1 KERMT architecture with new pretraining objectives that produce stronger downstream representations on multi-task ADMET benchmarks.

What's new in v2

Contrastive KERMT pretraining objective. Keeps the v1 graph-transformer encoder + chemistry-specific vocabulary heads, and adds two new pretraining-only heads:
- Transformer-based SMILES reconstruction decoder.
- In-batch contrastive auxiliary classifier (cMIM).
  All four objectives are jointly optimized under a single unified log-probability factorization. The decoder and contrastive head are pretraining-only and are discarded before downstream fine-tuning, so the inference-time footprint matches v1.
Agent skill suite. Eight SKILL.md-format skills under agent/skills/ for driving the full ADMET research lifecycle with LLM agents (Claude Code, Codex, Nemotron): environment setup, pretrain-from-scratch, continue-pretrain, add-cMIM-pretrain, fine-tune, embed, infer, and monitor. See agent/README.md for installation and use.
Training infrastructure. Mid-epoch resume, atomic checkpoint saves, configurable WandB integration, task-specific multi-task FFN heads with per-task dropout, multi-worker data loaders.

Pretrained weights

NGC Catalog: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/resources/kermt-contrastive
Hugging Face: https://huggingface.co/nvidia/NV-KERMT-70M-v2

Both contain the same bundle: a .pt checkpoint (~282 MB) plus the three pretraining vocabulary files (pretrain_atom_vocab.json, pretrain_bond_vocab.json, pretrain_smiles_vocab.pkl). Load via the codebase in this repository.

License

Source code (this repository): Apache License, Version 2.0. See LICENSE.
Model weights (released on NGC and Hugging Face): NVIDIA Open Model License Agreement.

Companion materials

Manuscript / preprint: Xue et al., Probabilistic Contrastive Pretraining for Multi-task ADME Property Prediction. arXiv:2606.11508

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KERMT v2.0.0 — Contrastive KERMT

Choose a tag to compare

Sorry, something went wrong.