Skip to content

fix: pass trust_remote_code=True to remaining AutoConfig.from_pretrained sites#2496

Open
bzantium wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
bzantium:fix/autoconfig-trust-remote-code
Open

fix: pass trust_remote_code=True to remaining AutoConfig.from_pretrained sites#2496
bzantium wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
bzantium:fix/autoconfig-trust-remote-code

Conversation

@bzantium
Copy link
Copy Markdown

What does this PR do ?

Three call sites in the codebase still invoked AutoConfig.from_pretrained without trust_remote_code=True. This crashes for any model that ships custom modeling code via auto_map in config.json, since transformers then tries to interactively prompt the user, which raises EOFError inside a Ray worker (stdin is closed) and finally surfaces the underlying ValueError:

ValueError: The repository <path> contains custom code which must be executed
to correctly load the model. ... Please pass the argument
`trust_remote_code=True` to allow custom code to be run.

The rest of the codebase already passes trust_remote_code=True at every other from_pretrained site (flops_tracker.py, native_checkpoint.py, sglang_worker.py, vllm_worker.py, dtensor_policy_worker.py, automodel/setup.py, huggingface/common.py, megatron/setup.py, megatron/community_import.py, algorithms/utils.py), so this PR just restores consistency at the three remaining sites.

Patched sites:

File Function Why it crashes
nemo_rl/algorithms/distillation.py check_vocab_equality Probes student and teacher configs at setup. The crash is fatal because it happens before any model weights are loaded, so distillation is fully blocked for any custom-arch student or teacher.
nemo_rl/models/generation/vllm/quantization/fp8.py init_fp8 Probes the model config to decide FP8 quant plumbing for vLLM.
nemo_rl/models/megatron/draft/utils.py build_draft_model Probes the draft model config for Megatron speculative decoding.

The distillation.py site is the one I actually hit in practice (on-policy distillation with a custom-arch HF student that ships modeling_*.py and configuration_*.py via auto_map). The other two are the same precondition violation in lower-traffic paths and were caught while looking at the surrounding AutoConfig.from_pretrained usages.

Issues

None filed; this is a small consistency fix discovered while debugging a custom-arch distillation run.

Usage

No new API or configuration. After this patch a distillation run with a custom-arch student or teacher, e.g.

policy:
  model_name: /path/to/custom-arch-student   # config.json has auto_map.AutoModelForCausalLM
distillation:
  teacher:
    model_name: /path/to/custom-arch-teacher

proceeds past check_vocab_equality instead of dying at setup.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • No new tests. This is a 3 line consistency fix mirroring existing call sites; happy to add a unit test that mocks AutoConfig.from_pretrained and asserts the kwarg is forwarded if the reviewers prefer.
  • Verified locally by applying the same patch in a downstream fork and re-running the failing distillation recipe; check_vocab_equality passes and the driver proceeds normally.

…ned call sites

Three call sites still invoked AutoConfig.from_pretrained without
trust_remote_code=True, which crashes for any model that ships custom
modeling code via auto_map in config.json. The rest of the codebase
already passes trust_remote_code=True at every from_pretrained site,
so this restores consistency.

* nemo_rl/algorithms/distillation.py: check_vocab_equality probes both
  student and teacher configs at setup; the failure is fatal because
  it happens before any model weights are loaded.
* nemo_rl/models/generation/vllm/quantization/fp8.py: init_fp8 probes
  the model config to decide FP8 plumbing.
* nemo_rl/models/megatron/draft/utils.py: build_draft_model probes the
  draft model config for speculative decoding.

Without the kwarg, transformers tries to interactively prompt
"Do you wish to run the custom code?" via input(), which raises
EOFError inside Ray workers (stdin is closed) and then surfaces the
proper ValueError asking for trust_remote_code=True.

Signed-off-by: Minho Ryu <ryumin93@gmail.com>
@bzantium bzantium requested review from a team as code owners May 15, 2026 02:00
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 15, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bzantium bzantium changed the title fix: pass trust_remote_code=True to remaining AutoConfig.from_pretrained call sites fix: pass trust_remote_code=True to remaining AutoConfig.from_pretrained sites May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants