Skip to content

Fix AttributeError in RAG generate() for missing config fields#46035

Merged
Rocketknight1 merged 1 commit into
huggingface:mainfrom
Sriniketh24:fix/rag-outdated-examples
May 18, 2026
Merged

Fix AttributeError in RAG generate() for missing config fields#46035
Rocketknight1 merged 1 commit into
huggingface:mainfrom
Sriniketh24:fix/rag-outdated-examples

Conversation

@Sriniketh24
Copy link
Copy Markdown
Contributor

Fixes #46015.

What this does

RagSequenceForGeneration.generate() crashes with AttributeError: 'RagConfig' object has no attribute 'num_return_sequences' because the config never defined num_beams or num_return_sequences.

Fix (2 lines in modeling_rag.py): Uses getattr with a default of 1 (matching the documented defaults in the generate docstring) instead of direct attribute access.

Adding these fields directly to RagConfig was not viable — it triggers the generation utils validation error ("You have modified the pretrained model configuration to control generation"), since generation params belong in generation_config, not the model config.

Also removes two stale references to examples/rag/use_own_knowledge_dataset.py in retrieval_rag.py docstrings, which no longer exists in the repo.

Coordination

Tests run

from transformers import AutoConfig
config = AutoConfig.from_pretrained("facebook/rag-token-nq")
# Previously: AttributeError: 'RagConfig' object has no attribute 'num_return_sequences'
# Now: resolves to default of 1 via getattr
assert getattr(config, "num_return_sequences", 1) == 1
ruff check src/transformers/models/rag/  # All checks passed
ruff format --check src/transformers/models/rag/  # Already formatted

Note: pytest tests/models/rag/test_modeling_rag.py segfaults on macOS ARM due to faiss-cpu — this is a pre-existing environment issue, not related to this change.

AI assistance (Claude Code) was used; all changes were reviewed and tested manually.

RagSequenceForGeneration.generate() crashes with AttributeError because
it accesses self.config.num_return_sequences and self.config.num_beams,
but RagConfig never defines these fields. Use getattr with a default of
1 (matching the documented defaults) instead of direct attribute access.

Also removes stale references to examples/rag/use_own_knowledge_dataset.py
which no longer exists in the repo.

Fixes huggingface#46015
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: rag

Copy link
Copy Markdown
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, looks good!

@Rocketknight1 Rocketknight1 enabled auto-merge May 18, 2026 15:35
@Rocketknight1 Rocketknight1 added this pull request to the merge queue May 18, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Merged via the queue into huggingface:main with commit 2991f8d May 18, 2026
23 checks passed
@someone282801
Copy link
Copy Markdown

someone282801 commented May 18, 2026

Hi, thank you for take look to one part of #46015 and fix 1 of possibly bunch of issues this could have.

The issue of not defined attributes is gone due to default usage of 1 in this case but this not fixes main points as reference to an up to date example instead of removing reference pointed in.

We are supposed to be in this state unless i am missing something as the new label on main issue seems to be (Good First Issue), i don't know what does it means but could be related to fix 1 of 3 things maybe ?

  1. First example provided crashes directly. - Not fixed, same crash happens + documentation wasn't updated taking in count that tokenizer.prepare_seq2seq_batch not exists.
  2. Second example based in custom dataset, referencing to a not defined path. - Can't be considered as resolved because not provided required information as up to date way to have a custom dataset with embeddings.
  3. Third customized script for reveal internal non existent properties from RAGConfig. - Resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Outdated examples about RAG

4 participants