Fix duplicated replies when HuggingFaceLocalGenerator uses multiple stop words by 18062706139fcz · Pull Request #11414 · deepset-ai/haystack

18062706139fcz · 2026-05-27T03:34:52Z

Summary

HuggingFaceLocalGenerator currently duplicates replies when multiple stop_words are configured. This PR updates the stop word post-processing so each stop word is removed sequentially from the existing replies instead of producing a cross-product of replies and stop words.

Problem

When stop_words contains more than one entry, the generator returns too many replies after generation.

Root Cause

The current implementation uses a nested list comprehension that iterates over both replies and self.stop_words at the same time:

[reply.replace(stop_word, "").rstrip() for reply in replies for stop_word in self.stop_words]

That creates one output entry per (reply, stop_word) pair, which duplicates replies.

Fix

Apply stop words sequentially to the current reply list:

iterate over self.stop_words
rewrite replies in place for each stop word
preserve the original number of replies while still stripping all configured stop words

Tests

Ran:

hatch run test:unit test/components/generators/test_hugging_face_local_generator.py

Result:

25 passed
3 deselected

Added a regression test that verifies multiple stop words do not duplicate replies.

vercel · 2026-05-27T03:34:57Z

Someone is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

CLAassistant · 2026-05-27T03:35:00Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

ryker seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

julian-risch · 2026-05-29T12:15:02Z

@18062706139fcz Thank you for opening this pull request. We are closing this PR as duplicate of #11413
If you would like to contribute with another PR, please make sure to agree to our CLA. #11414 (comment) and make sure that commits are linked to your account https://help.github.com/articles/why-are-my-commits-linked-to-the-wrong-user/#commits-are-not-linked-to-any-user

Fix HuggingFaceLocalGenerator stop word deduplication

5c69f0f

18062706139fcz requested a review from a team as a code owner May 27, 2026 03:34

18062706139fcz requested review from julian-risch and removed request for a team May 27, 2026 03:34

github-actions Bot added topic:tests type:documentation Improvements on the docs labels May 27, 2026

sachinn854 mentioned this pull request May 27, 2026

bug: HuggingFaceLocalGenerator returns N×M replies instead of N when stop_words has multiple entries #11409

Open

1 task

julian-risch closed this May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix duplicated replies when HuggingFaceLocalGenerator uses multiple stop words#11414

Fix duplicated replies when HuggingFaceLocalGenerator uses multiple stop words#11414
18062706139fcz wants to merge 1 commit into
deepset-ai:mainfrom
18062706139fcz:fix-11409-hf-local-generator-stop-words

18062706139fcz commented May 27, 2026

Uh oh!

vercel Bot commented May 27, 2026

Uh oh!

CLAassistant commented May 27, 2026

Uh oh!

julian-risch commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

18062706139fcz commented May 27, 2026

Summary

Problem

Root Cause

Fix

Tests

Uh oh!

vercel Bot commented May 27, 2026

Uh oh!

CLAassistant commented May 27, 2026

Uh oh!

julian-risch commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants