Skip to content

docs: add note on filtering prompt-elicited inline tags (e.g. <thinking>) before TTS#976

Merged
markbackman merged 2 commits into
pipecat-ai:mainfrom
scttbnsn:docs/filter-custom-inline-tags
Jul 3, 2026
Merged

docs: add note on filtering prompt-elicited inline tags (e.g. <thinking>) before TTS#976
markbackman merged 2 commits into
pipecat-ai:mainfrom
scttbnsn:docs/filter-custom-inline-tags

Conversation

@scttbnsn

@scttbnsn scttbnsn commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Follow-up to pipecat-ai/pipecat#4901. When a system prompt asks the LLM to reason inside inline <thinking>...</thinking> tags with extended thinking off, the reasoning streams back as plain text and gets spoken by TTS. @markbackman's investigation there concluded this belongs in docs rather than provider code: prefer native extended thinking, and strip deliberately-elicited inline tags at the text layer. This adds that note.

What changed

  • New "Removing Custom Inline Tags" section in pipecat/learn/text-to-speech.mdx with the PatternPairAggregator + MatchAction.REMOVE snippet from the issue, plus a Tip pointing at native extended thinking as the preferred path when the goal is genuine reasoning.
  • A short bullet in the Notes of api-reference/server/services/llm/anthropic.mdx cross-linking the new section, since that's where someone debugging spoken thinking text with Anthropic looks first.

Verification

  • Snippet imports and behavior checked against pipecat main: LLMTextProcessor(text_aggregator=...), add_pattern(..., action=MatchAction.REMOVE), and LLMThoughtTextFrame routing in AnthropicLLMService.
  • npx prettier clean on both files.

- 📝 docs(learn): add 'Removing Custom Inline Tags' section to text-to-speech page with PatternPairAggregator + MatchAction.REMOVE snippet and a Tip preferring native extended thinking
- 📝 docs(api-reference): cross-link the new section from the Anthropic service Notes

Addresses pipecat-ai/pipecat#4901

- **Prompt caching**: When `enable_prompt_caching` is enabled, Anthropic caches repeated context to reduce costs. Cache control markers are automatically added to the most recent user messages. This is most effective for conversations with large system prompts or long conversation histories.
- **Extended thinking**: Enabling thinking increases response quality for complex tasks but adds latency. When `type="enabled"`, you must provide a `budget_tokens` value (minimum 1024 with current models). Extended thinking is disabled by default.
- **Prompt-elicited `<thinking>` tags**: If your system prompt asks the model to reason inside inline tags rather than enabling extended thinking, that reasoning is ordinary text and will be spoken by TTS. Prefer the `thinking` parameter; for inline tags you deliberately keep, see [Removing Custom Inline Tags](/pipecat/learn/text-to-speech#removing-custom-inline-tags).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note makes sense.

Rather than adding a new subsection to the learning guides, it might make sense to just point the developer directly to the PatternPairAggregator. In the PatternPairAggregator, we can add a new, generic section about removing tags, which we can link to. Something like this would do the trick:

### Removing Tagged Content

To drop content from the text stream entirely, register a pattern with `MatchAction.REMOVE`. The tags and everything between them are removed before reaching downstream processors — nothing is spoken by TTS and nothing lands in the conversation context. This is useful when your prompt elicits inline tags whose content is not meant for the user, such as reasoning tags (e.g., `<thinking>...</thinking>`) or annotations intended for other processors:

```python
from pipecat.processors.aggregators.llm_text_processor import LLMTextProcessor
from pipecat.utils.text.pattern_pair_aggregator import MatchAction, PatternPairAggregator

pattern_aggregator = PatternPairAggregator()
pattern_aggregator.add_pattern(
    type="thinking",
    start_pattern="<thinking>",
    end_pattern="</thinking>",
    action=MatchAction.REMOVE,
)

# Set the aggregator on an LLMTextProcessor
llm_text_processor = LLMTextProcessor(text_aggregator=pattern_aggregator)

# add the llm_text_processor to your pipeline after the llm and before the tts
# llm -> llm_text_processor -> tts

Because this filters the text stream itself, it works with any LLM provider and any custom inline tag.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b117685. Used your text as a new "Removing Tagged Content" example on the PatternPairAggregator page, dropped the learn-guide section, and pointed the anthropic note at the new anchor.

Comment thread pipecat/learn/text-to-speech.mdx Outdated
# llm -> llm_text_processor -> tts
```

### Removing Custom Inline Tags

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the other comment, I think we'll want to remove this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

- 🗑️ remove(learn): drop the new text-to-speech section per review
- 📝 docs(api-reference): add 'Removing Tagged Content' usage example to pattern-pair-aggregator
- 🔄 refactor(api-reference): point the Anthropic note at the new anchor

Review feedback from pipecat-ai#976

@markbackman markbackman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for taking care of this 🙇

@markbackman markbackman merged commit b4cb081 into pipecat-ai:main Jul 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants