Context summarization feature implementation #3621

markbackman · 2026-02-06T19:12:40Z

I think this is fine to start. We will probably want to iterate until we land on something optimized.

Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		- Added new frames for context summarization: `LLMContextSummaryRequestFrame` and `LLMContextSummaryResultFrame`.

-Original file line number
+Diff line change
@@ -0,0 +1,5 @@
+    - Added context summarization feature to automatically compress conversation history when conversation length limits (by token or message count) are reached, enabling efficient long-running conversations.
+      - Configure via `enable_context_summarization=True` in `LLMAssistantAggregatorParams`
+      - Customize behavior with `LLMContextSummarizationConfig` (max tokens, thresholds, etc.)
+      - Automatically preserves incomplete function call sequences during summarization
+      - See new examples: `examples/foundational/54-context-summarization-openai.py` and `examples/foundational/54a-context-summarization-google.py`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context summarization feature implementation #3621

Uh oh!

Diff view

Diff view

There are no files selected for viewing

markbackman Feb 6, 2026

Uh oh!

Uh oh!

-Original file line number
+Diff line change
@@ -0,0 +1,307 @@
+    # Code Cleanup Skill
+    The **Code Cleanup Skill** reviews, refactors, and documents code changes in your current branch, ensuring alignment with **Pipecat’s architecture, coding standards, and example patterns**.
+    It focuses on **readability, correctness, performance, and consistency**, while avoiding breaking changes.
+    ---
+    ## Skill Overview
+    This skill analyzes all changes introduced in your branch and performs the following actions:
+. **Analyze Branch Changes**
+       - Review uncommitted changes and outgoing commits
+. **Refactor for Readability**
+       - Improve clarity, naming, structure, and modern Python usage
+. **Enhance Performance**
+       - Identify safe, conservative optimization opportunities
+. **Add Documentation**
+       - Apply Pipecat-style, Google-format docstrings
+. **Ensure Pattern Consistency**
+       - Match existing Pipecat services, pipelines, and examples
+. **Validate Examples**
+       - Ensure examples follow foundational patterns (e.g. `07-interruptible.py`)
+    ---
+    ## Usage
+    Invoke the skill using any of the following commands:
+    - “Clean up my branch code”
+    - “Refactor the changes in my branch”
+    - “Review and improve my branch code”
+    - `/cleanup`
+    ---
+    ## What This Skill Does
+    ### 1. Analyze Branch Changes
+    The skill retrieves all uncommitted changes and outgoing commits to understand:
+    - New files added
+    - Modified files
+    - Code additions and deletions
+    - Overall scope and intent of changes
+    ---
+    ### 2. Code Refactoring
+    #### Readability Improvements
+    - Replace tuples with named classes or dataclasses
+    - Improve variable, method, and class naming
+    - Extract complex logic into well-named helper methods
+    - Add missing type hints
+    - Simplify nested or complex conditionals
+    - Replace deprecated methods and features
+    - Normalize formatting to match Pipecat style
+    #### Performance Enhancements
+    - Identify inefficient loops or repeated work
+    - Suggest appropriate data structures
+    - Optimize async workflows and I/O
+    - Remove redundant operations
+    > Performance changes are conservative and non-breaking.
+    ---
+    ### 3. Documentation
+    Documentation follows **Google-style docstrings**, consistent with Pipecat conventions.
+    #### Class Documentation
+    ```python
+    class ExampleService:
+        """Brief one-line description.
+        Detailed explanation of the class purpose, responsibilities,
+        and important behaviors.
+        Supported features:
+        - Feature 1
+        - Feature 2
+        - Feature 3
+        """
+    ```
+    #### Method Documentation
+    ```python
+    def process_data(self, data: str, options: Optional[dict] = None) -> bool:
+        """Process incoming data with optional configuration.
+        Args:
+            data: The input data to process.
+            options: Optional configuration dictionary.
+        Returns:
+            True if processing succeeded, False otherwise.
+        Raises:
+            ValueError: If data is empty or invalid.
+        """
+    ```
+    #### Pydantic Model Parameters
+    ```python
+    class InputParams(BaseModel):
+        """Configuration parameters for the service.
+        Parameters:
+            timeout: Request timeout in seconds.
+            retry_count: Number of retry attempts.
+            enable_logging: Whether to enable debug logging.
+        """
+        timeout: Optional[float] = None
+        retry_count: int = 3
+        enable_logging: bool = False
+    ```
+    ---
+    ### 4. Pattern Consistency Checks
+    #### Service Classes
+    - Correct inheritance (`TTSService`, `STTService`, `LLMService`)
+    - Consistent constructor signatures
+    - Frame emission patterns
+    - Metrics support:
+      - `can_generate_metrics()`
+      - TTFB metrics
+      - Usage metrics
+    - Alignment with similar existing services
+    #### Examples
+    Validated against `examples/foundational/07-interruptible.py`:
+    - Proper `create_transport()` usage
+    - Correct pipeline structure
+    - Task setup and observers
+    - Event handler registration
+    - Runner and bot entrypoint consistency
+    ---
+    ### 5. Specific Implementation Patterns
+    #### Service Implementation
+    ```python
+    class ExampleTTSService(TTSService):
+        def __init__(self, *, api_key: Optional[str] = None, **kwargs):
+            super().__init__(**kwargs)
+            self._api_key = api_key or os.getenv("SERVICE_API_KEY")
+        def can_generate_metrics(self) -> bool:
+            return True
+        async def run_tts(self, text: str) -> AsyncGenerator[Frame, None]:
+            try:
+                await self.start_ttfb_metrics()
+                yield TTSStartedFrame()
+                # ... processing ...
+                yield TTSAudioRawFrame(...)
+            finally:
+                await self.stop_ttfb_metrics()
+    ```
+    ---
+    #### Example Structure Pattern
+    ```python
+    transport_params = {
+        "daily": lambda: DailyParams(...),
+        "twilio": lambda: FastAPIWebsocketParams(...),
+        "webrtc": lambda: TransportParams(...),
+    }
+    async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
+        stt = DeepgramSTTService(...)
+        tts = SomeTTSService(...)
+        llm = OpenAILLMService(...)
+        context = LLMContext(messages)
+        user_aggregator, assistant_aggregator = LLMContextAggregatorPair(...)
+        pipeline = Pipeline([...])
+        task = PipelineTask(pipeline, params=..., observers=[...])
+        @transport.event_handler("on_client_connected")
+        async def on_client_connected(transport, client):
+            await task.queue_frames([LLMRunFrame()])
+        runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
+        await runner.run(task)
+    async def bot(runner_args: RunnerArguments):
+        """Main bot entry point compatible with Pipecat Cloud."""
+        transport = await create_transport(runner_args, transport_params)
+        await run_bot(transport, runner_args)
+    ```
+    ---
+    ## Execution Flow
+. Fetch uncommitted and outgoing changes
+. Categorize files (services, examples, tests, utilities)
+. Analyze each file:
+       - Readability
+       - Performance
+       - Documentation
+       - Pattern consistency
+. Generate actionable recommendations
+. Apply Pipecat standards
+    ---
+    ## Examples
+    ### Before: Tuple Usage
+    ```python
+    def get_audio_info(self) -> Tuple[int, int]:
+        return (48000, 1)
+    ```
+    ### After: Named Class
+    ```python
+    class AudioInfo:
+        """Audio configuration information.
+        Parameters:
+            sample_rate: Sample rate in Hz.
+            num_channels: Number of audio channels.
+        """
+        sample_rate: int
+        num_channels: int
+    def get_audio_info(self) -> AudioInfo:
+        return AudioInfo(sample_rate=48000, num_channels=1)
+    ```
+    ---
+    ### Before: Missing Documentation
+    ```python
+    class NewTTSService(TTSService):
+        def __init__(self, api_key: str, voice: str):
+            self._api_key = api_key
+            self._voice = voice
+    ```
+    ### After: Fully Documented
+    ```python
+    class NewTTSService(TTSService):
+        """Text-to-speech service using NewProvider API.
+        Streams PCM audio and emits TTSAudioRawFrame frames compatible
+        with Pipecat transports.
+        Supported features:
+        - Text-to-speech synthesis
+        - Streaming PCM audio
+        - Voice customization
+        - TTFB metrics
+        """
+        def __init__(self, *, api_key: str, voice: str, **kwargs):
+            """Initialize the NewTTSService.
+            Args:
+                api_key: API key for authentication.
+                voice: Voice identifier to use.
+                **kwargs: Additional arguments passed to the parent service.
+            """
+            super().__init__(**kwargs)
+            self._api_key = api_key
+            self.set_voice(voice)
+    ```
+    ---
+    ## Notes
+    - Non-breaking improvements only
+    - Backward compatibility preserved
+    - Conservative performance changes
+    - Google-style docstrings
+    - Pattern checks follow recent Pipecat code

Context summarization feature implementation #3621

Uh oh!

Context summarization feature implementation #3621

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

markbackman Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!