vectorize-io · benfrank241 · Apr 7, 2026 · Apr 7, 2026 · Apr 24, 2026 · Apr 24, 2026
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -50,6 +50,7 @@ jobs:
       integrations-cloudflare-oauth-proxy: ${{ steps.filter.outputs.integrations-cloudflare-oauth-proxy }}
       integrations-lockfiles: ${{ steps.filter.outputs.integrations-lockfiles }}
       integrations-openai-agents: ${{ steps.filter.outputs.integrations-openai-agents }}
+      integrations-pipecat: ${{ steps.filter.outputs.integrations-pipecat }}
       dev: ${{ steps.filter.outputs.dev }}
       ci: ${{ steps.filter.outputs.ci }}
       # Secrets are available for internal PRs, pull_request_review, and workflow_dispatch.
@@ -136,6 +137,8 @@ jobs:
             - 'scripts/check-integration-lockfiles.sh'
           integrations-openai-agents:
             - 'hindsight-integrations/openai-agents/**'
+          integrations-pipecat:
+            - 'hindsight-integrations/pipecat/**'
           dev:
             - 'hindsight-dev/**'
           ci:
@@ -599,6 +602,44 @@ jobs:
       working-directory: ./hindsight-integrations/paperclip
       run: npm test
 
+  test-pipecat-integration:
+    needs: [detect-changes]
+    if: >-
+      github.event_name != 'pull_request_review' &&
+      (github.event_name == 'workflow_dispatch' ||
+      needs.detect-changes.outputs.integrations-pipecat == 'true' ||
+      needs.detect-changes.outputs.ci == 'true')
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v6
+      with:
+        ref: ${{ github.event.pull_request.head.sha || '' }}
+
+    - name: Install uv
+      uses: astral-sh/setup-uv@v7
+      with:
+        enable-cache: true
+        prune-cache: false
+
+    - name: Set up Python
+      uses: actions/setup-python@v6
+      with:
+        python-version-file: ".python-version"
+
+    - name: Build pipecat integration
+      working-directory: ./hindsight-integrations/pipecat
+      run: uv build
+
+    - name: Install dependencies
+      working-directory: ./hindsight-integrations/pipecat
+      run: uv sync --frozen
+
+    - name: Run tests
+      working-directory: ./hindsight-integrations/pipecat
+      run: uv run pytest tests -v
+
+
   build-control-plane:
     needs: [detect-changes]
     if: >-
@@ -2908,6 +2949,7 @@ jobs:
       - test-cloudflare-oauth-proxy-integration
       - build-chat-integration
       - test-paperclip-integration
+      - test-pipecat-integration
       - build-control-plane
       - build-docs
       - test-rust-cli

diff --git a/hindsight-docs/docs-integrations/pipecat.md b/hindsight-docs/docs-integrations/pipecat.md
@@ -0,0 +1,128 @@
+---
+sidebar_position: 22
+title: "Pipecat Persistent Memory with Hindsight | Integration"
+description: "Add persistent long-term memory to Pipecat voice AI pipelines via Hindsight. A single FrameProcessor slots between the user aggregator and LLM to recall context before each turn and retain conversation content after."
+---
+
+# Pipecat
+
+Persistent long-term memory for [Pipecat](https://github.com/pipecat-ai/pipecat) voice AI pipelines via [Hindsight](https://vectorize.io/hindsight). A single `FrameProcessor` slots between your user context aggregator and LLM service — recalling relevant memories before each turn and retaining conversation content after.
+
+## Quick Start
+
+```bash
+# 1. Start Hindsight (self-hosted)
+pip install hindsight-all
+export HINDSIGHT_API_LLM_API_KEY=your-openai-key
+hindsight-api
+
+# 2. Install the integration
+pip install hindsight-pipecat
+```
+
+```python
+from pipecat.pipeline.pipeline import Pipeline
+from hindsight_pipecat import HindsightMemoryService
+
+memory = HindsightMemoryService(
+    bank_id="user-123",
+    hindsight_api_url="http://localhost:8888",
+)
+
+pipeline = Pipeline([
+    transport.input(),
+    stt_service,
+    user_aggregator,
+    memory,           # ← add between user_aggregator and LLM
+    llm_service,
+    assistant_aggregator,
+    tts_service,
+    transport.output(),
+])
+```
+
+Or with [Hindsight Cloud](https://ui.hindsight.vectorize.io/signup):
+
+```python
+memory = HindsightMemoryService(
+    bank_id="user-123",
+    hindsight_api_url="https://api.hindsight.vectorize.io",
+    api_key="hsk_your_token_here",
+)
+```
+
+## How It Works
+
+```
+New turn starts
+  └─ OpenAILLMContextFrame arrives
+       ├─ Retain previous complete turn (user+assistant) — fire-and-forget
+       └─ Recall relevant memories for current user query
+            └─ Inject as <hindsight_memories> system message
+                 └─ Forward enriched context to LLM
+```
+
+On each `OpenAILLMContextFrame`:
+
+1. **Retain** — any new complete user+assistant turn pairs are sent to Hindsight asynchronously (non-blocking)
+2. **Recall** — the latest user message is used as the search query; results are injected as a system message before the LLM sees the context
+3. **Forward** — the enriched context frame is pushed downstream
+
+Memory accumulates across calls. By the third or fourth turn, recall starts surfacing useful context that the pipeline didn't have to re-establish.
+
+## Configuration
+
+```python
+HindsightMemoryService(
+    bank_id="user-123",              # Required: memory bank to use
+    hindsight_api_url="...",         # Hindsight API URL
+    api_key="hsk_...",               # API key (Hindsight Cloud)
+    recall_budget="mid",             # "low", "mid", or "high"
+    recall_max_tokens=4096,          # Max tokens for recall results
+    enable_recall=True,              # Inject memories before LLM
+    enable_retain=True,              # Store turns after each exchange
+    memory_prefix="Relevant memories from past conversations:\n",
+)
+```
+
+### Global Configuration
+
+```python
+from hindsight_pipecat import configure
+
+configure(
+    hindsight_api_url="http://localhost:8888",
+    api_key="hsk_...",
+    recall_budget="mid",
+)
+
+# Now create services without repeating connection details
+memory = HindsightMemoryService(bank_id="user-123")
+```
+
+## Compatibility
+
+Tested with Pipecat `v0.0.108`. The processor handles both the new `LLMContextFrame` and the deprecated `OpenAILLMContextFrame` for forward compatibility.
+
+## Manual Testing
+
+The `examples/` directory includes an interactive text-based chat simulator for testing memory recall/retain without requiring Daily/Deepgram/Cartesia API keys:
+
+```bash
+python examples/interactive_chat.py --bank demo-user
+```
+
+The `examples/basic_pipeline.py` shows the full voice pipeline with Daily + Deepgram + OpenAI + Cartesia.
+
+## Prerequisites
+
+A running Hindsight instance:
+
+**Self-hosted:**
+```bash
+pip install hindsight-all
+export HINDSIGHT_API_LLM_API_KEY=your-api-key
+hindsight-api  # starts on http://localhost:8888
+```
+
+**Hindsight Cloud:** [Sign up](https://ui.hindsight.vectorize.io/signup) — no self-hosting required.
diff --git a/hindsight-docs/src/data/integrations.json b/hindsight-docs/src/data/integrations.json
@@ -200,6 +200,16 @@
       "link": "/sdks/integrations/openai-agents",
       "icon": "/img/icons/openai-agents.svg"
     },
+    {
+      "id": "pipecat",
+      "name": "Pipecat",
+      "description": "Persistent memory for Pipecat voice AI pipelines via a FrameProcessor that recalls context before each turn and retains conversation content after.",
+      "type": "official",
+      "by": "hindsight",
+      "category": "tool",
+      "link": "/sdks/integrations/pipecat",
+      "icon": "/img/icons/pipecat.png"
+    },
     {
       "id": "hindclaw",
       "name": "HindClaw",

diff --git a/hindsight-docs/static/img/icons/pipecat.png b/hindsight-docs/static/img/icons/pipecat.png
diff --git a/hindsight-integrations/pipecat/CHANGELOG.md b/hindsight-integrations/pipecat/CHANGELOG.md
@@ -0,0 +1,11 @@
+# Changelog
+
+## 0.1.0 (2026-04-07)
+
+- Initial release: `HindsightMemoryService` FrameProcessor for Pipecat voice AI pipelines
+- Retain/recall/inject loop on each `LLMContextFrame`
+- Fire-and-forget async retain after each complete user+assistant turn
+- Memory injected as `<hindsight_memories>` system message before LLM call
+- Supports both `LLMContextFrame` and deprecated `OpenAILLMContextFrame`
+- Configurable recall budget (`low`, `mid`, `high`) and token limit
+- Global `configure()` helper for shared connection settings
diff --git a/hindsight-integrations/pipecat/LICENSE b/hindsight-integrations/pipecat/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2025 Vectorize AI, Inc.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/hindsight-integrations/pipecat/README.md b/hindsight-integrations/pipecat/README.md
@@ -0,0 +1,109 @@
+# Hindsight Pipecat Integration
+
+Persistent long-term memory for [Pipecat](https://github.com/pipecat-ai/pipecat) voice AI pipelines via [Hindsight](https://vectorize.io/hindsight). A single `FrameProcessor` slots between your user context aggregator and LLM service — recalling relevant memories before each turn and retaining conversation content after.
+
+## Quick Start
+
+```bash
+pip install hindsight-pipecat
+```
+
+```python
+from pipecat.pipeline.pipeline import Pipeline
+from hindsight_pipecat import HindsightMemoryService
+
+memory = HindsightMemoryService(
+    bank_id="user-123",
+    hindsight_api_url="http://localhost:8888",
+)
+
+pipeline = Pipeline([
+    transport.input(),
+    stt_service,
+    user_aggregator,
+    memory,           # ← add between user_aggregator and LLM
+    llm_service,
+    assistant_aggregator,
+    tts_service,
+    transport.output(),
+])
+```
+
+Or with [Hindsight Cloud](https://ui.hindsight.vectorize.io/signup):
+
+```python
+memory = HindsightMemoryService(
+    bank_id="user-123",
+    hindsight_api_url="https://api.hindsight.vectorize.io",
+    api_key="hsk_your_token_here",
+)
+```
+
+## How It Works
+
+```
+New turn starts
+  └─ OpenAILLMContextFrame arrives
+       ├─ Retain previous complete turn (user+assistant) — fire-and-forget
+       └─ Recall relevant memories for current user query
+            └─ Inject as <hindsight_memories> system message
+                 └─ Forward enriched context to LLM
+```
+
+On each `OpenAILLMContextFrame`:
+
+1. **Retain** — any new complete user+assistant turn pairs are sent to Hindsight asynchronously (non-blocking)
+2. **Recall** — the latest user message is used as the search query; results are injected as a system message before the LLM sees the context
+3. **Forward** — the enriched context frame is pushed downstream
+
+Memory accumulates across calls. By the third or fourth turn, recall starts surfacing useful context that the pipeline didn't have to re-establish.
+
+## Prerequisites
+
+A running Hindsight instance:
+
+**Self-hosted:**
+```bash
+pip install hindsight-all
+export HINDSIGHT_API_LLM_API_KEY=your-api-key
+hindsight-api  # starts on http://localhost:8888
+```
+
+**Hindsight Cloud:** [Sign up](https://ui.hindsight.vectorize.io/signup) — no self-hosting required.
+
+## Configuration
+
+```python
+HindsightMemoryService(
+    bank_id="user-123",               # Required: memory bank to use
+    hindsight_api_url="...",          # Hindsight API URL
+    api_key="hsk_...",                # API key (Hindsight Cloud)
+    recall_budget="mid",              # "low", "mid", or "high"
+    recall_max_tokens=4096,           # Max tokens for recall results
+    enable_recall=True,               # Inject memories before LLM
+    enable_retain=True,               # Store turns after each exchange
+    memory_prefix="Relevant memories from past conversations:\n",
+)
+```
+
+### Global configuration
+
+```python
+from hindsight_pipecat import configure
+
+configure(
+    hindsight_api_url="http://localhost:8888",
+    api_key="hsk_...",
+    recall_budget="mid",
+)
+
+# Now create services without repeating connection details
+memory = HindsightMemoryService(bank_id="user-123")
+```
+
+## Running Tests
+
+```bash
+pip install pytest pytest-asyncio
+pytest tests/ -v
+```