Skip to content

Commit aefb3fc

Browse files
authored
fix: improve async batch retain with large payloads (#366)
* fix: improve async batch retain with large payloads * fix: improve async batch retain with large payloads * api * api * api * api * api * Clean up perf benchmark: keep only Python files - Remove README.md and PERFORMANCE_FINDINGS.md - Remove results/ JSON files (gitignored) - Remove test_data/ directory - Keep only __init__.py and retain_perf.py * docs: explain automatic batch optimization for async retain - Add section explaining Hindsight automatically handles batch sizing - Users don't need to manually tune batch sizes with async mode - Hindsight splits large batches (>10k tokens) into optimized sub-batches - Include example showing best practices * docs: remove emojis and code example from performance page * fix: correct OperationDetails type to match API response - Change optional fields to use | null instead of ? - Fixes TypeScript compilation error in control plane build * fix: use discriminated union for OperationDetails type - Support both success and error states properly - Fixes TypeScript error when setting error state * fix: use unique document_ids in batch retain examples - Each item in a batch must have unique document_id - Update both Python and JavaScript examples - Fixes test-doc-examples CI failure * chore: trigger CI * fix: test mocking and duplicate document_ids in examples - Mock _get_pool() in test_async_retain_tags.py to avoid _initialized error - Set _initialized = True on mocked MemoryEngine instances - Fix duplicate document_ids in retain.py and retain.mjs examples * fix: properly mock async pool/connection and fix more duplicate document_ids - Use AsyncMock for pool.acquire() to fix 'can't be used in await' error - Fix duplicate document_ids in retain-async examples (retain.py and retain.mjs) - Remove batch-level document_id parameter that caused duplicates * ci: collect all doc example failures and show summary - Run all Python/Node.js/CLI examples regardless of individual failures - Collect failure list and display summary at the end - Show pass/fail count and list of failed files - Exit with failure only after running all examples * refactor: extract doc example testing to standalone script - Create scripts/test-doc-examples.sh to run all examples - Collects logs of failed examples separately - Shows full error logs only for failures at the end - Clean summary with pass/fail counts - Proper exit codes - Replaces inline bash in CI workflow * fix: doc examples - duplicate document_ids and error handling - retain.py: move document_id to item level to avoid duplicates - documents.mjs: add error handling for getDocument to show clear error message * fix: update tests for duplicate document_id validation - test_async_retain_tags: verify operation structure instead of exact UUID - test_delete_bank: use unique document_ids (team-doc-1, team-doc-2)
1 parent 2a47389 commit aefb3fc

File tree

32 files changed

+2307
-131
lines changed

32 files changed

+2307
-131
lines changed

.github/workflows/test.yml

Lines changed: 2 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -941,30 +941,11 @@ jobs:
941941
sleep 1
942942
done
943943
944-
- name: Run Python doc examples
945-
working-directory: ./hindsight-clients/python
946-
run: |
947-
for f in ../../hindsight-docs/examples/api/*.py; do
948-
echo "Running $f..."
949-
uv run python "$f"
950-
done
951-
952-
- name: Run Node.js doc examples
953-
run: |
954-
for f in hindsight-docs/examples/api/*.mjs; do
955-
echo "Running $f..."
956-
node "$f"
957-
done
958-
959944
- name: Configure CLI
960945
run: hindsight configure --api-url http://localhost:8888
961946

962-
- name: Run CLI doc examples
963-
run: |
964-
for f in hindsight-docs/examples/api/*.sh; do
965-
echo "Running $f..."
966-
bash "$f"
967-
done
947+
- name: Run all doc examples
948+
run: ./scripts/test-doc-examples.sh
968949

969950
- name: Show API server logs
970951
if: always()

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ hindsight-docs/static/llms-full.txt
4646
hindsight-dev/benchmarks/locomo/results/
4747
hindsight-dev/benchmarks/longmemeval/results/
4848
hindsight-dev/benchmarks/consolidation/results/
49+
hindsight-dev/benchmarks/perf/results/
4950
benchmarks/results/
5051
hindsight-cli/target
5152
hindsight-clients/rust/target

CLAUDE.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,8 +57,15 @@ cd hindsight-control-plane && npm run dev
5757

5858
### Benchmarks
5959
```bash
60+
# Accuracy benchmarks
6061
./scripts/benchmarks/run-longmemeval.sh
6162
./scripts/benchmarks/run-locomo.sh
63+
64+
# Performance benchmarks
65+
./scripts/benchmarks/run-consolidation.sh
66+
./scripts/benchmarks/run-retain-perf.sh --document <path> # Requires API server running
67+
68+
# Results viewer
6269
./scripts/benchmarks/start-visualizer.sh # View results at localhost:8001
6370
```
6471

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
"""Add GIN index on async_operations.result_metadata for parent_operation_id queries
2+
3+
Revision ID: y0t1u2v3w4x5
4+
Revises: x9s0t1u2v3w4
5+
Create Date: 2026-02-13
6+
7+
This migration adds a GIN index on the result_metadata JSONB column in the
8+
async_operations table to support efficient queries for child operations by
9+
parent_operation_id.
10+
11+
The index enables fast lookups when querying for child operations:
12+
SELECT * FROM async_operations
13+
WHERE result_metadata::jsonb @> '{"parent_operation_id": "uuid"}'::jsonb
14+
"""
15+
16+
from collections.abc import Sequence
17+
18+
from alembic import context, op
19+
20+
revision: str = "y0t1u2v3w4x5"
21+
down_revision: str | Sequence[str] | None = "x9s0t1u2v3w4"
22+
branch_labels: str | Sequence[str] | None = None
23+
depends_on: str | Sequence[str] | None = None
24+
25+
26+
def _get_schema_prefix() -> str:
27+
"""Get schema prefix for table names (required for multi-tenant support)."""
28+
schema = context.config.get_main_option("target_schema")
29+
return f'"{schema}".' if schema else ""
30+
31+
32+
def upgrade() -> None:
33+
"""Add GIN index on result_metadata for efficient parent_operation_id queries."""
34+
schema = _get_schema_prefix()
35+
36+
# Add GIN index for JSONB containment queries (@> operator)
37+
op.execute(f"""
38+
CREATE INDEX idx_async_operations_result_metadata
39+
ON {schema}async_operations
40+
USING gin(result_metadata)
41+
""")
42+
43+
44+
def downgrade() -> None:
45+
"""Remove GIN index on result_metadata."""
46+
schema = _get_schema_prefix()
47+
48+
# Drop index
49+
op.execute(f"DROP INDEX IF EXISTS {schema}idx_async_operations_result_metadata")

hindsight-api/hindsight_api/api/http.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1357,6 +1357,16 @@ class CancelOperationResponse(BaseModel):
13571357
operation_id: str
13581358

13591359

1360+
class ChildOperationStatus(BaseModel):
1361+
"""Status of a child operation (for batch operations)."""
1362+
1363+
operation_id: str
1364+
status: str
1365+
sub_batch_index: int | None = None
1366+
items_count: int | None = None
1367+
error_message: str | None = None
1368+
1369+
13601370
class OperationStatusResponse(BaseModel):
13611371
"""Response model for getting a single operation status."""
13621372

@@ -1381,6 +1391,13 @@ class OperationStatusResponse(BaseModel):
13811391
updated_at: str | None = None
13821392
completed_at: str | None = None
13831393
error_message: str | None = None
1394+
result_metadata: dict[str, Any] | None = Field(
1395+
default=None,
1396+
description="Internal metadata for debugging. Structure may change without notice. Not for production use.",
1397+
)
1398+
child_operations: list[ChildOperationStatus] | None = Field(
1399+
default=None, description="Child operations for batch operations (if applicable)"
1400+
)
13841401

13851402

13861403
class AsyncOperationSubmitResponse(BaseModel):

hindsight-api/hindsight_api/config.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,7 @@ def normalize_config_dict(config: dict[str, Any]) -> dict[str, Any]:
250250
ENV_RETAIN_EXTRACT_CAUSAL_LINKS = "HINDSIGHT_API_RETAIN_EXTRACT_CAUSAL_LINKS"
251251
ENV_RETAIN_EXTRACTION_MODE = "HINDSIGHT_API_RETAIN_EXTRACTION_MODE"
252252
ENV_RETAIN_CUSTOM_INSTRUCTIONS = "HINDSIGHT_API_RETAIN_CUSTOM_INSTRUCTIONS"
253+
ENV_RETAIN_BATCH_TOKENS = "HINDSIGHT_API_RETAIN_BATCH_TOKENS"
253254

254255
# Observations settings (consolidated knowledge from facts)
255256
ENV_ENABLE_OBSERVATIONS = "HINDSIGHT_API_ENABLE_OBSERVATIONS"
@@ -371,6 +372,7 @@ def normalize_config_dict(config: dict[str, Any]) -> dict[str, Any]:
371372
DEFAULT_RETAIN_EXTRACTION_MODE = "concise" # Extraction mode: "concise", "verbose", or "custom"
372373
RETAIN_EXTRACTION_MODES = ("concise", "verbose", "custom") # Allowed extraction modes
373374
DEFAULT_RETAIN_CUSTOM_INSTRUCTIONS = None # Custom extraction guidelines (only used when mode="custom")
375+
DEFAULT_RETAIN_BATCH_TOKENS = 10_000 # ~40KB of text # Max chars per sub-batch for async retain auto-splitting
374376

375377
# Observations defaults (consolidated knowledge from facts)
376378
DEFAULT_ENABLE_OBSERVATIONS = True # Observations enabled by default
@@ -590,6 +592,7 @@ class HindsightConfig:
590592
retain_extract_causal_links: bool
591593
retain_extraction_mode: str
592594
retain_custom_instructions: str | None
595+
retain_batch_tokens: int
593596

594597
# Observations settings (consolidated knowledge from facts)
595598
enable_observations: bool
@@ -939,6 +942,7 @@ def from_env(cls) -> "HindsightConfig":
939942
os.getenv(ENV_RETAIN_EXTRACTION_MODE, DEFAULT_RETAIN_EXTRACTION_MODE)
940943
),
941944
retain_custom_instructions=os.getenv(ENV_RETAIN_CUSTOM_INSTRUCTIONS) or DEFAULT_RETAIN_CUSTOM_INSTRUCTIONS,
945+
retain_batch_tokens=int(os.getenv(ENV_RETAIN_BATCH_TOKENS, str(DEFAULT_RETAIN_BATCH_TOKENS))),
942946
# Observations settings (consolidated knowledge from facts)
943947
enable_observations=os.getenv(ENV_ENABLE_OBSERVATIONS, str(DEFAULT_ENABLE_OBSERVATIONS)).lower() == "true",
944948
consolidation_batch_size=int(

0 commit comments

Comments
 (0)