Skip to content

feat: batch query string array params to avoid URL length limits#274

Merged
skylarmb merged 4 commits into
federated-sdk-release-candidatefrom
feature/hhai-3963
Feb 26, 2026
Merged

feat: batch query string array params to avoid URL length limits#274
skylarmb merged 4 commits into
federated-sdk-release-candidatefrom
feature/hhai-3963

Conversation

@skylarmb
Copy link
Copy Markdown
Contributor

Summary

  • When list parameters (datapoint_ids, run_ids) exceed 100 items, the client now transparently splits requests into batches and merges results
  • Prevents hitting URL length limits (8-16 KB) and the backend's qs arrayLimit of 1000
  • Batching is invisible to callers — the API surface is unchanged

Changes

  • src/honeyhive/api/client.py — added QUERY_BATCH_SIZE, _chunk_list() helper, and batching logic in DatapointsAPI.list()/list_async() and ExperimentsAPI.list_runs()/list_runs_async()
  • tests/unit/test_api_query_batching.py — 14 unit tests covering chunk logic, passthrough for small lists, batching + merge for large lists, metric deduplication, and async variants

Test plan

  • pytest tests/unit/test_api_query_batching.py — 14/14 pass
  • Existing unit tests unaffected (test_api_events_fixes.py — 20/20 pass)

🔗 Related backend PR: https://github.com/honeyhiveai/hive-kube/pull/2430

✨ Created with Claude Code

When list parameters (datapoint_ids, run_ids) exceed 100 items, the
client now transparently splits requests into batches and merges the
results. This avoids hitting URL length limits (8-16 KB) and the
backend's qs arrayLimit of 1000.

Affected methods:
- DatapointsAPI.list() / list_async()
- ExperimentsAPI.list_runs() / list_runs_async()

✨ Created with Claude Code
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Feb 26, 2026

Claude finished @skylarmb's task in 2m 10s —— View job


Review complete. Posted 6 inline comments and a top-level summary.

  • Read changed files and gather context
  • Review src/honeyhive/api/client.py changes
  • Review tests/unit/test_api_query_batching.py
  • Post inline feedback on specific issues
  • Post top-level summary comment

Key findings:

  • Bug: page/limit kwargs are forwarded to each batch in _batched_list_runs, silently producing wrong results when callers pass pagination params
  • Bug: total_unfiltered in synthetic Pagination sums filtered totals instead of unfiltered totals
  • Perf: Async batching is sequential — asyncio.gather() would parallelize independent batch requests
  • Design: No error handling for partial batch failures (relevant to SDK's "never crash host" principle)
  • Nits: _chunk_list type hint too narrow, inline Pagination import should be at module level

@snyk-io
Copy link
Copy Markdown

snyk-io Bot commented Feb 26, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

Comment thread src/honeyhive/api/client.py
Comment thread src/honeyhive/api/client.py Outdated
Comment thread src/honeyhive/api/client.py Outdated
Comment thread src/honeyhive/api/client.py
Comment thread src/honeyhive/api/client.py Outdated
Comment thread src/honeyhive/api/client.py
@github-actions
Copy link
Copy Markdown
Contributor

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Feb 26, 2026

Review Summary

Good approach — transparently batching large query-string arrays is the right call for avoiding URL length limits. The implementation is clean, well-tested, and the API surface is unchanged. A few things worth addressing before merge:

Should fix:

  • page/limit forwarded to batched requests — When batching is active, the caller's page and limit params are passed to every batch call via **kwargs. This will silently produce wrong results (e.g., limit=10 caps each batch at 10 items). These should be stripped or overridden in the batched path.
  • total_unfiltered in synthetic Pagination — Currently set to the sum of pagination.total, which is the filtered count. Should sum pagination.total_unfiltered instead.

Worth considering:

  • Async batching is sequential_batched_list_runs_async and list_async use a sequential for loop with await. Since batches are independent, asyncio.gather() would parallelize the network calls for a meaningful speedup on large lists.
  • No error handling for partial batch failures — If batch 3 of 5 fails, the exception propagates and all successfully-fetched data is lost. Given the SDK's "never crash host application" principle, consider at minimum adding context to the exception.

Nits:

  • _chunk_list type hint is List[str] but works generically — consider List[Any] or a TypeVar.
  • Pagination import appears inline in two method bodies — could be a top-level import.

Tests look solid — good coverage of boundary conditions, passthrough, batching + merge, metric deduplication, and async variants.

- Parallelize async batch requests with asyncio.gather()
- Strip page/limit params from batched run requests
- Fix total_unfiltered aggregation to use per-batch values
- Add error context logging before re-raising batch failures
- Use generic TypeVar for _chunk_list type hint
- Move Pagination import to module level

✨ Created with Claude Code
@github-actions
Copy link
Copy Markdown
Contributor

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

@skylarmb
Copy link
Copy Markdown
Contributor Author

@claude review if feedback has been adequately addressed

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Feb 26, 2026

Claude finished @skylarmb's task in 1m 43s —— View job


All previous review feedback has been adequately addressed in 4068ce6.

  • Read current state of changed files
  • Check if page/limit kwargs bug was fixed
  • Check if total_unfiltered bug was fixed
  • Check if async batching was parallelized with asyncio.gather()
  • Check if error handling for partial batch failures was added
  • Check nits: _chunk_list type hint, Pagination import placement
  • Post summary of findings

Feedback status

Previous feedback Status Details
Bug: page/limit forwarded to batched requests ✅ Fixed Both _batched_list_runs and _batched_list_runs_async now kwargs.pop("page", None) / kwargs.pop("limit", None). New test test_page_and_limit_stripped_from_batches covers this.
Bug: total_unfiltered summing filtered totals ✅ Fixed Both sync and async paths now track and sum resp.pagination.total_unfiltered separately. Test assertion added.
Perf: Async batching sequential ✅ Fixed list_async and _batched_list_runs_async both use asyncio.gather() for parallel batch requests.
Design: No error handling for partial batch failures ✅ Addressed Sync methods wrap each batch call in try/except with logger.warning context before re-raising. Async paths using asyncio.gather() propagate naturally. Reasonable approach.
Nit: _chunk_list type hint too narrow ✅ Fixed Now uses TypeVar("T") with List[T] signature.
Nit: Pagination import inline ✅ Fixed Moved to module-level import block (line 57). No remaining inline imports.

One minor remaining nit: there's only one blank line between T = TypeVar("T") and def _chunk_list(...) (line 117-118) — Black expects two blank lines before top-level function definitions. Not a blocker.

LGTM — the fix commit cleanly addresses every point raised in the initial review.


@github-actions
Copy link
Copy Markdown
Contributor

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

@skylarmb skylarmb marked this pull request as ready for review February 26, 2026 01:03
@skylarmb skylarmb requested a review from a team February 26, 2026 01:03
@github-actions
Copy link
Copy Markdown
Contributor

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

@skylarmb skylarmb merged commit 6714ccc into federated-sdk-release-candidate Feb 26, 2026
17 checks passed
@skylarmb skylarmb deleted the feature/hhai-3963 branch February 26, 2026 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants