fix(chromadb): emit one result event per document across all queries by dvirski · Pull Request #4105 · traceloop/openllmetry

dvirski · 2026-05-11T14:40:29Z

Related to #1870

problem:
ChromaDB's query() returns results as a list-of-lists (one inner list per query embedding). The instrumentation only unzipped the outer list, producing one db.query.result
event per query instead of one per result document. A second bug also indexed into each attribute value with [0], so string IDs were silently truncated to their first
character.

fix:
Added an inner loop over each query's result list so every document gets its own span event. Removed the erroneous [0] indexing so attribute values are used as-is.

Summary by CodeRabbit

Bug Fixes
- Query telemetry now emits one event per returned document (N×K for multi-query), with event attributes included only when present and metadata serialized when it's a dictionary.
Tests
- Added and updated tests to assert per-result event counts and validate each event's attributes for single and multi-embedding queries.
Documentation
- Clarified the nested query-result structure and the resulting per-result event behavior.

coderabbitai · 2026-05-11T14:41:01Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 895b5342-0f53-405c-b98d-c038494b5aca

📥 Commits

Reviewing files that changed from the base of the PR and between 6ac4eda and 943e427.

📒 Files selected for processing (3)

packages/opentelemetry-instrumentation-chromadb/opentelemetry/instrumentation/chromadb/wrapper.py
packages/opentelemetry-instrumentation-chromadb/tests/test_query.py
packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py

🚧 Files skipped from review as they are similar to previous changes (3)

packages/opentelemetry-instrumentation-chromadb/tests/test_query.py
packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py
packages/opentelemetry-instrumentation-chromadb/opentelemetry/instrumentation/chromadb/wrapper.py

📝 Walkthrough

Walkthrough

This PR refactors ChromaDB query result event emission to respect the library's nested structure: multiple queries each returning multiple results. The _add_query_result_events function now emits one db.query.result event per result document using nested iteration, and tests are updated/added to validate the N×K event counts and payloads.

Changes

Query Result Events Refactoring

Layer / File(s)	Summary
Query Result Events Implementation `packages/opentelemetry-instrumentation-chromadb/opentelemetry/instrumentation/chromadb/wrapper.py`	`_add_query_result_events` is refactored to iterate queries then result items (nested `zip_longest`), emitting one `db.query.result` event per document. Event attributes are conditionally populated for non-`None` values; `metadata` is JSON-serialized only when it is a dict. A docstring documents the nested structure and resulting N×K event count.
Existing Test Assertion Update `packages/opentelemetry-instrumentation-chromadb/tests/test_query.py`	`test_chroma_query` now asserts two `DB_QUERY_RESULT` events when `n_results=2`, matching the new one-event-per-result behavior.
Comprehensive Query Result Tests `packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py`	New test module with a `collection` fixture and three test functions validating the new event emission: single query with 2 results produces 2 events; two queries with 2 results each produces 4 events; each event includes required attributes and correct IDs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

galzilber
nina-kollman
max-deygin-traceloop
netanel-tl
doronkopit5

Poem

🐰 I hop through nested lists so neat,
One tiny event for each result I meet.
Metadata serialized when it's a dict,
Each chunk recorded, no piece is skipped.
Telemetry sings as the spans all click.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: fixing ChromaDB instrumentation to emit one result event per document across all queries, which directly matches the core functionality change in wrapper.py and test updates.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dr/fix(chromadb)-emit-one-result-event-per-document-across-all-queries-

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CLAassistant · 2026-05-11T14:42:38Z

All committers have signed the CLA.

coderabbitai

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py (1)

54-76: ⚡ Quick win

Use multi-character IDs so this test actually exercises the [0] truncation fix.

This PR fixes two bugs: (1) event count and (2) the previous [0] indexing that truncated string IDs to a single character. With single-character IDs ("1", "2"), "1"[0] == "1", so the id-equality assertion below would still pass against the old buggy code — i.e., this test only protects against the count regression, not the truncation regression. Switching to multi-character IDs makes the assertion meaningfully cover the truncation fix.

♻️ Proposed change

 def test_chromadb_query_result_events_contain_correct_data(exporter, collection):
     """Each result event should contain id, distance, document and metadata."""
     collection.add(
-        ids=["1", "2"],
+        ids=["doc-id-1", "doc-id-2"],
         documents=["doc one", "doc two"],
         metadatas=[{"source": "fileA"}, {"source": "fileB"}],
         embeddings=[[1.0, 0.0], [0.0, 1.0]],
     )
     collection.query(query_embeddings=[[1.0, 0.0]], n_results=2)

     spans = exporter.get_finished_spans()
     query_span = next(s for s in spans if s.name == "chroma.query")
     result_events = [e for e in query_span.events if e.name == "db.query.result"]

     assert len(result_events) == 2
     for event in result_events:
         assert "db.query.result.id" in event.attributes
         assert "db.query.result.distance" in event.attributes
         assert "db.query.result.document" in event.attributes
         assert "db.query.result.metadata" in event.attributes

     ids_recorded = {e.attributes["db.query.result.id"] for e in result_events}
-    assert ids_recorded == {"1", "2"}
+    assert ids_recorded == {"doc-id-1", "doc-id-2"}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py`
around lines 54 - 76, The test
test_chromadb_query_result_events_contain_correct_data uses single-character IDs
("1","2") which doesn't exercise the previous string-truncation bug; update the
ids passed to collection.add (in the collection.add call within that test) to
multi-character IDs (e.g., "id1", "id2" or "10", "20") so the assertion that
ids_recorded == {"id1", "id2"} actually verifies the fix for the [0] truncation
as well as the event count.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py`:
- Around line 54-76: The test
test_chromadb_query_result_events_contain_correct_data uses single-character IDs
("1","2") which doesn't exercise the previous string-truncation bug; update the
ids passed to collection.add (in the collection.add call within that test) to
multi-character IDs (e.g., "id1", "id2" or "10", "20") so the assertion that
ids_recorded == {"id1", "id2"} actually verifies the fix for the [0] truncation
as well as the event count.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3a5327b2-f1f8-4739-bfe9-d3cbe25a4a7c

📥 Commits

Reviewing files that changed from the base of the PR and between 6d3e696 and a08a54c.

⛔ Files ignored due to path filters (1)

packages/opentelemetry-instrumentation-chromadb/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

packages/opentelemetry-instrumentation-chromadb/opentelemetry/instrumentation/chromadb/wrapper.py
packages/opentelemetry-instrumentation-chromadb/tests/test_query.py
packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py

doronkopit5 · 2026-05-11T15:09:03Z

PLease take this small comment and add

packages/opentelemetry-instrumentation-chromadb/tests/test_query_results.py (1)> 54-76: ⚡ Quick win

Use multi-character IDs so this test actually exercises the [0] truncation fix.
This PR fixes two bugs: (1) event count and (2) the previous [0] indexing that truncated string IDs to a single character. With single-character IDs ("1", "2"), "1"[0] == "1", so the id-equality assertion below would still pass against the old buggy code — i.e., this test only protects against the count regression, not the truncation regression. Switching to multi-character IDs makes the assertion meaningfully cover the truncation fix.

dvirski · 2026-05-17T12:28:28Z

@doronkopit5

Updated the third test in test_query_results.py to use multi-character IDs ("doc-id-aaa", "doc-id-bbb") instead of single-character ones ("1", "2").

Why: the PR fixed an ids[0] truncation bug, but single-character IDs hid it — "1"[0] == "1" so the assertion passed even against the broken code. Multi-character IDs make ids[0] collapse to just "d", so the test now actually fails under the old bug and genuinely covers the truncation fix (not
just the event-count fix).

dvirski requested review from doronkopit5, galzilber, max-deygin-traceloop, netanel-tl and nina-kollman May 11, 2026 14:40

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

dvirski force-pushed the dr/fix(chromadb)-emit-one-result-event-per-document-across-all-queries- branch from a08a54c to 6ac4eda Compare May 17, 2026 12:28

doronkopit5 approved these changes May 17, 2026

View reviewed changes

dvirski and others added 2 commits May 17, 2026 21:08

fix(chromadb): emit one result event per document across all queries

40ed473

CR Fix

943e427

dvirski force-pushed the dr/fix(chromadb)-emit-one-result-event-per-document-across-all-queries- branch from 6ac4eda to 943e427 Compare May 17, 2026 18:08

dvirski merged commit 12bdd62 into main May 17, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(chromadb): emit one result event per document across all queries#4105

fix(chromadb): emit one result event per document across all queries#4105
dvirski merged 2 commits into
mainfrom
dr/fix(chromadb)-emit-one-result-event-per-document-across-all-queries-

dvirski commented May 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

CLAassistant commented May 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

doronkopit5 commented May 11, 2026

Uh oh!

dvirski commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dvirski commented May 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

CLAassistant commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

doronkopit5 commented May 11, 2026

Uh oh!

dvirski commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dvirski commented May 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading

CLAassistant commented May 11, 2026 •

edited

Loading