Skip to content

Fix TraceAdapterRegistry lookup errors in SpanToDFWRecordProcessor#831

Merged
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
mpenn:mpenn_fix-dfw-typechecks
Sep 22, 2025
Merged

Fix TraceAdapterRegistry lookup errors in SpanToDFWRecordProcessor#831
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
mpenn:mpenn_fix-dfw-typechecks

Conversation

@mpenn
Copy link
Contributor

@mpenn mpenn commented Sep 22, 2025

Description

This PR updates SpanToDFWRecordProcessor based on recent fixes to TypeIntrospectionMixin. The data flywheel type registry does not register the optional type (e.g. DFWRecordT | None) and processing chains fail to execute due to registry look failures. The following are done in this PR to resolve this bug:

  • Add extract_non_optional_type method to TypeIntrospectionMixin to extract the non-optional type(s)
  • Update SpanToDFWRecordProcess to leverage this method to extract the target type
  • Corresponding updates to tests

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • Bug Fixes

    • More robust event processing: unsupported events are now cleanly ignored.
    • Prevents passing optional types to converters, reducing type-mismatch errors.
    • Allows no record to be produced when events don’t map, avoiding misleading output.
  • Refactor

    • Streamlined type handling to improve stability and consistency across processors.
  • Tests

    • Added comprehensive tests covering optional/union type scenarios to ensure reliable behavior.

…anges to `TypeIntrospectionMixin`

- Add method to `TypeIntrospectionMixin` to deal with union types
- Leverage this method in `SpanToDFWRecordProcessor.process`
- Add corresponding unit tests

Signed-off-by: Matthew Penn <mpenn@nvidia.com>
Signed-off-by: Matthew Penn <mpenn@nvidia.com>
@mpenn mpenn self-assigned this Sep 22, 2025
@mpenn mpenn added bug Something isn't working non-breaking Non-breaking change labels Sep 22, 2025
@coderabbitai
Copy link

coderabbitai bot commented Sep 22, 2025

Walkthrough

The processor’s generic output type now allows None. It unwraps Optional from output_type before invoking the converter for LLM_START events and explicitly returns None for unsupported events. A new TypeIntrospectionMixin method extract_non_optional_type is added, with comprehensive unit tests added/updated for both the processor and the mixin.

Changes

Cohort / File(s) Summary
Processor output typing and control flow
packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
Processor generic updated to DFWRecordT | None; on LLM_START, passes non-optional target type to converter using extract_non_optional_type; returns None for unsupported events.
Processor tests updated for non-optional target type
packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
Tests updated to call and assert with extract_non_optional_type(processor.output_type) as to_type; aligns expectations with optional-aware processor output.
Type introspection utility added
src/nat/observability/mixin/type_introspection_mixin.py
Added extract_non_optional_type(self, type_obj) with caching to unwrap Optional[T]/Union[T, None]; imports types.
Mixin tests for type extraction
tests/nat/observability/mixin/test_type_introspection_mixin.py
New tests for extract_non_optional_type across unions, nested generics, and edge cases; note: duplicated test block present in diff.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Proc as SpanToDFWRecordProcessor
  participant Mix as TypeIntrospectionMixin
  participant Conv as span_to_dfw_record Converter

  Caller->>Proc: process(span)
  alt Event == LLM_START
    Proc->>Mix: extract_non_optional_type(output_type)
    Mix-->>Proc: non_optional_type
    Proc->>Conv: span_to_dfw_record(span, to_type=non_optional_type)
    Conv-->>Proc: DFWRecord or None
    Proc-->>Caller: DFWRecord or None
  else Unsupported event
    Proc-->>Caller: None
  end

  note over Proc,Mix: Non-optional type ensures concrete converter target
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title Check ✅ Passed The title accurately and concisely summarizes the primary change: it indicates a fix for registry lookup errors involving TraceAdapterRegistry within SpanToDFWRecordProcessor, uses imperative mood, and fits the ~72-character length guideline.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mpenn mpenn marked this pull request as ready for review September 22, 2025 03:02
@mpenn mpenn requested a review from a team as a code owner September 22, 2025 03:02
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/nat/observability/mixin/type_introspection_mixin.py (1)

480-497: Avoid lru_cache on bound methods (Ruff B019); make static and (optionally) cache at module scope.

Using lru_cache on instance methods can retain self and leak. Also, tighten the return type.

Apply:

-    @lru_cache
-    def extract_non_optional_type(self, type_obj: type | types.UnionType) -> Any:
+    @staticmethod
+    def extract_non_optional_type(type_obj: type | types.UnionType) -> type[Any] | types.UnionType:
@@
-        decomposed = DecomposedType(type_obj)  # type: ignore[arg-type]
+        decomposed = DecomposedType(type_obj)  # type: ignore[arg-type]
         if decomposed.is_optional:
             return decomposed.get_optional_type().type
         return type_obj

If you still want caching, add this helper at module scope and delegate:

from functools import lru_cache

@lru_cache
def _extract_non_optional_type_cached(type_obj: object) -> object:
    decomposed = DecomposedType(type_obj)  # type: ignore[arg-type]
    return decomposed.get_optional_type().type if decomposed.is_optional else type_obj

And implement:

@staticmethod
def extract_non_optional_type(type_obj: type | types.UnionType) -> type[Any] | types.UnionType:
    return _extract_non_optional_type_cached(type_obj)
🧹 Nitpick comments (3)
tests/nat/observability/mixin/test_type_introspection_mixin.py (1)

562-668: Parametrize to reduce repetition.

You can fold several cases into a single parametrized test for brevity.

Example:

import pytest

@pytest.mark.parametrize(
    ("hint","expected"),
    [
        (int | None, int),
        (str | None, str),
        (dict[str, int] | None, dict[str, int]),
        (list[str] | None, list[str]),
        (None | str, str),
        (str | int, str | int),
        (str | int | None, str | int),
    ],
)
def test_extract_non_optional_parametrized(hint, expected):
    inst = ConcreteDirectClass()
    assert inst.extract_non_optional_type(hint) == expected
packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py (2)

59-59: Remove redundant TypeIntrospectionMixin in bases.

Processor already inherits TypeIntrospectionMixin; duplicating it is unnecessary and can affect MRO.

Apply:

-class SpanToDFWRecordProcessor(Processor[Span, DFWRecordT | None], TypeIntrospectionMixin):
+class SpanToDFWRecordProcessor(Processor[Span, DFWRecordT | None]):

82-85: Cast target_type for pyright; ensures to_type is concrete.

Helps static checkers and documents intent.

Apply:

-                # Extract the concrete type from Optional[DFWRecordT] to avoid passing Optional to converters
-                target_type = self.extract_non_optional_type(self.output_type)
-                dfw_record = span_to_dfw_record(span=item, to_type=target_type, client_id=self._client_id)
+                # Extract the concrete type from Optional[...] to avoid passing Optional to converters
+                target_type = cast(type[BaseModel], self.extract_non_optional_type(self.output_type))
+                dfw_record = span_to_dfw_record(span=item, to_type=target_type, client_id=self._client_id)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2430a71 and c01c128.

📒 Files selected for processing (4)
  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py (2 hunks)
  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py (2 hunks)
  • src/nat/observability/mixin/type_introspection_mixin.py (2 hunks)
  • tests/nat/observability/mixin/test_type_introspection_mixin.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (12)
packages/*/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Packages containing Python code must have tests under packages//tests

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bare raise and log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
  • tests/nat/observability/mixin/test_type_introspection_mixin.py
  • src/nat/observability/mixin/type_introspection_mixin.py
**/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/tests/**/*.py: Test functions must use the test_ prefix and snake_case
Extract repeated test code into pytest fixtures; fixtures should set name=... in @pytest.fixture and functions named with fixture_ prefix
Mark expensive tests with @pytest.mark.slow or @pytest.mark.integration
Use pytest with pytest-asyncio for async code; mock external services with pytest_httpserver or unittest.mock

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
  • tests/nat/observability/mixin/test_type_introspection_mixin.py
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
  • tests/nat/observability/mixin/test_type_introspection_mixin.py
  • src/nat/observability/mixin/type_introspection_mixin.py
**/*.{py,md}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
  • tests/nat/observability/mixin/test_type_introspection_mixin.py
  • src/nat/observability/mixin/type_introspection_mixin.py
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
  • tests/nat/observability/mixin/test_type_introspection_mixin.py
  • src/nat/observability/mixin/type_introspection_mixin.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
  • tests/nat/observability/mixin/test_type_introspection_mixin.py
  • src/nat/observability/mixin/type_introspection_mixin.py
packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

  • Not all packages contain Python code, if they do they should also contain their own set of tests, in a
    tests/ directory at the same level as the pyproject.toml file.

Files:

  • packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py
  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
packages/*/src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

packages/*/src/**/*.py: All importable Python code in packages must live under packages//src/
All public APIs in packaged code require Python 3.11+ type hints; prefer typing/collections.abc; use typing.Annotated when useful
Provide Google-style docstrings for public APIs in packages; first line concise with a period; use backticks for code entities

Files:

  • packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Unit tests must live under tests/ and use configured markers (e2e, integration, etc.)

Files:

  • tests/nat/observability/mixin/test_type_introspection_mixin.py

⚙️ CodeRabbit configuration file

tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using the test_ prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using the fixture_ prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass

Files:

  • tests/nat/observability/mixin/test_type_introspection_mixin.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

src/**/*.py: All importable Python code must live under src/
All public APIs in src/ require Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise with a period; surround code entities with backticks

Files:

  • src/nat/observability/mixin/type_introspection_mixin.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Core functionality under src/nat should prioritize backward compatibility when changed

Files:

  • src/nat/observability/mixin/type_introspection_mixin.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/observability/mixin/type_introspection_mixin.py
🧬 Code graph analysis (4)
packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py (1)
src/nat/observability/mixin/type_introspection_mixin.py (2)
  • extract_non_optional_type (481-496)
  • output_type (314-320)
packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py (3)
src/nat/observability/processor/processor.py (1)
  • Processor (27-74)
src/nat/observability/mixin/type_introspection_mixin.py (3)
  • TypeIntrospectionMixin (35-496)
  • extract_non_optional_type (481-496)
  • output_type (314-320)
packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/trace_conversion/span_to_dfw_record.py (1)
  • span_to_dfw_record (99-119)
tests/nat/observability/mixin/test_type_introspection_mixin.py (1)
src/nat/observability/mixin/type_introspection_mixin.py (1)
  • extract_non_optional_type (481-496)
src/nat/observability/mixin/type_introspection_mixin.py (1)
src/nat/utils/type_utils.py (3)
  • DecomposedType (60-488)
  • is_optional (207-218)
  • get_optional_type (229-257)
🪛 Ruff (0.13.1)
src/nat/observability/mixin/type_introspection_mixin.py

480-480: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

🔇 Additional comments (4)
tests/nat/observability/mixin/test_type_introspection_mixin.py (1)

562-668: Good addition: covers Optional/Union unwrapping thoroughly.

Scenarios look comprehensive and align with the registry lookup fix.

packages/nvidia_nat_data_flywheel/tests/observability/processor/test_dfw_record_processor.py (2)

182-184: Assertion updated to expect non-optional to_type.

Matches the new unwrapping behavior.


258-259: LGTM: verifies unwrapped to_type is passed through.

Good coverage for call arguments.

packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py (1)

80-88: Sanity-check: ensure unwrapped output_type isn’t a multi-member Union.

If OutputT can be A | B | None, extract_non_optional_type(self.output_type) may yield a Union which TraceAdapterRegistry.convert likely rejects — add a runtime guard/assert that target_type is a single concrete (non-Union) type (or gate/raise) before calling span_to_dfw_record. Location: packages/nvidia_nat_data_flywheel/src/nat/plugins/data_flywheel/observability/processor/dfw_record_processor.py (lines 80–88). Repo search with the provided script returned no matches; manual verification required.

@mpenn mpenn changed the title Fix TraceAdapterRegistry errors in SpanToDFWRecordProcessor based on TypeIntrospectionMixin fixes Fix TraceAdapterRegistry lookup errors in SpanToDFWRecordProcessor Sep 22, 2025
@mpenn
Copy link
Contributor Author

mpenn commented Sep 22, 2025

/merge

@rapids-bot rapids-bot bot merged commit 31f8f13 into NVIDIA:develop Sep 22, 2025
17 checks passed
@LeBgDLaStreet
Copy link

Good wave

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants