RSPEED-2867: Extract RH identity context to shared utility#1548
Conversation
Move _get_rh_identity_context() and AUTH_DISABLED from rlsapi_v1.py to src/utils/rh_identity.py for reuse by the responses endpoint. Update rlsapi imports to use the shared module. Add unit tests. Signed-off-by: Major Hayden <major@redhat.com>
WalkthroughThe pull request refactors identity context extraction logic from Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Warning Review ran into problems🔥 ProblemsTimed out fetching pipeline failures after 30000ms Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Line 56: The file currently re-exports get_rh_identity_context as
_get_rh_identity_context (importing AUTH_DISABLED and get_rh_identity_context)
to preserve backward compatibility; once downstream consumers and tests are
migrated off the underscore alias, remove the shim by deleting the
_get_rh_identity_context alias and update any remaining callers to use
get_rh_identity_context directly (search for usages of _get_rh_identity_context
and replace them with get_rh_identity_context, and remove the alias
import/export).
In `@src/utils/rh_identity.py`:
- Around line 1-39: Add a short inline comment in get_rh_identity_context
explaining the naming mismatch: note that system_id is intentionally populated
from RHIdentityData.get_user_id() because the telemetry uses RH identity's
user_id to represent a system identifier (this clarifies why the function
returns org_id, system_id while calling get_user_id()); place the comment
adjacent to the system_id assignment to reference org_id, system_id, and
get_user_id for future maintainers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c5d9cc1e-ef25-4263-a53f-f6bf4eff4ddd
📒 Files selected for processing (4)
src/app/endpoints/rlsapi_v1.pysrc/utils/rh_identity.pytests/unit/utils/README.mdtests/unit/utils/test_rh_identity.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Import FastAPI dependencies with:from fastapi import APIRouter, HTTPException, Request, status, Depends
Import Llama Stack client with:from llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
Use Final[type] as type hint for all constants
All functions require docstrings with brief descriptions
Complete type annotations for parameters and return types in functions
Usetyping_extensions.Selffor model validators in Pydantic models
Use modern union type syntaxstr | intinstead ofUnion[str, int]
UseOptional[Type]for optional type hints
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Use standard log levels with clear purposes: debug, info, warning, error
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with@abstractmethoddecorators
Use@model_validatorand@field_validatorfor Pydantic model validation
Complete type annotations for all class attributes; use specific types, notAny
Follow Google Python docstring conventions with Parameters, Returns, Raises, and Attributes sections
Files:
src/utils/rh_identity.pytests/unit/utils/test_rh_identity.pysrc/app/endpoints/rlsapi_v1.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models extend
ConfigurationBasefor config,BaseModelfor data models
Files:
src/utils/rh_identity.pysrc/app/endpoints/rlsapi_v1.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest; pytest is the standard for this project
Usepytest-mockfor AsyncMock objects in tests
Use markerpytest.mark.asynciofor async tests
Unit tests require 60% coverage, integration tests 10%
Files:
tests/unit/utils/test_rh_identity.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoints
Files:
src/app/endpoints/rlsapi_v1.py
🧠 Learnings (1)
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/rlsapi_v1.py
🪛 markdownlint-cli2 (0.22.0)
tests/unit/utils/README.md
[warning] 42-42: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🔇 Additional comments (2)
tests/unit/utils/README.md (1)
42-44: LGTM!Entry follows the existing documentation pattern. The markdownlint MD022 hint appears to be a false positive — the surrounding entries use the same heading+description+blank-line structure.
tests/unit/utils/test_rh_identity.py (1)
1-69: LGTM!Good parametrized coverage: present/absent/empty-org/empty-user cases, plus
assert_called_onceverification for the getters. Usespytest+pytest-mockwith properMockerFixturetyping, consistent with project conventions. The use ofmocker.Mock(spec=[])to simulate astatewithoutrh_identity_datacorrectly exercises thegetattr(..., None)fallback path in the utility.
| extract_token_usage, | ||
| get_mcp_tools, | ||
| ) | ||
| from utils.rh_identity import AUTH_DISABLED, get_rh_identity_context |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
LGTM — clean extraction with a backward-compatible alias.
Import of AUTH_DISABLED and get_rh_identity_context from the shared utility, plus the _get_rh_identity_context alias, preserves existing test imports while enabling reuse from the responses endpoint. Call sites at lines 349 and 510 are correctly updated.
One small nit: once downstream consumers (and existing tests) are migrated off _get_rh_identity_context, consider removing the alias to avoid carrying a permanent shim.
Also applies to: 80-81, 349-349, 510-510
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/app/endpoints/rlsapi_v1.py` at line 56, The file currently re-exports
get_rh_identity_context as _get_rh_identity_context (importing AUTH_DISABLED and
get_rh_identity_context) to preserve backward compatibility; once downstream
consumers and tests are migrated off the underscore alias, remove the shim by
deleting the _get_rh_identity_context alias and update any remaining callers to
use get_rh_identity_context directly (search for usages of
_get_rh_identity_context and replace them with get_rh_identity_context, and
remove the alias import/export).
There was a problem hiding this comment.
Yep, the alias is intentional for this PR since it's part 1/3. I'll clean it up in part 3 once the responses endpoint is wired up and nothing references the underscore-prefixed name anymore.
There was a problem hiding this comment.
@major, makes sense — noted! The alias is a clean way to keep existing tests green across the three-part series. I'll hold off flagging it until part 3 lands.
✏️ Learnings added
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1548
File: src/app/endpoints/rlsapi_v1.py:56-56
Timestamp: 2026-04-20T15:09:45.119Z
Learning: In `src/app/endpoints/rlsapi_v1.py`, the `_get_rh_identity_context = get_rh_identity_context` alias is a deliberate, temporary backward-compatibility shim introduced in PR `#1548` (part 1/3 of Splunk HEC telemetry work). It is planned for removal in part 3 once the responses endpoint is fully wired up and no tests/consumers reference the underscore-prefixed name. Do not flag this alias as unnecessary or dead code until part 3 is merged.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1524
File: src/app/endpoints/responses.py:501-529
Timestamp: 2026-04-16T19:08:31.451Z
Learning: In `src/app/endpoints/responses.py`, `_sanitize_response_dict(response_dict, configured_mcp_labels)` intentionally mutates `response_dict` in-place. The dict is always a fresh throwaway produced by `model_dump()` on the immediately preceding line (both in the streaming `async for` loop and in the non-streaming path); no other reference to it exists. The AGENTS.md guideline "avoid in-place parameter modification" applies to mutating a caller's long-lived/shared data structures, not to ephemeral serialization dicts. Do not flag this as an anti-pattern in future reviews.
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
| """Utility functions for extracting RH Identity context for telemetry. | ||
|
|
||
| This module provides functions to extract organization and system identifiers | ||
| from Red Hat Identity request state for telemetry and logging purposes. | ||
| """ | ||
|
|
||
| from typing import Final, Optional | ||
|
|
||
| from starlette.requests import Request | ||
|
|
||
| from authentication.rh_identity import RHIdentityData | ||
|
|
||
| # Default value when RH Identity auth is not configured | ||
| AUTH_DISABLED: Final[str] = "auth_disabled" | ||
|
|
||
|
|
||
| def get_rh_identity_context(request: Request) -> tuple[str, str]: | ||
| """Extract org_id and system_id from RH Identity request state. | ||
|
|
||
| When RH Identity authentication is configured, the auth dependency stores | ||
| the RHIdentityData object in request.state.rh_identity_data. This function | ||
| extracts the org_id and system_id for telemetry purposes. | ||
|
|
||
| Args: | ||
| request: The FastAPI request object. | ||
|
|
||
| Returns: | ||
| Tuple of (org_id, system_id). Returns ("auth_disabled", "auth_disabled") | ||
| when RH Identity auth is not configured or data is unavailable. | ||
| """ | ||
| rh_identity: Optional[RHIdentityData] = getattr( | ||
| request.state, "rh_identity_data", None | ||
| ) | ||
| if rh_identity is None: | ||
| return AUTH_DISABLED, AUTH_DISABLED | ||
|
|
||
| org_id = rh_identity.get_org_id() or AUTH_DISABLED | ||
| system_id = rh_identity.get_user_id() or AUTH_DISABLED | ||
| return org_id, system_id |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
LGTM — clean, well-documented utility.
- Module docstring,
Final[str]constant, complete type annotations, and Google-style docstring match the repo guidelines. - Using
getattr(request.state, "rh_identity_data", None)correctly handles the case where the attribute was never set by the auth dependency (Starlette'sStateraisesAttributeErroron missing keys). - The
or AUTH_DISABLEDfallback on lines 37-38 treats any falsy return (empty string,None) as "unavailable", which is consistent with the existing behavior inrlsapi_v1.pyand is validated by the new tests.
Minor nit: the parameter is org_id, system_id, but system_id is populated from get_user_id(). This naming mismatch is pre-existing (carried over from rlsapi_v1.py), but it's worth a comment or a follow-up to clarify whether the RH identity "user_id" field semantically represents a system identifier in this telemetry context.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/utils/rh_identity.py` around lines 1 - 39, Add a short inline comment in
get_rh_identity_context explaining the naming mismatch: note that system_id is
intentionally populated from RHIdentityData.get_user_id() because the telemetry
uses RH identity's user_id to represent a system identifier (this clarifies why
the function returns org_id, system_id while calling get_user_id()); place the
comment adjacent to the system_id assignment to reference org_id, system_id, and
get_user_id for future maintainers.
There was a problem hiding this comment.
Good catch on the naming mismatch. That's pre-existing from the original code - I carried it over as-is to keep this refactor a pure move with no behavior change. I'll file a follow-up to clarify the naming.
There was a problem hiding this comment.
@major, that makes complete sense — keeping this PR as a pure move is the right call. Happy to open a follow-up GitHub issue to track the naming clarification if you'd like; just say the word and I'll create it.
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
Description
Moves
_get_rh_identity_context()andAUTH_DISABLEDfromrlsapi_v1.pytosrc/utils/rh_identity.pyfor reuse by the responses endpoint. No behavior change. Part 1/3 of Splunk HEC telemetry for/responses.Type of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
New unit tests:
uv run pytest tests/unit/utils/test_rh_identity.py -v(5 cases covering present/absent/empty identity scenarios). All 2033+ existing tests pass, no regressions.