-
Notifications
You must be signed in to change notification settings - Fork 54
LCORE-395: config endpoint should not leak secrets #502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughPydantic config fields for secrets were changed to SecretStr; code now calls get_secret_value() when constructing DB URLs and service clients. Tests were updated to treat secrets as SecretStr. Some endpoints and streaming handlers were modified to log or expose the unmasked configuration object (including API key). Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Caller
participant Config as Pydantic Config
participant DB as EngineCreator/_create_postgres_engine
participant Client as ServiceClientBuilder
participant Logger as App Logger
Caller->>Config: load config (password: SecretStr, api_key: SecretStr)
Caller->>DB: request engine (uses config.postgres.password)
DB->>Config: config.postgres.password.get_secret_value()
DB-->>Caller: engine created with plaintext password in URL
Caller->>Client: build service client (uses config.llama_stack.api_key)
Client->>Config: config.llama_stack.api_key.get_secret_value()
Client-->>Caller: client constructed with plaintext api_key
Caller->>Logger: log configuration
Logger-->>Caller: (now logs unmasked config object, includes SecretStr values if not explicitly masked)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
src/models/config.py (3)
38-38: Exclude tls_key_password from serialization — also hide from reprGood hardening. Consider additionally setting repr=False to avoid accidental log exposure, and (optional) renaming to tls_key_password_path for clarity.
- tls_key_password: Optional[FilePath] = Field(default=None, exclude=True) + tls_key_password: Optional[FilePath] = Field(default=None, exclude=True, repr=False)
83-83: DB password: prefer SecretStr and hide from reprExcluding solves the config-leak; using SecretStr adds defense-in-depth for logs/repr and downstream handling.
-from pydantic import ( +from pydantic import ( BaseModel, ConfigDict, Field, model_validator, FilePath, AnyHttpUrl, PositiveInt, + SecretStr, )- password: str = Field(exclude=True) + password: SecretStr = Field(exclude=True, repr=False)Note: callers need
.password.get_secret_value()where a str is required.
170-170: Llama API key: use SecretStr and hide from reprSame rationale as DB password.
- api_key: Optional[str] = Field(default=None, exclude=True) + api_key: Optional[SecretStr] = Field(default=None, exclude=True, repr=False)tests/unit/models/test_config.py (1)
540-551: Tests: add explicit “not present” assertions for secrets and fix outdated noteYou already validate via full-dict equality. Adding negative asserts makes test intent obvious. Also update the note that redaction is now in place.
- Please note that redaction process is not in place. + Secrets are redacted via Field(exclude=True).After loading content:
content = json.load(fin) # content should be loaded assert content is not None + # explicitly verify secrets are not serialized + assert "tls_key_password" not in content["service"]["tls_config"] + assert "api_key" not in content["llama_stack"] + assert "password" not in content["database"]["postgres"]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/models/config.py(3 hunks)tests/unit/models/test_config.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/unit/models/test_config.py (1)
src/models/config.py (2)
DatabaseConfiguration(97-135)PostgreSQLDatabaseConfiguration(76-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🔇 Additional comments (1)
tests/unit/models/test_config.py (1)
638-648: LGTM: expected DB JSON matches shape and omits passwordThe expected structure is comprehensive and correctly redacts the secret.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but please update PR title to contain Jira issue
Right now in assisted-chat we're blocking the config endpoint completely, even for employees, because it leaks secrets like: - database password - llama stack api key - tls key password This change makes sure those secrets are not included in the serialized configuration. This way we can open up the config endpoint to employees again (still not the public, because it contains things like system prompt, which we would like to keep private).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/models/config.py (1)
37-39: tls_key_password type is incorrect; should be a secret string, not a file path.A password shouldn't be modeled as
FilePath. This rejects valid configs (string password) and, if set to a path, would leak filesystem details via the /config endpoint. UseSecretStrto align with the PR goal.Apply:
class TLSConfiguration(ConfigurationBase): @@ - tls_key_password: Optional[FilePath] = None + tls_key_password: Optional[SecretStr] = NoneIf any consumers expect a raw string, unwrap with
.get_secret_value().
🧹 Nitpick comments (8)
src/models/config.py (1)
146-150: Avoid shared mutable defaults for nested models and lists.Use
Field(default_factory=...)to prevent instances from sharing the same default objects.Apply:
class ServiceConfiguration(ConfigurationBase): @@ - tls_config: TLSConfiguration = TLSConfiguration() - cors: CORSConfiguration = CORSConfiguration() + tls_config: TLSConfiguration = Field(default_factory=TLSConfiguration) + cors: CORSConfiguration = Field(default_factory=CORSConfiguration)class Configuration(ConfigurationBase): @@ - database: DatabaseConfiguration = DatabaseConfiguration() - mcp_servers: list[ModelContextProtocolServer] = [] - authentication: AuthenticationConfiguration = AuthenticationConfiguration() + database: DatabaseConfiguration = Field(default_factory=DatabaseConfiguration) + mcp_servers: list[ModelContextProtocolServer] = Field(default_factory=list) + authentication: AuthenticationConfiguration = Field(default_factory=AuthenticationConfiguration) authorization: Optional[AuthorizationConfiguration] = None customization: Optional[Customization] = None - inference: InferenceConfiguration = InferenceConfiguration() + inference: InferenceConfiguration = Field(default_factory=InferenceConfiguration)Also applies to: 461-467
src/client.py (1)
41-46: Minor: treat empty api_key as None.Defensively normalize empty strings to
Noneto avoid sending an empty header.Apply:
- api_key=( - llama_stack_config.api_key.get_secret_value() - if llama_stack_config.api_key is not None - else None - ), + api_key=(lambda v: (v.strip() or None))( + llama_stack_config.api_key.get_secret_value() + if llama_stack_config.api_key is not None else "" + ),src/app/endpoints/config.py (1)
39-41: Example response masks api_key; ensure runtime serialization also masks secrets.Given SecretStr usage, FastAPI/Pydantic should emit masked values in /config. Consider adding/keeping an integration test asserting redaction for all secret fields (api_key, DB password, TLS key password).
I can add a test hitting GET /config and asserting that secret fields match a redaction pattern (e.g., r'^*+$') and never equal the real values. Do you want me to draft it?
tests/integration/test_configuration.py (1)
61-61: Guard against None before unwrapping SecretStr.Be explicit to prevent accidental AttributeError if the fixture changes.
- assert ls_config.api_key.get_secret_value() == "xyzzy" + assert ls_config.api_key is not None + assert ls_config.api_key.get_secret_value() == "xyzzy"tests/unit/models/test_config.py (4)
540-551: Add a regression check that no cleartext secret appears in the raw dumpThe structure looks good. To harden this test against regressions, also assert the raw file doesn’t contain the actual secret value (e.g., "ls_password"), not just the redacted form.
Add after the equality assertion in this test:
with open(dump_file, "r", encoding="utf-8") as fin: raw = fin.read() assert "ls_password" not in raw
615-619: Avoid brittle equality on the exact redaction maskAsserting the exact "**********" ties you to the library’s redaction formatting. Prefer asserting “masked and not equal to the real value.”
Append these checks in the test (keep the deep-equality if you like, but these make it robust):
api_key_mask = content["llama_stack"]["api_key"] assert api_key_mask != "whatever" assert set(api_key_mask) == {"*"} # contains only asterisks
640-651: Same brittleness for Postgres password redactionMirror the robust checks for the DB password to avoid depending on the exact number of asterisks.
pwd_mask = content["database"]["postgres"]["password"] assert pwd_mask != "ls_password" assert set(pwd_mask) == {"*"}
515-516: Update stale note: redaction is now in placeThe comment says redaction isn’t in place, but the expectations assert redaction. Update to avoid confusion.
- Please note that redaction process is not in place. + Note: secrets are redacted in dumps; this test asserts the redaction markers.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (9)
src/app/database.py(1 hunks)src/app/endpoints/config.py(1 hunks)src/app/endpoints/query.py(1 hunks)src/app/endpoints/streaming_query.py(1 hunks)src/client.py(1 hunks)src/models/config.py(3 hunks)tests/integration/test_configuration.py(1 hunks)tests/unit/models/test_config.py(4 hunks)tests/unit/test_configuration.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
src/app/endpoints/query.py (1)
src/configuration.py (2)
configuration(60-64)llama_stack_configuration(74-78)
src/app/database.py (1)
src/models/config.py (1)
config(130-136)
tests/unit/test_configuration.py (1)
src/configuration.py (1)
llama_stack_configuration(74-78)
src/app/endpoints/streaming_query.py (1)
src/configuration.py (2)
configuration(60-64)llama_stack_configuration(74-78)
tests/unit/models/test_config.py (1)
src/models/config.py (2)
DatabaseConfiguration(98-136)PostgreSQLDatabaseConfiguration(77-95)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🔇 Additional comments (5)
src/models/config.py (2)
84-85: Good: Postgres password switched to SecretStr.This aligns with secret handling and prevents accidental exposure in serialization.
171-174: Good: Llama Stack api_key now SecretStr.Unifies secret treatment and keeps OpenAPI/example masking consistent.
src/client.py (1)
41-46: Correct: unwrap api_key with get_secret_value().This prevents passing a SecretStr object to the client and avoids accidental stringification leaks.
tests/unit/test_configuration.py (1)
86-86: Good: updated test to unwrap SecretStr.Accurately reflects the new secret-wrapped api_key.
tests/unit/models/test_config.py (1)
1005-1005: LGTM: SecretStr handling validated correctlySwitching to
get_secret_value()is the right assertion forSecretStr.
| f"postgresql://{config.user}:{config.password.get_secret_value()}@" | ||
| f"{config.host}:{config.port}/{config.db}" | ||
| f"?sslmode={config.ssl_mode}&gssencmode={config.gss_encmode}" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Build the Postgres URL with SQLAlchemy URL.create to correctly escape credentials.
String interpolation will break with special characters in user/password and risks subtle connection bugs. Use URL.create and fix the options flag spacing.
Apply:
-from sqlalchemy import create_engine, text
+from sqlalchemy import create_engine, text
+from sqlalchemy.engine import URL
@@
- postgres_url = (
- f"postgresql://{config.user}:{config.password.get_secret_value()}@"
- f"{config.host}:{config.port}/{config.db}"
- f"?sslmode={config.ssl_mode}&gssencmode={config.gss_encmode}"
- )
+ postgres_url = URL.create(
+ drivername="postgresql",
+ username=config.user,
+ password=config.password.get_secret_value(),
+ host=config.host,
+ port=int(config.port),
+ database=config.db,
+ query={"sslmode": config.ssl_mode, "gssencmode": config.gss_encmode},
+ )
@@
- if is_custom_schema:
- connect_args["options"] = f"-csearch_path={config.namespace}"
+ if is_custom_schema:
+ connect_args["options"] = f"-c search_path={config.namespace}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| f"postgresql://{config.user}:{config.password.get_secret_value()}@" | |
| f"{config.host}:{config.port}/{config.db}" | |
| f"?sslmode={config.ssl_mode}&gssencmode={config.gss_encmode}" | |
| ) | |
| ++ b/src/app/database.py | |
| @@ -1,3 +1,4 @@ | |
| from sqlalchemy import create_engine, text | |
| from sqlalchemy.engine import URL | |
| # … earlier code … | |
| - postgres_url = ( | |
| - f"postgresql://{config.user}:{config.password.get_secret_value()}@" | |
| - f"{config.host}:{config.port}/{config.db}" | |
| - f"?sslmode={config.ssl_mode}&gssencmode={config.gss_encmode}" | |
| postgres_url = URL.create( | |
| drivername="postgresql", | |
| username=config.user, | |
| password=config.password.get_secret_value(), | |
| host=config.host, | |
| port=int(config.port), | |
| database=config.db, | |
| query={ | |
| "sslmode": config.ssl_mode, | |
| "gssencmode": config.gss_encmode, | |
| }, | |
| ) | |
| - if is_custom_schema: | |
| if is_custom_schema: | |
| connect_args["options"] = f"-c search_path={config.namespace}" | |
| # … later code … |
🤖 Prompt for AI Agents
In src/app/database.py around lines 62 to 65, the Postgres connection string is
being built via naive f-string interpolation which will mis-handle special
characters in the user/password and has incorrect query option spacing; replace
the f-string construction with SQLAlchemy's URL.create to properly escape
credentials and assemble the components (drivername 'postgresql', username from
config.user, password from config.password.get_secret_value(), host, port,
database name) and pass the query options as a dict (e.g., {'sslmode':
config.ssl_mode, 'gssencmode': config.gss_encmode}) so the URL is correctly
encoded and the options flag spacing is fixed.
| # log Llama Stack configuration | ||
| logger.info("Llama stack config: %s", configuration.llama_stack_configuration) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid logging secret-bearing config at INFO; sanitize and downgrade to DEBUG.
Even with SecretStr masking, logging the full config object at INFO on every request is risky and noisy. Emit only a sanitized JSON view and use DEBUG.
- # log Llama Stack configuration
- logger.info("Llama stack config: %s", configuration.llama_stack_configuration)
+ # log Llama Stack configuration (sanitized)
+ logger.debug(
+ "Llama stack config: %s",
+ configuration.llama_stack_configuration.model_dump(
+ mode="json", exclude={"api_key"}, exclude_none=True
+ ),
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # log Llama Stack configuration | |
| logger.info("Llama stack config: %s", configuration.llama_stack_configuration) | |
| # log Llama Stack configuration (sanitized) | |
| logger.debug( | |
| "Llama stack config: %s", | |
| configuration.llama_stack_configuration.model_dump( | |
| mode="json", exclude={"api_key"}, exclude_none=True | |
| ), | |
| ) |
🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 179-181, the code currently logs the
entire Llama stack configuration at INFO which may expose secrets; change this
to log a sanitized representation at DEBUG instead: build a JSON-safe dict from
configuration that excludes or redacts secret-bearing fields (exclude fields
named like '*secret*' or instances of SecretStr/SecretBytes, or use pydantic's
.dict(exclude=...) if applicable), then logger.debug the sanitized JSON view and
remove or downgrade the existing logger.info call so no full config is logged at
INFO.
| # log Llama Stack configuration | ||
| logger.info("Llama stack config: %s", configuration.llama_stack_configuration) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Same logging concern: sanitize and use DEBUG.
Mirror the sanitized, lower-verbosity logging to avoid leaking sensitive config details.
- # log Llama Stack configuration
- logger.info("Llama stack config: %s", configuration.llama_stack_configuration)
+ # log Llama Stack configuration (sanitized)
+ logger.debug(
+ "Llama stack config: %s",
+ configuration.llama_stack_configuration.model_dump(
+ mode="json", exclude={"api_key"}, exclude_none=True
+ ),
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # log Llama Stack configuration | |
| logger.info("Llama stack config: %s", configuration.llama_stack_configuration) | |
| # log Llama Stack configuration (sanitized) | |
| logger.debug( | |
| "Llama stack config: %s", | |
| configuration.llama_stack_configuration.model_dump( | |
| mode="json", | |
| exclude={"api_key"}, | |
| exclude_none=True, | |
| ), | |
| ) |
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Right now in assisted-chat we're blocking the config endpoint completely, even for employees, because it leaks secrets like:
This change makes sure those secrets are not included in the serialized configuration. This way we can open up the config endpoint to employees again (still not the public, because it contains things like system prompt, which we would like to keep private).
Type of change
Related Tickets & Documents
LCORE-395
Checklist before requesting a review
Testing
Adjusted unit-tests and checked manually
Summary by CodeRabbit
New Features
Bug Fixes / Security
Tests