Skip to content

Strip userinfo from OpenSearch host URL before using it as task-log label#65509

Merged
potiuk merged 1 commit intoapache:mainfrom
potiuk:strip-userinfo-from-os-host-url
Apr 19, 2026
Merged

Strip userinfo from OpenSearch host URL before using it as task-log label#65509
potiuk merged 1 commit intoapache:mainfrom
potiuk:strip-userinfo-from-os-host-url

Conversation

@potiuk
Copy link
Copy Markdown
Member

@potiuk potiuk commented Apr 19, 2026

Follow-up to #65349 (thanks @Owen-CH-Leung for catching this). OpenSearch's task-log handler has the same credential-leak as Elasticsearch did: _group_logs_by_host falls back to the raw [opensearch] host config value as the log-source label, so a host URL containing user:password@... appears as a dictionary key in task-log output.

Applies the same _strip_userinfo helper used in providers/elasticsearch. The OpenSearch client itself still connects using the full unredacted URL, so authentication is unaffected. Both _group_logs_by_host sites (OpensearchTaskHandler and OpensearchRemoteLogIO) are patched.

Also adds AGENTS.md to both providers/opensearch and providers/elasticsearch noting that the two providers are forks and that most task-log-handler fixes should be cross-applied, so this kind of cross-provider miss is easier to avoid next time.

Test plan

  • New test_strip_userinfo parametrized across 7 input URL shapes (with userinfo, without userinfo, username-only, non-URL, empty) — all pass
  • Full existing test_os_task_handler.py non-db suite continues to pass (42/42)

Changelog

Added a redaction note above the latest version header in providers/opensearch/docs/changelog.rst (mirrors the ES PR).

Was generative AI tooling used to co-author this PR?
  • Yes — Claude Opus 4.7 (1M context)

Generated-by: Claude Opus 4.7 (1M context) following the guidelines

…abel

Follow-up to apache#65349 — OpenSearch's `_group_logs_by_host` had the same
credential-leak as Elasticsearch: the raw `[opensearch] host` config
value (which commonly embeds `user:password@...`) was used as a
log-source dictionary key, exposing credentials in task logs. Apply the
same `_strip_userinfo` helper; the OpenSearch client still connects
with the full URL so auth is unaffected. Both `OpensearchTaskHandler`
and `OpensearchRemoteLogIO` sites are patched.

Also add `AGENTS.md` to both `providers/opensearch` and
`providers/elasticsearch` noting that the two providers are forks and
most task-log-handler fixes should be cross-applied.
Copy link
Copy Markdown
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@potiuk potiuk merged commit 6a6b6ff into apache:main Apr 19, 2026
36 checks passed
@potiuk potiuk deleted the strip-userinfo-from-os-host-url branch April 19, 2026 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants