add a license file#9
Merged
Merged
Conversation
dkirov-dd
added a commit
that referenced
this pull request
Apr 29, 2026
… reliance Replace the ``latest.json`` rolling pointer fetch with an S3 ``ListObjectsV2`` walk over ``targets/<project>/``: filter keys to PEP 440 stable versions and pick the maximum. The chosen version is then fetched through TUF as before, so the pointer file the client trusts is still cryptographically verified. Why list S3 instead of parsing the signed targets metadata: once ``path_hash_prefixes`` delegations are in use, a client cannot tell from metadata alone which delegation signs the latest version of a given project. Listing the bucket sidesteps that — TUF still authoritatively verifies the chosen version's pointer. The publisher counterpart in agent-integrations-tuf drops ``latest.json`` entirely; see DataDog/agent-integrations-tuf PR #9. - ``_resolve_latest_version`` lists ``targets/<project>/`` via the S3 REST API (no boto3 dep), parses the XML response, follows the continuation-token pagination, and applies a PEP 440 stable filter - ``get_pointer(project, version=None)`` resolves ``version`` itself before delegating to the TUF Updater - 6 new offline tests cover max-version selection, pre-release/dev filtering, post-release support, the no-stable error, paginated listings, and non-pointer key skipping Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apurvagandhi
pushed a commit
to fidelity-contributions/datadog-integrations-core
that referenced
this pull request
May 27, 2026
…DataDog#23144) * feat(downloader): add TUFPointerDownloader for v2 pointer-file format The new agent-integrations-tuf pipeline produces TUF targets as JSON pointer files (targets/<project>/<version>.json) rather than the old HTML simple index + in-toto approach. This commit adds: - TUFPointerDownloader in download_v2.py: TUF-verifies the pointer file, then fetches and sha256-verifies the wheel from S3. - DigestMismatch exception for sha256/length failures. - --format v2 CLI flag: routes through TUFPointerDownloader. --unsafe-disable-verification carries forward; --type and --ignore-python-version are no-ops in v2 with a warning. - 8 offline unit tests covering happy path, missing target, digest mismatch, length mismatch, and disable_verification mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(downloader): use --repository URL for wheel fetch, not pointer's baked value The pointer file always contains the prod S3 repository URL. When validating staging, the caller passes --repository <staging-url> to point at the staging bucket; that URL should be used for both the TUF metadata fetch AND the wheel download, not just the metadata. Adds a test that asserts the wheel is fetched from the caller-supplied URL even when the pointer contains a different (prod) repository value. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(downloader): resolve latest via S3 listing, drop latest.json reliance Replace the ``latest.json`` rolling pointer fetch with an S3 ``ListObjectsV2`` walk over ``targets/<project>/``: filter keys to PEP 440 stable versions and pick the maximum. The chosen version is then fetched through TUF as before, so the pointer file the client trusts is still cryptographically verified. Why list S3 instead of parsing the signed targets metadata: once ``path_hash_prefixes`` delegations are in use, a client cannot tell from metadata alone which delegation signs the latest version of a given project. Listing the bucket sidesteps that — TUF still authoritatively verifies the chosen version's pointer. The publisher counterpart in agent-integrations-tuf drops ``latest.json`` entirely; see DataDog/agent-integrations-tuf PR DataDog#9. - ``_resolve_latest_version`` lists ``targets/<project>/`` via the S3 REST API (no boto3 dep), parses the XML response, follows the continuation-token pagination, and applies a PEP 440 stable filter - ``get_pointer(project, version=None)`` resolves ``version`` itself before delegating to the TUF Updater - 6 new offline tests cover max-version selection, pre-release/dev filtering, post-release support, the no-stable error, paginated listings, and non-pointer key skipping Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Revert "refactor(downloader): resolve latest via S3 listing, drop latest.json reliance" This reverts commit 70688d8. * feat(downloader): bundle 1.root.json; rename --format to --index; drop --root-json - Bundle metadata/root_history/1.root.json from agent-integrations-tuf as a package resource; TUFPointerDownloader loads it via importlib.resources — no TOFU, no --root-json flag needed - Rename --format v2 to --index (boolean flag); v1 remains the default when --index is absent - Remove trust_anchor parameter from TUFPointerDownloader.__init__ - Drop --format and --root-json from instantiate_downloader (v1 path) - Register 1.root.json as a wheel artifact in pyproject.toml - Update tests to match new interface Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * fix(downloader): rename --index to --v2 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * feat(downloader): default to v2 with v1 fallback; add prod URL constant Without any flag the downloader now attempts v2 (against the prod S3 bucket) and falls back to v1 on any failure, so callers get the new format automatically without code changes. Passing --v2 explicitly keeps the strict v2 path with no fallback (used by the pipeline's validate- staging step). V2_REPOSITORY_URL is the prod bucket constant used for the default repository value in _download_v2(); callers can still override it with --repository. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * feat(downloader): resolve hash-prefixed targets via N.targets.json The v2 TUF repository uses consistent-snapshot format: pointer files are stored as {sha256}.{version}.json on S3. Two changes to support this: 1. _make_updater now sets UpdaterConfig(prefix_targets_with_hash=True) so the TUF Updater resolves hash-prefixed paths automatically when calling download_target(). 2. get_pointer() now parses N.targets.json (after Updater.refresh()) to enumerate available versions for the project. This replaces the removed latest.json: when version=None, _resolve_version() scans all <project>/<ver>.json entries in targets metadata and returns the highest stable PEP 440 version. The disable_verification path fetches the metadata chain (timestamp → snapshot → targets) without verifying signatures to find the hash-prefixed URL, then fetches the pointer directly. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * feat(downloader): resolve latest via latest pointer target * Move v2 TUF root metadata * Simplify v2 downloader implementation * feat(downloader): add MissingVersion and MalformedPointerError exceptions Dedicated types replace the prior reuse of TargetNotFoundError for argument validation (which mislabeled the failure category) and the unchecked KeyError raised on a malformed pointer JSON. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(downloader): harden v2 wheel fetch and pointer handling - Add explicit 60s timeout to urllib.request.urlopen so a stalled wheel fetch does not hang the Agent installer indefinitely. - Validate required pointer JSON keys (digest, length, wheel_path) and raise the new MalformedPointerError instead of an opaque KeyError. - Raise MissingVersion (a CLIError subclass) when --unsafe-disable-verification is set without --version, so the v1 fallback log reports the actual cause instead of "target not found". - Extract _verify_content to drop the pointer-is-None sentinel and make the verified and direct-download branches structurally parallel. - Add `from __future__ import annotations` so the PEP 604 unions stay compatible with the declared requires-python = ">=3.8". - Move logging.basicConfig out of the constructor and into the CLI entry point (separate commit); the class no longer mutates the root logger. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(downloader): make v2/v1 fallback handle validation errors and --force - Split _download_v2() into instantiate_v2_downloader() and run_v2_downloader() to mirror the v1 instantiate/run split and let the warning/validation branches be tested without patching sys.argv. - Re-raise user-input errors (CLIError, MissingVersion) before the broad except so they propagate as-is instead of triggering a spurious v1 retry and a misleading "v2 download failed" log line. - Add --force as a no-op compat stub on the v2 parser so v1-only callers do not trip parse_args -> SystemExit and silently skip the fallback. - Hoist `import logging` to module top (was lazy-imported in the except block) and own the verbose-to-level + logging.basicConfig setup that used to live inside TUFPointerDownloader.__init__. - Drop the meaningless `--v2 default=True` re-declaration; rename underscore-prefixed argparse dests to plain names. - Note in the fallback block that v1 offline tests now traverse v2 first on every invocation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(downloader): broaden v2 coverage and parametrize failure categories - Parametrize _v2_failure_category across all five (exc, category) cases and add DownloadError / TimeoutError coverage that the categorizer already handles but previous tests never asserted. - Replace direct calls to TUFPointerDownloader._target_path with a parametrized test that drives get_pointer and asserts on Updater.get_targetinfo so the behavior, not the private helper, is what's pinned. - Add failure-mode tests for malformed pointer JSON (one per required key), urllib HTTPError/URLError mid-download, and wheel_path without a leading slash so the URL-composition contract is visible. - Update test_direct_download_requires_explicit_version to expect MissingVersion now that argument-validation no longer reuses TargetNotFoundError. - Move @pytest.mark.offline from each class to a module-level pytestmark; drop the leading-underscore prefix on module constants to match AGENTS.md style. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(downloader): sort test imports per project ruff config Ruff in CI uses the root ../pyproject.toml which treats datadog_checks as first-party. Reorder the test imports to match. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(downloader): address PR DataDog#23144 review feedback - exceptions.py: type-hint MalformedPointerError/DigestMismatch __init__; add LengthMismatch (split from the overloaded DigestMismatch). - download_v2.py: drop underscore from WHEEL_FETCH_TIMEOUT_SECONDS and REQUIRED_POINTER_KEYS per AGENTS.md; validate wheel_path leading slash via MalformedPointerError; verify length first (cheap early-out) before the sha256 digest check. - cli.py: add type hints on download(), _v2_parser(), instantiate_v2_downloader(), run_v2_downloader(); drop the unused _args parameter from run_v2_downloader; collapse the redundant (CLIError, MissingVersion) except clause to just CLIError. - test_v2_downloader.py: assert MalformedPointerError when wheel_path lacks a leading slash; split TestLengthMismatch from TestDigestMismatch; cover instantiate_v2_downloader validation/warning branches and the cli.download() v2-then-v1 fallback orchestration; drop the inline Updater patch in TestDisableVerification in favour of the fixture. * Fix v2 downloader blockers: narrow fallback, future import Narrow the v1 fallback in download() to a tuple of network/lookup errors. Previously every non-CLIError exception triggered v1 retry, including DigestMismatch / LengthMismatch / MalformedPointerError — i.e. integrity failures the v2 path is meant to surface were silently masked. Now those propagate; only TargetNotFoundError, DownloadError, TimeoutError, and urllib.error.URLError fall back. Add `from __future__ import annotations` to cli.py: the new module uses PEP 604 unions and PEP 585 subscripted generics at definition time, which crash on Python 3.8/3.9 (pyproject.toml declares requires-python = ">=3.8"). download_v2.py already had the import. Add parametrized test pinning the new behavior — DigestMismatch, LengthMismatch, and MalformedPointerError propagate without invoking the v1 downloader. Other review feedback (refactor download(), gate compat warnings on --v2, validate pointer field types, split download() into verified / direct, etc.) is deferred to a follow-up to keep this PR focused. * Preserve v1 downloader fallback behavior * Format v2 downloader tests * Add v2 downloader reviewer test coverage * Reuse v2 downloader test wheel name * Restore unsafe v1 fallback regression test --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.