Skip to content

Return resolved download links in downloader output#281

Merged
soimkim merged 4 commits into
mainfrom
down
May 26, 2026
Merged

Return resolved download links in downloader output#281
soimkim merged 4 commits into
mainfrom
down

Conversation

@soimkim
Copy link
Copy Markdown
Contributor

@soimkim soimkim commented May 26, 2026

Description

  1. Return resolved download links in downloader output.
    Include the actual resolved download link in downloader output results, and use an empty link when the download flow fails.
  2. Debian package version detection now extracts and returns the matched version from package page headings for improved accuracy.

@soimkim soimkim self-assigned this May 26, 2026
@soimkim soimkim added the chore [PR/Issue] Refactoring, maintenance the code label May 26, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Warning

Review limit reached

@soimkim, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 30 minutes and 14 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ad299966-4411-4ded-b598-2784a254cfde

📥 Commits

Reviewing files that changed from the base of the PR and between 526fef8 and 6e5188c.

📒 Files selected for processing (3)
  • src/fosslight_util/_get_downloadable_url.py
  • src/fosslight_util/download.py
  • tests/test_download_version_hint.py
📝 Walkthrough

Walkthrough

The PR adds Debian package heading version extraction to URL resolution, returning version alongside resolved pool tarballs, and extends download tracking to capture final resolved URLs. These are propagated through download flows and included in CLI output JSON for download success or failure.

Changes

Debian version extraction and download link propagation

Layer / File(s) Summary
Debian package heading version extraction
src/fosslight_util/_get_downloadable_url.py
New regex pattern and helper function extract the Debian h1 heading version; _resolve_debian_package_page_to_pool_tarball() and _resolve_debian_search_to_source_tarball() now return (url, version) tuples; get_downloadable_url() propagates the extracted Debian version into its return tuple instead of an empty string.
Download link tracking and CLI output
src/fosslight_util/download.py
download_wget() now tracks the resolved URL from get_downloadable_url() and returns it as an additional tuple element; cli_download_and_extract() captures this resolved link across git clone, wget, and rubygems flows and includes it in the output JSON "link" field (or empty string when overall operation fails).
Debian version and link tests
tests/test_download_version_hint.py
Added test helpers and new cases validating Debian package heading version extraction, version preservation through search-to-tarball resolution, and CLI output JSON structure with resolved download links on both success and failure paths.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • fosslight/fosslight_util#270: Both PRs modify src/fosslight_util/_get_downloadable_url.py's Debian URL resolution logic—specifically the packages.debian.org/search/package-page-to-pool-tarball flow—so the main PR's updated return values (including matched Debian heading version) build directly on the retrieved PR's Debian search-to-stable-tarball resolver.
  • fosslight/fosslight_util#277: Both PRs modify src/fosslight_util/_get_downloadable_url.py's Debian URL resolution logic (searching/following Debian package pages to resolve the pool tarball), so the main PR's version-extraction/tuple propagation builds directly on the same Debian resolution workflow.

Suggested labels

enhancement

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Return resolved download links in downloader output' directly and clearly summarizes the main objective of the pull request, which is to include resolved download links in the downloader output.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch down

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/fosslight_util/_get_downloadable_url.py`:
- Around line 133-152: The code currently drops the parsed Debian heading
version by returning "" at the end even when version_tokens existed and
package_version was parsed; update the final return so that when version_tokens
is truthy you propagate package_version (not ""), i.e., return
(_normalize_debian_pool_download_from_tarball_hrefs(source_links),
package_version) when package_version is present (or otherwise fall back to ""),
ensuring the branch that filters source_links and reaches the final return
preserves package_version from earlier parsing.

In `@src/fosslight_util/download.py`:
- Around line 209-211: The code assigns downloaded_link = link and then writes
it to JSON, which can leak credentials or signed tokens; update places that set
downloaded_link (the assignments where downloaded_link = link and the other
similar branches around success_git and the blocks at 225-226 and 244-249) to
pass link through a sanitizer function (e.g., sanitize_url) before assigning.
Implement sanitize_url to parse the URL (urllib.parse), strip userinfo
(username/password) and remove or redact sensitive query parameters like token,
access_token, sig, signature, auth, jwt, and any param matching
/(token|sig|signature|auth)/i, then return the cleaned URL; use this sanitized
value for all JSON output and logging instead of the raw link.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2e677b60-fe51-455b-8343-5cdfbedfe568

📥 Commits

Reviewing files that changed from the base of the PR and between 141ee1d and 526fef8.

📒 Files selected for processing (3)
  • src/fosslight_util/_get_downloadable_url.py
  • src/fosslight_util/download.py
  • tests/test_download_version_hint.py

Comment thread src/fosslight_util/_get_downloadable_url.py Outdated
Comment thread src/fosslight_util/download.py
@soimkim soimkim merged commit 54c916d into main May 26, 2026
8 checks passed
@soimkim soimkim deleted the down branch May 26, 2026 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore [PR/Issue] Refactoring, maintenance the code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant