Skip to content

feat: core support for base repository URLs with configurable branch and path#77

Merged
Kilo59 merged 16 commits intomainfrom
17-let-me-declare-a-base-repo-url-as-an-upstream
Mar 8, 2026
Merged

feat: core support for base repository URLs with configurable branch and path#77
Kilo59 merged 16 commits intomainfrom
17-let-me-declare-a-base-repo-url-as-an-upstream

Conversation

@Kilo59
Copy link
Copy Markdown
Owner

@Kilo59 Kilo59 commented Mar 8, 2026

This PR implements full support for using base repository URLs (GitHub/GitLab) as upstream ruff-sync sources, including advanced configuration for custom branches and directory prefixes.

New Features

  • Repository URL Support: GitHub and GitLab repository URLs (e.g., https://github.com/org/repo) are now automatically resolved to their raw content counterparts.
  • Configurable Defaults: Added --branch and --path CLI options (and corresponding [tool.ruff-sync] configuration keys) to override the default main branch and root pyproject.toml path.
  • Advanced Configuration: The path configuration now allows specifying a directory prefix (parent path) where pyproject.toml is located.

Enhancements

  • Refactored Argument Resolution: Optimized ruff_sync.py by splitting complex logic into specialized helper functions, improving readability and maintainability.
  • Improved Type Safety: Added a Config TypedDict for robust handling of settings loaded from pyproject.toml.
  • Better Developer Experience: Added a .git-blame-ignore-revs file to maintain a clean git blame history.
  • Modernized Tooling: Updated all dogfooding scripts and internal documentation to use uv instead of poetry.

Documentation

  • Updated README.md with an "Advanced Configuration" section and refined examples to reflect current best practices.

Testing

  • Added comprehensive unit tests in tests/test_url_handling.py covering various URL patterns and configuration overrides.

Chores

  • Bumped project version to 0.0.3.dev1.

@Kilo59 Kilo59 linked an issue Mar 8, 2026 that may be closed by this pull request
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Mar 8, 2026

Reviewer's Guide

Adds support for using GitHub repository URLs (not just blob URLs) as Ruff config upstreams by enhancing GitHub URL → raw URL conversion, updates tooling/docs/scripts to prefer repo-style URLs and uv-based commands, bumps version to 0.0.3, and introduces targeted unit tests and blame-ignore config.

Sequence diagram for converting upstream GitHub URL to raw URL

sequenceDiagram
    actor User
    participant CLI as ruff_sync_CLI
    participant URLFunc as github_url_to_raw_url
    participant GitHub as github_com
    participant RawGH as raw_githubusercontent_com

    User->>CLI: run_ruff_sync_with_upstream_URL
    CLI->>URLFunc: github_url_to_raw_url(upstream_URL)

    alt Non_GitHub_URL
        URLFunc-->>CLI: return_original_URL
        CLI->>GitHub: GET_original_URL
        GitHub-->>CLI: pyproject_toml_content
    else GitHub_blob_URL
        URLFunc->>URLFunc: replace_github_com_with_raw_domain
        URLFunc->>URLFunc: replace_blob_segment_with_path
        URLFunc-->>CLI: return_raw_URL
        CLI->>RawGH: GET_raw_URL
        RawGH-->>CLI: pyproject_toml_content
    else GitHub_repo_URL
        URLFunc->>URLFunc: parse_path_components
        URLFunc->>URLFunc: build_raw_URL_org_repo_main_pyproject
        URLFunc-->>CLI: return_raw_URL
        CLI->>RawGH: GET_raw_URL
        RawGH-->>CLI: pyproject_toml_content
    else Other_GitHub_pattern
        URLFunc-->>CLI: return_original_URL
        CLI->>GitHub: GET_original_URL
        GitHub-->>CLI: pyproject_toml_content
    end

    CLI->>CLI: apply_sync_or_check_logic
    CLI-->>User: report_result
Loading

Flow diagram for github_url_to_raw_url decision logic

flowchart TD
    A[start_github_url_to_raw_url] --> B[log_initial_URL]
    B --> C{URL_contains_github_com}

    C -- No --> D[log_non_GitHub_return_as_is]
    D --> Z[end_return_original_URL]

    C -- Yes --> E{URL_path_contains_blob_segment}

    E -- Yes --> F[replace_github_com_with_raw_githubusercontent_com]
    F --> G[replace_blob_segment_with_slash_path]
    G --> H[log_blob_conversion]
    H --> I[return_raw_URL]
    I --> Z

    E -- No --> J[split_URL_path_into_non_empty_parts]
    J --> K{path_parts_length_equals_repo_parts_count}

    K -- Yes --> L[extract_org_and_repo]
    L --> M[build_raw_URL_org_repo_main_pyproject]
    M --> N[log_repo_conversion]
    N --> O[return_raw_URL]
    O --> Z

    K -- No --> P[log_unknown_GitHub_pattern_return_as_is]
    P --> Z
Loading

File-Level Changes

Change Details Files
Enhance GitHub URL handling to support both blob and repository URLs for upstream resolution, with clearer logging and a small refactor in source path handling.
  • Extend github_url_to_raw_url to early-return non-GitHub URLs unchanged and log decision points.
  • Convert blob URLs by swapping github.com with raw.githubusercontent.com and /blob/ with /, using info-level logging.
  • Detect repository-style GitHub URLs via a constant path-part count and map them to main/pyproject.toml on raw.githubusercontent.com.
  • Add a fallback path for unknown GitHub URL patterns that returns the original URL with an info log.
  • Slightly refactor source path resolution in check and pull to use a single conditional expression.
  • Introduce a constant for expected GitHub repo path-part count and bump internal version to 0.0.3.
ruff_sync.py
Align documentation and helper scripts with repository-style upstream URLs and uv-based execution.
  • Update README usage examples and configuration guidance to show repo URLs as the primary upstream form and describe expanded GitHub URL support.
  • Change internal agent/testing docs to run commands via uv instead of poetry (pytest, invoke, coverage).
  • Adjust dogfooding scripts to default to repository URLs instead of blob URLs and to invoke ruff_sync.py via uv.
README.md
.agents/workflows/add-test-case.md
.agents/TESTING.md
scripts/dogfood.sh
scripts/dogfood_check.sh
Version and tooling metadata updates plus new tests and blame-ignore configuration.
  • Bump project version in pyproject.toml to 0.0.3 to match version.
  • Introduce uv.lock to capture uv environment locking (contents not shown in diff).
  • Add tests that parameterize github_url_to_raw_url behavior across blob, repo, non-GitHub, and already-raw URLs.
  • Add a .git-blame-ignore-revs file to manage ignored commits in git blame output.
pyproject.toml
uv.lock
tests/test_url_handling.py
.git-blame-ignore-revs

Assessment against linked issues

Issue Objective Addressed Explanation
#17 Allow GitHub repository base URLs (e.g. https://github.com/org/repo) to be used as upstreams, resolving them automatically to the corresponding main/pyproject.toml (or equivalent) for ruff-sync operations.
#17 Update README and related documentation to describe and exemplify using base GitHub repository URLs as upstreams instead of blob/main/pyproject.toml URLs.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 8, 2026

Codecov Report

❌ Patch coverage is 91.34615% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.83%. Comparing base (75f9e34) to head (37218fb).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ruff_sync.py 91.34% 9 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #77      +/-   ##
==========================================
+ Coverage   92.06%   92.83%   +0.77%     
==========================================
  Files           1        1              
  Lines         252      321      +69     
==========================================
+ Hits          232      298      +66     
- Misses         20       23       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Kilo59 Kilo59 marked this pull request as ready for review March 8, 2026 19:14
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In github_url_to_raw_url, consider using url.host/url.netloc to explicitly match github.com instead of a substring search on url_str, which would avoid false positives like https://notgithub.com/....
  • The new LOGGER.info calls for non-GitHub and unrecognized GitHub URLs may be noisy in normal use; consider downgrading these to debug or gating them behind a verbose flag so routine runs stay quiet.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `github_url_to_raw_url`, consider using `url.host`/`url.netloc` to explicitly match `github.com` instead of a substring search on `url_str`, which would avoid false positives like `https://notgithub.com/...`.
- The new `LOGGER.info` calls for non-GitHub and unrecognized GitHub URLs may be noisy in normal use; consider downgrading these to `debug` or gating them behind a verbose flag so routine runs stay quiet.

## Individual Comments

### Comment 1
<location path="ruff_sync.py" line_range="194" />
<code_context>
+    LOGGER.debug(f"Initial URL: {url}")
     url_str = str(url)
-    if "github.com" in url_str and "/blob/" in url_str:
+    if "github.com" not in url_str:
+        LOGGER.info("URL is not a GitHub URL, returning as is.")
+        return url
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Checking for a GitHub URL by substring on the full string is brittle; using the URL host would be more robust.

`"github.com" in url_str` can both miss valid GitHub URLs (e.g., `www.github.com`, enterprise hosts) and wrongly match non-GitHub URLs where `github.com` appears only in the path or query. Since `url` is an `httpx.URL`, prefer checking `url.host` against an explicit set of allowed hosts (e.g., `{"github.com", "www.github.com"}` or enterprise domains) for a more accurate and extensible check.
</issue_to_address>

### Comment 2
<location path="tests/test_url_handling.py" line_range="21-13" />
<code_context>
+            "https://github.com/org/repo/blob/develop/config/ruff.toml",
+            "https://raw.githubusercontent.com/org/repo/develop/config/ruff.toml",
+        ),
+        # Repo URLs
+        (
+            "https://github.com/pydantic/pydantic",
+            "https://raw.githubusercontent.com/pydantic/pydantic/main/pyproject.toml",
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for GitHub URLs that are neither blob nor plain repo (e.g. `/tree/`), to assert they are passed through unchanged.

To cover the branch that logs and returns non-blob GitHub URLs unchanged, please add a parametrized case like:

```python
(
    "https://github.com/org/repo/tree/main/subdir/pyproject.toml",
    "https://github.com/org/repo/tree/main/subdir/pyproject.toml",
),
```

This exercises the "unknown GitHub pattern" path and protects against future rewrites of such URLs.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread ruff_sync.py Outdated
"input_url, expected_url",
[
# Blob URLs
(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test case for GitHub URLs that are neither blob nor plain repo (e.g. /tree/), to assert they are passed through unchanged.

To cover the branch that logs and returns non-blob GitHub URLs unchanged, please add a parametrized case like:

(
    "https://github.com/org/repo/tree/main/subdir/pyproject.toml",
    "https://github.com/org/repo/tree/main/subdir/pyproject.toml",
),

This exercises the "unknown GitHub pattern" path and protects against future rewrites of such URLs.

@Kilo59 Kilo59 self-assigned this Mar 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

╒═══════════════╤═════════════════╤════════════╤══════════════╤═════════════════╕
│ File          │ Maintainabili   │ Unique     │ Cyclomatic   │ Lines of Code   │
│               │ ty Index        │ Operands   │ Complexity   │                 │
╞═══════════════╪═════════════════╪════════════╪══════════════╪═════════════════╡
│ ruff_sync.py  │ 38.2312 ->      │ 12 -> 12   │ 77 -> 98     │ 530 -> 660      │
│               │ 35.4133         │            │              │                 │
├───────────────┼─────────────────┼────────────┼──────────────┼─────────────────┤
│ tests/test_ur │ - -> 84.61278   │ - -> 1     │ - -> 2       │ - -> 113        │
│ l_handling.py │ 146394264       │            │              │                 │
╘═══════════════╧═════════════════╧════════════╧══════════════╧═════════════════╛

@Kilo59 Kilo59 changed the title feat: Use a base repo url as an upstream feat: core support for base repository URLs with configurable branch and path Mar 8, 2026
@Kilo59 Kilo59 merged commit 7a6395c into main Mar 8, 2026
12 of 13 checks passed
@Kilo59 Kilo59 deleted the 17-let-me-declare-a-base-repo-url-as-an-upstream branch March 8, 2026 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Let me declare a base repo url as an upstream

1 participant