Conversation
Reviewer's GuideAdds support for using GitHub repository URLs (not just blob URLs) as Ruff config upstreams by enhancing GitHub URL → raw URL conversion, updates tooling/docs/scripts to prefer repo-style URLs and uv-based commands, bumps version to 0.0.3, and introduces targeted unit tests and blame-ignore config. Sequence diagram for converting upstream GitHub URL to raw URLsequenceDiagram
actor User
participant CLI as ruff_sync_CLI
participant URLFunc as github_url_to_raw_url
participant GitHub as github_com
participant RawGH as raw_githubusercontent_com
User->>CLI: run_ruff_sync_with_upstream_URL
CLI->>URLFunc: github_url_to_raw_url(upstream_URL)
alt Non_GitHub_URL
URLFunc-->>CLI: return_original_URL
CLI->>GitHub: GET_original_URL
GitHub-->>CLI: pyproject_toml_content
else GitHub_blob_URL
URLFunc->>URLFunc: replace_github_com_with_raw_domain
URLFunc->>URLFunc: replace_blob_segment_with_path
URLFunc-->>CLI: return_raw_URL
CLI->>RawGH: GET_raw_URL
RawGH-->>CLI: pyproject_toml_content
else GitHub_repo_URL
URLFunc->>URLFunc: parse_path_components
URLFunc->>URLFunc: build_raw_URL_org_repo_main_pyproject
URLFunc-->>CLI: return_raw_URL
CLI->>RawGH: GET_raw_URL
RawGH-->>CLI: pyproject_toml_content
else Other_GitHub_pattern
URLFunc-->>CLI: return_original_URL
CLI->>GitHub: GET_original_URL
GitHub-->>CLI: pyproject_toml_content
end
CLI->>CLI: apply_sync_or_check_logic
CLI-->>User: report_result
Flow diagram for github_url_to_raw_url decision logicflowchart TD
A[start_github_url_to_raw_url] --> B[log_initial_URL]
B --> C{URL_contains_github_com}
C -- No --> D[log_non_GitHub_return_as_is]
D --> Z[end_return_original_URL]
C -- Yes --> E{URL_path_contains_blob_segment}
E -- Yes --> F[replace_github_com_with_raw_githubusercontent_com]
F --> G[replace_blob_segment_with_slash_path]
G --> H[log_blob_conversion]
H --> I[return_raw_URL]
I --> Z
E -- No --> J[split_URL_path_into_non_empty_parts]
J --> K{path_parts_length_equals_repo_parts_count}
K -- Yes --> L[extract_org_and_repo]
L --> M[build_raw_URL_org_repo_main_pyproject]
M --> N[log_repo_conversion]
N --> O[return_raw_URL]
O --> Z
K -- No --> P[log_unknown_GitHub_pattern_return_as_is]
P --> Z
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #77 +/- ##
==========================================
+ Coverage 92.06% 92.83% +0.77%
==========================================
Files 1 1
Lines 252 321 +69
==========================================
+ Hits 232 298 +66
- Misses 20 23 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
github_url_to_raw_url, consider usingurl.host/url.netlocto explicitly matchgithub.cominstead of a substring search onurl_str, which would avoid false positives likehttps://notgithub.com/.... - The new
LOGGER.infocalls for non-GitHub and unrecognized GitHub URLs may be noisy in normal use; consider downgrading these todebugor gating them behind a verbose flag so routine runs stay quiet.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `github_url_to_raw_url`, consider using `url.host`/`url.netloc` to explicitly match `github.com` instead of a substring search on `url_str`, which would avoid false positives like `https://notgithub.com/...`.
- The new `LOGGER.info` calls for non-GitHub and unrecognized GitHub URLs may be noisy in normal use; consider downgrading these to `debug` or gating them behind a verbose flag so routine runs stay quiet.
## Individual Comments
### Comment 1
<location path="ruff_sync.py" line_range="194" />
<code_context>
+ LOGGER.debug(f"Initial URL: {url}")
url_str = str(url)
- if "github.com" in url_str and "/blob/" in url_str:
+ if "github.com" not in url_str:
+ LOGGER.info("URL is not a GitHub URL, returning as is.")
+ return url
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Checking for a GitHub URL by substring on the full string is brittle; using the URL host would be more robust.
`"github.com" in url_str` can both miss valid GitHub URLs (e.g., `www.github.com`, enterprise hosts) and wrongly match non-GitHub URLs where `github.com` appears only in the path or query. Since `url` is an `httpx.URL`, prefer checking `url.host` against an explicit set of allowed hosts (e.g., `{"github.com", "www.github.com"}` or enterprise domains) for a more accurate and extensible check.
</issue_to_address>
### Comment 2
<location path="tests/test_url_handling.py" line_range="21-13" />
<code_context>
+ "https://github.com/org/repo/blob/develop/config/ruff.toml",
+ "https://raw.githubusercontent.com/org/repo/develop/config/ruff.toml",
+ ),
+ # Repo URLs
+ (
+ "https://github.com/pydantic/pydantic",
+ "https://raw.githubusercontent.com/pydantic/pydantic/main/pyproject.toml",
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for GitHub URLs that are neither blob nor plain repo (e.g. `/tree/`), to assert they are passed through unchanged.
To cover the branch that logs and returns non-blob GitHub URLs unchanged, please add a parametrized case like:
```python
(
"https://github.com/org/repo/tree/main/subdir/pyproject.toml",
"https://github.com/org/repo/tree/main/subdir/pyproject.toml",
),
```
This exercises the "unknown GitHub pattern" path and protects against future rewrites of such URLs.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| "input_url, expected_url", | ||
| [ | ||
| # Blob URLs | ||
| ( |
There was a problem hiding this comment.
suggestion (testing): Add a test case for GitHub URLs that are neither blob nor plain repo (e.g. /tree/), to assert they are passed through unchanged.
To cover the branch that logs and returns non-blob GitHub URLs unchanged, please add a parametrized case like:
(
"https://github.com/org/repo/tree/main/subdir/pyproject.toml",
"https://github.com/org/repo/tree/main/subdir/pyproject.toml",
),This exercises the "unknown GitHub pattern" path and protects against future rewrites of such URLs.
|
This PR implements full support for using base repository URLs (GitHub/GitLab) as upstream ruff-sync sources, including advanced configuration for custom branches and directory prefixes.
New Features
https://github.com/org/repo) are now automatically resolved to their raw content counterparts.--branchand--pathCLI options (and corresponding[tool.ruff-sync]configuration keys) to override the defaultmainbranch and rootpyproject.tomlpath.pathconfiguration now allows specifying a directory prefix (parent path) wherepyproject.tomlis located.Enhancements
ruff_sync.pyby splitting complex logic into specialized helper functions, improving readability and maintainability.ConfigTypedDictfor robust handling of settings loaded frompyproject.toml..git-blame-ignore-revsfile to maintain a clean git blame history.uvinstead ofpoetry.Documentation
README.mdwith an "Advanced Configuration" section and refined examples to reflect current best practices.Testing
tests/test_url_handling.pycovering various URL patterns and configuration overrides.Chores
0.0.3.dev1.