Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,35 @@ jobs:
token: ${{ secrets.CODECOV_TOKEN }}
slug: Kilo59/ruff-sync

pre-publish:
name: Test package installation
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: [static-analysis, tests]
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v5

- name: Set up Python
run: uv python install 3.10

- name: Build package
run: uv build

- name: Install and test packaged program
run: |
uv tool install $(ls dist/*.whl)
ruff-sync --version
ruff-sync https://github.com/Kilo59/ruff-sync
ruff-sync check https://github.com/Kilo59/ruff-sync

publish:
name: Build and publish to PyPI
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: [static-analysis, tests]
needs: [pre-publish]
runs-on: ubuntu-latest
permissions:
# This permission is required for trusted publishing
Expand Down
23 changes: 13 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,21 +111,20 @@ uv tool install git+https://github.com/Kilo59/ruff-sync
### Usage

```console
# Sync from a GitHub repository (defaults to main/pyproject.toml)
# Sync from a GitHub/GitLab repository (defaults to main/pyproject.toml)
ruff-sync https://github.com/my-org/standards

# Or a direct blob/file URL (auto-converts to raw)
ruff-sync https://github.com/my-org/standards/blob/main/pyproject.toml

# GitLab support (including nested projects)
ruff-sync https://gitlab.com/my-org/my-group/nested/standards
# Clone from any git repository (using SSH or HTTP, defaults to --depth 1)
# You can use the --branch flag to specify a branch (default: main)
ruff-sync git@github.com:my-org/standards.git
ruff-sync ssh://git@gitlab.com/my-org/standards.git

# Once configured in pyproject.toml (see Configuration), simply run:
# Or if configured in pyproject.toml (see Configuration), simply run:
ruff-sync

# Sync into a specific project directory
ruff-sync --source ./my-project

# Exclude specific sections from being overwritten using dotted paths
ruff-sync --exclude lint.per-file-ignores lint.ignore

Expand All @@ -142,8 +141,9 @@ Run `ruff-sync --help` for full details on all available options.

- **Format-preserving merges** — Uses [tomlkit](https://github.com/sdispater/tomlkit) under the hood, so your comments, whitespace, and TOML structure are preserved. No reformatting surprises.
- **GitHub & GitLab URL support** — Automatically converts GitHub/GitLab repository URLs or blob URLs to raw content URLs.
- **Git clone support** — If the URL starts with `git@` or uses the `ssh://`, `git://`, or `git+ssh://` schemes, `ruff-sync` will perform an efficient shallow clone (using `--filter=blob:none` and `--no-checkout`) to safely extract the configuration with minimal network traffic.
- **Selective exclusions** — Keep project-specific overrides (like `per-file-ignores` or `target-version`) from being clobbered by the upstream config.
- **Works with any host** — GitHub, GitLab, Bitbucket, or any raw URL that serves a `pyproject.toml`.
- **Works with any host** — GitHub, GitLab, Bitbucket, private SSH servers, or any raw URL that serves a `pyproject.toml`.
- **CI-ready `check` command** — Verify that your local config is in sync without modifying anything. Exits 1 if out of sync, making it perfect for pre-merge gates. ([See detailed logic](#detailed-check-logic))
- **Semantic mode** — Use `--semantic` to ignore cosmetic differences (comments, whitespace) and only fail on real value changes.

Expand Down Expand Up @@ -294,13 +294,16 @@ To see `ruff-sync` in action, you can "dogfood" it on this project's own config.
**Check if this project is in sync with its upstream:**

```console
./scripts/dogfood_check.sh
./scripts/check_dogfood.sh
```

**Or sync from a large upstream like Pydantic's config:**

```console
./scripts/dogfood.sh
# Using a HTTP URL
./scripts/pull_dogfood.sh
# Using a Git URL
./scripts/gitclone_dogfood.sh
```

This will download Pydantic's Ruff configuration and merge it into the local `pyproject.toml`. You can then use `git diff` to see how it merged the keys while preserving the existing structure and comments.
Expand Down
135 changes: 126 additions & 9 deletions ruff_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@
import logging
import pathlib
import re
import subprocess
import sys
import tempfile
from argparse import ArgumentParser, RawDescriptionHelpFormatter
from collections.abc import Iterable, Mapping
from functools import lru_cache
Expand Down Expand Up @@ -121,6 +123,8 @@ def _get_cli_parser() -> ArgumentParser:
"Examples:\n"
" ruff-sync pull https://github.com/org/repo/blob/main/pyproject.toml\n"
" ruff-sync check https://github.com/org/repo/blob/main/pyproject.toml\n"
" ruff-sync pull git@github.com:org/repo.git\n"
" ruff-sync pull ssh://git@gitlab.com/org/repo.git\n"
" ruff-sync check --semantic # ignore formatting-only differences\n\n"
"The upstream URL can also be set in [tool.ruff-sync] in pyproject.toml\n"
"so you can simply run: ruff-sync pull"
Expand Down Expand Up @@ -278,22 +282,30 @@ def _convert_gitlab_url(url: URL, branch: str = "main", path: str = "") -> URL:
return url


def resolve_raw_url(url: URL, branch: str = "main", path: str = "") -> URL:
"""Resolve a GitHub or GitLab URL to its raw content URL.
def is_git_url(url: URL) -> bool:
"""Return True if the URL should be treated as a git repository."""
return str(url).startswith("git@") or url.scheme in ("ssh", "git", "git+ssh")


def resolve_raw_url(url: URL, branch: str = "main", path: str | None = None) -> URL:
"""Convert a GitHub or GitLab repository/blob URL to a raw content URL.

Args:
url (URL): The URL to resolve.
branch (str): The default branch to use for repo URLs.
path (str): The directory prefix for pyproject.toml.
path (str | None): The directory prefix for pyproject.toml.

Returns:
URL: The resolved raw content URL, or the original URL if no conversion applies.
"""
# If it's a git URL, leave it alone; we'll handle it via git clone
if is_git_url(url):
return url
LOGGER.debug(f"Initial URL: {url}")
if url.host in _GITHUB_HOSTS:
return _convert_github_url(url, branch=branch, path=path)
return _convert_github_url(url, branch=branch, path=path or "")
if url.host in _GITLAB_HOSTS:
return _convert_gitlab_url(url, branch=branch, path=path)
return _convert_gitlab_url(url, branch=branch, path=path or "")
return url


Expand All @@ -304,6 +316,107 @@ async def download(url: URL, client: httpx.AsyncClient) -> StringIO:
return StringIO(response.text)


async def fetch_upstream_config(
url: URL, client: httpx.AsyncClient, branch: str, path: str | None
) -> StringIO:
"""Fetch the upstream pyproject.toml either via HTTP or git clone."""
if is_git_url(url):
LOGGER.info(f"Cloning {url} via git...")

def _git_clone_and_read() -> str:
"""Clone the git repo into a temp directory and read pyproject.toml.

Uses an efficient cloning strategy to minimize network traffic and disk space:
- `--depth 1`: only fetches the tip of the requested branch
- `--filter=blob:none`: avoids downloading any file contents (blobs) during the clone
- `--no-checkout`: prevents git from populating the working tree

After the metadata is cloned, we try `git restore` to explicitly download and place
only the requested `pyproject.toml` file into the working tree. If `restore` fails
(e.g. on older git versions), we fall back to a specific `git checkout`.
"""
with tempfile.TemporaryDirectory() as temp_dir:
# Use --no-checkout and --filter=blob:none to avoid downloading unnecessary files
cmd = [
"git",
"clone",
"--depth",
"1",
"--filter=blob:none",
"--no-checkout",
"--branch",
branch,
str(url),
temp_dir,
]
LOGGER.info(f"Running git command: {' '.join(cmd)}")
try:
subprocess.run( # noqa: S603
cmd,
check=True,
capture_output=True,
text=True,
)
target_path = (
pathlib.Path(path.strip("/")) / "pyproject.toml"
if path
else pathlib.Path("pyproject.toml")
)

# Restore just the pyproject_toml file
restore_cmd = [
"git",
"-C",
temp_dir,
"restore",
"--source",
branch,
str(target_path),
]
LOGGER.info(f"Running git restore: {' '.join(restore_cmd)}")

try:
subprocess.run( # noqa: S603
restore_cmd,
check=True,
capture_output=True,
text=True,
)
except subprocess.CalledProcessError:
LOGGER.info("git restore failed, falling back to git checkout")
checkout_cmd = [
"git",
"-C",
temp_dir,
"checkout",
branch,
"--",
str(target_path),
]
LOGGER.info(f"Running git checkout: {' '.join(checkout_cmd)}")
subprocess.run( # noqa: S603
checkout_cmd,
check=True,
capture_output=True,
text=True,
)
except subprocess.CalledProcessError as e:
LOGGER.exception(f"Git operation failed: {e.stderr}")
raise

full_target_path = pathlib.Path(temp_dir) / target_path
if not full_target_path.exists():
raise FileNotFoundError(
f"Configuration file not found in repository at {target_path}"
)
return full_target_path.read_text()

content = await asyncio.to_thread(_git_clone_and_read)
return StringIO(content)

return await download(url, client)


@overload
def get_ruff_tool_table(
toml: str | TOMLDocument,
Expand Down Expand Up @@ -460,8 +573,10 @@ async def check(
source_doc = source_toml_file.read()

async with httpx.AsyncClient() as client:
file_buffer = await download(args.upstream, client)
LOGGER.info(f"Downloaded upstream file from {args.upstream}")
file_buffer = await fetch_upstream_config(
args.upstream, client, branch=args.branch, path=args.path
)
LOGGER.info(f"Loaded upstream file from {args.upstream}")

upstream_ruff_toml = get_ruff_tool_table(
file_buffer.read(), create_if_missing=False, exclude=args.exclude
Expand Down Expand Up @@ -525,8 +640,10 @@ async def pull(

# NOTE: there's no particular reason to use async here.
async with httpx.AsyncClient() as client:
file_buffer = await download(args.upstream, client)
LOGGER.info(f"Downloaded upstream file from {args.upstream}")
file_buffer = await fetch_upstream_config(
args.upstream, client, branch=args.branch, path=args.path
)
LOGGER.info(f"Loaded upstream file from {args.upstream}")

upstream_ruff_toml = get_ruff_tool_table(
file_buffer.read(), create_if_missing=False, exclude=args.exclude
Expand Down
File renamed without changes.
46 changes: 46 additions & 0 deletions scripts/gitclone_dogfood.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/usr/bin/env bash
set -euo pipefail

# Dogfooding script for ruff-sync using a git URL
#
# This script "dogfoods" ruff-sync by syncing this project's own pyproject.toml
# with a complex upstream configuration (defaults to Pydantic) using a giturl.
#
# Usage:
# ./scripts/gitclone_dogfood.sh [upstream_url]
#
# Default upstream:
# git@github.com:pydantic/pydantic.git

DEFAULT_UPSTREAM="git@github.com:pydantic/pydantic.git"
UPSTREAM=${1:-$DEFAULT_UPSTREAM}

echo "🐶 Dogfooding ruff-sync via git clone..."
echo "🔗 Upstream: $UPSTREAM"
echo "📂 Target: ./pyproject.toml"
echo ""

# Ensure we are in the project root
cd "$(dirname "$0")/.."

# Check if we have uncommitted changes in pyproject.toml
if ! git diff --quiet pyproject.toml; then
echo "⚠️ Warning: You have uncommitted changes in pyproject.toml."
read -p "Continue anyway? (y/N) " -n 1 -r
echo ""
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
echo "Aborting."
exit 1
fi
fi

# Run the tool via uv
uv run python ruff_sync.py "$UPSTREAM" -v

echo ""
echo "✨ Dogfooding run complete!"
echo "--------------------------------------------------"
echo "Next steps:"
echo "1. Inspect the changes: git diff pyproject.toml"
echo "2. Discard when done: git checkout pyproject.toml"
echo "--------------------------------------------------"
File renamed without changes.
Loading
Loading