Skip to content

Fix .gitattributes writer: quote paths with spaces, dedupe per-batch syncs, deterministic ordering#330

Merged
jamesadevine merged 3 commits intomainfrom
copilot/add-gitattributes-for-compiled-pipelines
Apr 28, 2026
Merged

Fix .gitattributes writer: quote paths with spaces, dedupe per-batch syncs, deterministic ordering#330
jamesadevine merged 3 commits intomainfrom
copilot/add-gitattributes-for-compiled-pipelines

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 28, 2026

Follow-up to the .lock.yml / .gitattributes work. The original goals — preserving pre-existing user-authored entries and emitting managed entries in a stable order so re-compiles don't churn the file — were already satisfied by the initial implementation. A subsequent /rust-review pass surfaced two real bugs and two doc/test improvements, all addressed here.

Already-satisfied properties

  • Preservationstrip_managed_block only removes lines between # BEGIN ado-aw managed and # END ado-aw managed; everything else is copied verbatim. Covered by preserves_user_managed_lines.
  • Determinism — entries are collected into a BTreeSet<String> before rendering, so output order is independent of detect_pipelines traversal order. Covered by entries_are_sorted_and_deduplicated, and the writer additionally short-circuits the disk write when the rendered content matches the existing file (idempotent_when_unchanged).

Changes from review feedback

  • Quote paths containing whitespace / " / #normalize_path previously emitted my agents/pipeline.lock.yml linguist-generated=true merge=ours, which git would parse as pattern my with three attributes. It now wraps such entries in double quotes (with embedded " escaped) so git treats them as a single pattern. New regression test quotes_paths_containing_spaces.
  • Avoid N+1 detect_pipelines calls during batch compilecompile_pipeline is now a thin wrapper over a private compile_pipeline_inner(... sync_gitattributes: bool). compile_all_pipelines passes false and performs a single .gitattributes sync at the end of the batch, reusing the already-detected pipeline list rather than re-scanning the tree once per pipeline.
  • Document managed-block placement — added a doc comment on render calling out that the block is always written at EOF; user lines outside it are preserved verbatim, but in-place block position is not.
  • Drop flaky mtime assertionidempotent_when_unchanged now asserts only content equality (mtime granularity is filesystem-dependent, e.g. 1s on macOS HFS+).

Smoke test

Pre-populated .gitattributes with three pipelines compiled in non-alphabetical order:

*.png binary
docs/* linguist-documentation
# BEGIN ado-aw managed (do not edit)
a/agent.lock.yml linguist-generated=true merge=ours
b/agent.lock.yml linguist-generated=true merge=ours
c/agent.lock.yml linguist-generated=true merge=ours
# END ado-aw managed

A subsequent ado-aw compile (auto-discover) produces a byte-identical file. A path containing a space (my agents/pipeline.md) compiles to a quoted entry:

"my agents/pipeline.lock.yml" linguist-generated=true merge=ours

cargo build, cargo test, and cargo clippy --all-targets --all-features are clean (no new warnings on changed files).

Copilot AI and others added 2 commits April 28, 2026 09:48
@jamesadevine
Copy link
Copy Markdown
Collaborator

/rust-review

1 similar comment
@jamesadevine
Copy link
Copy Markdown
Collaborator

/rust-review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

Rust PR Reviewer completed successfully!

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good overall — clean implementation with good test coverage. Two issues worth addressing before merge.

Findings

🐛 Bugs / Logic Issues

  • src/compile/gitattributes.rs:62-65 — Paths with spaces produce malformed .gitattributes

    normalize_path does not quote paths that contain spaces. The .gitattributes format requires double-quoting any pattern that contains whitespace or special characters:

    "my agents/pipeline.lock.yml" linguist-generated=true merge=ours
    

    Currently the output would be my agents/pipeline.lock.yml linguist-generated=true merge=ours, which git would parse as pattern my with attributes agents/pipeline.lock.yml, linguist-generated=true, and merge=ours. Fix: quote any entry whose path contains a space (or " or #):

    fn normalize_path(p: &Path) -> String {
        let s = p.to_string_lossy().replace('\\', "/");
        let s = s.trim_start_matches("./");
        if s.contains(' ') || s.contains('"') || s.contains('#') {
            format!("\"{}\"", s.replace('"', "\\\""))
        } else {
            s.to_string()
        }
    }
  • src/compile/mod.rs:151detect_pipelines called N+1 times during compile_all_pipelines

    compile_all_pipelines already calls detect_pipelines once, then calls compile_pipeline for each result. Each compile_pipeline call triggers sync_gitattributes_for_output, which calls detect_pipelines again — so compiling N pipelines at once does N+1 directory scans. The intermediate writes are also unnecessary since only the final state matters.

    Consider either (a) passing a flag to suppress the per-pipeline gitattributes sync and doing one final sync after all pipelines are compiled, or (b) having compile_all_pipelines call gitattributes::update_gitattributes directly once at the end.

⚠️ Suggestions

  • src/compile/gitattributes.rs:125-130 — Managed block silently moves to end of file on first rewrite

    render always appends the managed block at the end of preserved. If a user placed the block in the middle of their .gitattributes (e.g., between user entries), the first recompile will move it to the end of the file. This isn't a correctness issue but it could surprise users. Worth calling out in a doc comment.

  • src/compile/gitattributes.rs:222-236idempotent_when_unchanged test uses a 20ms sleep for mtime comparison

    On FAT32 / older HFS+ filesystems (2-second mtime granularity) this test would be flaky. Linux is fine (nanosecond precision), but worth noting for anyone running on macOS HFS+ (modified() there is 1-second granularity). Consider checking mtime only on Linux via #[cfg(target_os = "linux")] or just asserting content equality without the mtime check.

✅ What Looks Good

  • BTreeSet for deterministic output ordering is the right choice — no surprises across runs or platforms.
  • The strip_managed_block truncation behavior for malformed files (BEGIN without END → strip to EOF) is sensible and documented.
  • Best-effort .gitattributes update with debug! fallback is the right approach — compilation shouldn't fail just because the working directory isn't a git repo.
  • Test coverage is thorough: preserves user lines, replaces stale blocks, removes block when pipelines list is empty, deduplicates, idempotency check. All the key invariants are exercised.

Generated by Rust PR Reviewer for issue #330 · ● 1.1M ·

@jamesadevine
Copy link
Copy Markdown
Collaborator

@copilot Address feedback and suggestions

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 28, 2026

@copilot Address feedback and suggestions

Addressed all four items from the rust-review in bda2350:

  • Paths with spacesnormalize_path now wraps any entry containing space/"/# in double quotes (with embedded " escaped). Covered by new test quotes_paths_containing_spaces.
  • N+1 detect_pipelines — split compile_pipeline_inner with a sync_gitattributes flag; compile_all_pipelines skips the per-pipeline sync and does one sync at the end using the already-detected list (no extra scans).
  • Managed block placement — added a doc comment on render calling out that the block always lands at EOF.
  • Flaky mtime assertionidempotent_when_unchanged now asserts only content equality.

cargo build, cargo test, and cargo clippy --all-targets --all-features are clean.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dev.azure.com
    • Triggering command: /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-9a1c412f8eace73c /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-9a1c412f8eace73c /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/init_tests-130137f0f8307a3a.2w/home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-9a1c412f8eace73c.0nol22zdg4eqz5rhom7za9nr8.02siqsu.rcgu.o /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/init_tests-130137f0f8307a3a.31/home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-9a1c412f8eace73c.0pbh8z8uy8d1zhrqfpbb6rnhw.02siqsu.rcgu.o /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/init_tests-130137f0f8307a3a.3b/home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-9a1c412f8eace73c.0ri11hlazzp6hqohp3gnmay41.02siqsu.rcgu.o /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/init_tests-130137f0f8307a3a.3c/home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-9a1c412f8eace73c.0z6aoshunqe15kvf3tp3ctfvb.02siqsu.rcgu.o /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/init_tests-130137f0f8307a3a.3f/home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-9a1c412f8eace73c.102y4n3t2bdsdx7qb6lwp9slu.02siqsu.rcgu.o 8.0djpn0j2od05flismna24xap2.1ftfufk.rcgu.o 8.0ellgqglwg582barz68fbig19.1ftfufk.rcgu.o 8.0jtexl9ue6uxxfy38p3o3tzal.1ftfufk.rcgu.o 8.0k7xenpu4k5g0j3z3nrimcs1u.1ftfufk.rcgu.o 8.0rzhlhiy5ioyhjwnh4jujsvud.1ftfufk.rcgu.o 8.0z88dmg27t9eu27686btnhjkb.1ftfufk.rcgu.o 8.12724a9qslsz175bs6929j7c7.1ftfufk.rcgu.o 8.13uhokv3qrij4556awk5b8io0.1ftfufk.rcgu.o 8.1cfsjbb5n00348k3jybh2f0cq.1ftfufk.rcgu.o 8.1gqdb1jetmpxof16v6xxc9f00.1ftfufk.rcgu.o 8.1kaqz472u2x7u4wrzxydofou9.1ftfufk.rcgu.o 8.1l6p24bqam7vjd2lhd8qwsx9u.1ftfufk.rcgu.o 8.1mc3s5m7i3go2c4ozf1hqb4q6.1ftfufk.rcgu.o 8.1ofye6g3w5oqac01kzpct9a0k.1ftfufk.rcgu.o (dns block)
  • https://api.github.com/graphql
    • Triggering command: /usr/bin/gh gh pr view --json number,reviews,comments (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Confirm .gitattributes already preserves user content and emits deterministic order Fix .gitattributes writer: quote paths with spaces, dedupe per-batch syncs, deterministic ordering Apr 28, 2026
Copilot AI requested a review from jamesadevine April 28, 2026 10:36
@jamesadevine jamesadevine merged commit 13c0af1 into main Apr 28, 2026
2 checks passed
@jamesadevine jamesadevine deleted the copilot/add-gitattributes-for-compiled-pipelines branch April 28, 2026 12:31
jamesadevine added a commit that referenced this pull request Apr 28, 2026
This is an empty commit to trigger release-please to open a release PR
so that the changelog can be manually updated to include changes from
commits #330, #331, and #332 which were merged without Conventional
Commit prefixes and were silently dropped from the changelog.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants