Skip to content

Switch tarx to go-git's gitignore matcher#777

Merged
phinze merged 4 commits into
mainfrom
phinze/mir-1075-deploy-drops-directories-preserved-via-keep-gitignore
May 5, 2026
Merged

Switch tarx to go-git's gitignore matcher#777
phinze merged 4 commits into
mainfrom
phinze/mir-1075-deploy-drops-directories-preserved-via-keep-gitignore

Conversation

@phinze
Copy link
Copy Markdown
Contributor

@phinze phinze commented Apr 29, 2026

A user report came in about a vanilla bridgetown new app failing to boot after deploy with Errno::ENOENT - tmp/pids/server.pid. Bridgetown ships with tmp/pids/.keep plus a .gitignore that uses /tmp/* and !/tmp/pids/ to keep the pidfile directory tracked. When miren bundles source for deploy, the gitignore walker drops tmp/ from the tar entirely, so tmp/pids/ never makes it into the image, and Puma has nowhere to write its pidfile.

The walker was using shibumi/go-pathspec, a generic glob library whose GitIgnore function takes a single string path with no way to indicate it's a directory. Our isGitignored worked around that by checking both relPath and relPath + "/" and OR-ing the result, but the dual-form check breaks negations: the no-slash form matches /tmp/* and short-circuits before the trailing-slash negation !/tmp/pids/ ever gets a chance. Once the walker SkipDirs tmp/pids/, lazy directory emission drops tmp/ from the tar. Pathspec is a generic glob library, not a real gitignore implementation, and we'd been bending it into the wrong shape.

Swapping to go-git/v5/plumbing/format/gitignore, the parser go-git itself uses, gets us a proper Match(path []string, isDir bool) bool API. All 19 existing tarx tests pass unchanged, which is a strong signal the swap is behavior-preserving for everything except the bug we were after. TestMakeTarBridgetownTmpPids is the new regression guard. Binary delta is +20KB after dead-code elimination.

Closes MIR-1075.

A vanilla `bridgetown new` ships with `tmp/pids/.keep` plus a
`.gitignore` that uses `/tmp/*` and a `!/tmp/pids/` negation to keep
the pidfile directory tracked. Deploys would drop `tmp/` from the tar
entirely, so Puma had nowhere to write `tmp/pids/server.pid` at boot.

Root cause was `shibumi/go-pathspec`, which exposes a single-string
path API with no way to indicate the path is a directory. It silently
mishandles trailing-slash patterns (even basic positive matches like
`node_modules/` against directory paths). Our `isGitignored` used a
dual-form workaround (try both `relPath` and `relPath + "/"`) that
papered over simple cases but broke negation: the no-slash form
matches `/tmp/*` and short-circuits before the trailing-slash
negation `!/tmp/pids/` can apply. The walker then `SkipDir`s
`tmp/pids/`, exits `tmp/` having accepted zero files inside, and lazy
directory emission drops `tmp/` from the tar entirely.

`go-git/v5/plumbing/format/gitignore` is the parser go-git itself
uses, with a `Match(path []string, isDir bool) bool` API. Binary
delta is +20KB after dead-code elimination.

All 19 existing tarx tests pass with zero modifications.
`TestMakeTarBridgetownTmpPids` is the new regression guard.

End-to-end verified: `/app/tmp/pids/.keep` now lands in the deployed
image as `app:app`, and Puma boots without any `PIDFILE` workaround.

Fixes MIR-1075.
@phinze phinze requested a review from a team as a code owner April 29, 2026 15:16
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d6b06571-0d1a-4571-9a45-0b011ca6280d

📥 Commits

Reviewing files that changed from the base of the PR and between 5b5fec6 and f0180d7.

📒 Files selected for processing (2)
  • pkg/tarx/tarx.go
  • pkg/tarx/tarx_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/tarx/tarx.go

📝 Walkthrough

Walkthrough

The tarx package replaces go-pathspec with go-git’s gitignore.Pattern/Matcher, adding parseGitignoreFile, parseStringPatterns, and pathSegments to build scoped matchers and updating ValidatePattern to reject empty/whitespace and use gitignore.ParsePattern. ComputeManifest and tar creation now apply matcher segment logic and dynamically load nested .gitignore files. Tar streaming uses io.Pipe with explicit close ordering and CloseWithError to propagate errors. Two tests for nested and sibling .gitignore behavior were added. go.mod was updated to add go-git and adjust indirect requirements.


Review rate limit: 3/5 reviews remaining, refill in 12 minutes and 32 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/tarx/tarx.go`:
- Around line 32-35: The file currently swallows all errors from
os.ReadFile(path) when loading .gitignore, which hides real read failures;
change the reader so it only treats a missing file (os.IsNotExist(err)) as
optional and returns patterns, nil in that case, but for other errors return the
error to the caller; update the .gitignore-reading function to return (patterns,
error) accordingly and propagate that error through ComputeManifest and
makeTarWithFilter instead of treating every failure as “no patterns”.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ac632f55-18d2-4ead-bbff-d280e9216905

📥 Commits

Reviewing files that changed from the base of the PR and between 47c2774 and 5e643e6.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go.mod
  • pkg/tarx/tarx.go
  • pkg/tarx/tarx_test.go

Comment thread pkg/tarx/tarx.go
parseGitignoreFile was returning nil on any read error, which made a
permission-denied or transient I/O failure on a .gitignore look
identical to the file not existing. That silently turned "couldn't
read your ignore rules" into "no patterns, ship everything." Only
treat os.IsNotExist as the optional case now and propagate everything
else up through ComputeManifest and makeTarWithFilter.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/tarx/tarx.go`:
- Around line 290-293: The goroutine that calls filepath.Walk currently ignores
the error returned by Walk, causing nested .gitignore parse errors (from
parseGitignoreFile) to be lost and the reader to see EOF; modify the goroutine
that uses os.Pipe to capture the error returned by filepath.Walk and propagate
it to the reader by writing the error through the pipe (or closing the writer
with the error) so callers of the tar generation observe the failure; locate the
Walk invocation and its callback, ensure any non-nil error from filepath.Walk is
forwarded (not swallowed) and that parseGitignoreFile errors returned from the
callback are returned by Walk and sent across the pipe to the reader side.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77cdb787-b9c0-4b82-be0f-e8fa2186144c

📥 Commits

Reviewing files that changed from the base of the PR and between 5e643e6 and b896458.

📒 Files selected for processing (1)
  • pkg/tarx/tarx.go

Comment thread pkg/tarx/tarx.go
The goroutine in makeTarWithFilter discarded filepath.Walk's return
value, so any error from inside the walk (now including the gitignore
parse errors the previous commit started returning) reached the
reader as a silent EOF. Switched the pipe from os.Pipe to io.Pipe so
we have CloseWithError available, captured the walk error, and made
sure the tar and gzip trailers flush before we close.
Copy link
Copy Markdown
Contributor

@evanphx evanphx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, but I think we need to fix the cross-contanimation of the gitignore patterns from sibling directories.

Comment thread pkg/tarx/tarx.go Outdated
Comment thread pkg/tarx/tarx.go Outdated
Comment thread pkg/tarx/tarx.go
Comment thread pkg/tarx/tarx.go Outdated
We were calling gitignore.NewMatcher inside the walker callback for
both the includes and ignores, so we'd rebuild a matcher per file.
Hoist them out: the includes matcher never changes during the walk,
and the ignores matcher only needs rebuilding when a nested
.gitignore adds new patterns. Same shape applied in ComputeManifest
and makeTarWithFilter.

Also add TestMakeTarSiblingGitignoresDoNotCrossPollute to lock down
that nested .gitignores stay scoped to their own subtree. The
accumulating slice mechanic looks fishy at a glance (patterns from
subdirA's .gitignore stay in the slice when we walk subdirB), but
go-git's Pattern carries a domain set at parse time and Match
early-returns NoMatch if the path doesn't start with that domain.
The test confirms cross-pollution doesn't happen, and stripping the
domain makes the test fail with the expected symptom.
Copy link
Copy Markdown
Contributor

@evanphx evanphx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the pattern domain thing makes sense. I had to go through it, basically it's clamping the pattern to paths that are inside the path to the gitignore.

@phinze phinze merged commit b9c9621 into main May 5, 2026
12 checks passed
@phinze phinze deleted the phinze/mir-1075-deploy-drops-directories-preserved-via-keep-gitignore branch May 5, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants