Skip to content

fix: remove binary detection, fix non-ASCII git paths, add FAQ#17

Merged
molon merged 14 commits intomainfrom
fix/preserve-new-files-on-refresh-reload
Apr 1, 2026
Merged

fix: remove binary detection, fix non-ASCII git paths, add FAQ#17
molon merged 14 commits intomainfrom
fix/preserve-new-files-on-refresh-reload

Conversation

@molon
Copy link
Copy Markdown
Owner

@molon molon commented Apr 1, 2026

Summary

  • Remove binary detection from collectUntrackedFiles: Was inconsistent with onDiskCreate which accepts binary files — caused .xlsx/.msg files dragged from Finder to disappear after refresh. Baseline tracking is the single source of truth, not file content.
  • Fix non-ASCII git paths: git ls-tree quotes non-ASCII paths by default (e.g. \343\203\241...), causing path mismatches with filesystem paths. Added -c core.quotepath=false globally to all git calls so paths are always raw UTF-8.
  • Revert onWillCreateFiles approach: onWillCreateFiles also fires for Finder drag-into-explorer, making it unreliable. Restored openDoc&&bufferMatch heuristic.
  • Docs: Updated CLAUDE.md with baseline-as-source-of-truth principle. Added FAQ for known Finder drag-in timing heuristic.

Test plan

  • Unit test: non-ASCII paths in listTrackedFiles and getBaseline
  • Integration test: refresh preserves binary files detected by FileWatcher
  • Integration test: refresh preserves new files (null baseline)
  • All 71 unit tests pass
  • All 67 integration tests pass

🤖 Generated with Claude Code

molon and others added 14 commits April 1, 2026 19:57
Files dragged into the workspace from Finder are detected as new files
with null baseline, but were lost on refresh (rebuildState) or VSCode
reload (load) because both methods only rebuilt state from git—and
null-baseline files are never stored in git.

Add collectUntrackedFiles() to scan the workspace for files not tracked
in hunkwise git, and call it from both load() and rebuildState() to
restore them as new files with null baseline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace array concat with push(...spread) to avoid quadratic reallocation
- Add null-byte check to skip binary files (same heuristic as git)
- Add integration test verifying binary files are not picked up on refresh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use chunk-based Buffer read instead of loading entire file for null-byte check
- Update rebuildState log message to reflect both git and disk sources

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Suppress FileWatcher during binary file creation and refresh to
eliminate timing dependency between watcher create events and
the refresh command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The null-byte binary detection in collectUntrackedFiles was inconsistent
with onDiskCreate which accepts binary files. This caused binary files
(e.g. .xlsx, .msg) dragged from Finder to disappear after refresh.

Remove the detection so both paths behave consistently — if onDiskCreate
can detect a file as new, refresh must preserve it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document that file tracking is determined solely by whether a baseline
exists in hunkwise git, not by file content or binary detection. Fix
stale description that said baseline '' means new file (it's null).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the unreliable openDoc&&bufferMatch heuristic with explicit
onWillCreateFiles/onDidCreateFiles tracking. The old heuristic caused
race conditions: files dragged from Finder could be auto-opened by
VSCode, making them look like user-created files (silently snapshotted
as baseline instead of entering reviewing).

onWillCreateFiles only fires for VSCode API creates (explorer, applyEdit),
never for external creates (Finder, scripts, AI tools), making the
distinction reliable and deterministic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
onWillCreateFiles fires for Finder drag-into-explorer too (VSCode
handles the copy), making it unreliable for distinguishing user creates
from external creates. The openDoc&&bufferMatch heuristic, while
imperfect (race-dependent), is still the better trade-off.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The file gets snapshotted as baseline when misdetected, so refresh
cannot recover it — baseline matches disk, no diff to show.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
git ls-tree quotes non-ASCII filenames by default (e.g.
\343\203\241\343\203\274\343\203\253), causing path mismatches with
filesystem paths. Add -c core.quotepath=false to all git calls so
paths are always output as raw UTF-8.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 1, 2026 16:42
@molon molon merged commit 4441176 into main Apr 1, 2026
2 checks passed
@molon molon deleted the fix/preserve-new-files-on-refresh-reload branch April 1, 2026 16:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates hunkwise’s refresh + git baseline plumbing to better support externally created files (including binary) and to correctly handle non-ASCII paths in the internal git repo, alongside documentation updates clarifying baseline semantics and known VS Code/Finder timing behavior.

Changes:

  • Remove binary-content heuristics from StateManager.collectUntrackedFiles() so refresh treats all on-disk untracked files consistently (baseline remains the source of truth).
  • Force raw UTF-8 path output from git by adding -c core.quotepath=false to all HunkwiseGit subprocess calls.
  • Add/adjust tests covering non-ASCII paths and refresh behavior for binary “new” files; add FAQ + update CLAUDE.md guidance.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/test/integration/filewatch.test.ts Updates integration coverage to ensure refresh preserves binary new files in reviewing state.
src/test/hunkwiseGit.test.ts Adds a unit test ensuring listTrackedFiles() / getBaseline() work with non-ASCII paths.
src/stateManager.ts Removes binary/unreadable detection from untracked-file collection during load/rebuild.
src/hunkwiseGit.ts Forces core.quotepath=false on all git calls to prevent escaped path output.
package.json Bumps extension version.
FAQ.md Adds documentation for Finder drag-in timing behavior.
CLAUDE.md Clarifies “baseline as single source of truth” and null/empty baseline semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/stateManager.ts
await fd.close();
}
} catch { /* skip unreadable */ }
results.push(full);
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collectUntrackedFiles() now unconditionally adds any Dirent.isFile() not in trackedSet. This also changes the previous behavior of skipping unreadable files (EACCES/EPERM): those files will now enter reviewing with baseline:null, but later code paths (e.g. accept/discard flows that read the file as UTF-8) can throw or behave incorrectly when the file can’t be read. Consider restoring a minimal readability/access check (without binary detection) so only readable files are treated as new, or ensure downstream consumers gracefully handle unreadable files.

Suggested change
results.push(full);
try {
await fs.promises.access(full, fs.constants.R_OK);
results.push(full);
} catch (err: any) {
// Skip unreadable files (e.g., permission errors) to avoid issues later
if (err && (err.code === 'EACCES' || err.code === 'EPERM')) {
continue;
}
// For other unexpected errors, also skip to be conservative
continue;
}

Copilot uses AI. Check for mistakes.
molon added a commit that referenced this pull request Apr 1, 2026
- Add readability check in collectUntrackedFiles to skip files with
  permission errors (EACCES/EPERM) that would fail downstream (#17)
- Normalize pendingUserDeletes paths to prevent NFC/NFD mismatches (#18)
- Normalize document fsPath in onDiskCreate/onDiskChange comparisons (#18)
- Normalize filePath in all StateManager public methods for consistent
  Map key access (#18)
- Gate NFC normalization to macOS only — on Linux NFC/NFD can be
  distinct filenames (#18)
- Rename shadowed `fs` variable to `childState` in renameFile (#18)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
molon added a commit that referenced this pull request Apr 1, 2026
* fix: address Copilot review issues from PR #17 and #18

- Add readability check in collectUntrackedFiles to skip files with
  permission errors (EACCES/EPERM) that would fail downstream (#17)
- Normalize pendingUserDeletes paths to prevent NFC/NFD mismatches (#18)
- Normalize document fsPath in onDiskCreate/onDiskChange comparisons (#18)
- Normalize filePath in all StateManager public methods for consistent
  Map key access (#18)
- Gate NFC normalization to macOS only — on Linux NFC/NFD can be
  distinct filenames (#18)
- Rename shadowed `fs` variable to `childState` in renameFile (#18)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update normalizePath docstring for platform-conditional behavior

The docstring still described unconditional NFC normalization after the
implementation was changed to macOS-only. Updated to document that the
function is a no-op on non-macOS platforms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version to 0.0.29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants