fix: remove binary detection, fix non-ASCII git paths, add FAQ#17
fix: remove binary detection, fix non-ASCII git paths, add FAQ#17
Conversation
Files dragged into the workspace from Finder are detected as new files with null baseline, but were lost on refresh (rebuildState) or VSCode reload (load) because both methods only rebuilt state from git—and null-baseline files are never stored in git. Add collectUntrackedFiles() to scan the workspace for files not tracked in hunkwise git, and call it from both load() and rebuildState() to restore them as new files with null baseline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace array concat with push(...spread) to avoid quadratic reallocation - Add null-byte check to skip binary files (same heuristic as git) - Add integration test verifying binary files are not picked up on refresh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use chunk-based Buffer read instead of loading entire file for null-byte check - Update rebuildState log message to reflect both git and disk sources Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Suppress FileWatcher during binary file creation and refresh to eliminate timing dependency between watcher create events and the refresh command. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s-on-refresh-reload
The null-byte binary detection in collectUntrackedFiles was inconsistent with onDiskCreate which accepts binary files. This caused binary files (e.g. .xlsx, .msg) dragged from Finder to disappear after refresh. Remove the detection so both paths behave consistently — if onDiskCreate can detect a file as new, refresh must preserve it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document that file tracking is determined solely by whether a baseline exists in hunkwise git, not by file content or binary detection. Fix stale description that said baseline '' means new file (it's null). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the unreliable openDoc&&bufferMatch heuristic with explicit onWillCreateFiles/onDidCreateFiles tracking. The old heuristic caused race conditions: files dragged from Finder could be auto-opened by VSCode, making them look like user-created files (silently snapshotted as baseline instead of entering reviewing). onWillCreateFiles only fires for VSCode API creates (explorer, applyEdit), never for external creates (Finder, scripts, AI tools), making the distinction reliable and deterministic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
onWillCreateFiles fires for Finder drag-into-explorer too (VSCode handles the copy), making it unreliable for distinguishing user creates from external creates. The openDoc&&bufferMatch heuristic, while imperfect (race-dependent), is still the better trade-off. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The file gets snapshotted as baseline when misdetected, so refresh cannot recover it — baseline matches disk, no diff to show. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
git ls-tree quotes non-ASCII filenames by default (e.g. \343\203\241\343\203\274\343\203\253), causing path mismatches with filesystem paths. Add -c core.quotepath=false to all git calls so paths are always output as raw UTF-8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR updates hunkwise’s refresh + git baseline plumbing to better support externally created files (including binary) and to correctly handle non-ASCII paths in the internal git repo, alongside documentation updates clarifying baseline semantics and known VS Code/Finder timing behavior.
Changes:
- Remove binary-content heuristics from
StateManager.collectUntrackedFiles()so refresh treats all on-disk untracked files consistently (baseline remains the source of truth). - Force raw UTF-8 path output from git by adding
-c core.quotepath=falseto allHunkwiseGitsubprocess calls. - Add/adjust tests covering non-ASCII paths and refresh behavior for binary “new” files; add FAQ + update CLAUDE.md guidance.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/test/integration/filewatch.test.ts | Updates integration coverage to ensure refresh preserves binary new files in reviewing state. |
| src/test/hunkwiseGit.test.ts | Adds a unit test ensuring listTrackedFiles() / getBaseline() work with non-ASCII paths. |
| src/stateManager.ts | Removes binary/unreadable detection from untracked-file collection during load/rebuild. |
| src/hunkwiseGit.ts | Forces core.quotepath=false on all git calls to prevent escaped path output. |
| package.json | Bumps extension version. |
| FAQ.md | Adds documentation for Finder drag-in timing behavior. |
| CLAUDE.md | Clarifies “baseline as single source of truth” and null/empty baseline semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await fd.close(); | ||
| } | ||
| } catch { /* skip unreadable */ } | ||
| results.push(full); |
There was a problem hiding this comment.
collectUntrackedFiles() now unconditionally adds any Dirent.isFile() not in trackedSet. This also changes the previous behavior of skipping unreadable files (EACCES/EPERM): those files will now enter reviewing with baseline:null, but later code paths (e.g. accept/discard flows that read the file as UTF-8) can throw or behave incorrectly when the file can’t be read. Consider restoring a minimal readability/access check (without binary detection) so only readable files are treated as new, or ensure downstream consumers gracefully handle unreadable files.
| results.push(full); | |
| try { | |
| await fs.promises.access(full, fs.constants.R_OK); | |
| results.push(full); | |
| } catch (err: any) { | |
| // Skip unreadable files (e.g., permission errors) to avoid issues later | |
| if (err && (err.code === 'EACCES' || err.code === 'EPERM')) { | |
| continue; | |
| } | |
| // For other unexpected errors, also skip to be conservative | |
| continue; | |
| } |
- Add readability check in collectUntrackedFiles to skip files with permission errors (EACCES/EPERM) that would fail downstream (#17) - Normalize pendingUserDeletes paths to prevent NFC/NFD mismatches (#18) - Normalize document fsPath in onDiskCreate/onDiskChange comparisons (#18) - Normalize filePath in all StateManager public methods for consistent Map key access (#18) - Gate NFC normalization to macOS only — on Linux NFC/NFD can be distinct filenames (#18) - Rename shadowed `fs` variable to `childState` in renameFile (#18) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address Copilot review issues from PR #17 and #18 - Add readability check in collectUntrackedFiles to skip files with permission errors (EACCES/EPERM) that would fail downstream (#17) - Normalize pendingUserDeletes paths to prevent NFC/NFD mismatches (#18) - Normalize document fsPath in onDiskCreate/onDiskChange comparisons (#18) - Normalize filePath in all StateManager public methods for consistent Map key access (#18) - Gate NFC normalization to macOS only — on Linux NFC/NFD can be distinct filenames (#18) - Rename shadowed `fs` variable to `childState` in renameFile (#18) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update normalizePath docstring for platform-conditional behavior The docstring still described unconditional NFC normalization after the implementation was changed to macOS-only. Updated to document that the function is a no-op on non-macOS platforms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version to 0.0.29 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
collectUntrackedFiles: Was inconsistent withonDiskCreatewhich accepts binary files — caused .xlsx/.msg files dragged from Finder to disappear after refresh. Baseline tracking is the single source of truth, not file content.git ls-treequotes non-ASCII paths by default (e.g.\343\203\241...), causing path mismatches with filesystem paths. Added-c core.quotepath=falseglobally to all git calls so paths are always raw UTF-8.onWillCreateFilesapproach:onWillCreateFilesalso fires for Finder drag-into-explorer, making it unreliable. RestoredopenDoc&&bufferMatchheuristic.Test plan
listTrackedFilesandgetBaseline🤖 Generated with Claude Code