Skip to content

Handle symlinks and Unicode filenames in snapshot creation#1

Merged
Cod-e-Codes merged 4 commits into
Cod-e-Codes:masterfrom
bpmiranda3099:fix/handle-unicode-symlinks
May 20, 2026
Merged

Handle symlinks and Unicode filenames in snapshot creation#1
Cod-e-Codes merged 4 commits into
Cod-e-Codes:masterfrom
bpmiranda3099:fix/handle-unicode-symlinks

Conversation

@bpmiranda3099
Copy link
Copy Markdown
Contributor

Pull Request

Description

Fixes snapshot creation failure when encountering symlinks with Unicode characters in filenames. Previously, ignoregrets would fail trying to read symlinks as regular files, causing the snapshot to error out.

Error encountered:

Error: failed to add file to archive: ".venv-pytest/bin/\360\235\234\213thon": open ".venv-pytest/bin/\360\235\234\213thon": no such file or directory

Changes:

  • Changed os.Open() to os.Lstat() to detect symlinks without following them
  • Added special handling for symlinks in addFileToArchive()
  • Use tar.FileInfoHeader() for proper tar header generation
  • Store symlink target path instead of attempting to read symlink as a regular file

Checklist

  • Code compiles
  • Tests pass
  • Docs updated (not needed for this fix)

@bpmiranda3099 bpmiranda3099 marked this pull request as draft May 20, 2026 08:30
@bpmiranda3099 bpmiranda3099 marked this pull request as ready for review May 20, 2026 08:31
@Cod-e-Codes
Copy link
Copy Markdown
Owner

The Lstat fix is correct, but a few things need to be addressed before this is ready to merge:

Manifest value is wrong for symlinks
manifest.Files[path] = target stores the symlink target where a SHA256 hash is expected. The map is documented as path -> sha256 and verifyManifest compares values as checksums. For symlinks you could store sha256(target) or a sentinel prefix like symlink:<target> so downstream logic can distinguish them without silently corrupting comparisons.

tar.FileInfoHeader is called before the target is known
os.Readlink should be called first, then the result passed as the second argument to FileInfoHeader. That argument exists specifically for this case. Right now you're passing "" and patching hdr.Linkname after the fact, which works but inverts the intended API usage.

restoreFile does not handle symlinks
This is the bigger gap. restoreFile calls os.OpenFile and io.Copy for every entry regardless of type. A symlink entry in the archive will hit that path and either fail or restore garbage. You need a check for hdr.Typeflag == tar.TypeSymlink that calls os.Symlink(hdr.Linkname, hdr.Name) instead.

Test coverage is minimal
TestSymlinkHandling only asserts the key exists in the manifest map. It does not verify hdr.Typeflag, hdr.Linkname, or that the resulting archive is actually readable. A test that passes with a malformed tar entry is not providing meaningful coverage.

Happy to clarify any of this. The core fix is the right approach, just needs these gaps closed.

@Cod-e-Codes Cod-e-Codes merged commit 234d796 into Cod-e-Codes:master May 20, 2026
@bpmiranda3099
Copy link
Copy Markdown
Contributor Author

Thanks. Appreciate the detailed feedback. I've pushed updates addressing each one:

Manifest value: Using symlink:<target> prefix. No collision with SHA256 since hashes are always 64 hex chars.

tar.FileInfoHeader: os.Readlink is called first now, target passed as the second argument.

restoreFile: Added tar.TypeSymlink check that calls os.Symlink(hdr.Linkname, hdr.Name). Also switched the existence check from os.Stat to os.Lstat so existing symlinks are detected.

Tests: TestSymlinkHandling now checks the sentinel prefix, hdr.Typeflag, hdr.Linkname, that the archive is readable, and that a restored symlink is actually a symlink (via os.Lstat).

Let me know what you think.

@Cod-e-Codes
Copy link
Copy Markdown
Owner

Everything looks good. Thank you for your contribution.

@bpmiranda3099 bpmiranda3099 deleted the fix/handle-unicode-symlinks branch May 29, 2026 00:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants