-
Notifications
You must be signed in to change notification settings - Fork 55
Preserve Unix file permissions when caching attachedOutputs #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why the permissions on machine X should be the same as those on the build from machine Y. This could lead to problems with cache portability since file metadata is not included in the file checksum - only the content of the file is. The cache doesn't track the permissions in checksums, and changing permissions doesn't affect caching. While I understand the relevance of timestamps, the significance of permissions is less clear to me. Perhaps permissions should only be synced when the build explicitly sets them on outputs, otherwise it becomes a replication of environment which is a slightly different concern.
|
To add a bit of context (since I opened #214 to begin with), I was using the The issue arose because some resource files (bash scripts) that had exec permissions, and the We did work around this by using |
When preservePermissions=true (default), Unix file permissions are stored in ZIP entry headers, making them part of the ZIP file's binary content. This ensures that hashing the ZIP file (for cache keys) includes permission information, providing proper cache invalidation when file permissions change. This addresses the architectural correctness concern raised in PR apache#392 review: permissions should be in the checksum if they're being preserved, similar to how Git includes file mode in tree hashes. Changes: - Added preservePermissions boolean parameter to CacheUtils.zip() and unzip() - Added preservePermissions config option to build-cache-config.mdo (default: true) - Added isPreservePermissions() method to CacheConfig interface and implementation - Updated CacheControllerImpl to pass preservePermissions config to zip/unzip - Made permission preservation conditional based on the config flag - Fixed permissionsToMode() to include file type bits (0100000 for regular files) - Added comprehensive tests: * testPermissionsAffectFileHashWhenEnabled() - Verifies different permissions produce different ZIP hashes when preservation is enabled * testPermissionsDoNotAffectHashWhenDisabled() - Verifies permissions are NOT preserved when the flag is disabled - Added JavaDoc explaining that permissions become part of cache checksum Behavior: - preservePermissions=true: Permissions stored in ZIP, affect cache key (default) - preservePermissions=false: Permissions not preserved, files use system umask This is analogous to Git's treatment of file mode in tree hashes, ensuring that metadata included in cache restoration is also part of cache key computation.
Fixes apache#214 ## Problem When using attachedOutputs, executable file permissions are lost during cache restoration. This is because the standard java.util.zip classes do not preserve Unix file permissions. ## Solution Switch from java.util.zip to Apache Commons Compress's ZipArchive classes which support Unix permissions via setUnixMode() and getUnixMode() methods. ## Changes 1. **Add Apache Commons Compress 1.28.0 dependency** (was already transitive test dependency, now direct compile dependency) 2. **Replace java.util.zip with Commons Compress in CacheUtils**: - ZipEntry → ZipArchiveEntry - ZipOutputStream → ZipArchiveOutputStream - ZipInputStream → ZipArchiveInputStream 3. **Preserve permissions during zip**: - Read POSIX permissions with Files.getPosixFilePermissions() - Convert to Unix mode integer - Store via zipEntry.setUnixMode() 4. **Restore permissions during unzip**: - Read Unix mode from zipEntry.getUnixMode() - Convert to POSIX permissions - Apply via Files.setPosixFilePermissions() 5. **Platform safety**: - Wrap permission operations in try-catch for UnsupportedOperationException - Gracefully handle non-POSIX filesystems (Windows, FAT32, etc.) ## Why Apache Commons Compress? Apache Commons Compress was already a transitive test dependency. Using it provides a clean, simple solution compared to alternatives: **With Commons Compress** (this PR): - 2 lines to preserve permissions: entry.setUnixMode() / entry.getUnixMode() - Well-tested, maintained by Apache - Handles all edge cases and platform differences - Same Apache 2.0 license **Without Commons Compress** (JDK-only approach): - Would require manually encoding Unix permissions in ZipEntry extra field - Complex binary format (InfoZIP extra field specification) - Error-prone and hard to maintain - Platform-specific quirks - No standard API - would need reflection or custom binary encoding ## Testing The changes preserve backward compatibility: - Files without Unix mode (mode=0) are unchanged - Non-POSIX systems gracefully skip permission operations - Existing zip files without permission data work as before Tested scenarios: - Executable shell scripts (rwxr-xr-x / 0755) - Read-only files (r--r--r-- / 0444) - Regular files (rw-r--r-- / 0644) - Windows filesystems (permissions skipped gracefully)
When preservePermissions=true (default), Unix file permissions are stored in ZIP entry headers, making them part of the ZIP file's binary content. This ensures that hashing the ZIP file (for cache keys) includes permission information, providing proper cache invalidation when file permissions change. This addresses the architectural correctness concern raised in PR apache#392 review: permissions should be in the checksum if they're being preserved, similar to how Git includes file mode in tree hashes. Changes: - Added preservePermissions boolean parameter to CacheUtils.zip() and unzip() - Added preservePermissions config option to build-cache-config.mdo (default: true) - Added isPreservePermissions() method to CacheConfig interface and implementation - Updated CacheControllerImpl to pass preservePermissions config to zip/unzip - Made permission preservation conditional based on the config flag - Fixed permissionsToMode() to include file type bits (0100000 for regular files) - Added comprehensive tests: * testPermissionsAffectFileHashWhenEnabled() - Verifies different permissions produce different ZIP hashes when preservation is enabled * testPermissionsDoNotAffectHashWhenDisabled() - Verifies permissions are NOT preserved when the flag is disabled - Added JavaDoc explaining that permissions become part of cache checksum Behavior: - preservePermissions=true: Permissions stored in ZIP, affect cache key (default) - preservePermissions=false: Permissions not preserved, files use system umask This is analogous to Git's treatment of file mode in tree hashes, ensuring that metadata included in cache restoration is also part of cache key computation. Optimize filesystem capability check to avoid repeated exceptions Addresses reviewer feedback: check once if filesystem supports POSIX file attributes instead of throwing and catching UnsupportedOperationException for every file during zip/unzip operations. Changes: - Added single filesystem capability check at the beginning of zip() and unzip() - Removed try-catch blocks from inside the file iteration loops - Improved performance on non-POSIX filesystems (e.g., Windows) Before: UnsupportedOperationException thrown and caught for every file After: Single supportedFileAttributeViews() check per zip/unzip operation This is more efficient and cleaner code while maintaining identical behavior.
3b7e662
to
76ed09c
Compare
When preservePermissions=true (default), Unix file permissions are stored in ZIP entry headers, making them part of the ZIP file's binary content. This ensures that hashing the ZIP file (for cache keys) includes permission information, providing proper cache invalidation when file permissions change. This addresses the architectural correctness concern raised in PR apache#392 review: permissions should be in the checksum if they're being preserved, similar to how Git includes file mode in tree hashes. Changes: - Added preservePermissions boolean parameter to CacheUtils.zip() and unzip() - Added preservePermissions config option to build-cache-config.mdo (default: true) - Added isPreservePermissions() method to CacheConfig interface and implementation - Updated CacheControllerImpl to pass preservePermissions config to zip/unzip - Made permission preservation conditional based on the config flag - Fixed permissionsToMode() to include file type bits (0100000 for regular files) - Added comprehensive tests: * testPermissionsAffectFileHashWhenEnabled() - Verifies different permissions produce different ZIP hashes when preservation is enabled * testPermissionsDoNotAffectHashWhenDisabled() - Verifies permissions are NOT preserved when the flag is disabled - Added JavaDoc explaining that permissions become part of cache checksum Behavior: - preservePermissions=true: Permissions stored in ZIP, affect cache key (default) - preservePermissions=false: Permissions not preserved, files use system umask This is analogous to Git's treatment of file mode in tree hashes, ensuring that metadata included in cache restoration is also part of cache key computation. Optimize filesystem capability check to avoid repeated exceptions Addresses reviewer feedback: check once if filesystem supports POSIX file attributes instead of throwing and catching UnsupportedOperationException for every file during zip/unzip operations. Changes: - Added single filesystem capability check at the beginning of zip() and unzip() - Removed try-catch blocks from inside the file iteration loops - Improved performance on non-POSIX filesystems (e.g., Windows) Before: UnsupportedOperationException thrown and caught for every file After: Single supportedFileAttributeViews() check per zip/unzip operation This is more efficient and cleaner code while maintaining identical behavior.
76ed09c
to
3d534ee
Compare
@AlexanderAshitkin Updated commit ready for your review |
When preservePermissions=true (default), Unix file permissions are stored in ZIP entry headers, making them part of the ZIP file's binary content. This ensures that hashing the ZIP file (for cache keys) includes permission information, providing proper cache invalidation when file permissions change. This addresses the architectural correctness concern raised in PR apache#392 review: permissions should be in the checksum if they're being preserved, similar to how Git includes file mode in tree hashes. Changes: - Added preservePermissions boolean parameter to CacheUtils.zip() and unzip() - Added preservePermissions config option to build-cache-config.mdo (default: true) - Added isPreservePermissions() method to CacheConfig interface and implementation - Updated CacheControllerImpl to pass preservePermissions config to zip/unzip - Made permission preservation conditional based on the config flag - Fixed permissionsToMode() to include file type bits (0100000 for regular files) - Added comprehensive tests: * testPermissionsAffectFileHashWhenEnabled() - Verifies different permissions produce different ZIP hashes when preservation is enabled * testPermissionsDoNotAffectHashWhenDisabled() - Verifies permissions are NOT preserved when the flag is disabled - Added JavaDoc explaining that permissions become part of cache checksum Behavior: - preservePermissions=true: Permissions stored in ZIP, affect cache key (default) - preservePermissions=false: Permissions not preserved, files use system umask This is analogous to Git's treatment of file mode in tree hashes, ensuring that metadata included in cache restoration is also part of cache key computation. Optimize filesystem capability check to avoid repeated exceptions Addresses reviewer feedback: check once if filesystem supports POSIX file attributes instead of throwing and catching UnsupportedOperationException for every file during zip/unzip operations. Changes: - Added single filesystem capability check at the beginning of zip() and unzip() - Removed try-catch blocks from inside the file iteration loops - Improved performance on non-POSIX filesystems (e.g., Windows) Before: UnsupportedOperationException thrown and caught for every file After: Single supportedFileAttributeViews() check per zip/unzip operation This is more efficient and cleaner code while maintaining identical behavior.
6c41ae5
to
63cf5a8
Compare
Fixes #214
Problem
When using
attachedOutputs
, executable file permissions are lost during cache restoration. This is because the standardjava.util.zip
classes do not preserve Unix file permissions.Solution
Switch from
java.util.zip
to Apache Commons Compress's ZipArchive classes which support Unix permissions viasetUnixMode()
andgetUnixMode()
methods.Changes
Add Apache Commons Compress 1.28.0 dependency (was already transitive test dependency, now direct compile dependency)
Replace java.util.zip with Commons Compress in CacheUtils:
ZipEntry
→ZipArchiveEntry
ZipOutputStream
→ZipArchiveOutputStream
ZipInputStream
→ZipArchiveInputStream
Preserve permissions during zip:
Files.getPosixFilePermissions()
0755
)zipEntry.setUnixMode()
Restore permissions during unzip:
zipEntry.getUnixMode()
Files.setPosixFilePermissions()
Platform safety:
UnsupportedOperationException
Why Apache Commons Compress?
Apache Commons Compress was already a transitive test dependency. Using it provides a clean, simple solution compared to alternatives:
With Commons Compress (this PR):
entry.setUnixMode()
/entry.getUnixMode()
Without Commons Compress (JDK-only approach):
ZipEntry
extra fieldThe standard
java.util.zip.ZipEntry
class provides no methods to access or modify external file attributes that store Unix permissions. Attempting this with JDK-only code would require:In contrast, Apache Commons Compress provides a battle-tested, well-documented API that handles all this complexity for us.
Testing
The changes preserve backward compatibility:
mode=0
) are unchangedTested scenarios:
rwxr-xr-x
/0755
)r--r--r--
/0444
)rw-r--r--
/0644
)Related Issues