Bit7z v4.1.0 #333
rikyoz
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Main Features and Improvements
Nested archives support: the new
BitNestedArchiveReaderclass lets you open and inspect archives that are embedded as items inside another archive, without extracting them to disk first. (#92)Native file I/O: file streams have been rewritten to use low-level Win32 APIs on Windows and POSIX I/O on Unix, replacing
std::fstream. This dramatically improves the performance of both extraction and compression operations. (#319, #325)Due to the choice of using
std::fstreamfor cross-platform support, v4.0 introduced a severe performance regression.For a quick comparison, I did some non-exhaustive benchmarks. The result is the following boxplot, which showcases the scale of that regression and how v4.1 resolves it. In this test, v4.0 extraction takes roughly six times as long as v3, while v4.1 brings it back down into the same range as v3:
Zooming in on just v3 and v4.1 (the v4.0 box is omitted here), a small performance gap remains:
Part of this residual gap is likely due to v4.1 performing additional security validation that v3 did not, so I anticipate a modest, permanent cost relative to v3 in exchange for safer behavior. I'll continue working to reduce, and where possible eliminate, the non-essential portion of this gap in the stable release.
Benchmark methodology
Each version was benchmarked by repeatedly extracting the same archive to the same destination, discarding warmup iterations. Boxes show the median (line) and interquartile range; the cross marks the mean; whiskers follow the Tukey 1.5×IQR convention. Measurements were collected on a single machine; absolute timings are environment-dependent and intended for relative comparison between versions, not as absolute throughput figures. The v4.0 regression and its resolution in v4.1 are large effects that are robust to measurement noise; the finer v3-vs-v4.1 gap is smaller and more sensitive to measurement conditions, so it should be read as indicative rather than exact.
New extraction callbacks:
RenameCallback(dynamically rename or skip items during extraction),RawDataCallback(stream raw bytes directly to user code without touching the filesystem), andBufferCallback(route each extracted file to a separate, independently chosen buffer) unlock extraction patterns that were previously impossible.Example
BitFileExtractor extractor{ lib, BitFormat::Zip }; // RenameCallback: skip .tmp files, extract everything else under a subfolder extractor.extract( "archive.zip", "output/", []( uint32_t, const tstring& path ) -> tstring { if ( path.find( ".tmp" ) != tstring::npos ) { return {}; // empty string = skip this item } return "backup/" + path; } ); // RawDataCallback: receive a file's raw bytes without writing to disk std::vector<byte_t> rawData; extractor.extractTo( "archive.zip", [&]( const byte_t* data, std::size_t size ) -> bool { rawData.insert( rawData.end(), data, data + size ); return true; // return false to abort }, /* index = */ 0 ); // BufferCallback: extract all files, each into its own buffer std::map<tstring, buffer_t> files; extractor.extract( "archive.zip", [&]( uint32_t, const tstring& path ) -> buffer_t& { return files[ path ]; } );Multi-volume LRU file-handle cache: when reading or creating large multi-volume archives, bit7z now keeps only the most recently used volume file handles open, preventing file descriptor exhaustion on archives with many volumes (#150, #161).
Timestamp control: two complementary additions give full control over timestamps when creating archives.
BitAbstractArchiveCreatorgainssetStoreLastWriteTime(),setStoreCreationTime(), andsetStoreLastAccessTime()to control which timestamp types are stored globally (format support varies: 7z has the most complete support; TAR does not support creation/last-access timestamps). Additionally,BitOutputArchive::addFile()now returns aBitInputItem&, allowing per-item timestamp overrides viasetCreationTime(),setLastWriteTime(), andsetLastAccessTime(). (#184)Example
Deferred library loading:
Bit7zLibraryLoaderallows constructing a loader without immediately loading the 7-zip DLL/SO, then loading (and unloading) it at any later point. Useful for plugin systems and applications that need to control when the native library is brought in.Example
🆕 Self-extracting (SFX) archive support (since beta): bit7z correctly handles SFX executables. (#317, #326)
BitFormat::Pe,BitFormat::Elf,BitFormat::Macho) now succeeds on unsigned SFX executables, so listing its sections works where it previously failed with an opaqueS_FALSE.BitFormat::Auto, detection now scans for and prioritizes the embedded archive (7z, Rar, Zip, Cab, ...) over the executable wrapper, matching typical SFX behavior. If an embedded encrypted archive is found but cannot be opened,OpenErrorEncryptedis reported. The SFX scan is skipped when an explicit format is requested orArchiveStartOffset::FileStartis used, so the executable wrapper remains directly accessible.🆕 Root-folder extraction (since beta): extract the contents of an archive's single top-level folder without knowing its name in advance. (#331)
BitInputArchive::rootFolder()returns the single common top-level folder shared by every item, or an empty string when there is no common root.extractRootFolderContent()/extractRootFolderContentTo()(onBitInputArchive,BitArchiveReader, andBitExtractor) strip that folder's prefix so its contents land directly in the output directory, reusing the already-open archive. Archives without a single root folder throw aBitException.BitArchiveReader reader{ lib, "node-v24.16.0-win-x64.zip", BitFormat::Zip }; reader.extractRootFolderContentTo( "C:/Program Files/NODE.JS" ); // drops the top "node-..." folder🆕 Locked-file compression (since beta): the new
setStoreOpenFiles()setting onBitAbstractArchiveCreatoropens files with shared read/write access on Windows, mirroring 7-Zip's-sswswitch, so you can archive files another process holds open for writing instead of failing with "access denied". No effect on non-Windows platforms. (#329)New Features
Bit7zLibrary::useLargePages(): replaces the deprecatedsetLargePageMode().BitArchiveItem::rawPath()/nativeName(): access item paths and names in their raw/native string forms.BitArchiveReader: nested archive constructors: open nested archives directly from a parentBitInputArchive.BitArchiveReader:ArchiveStartOffsetconstructors: open archives embedded in the middle of a file (e.g., self-extracting archives).BitArchiveReader::archiveProperties(): access archive-level format properties.BitArchiveReader::itemsMatching(): get items matching a wildcard pattern.BitError::NoMatchingFile: new error code for filter/regex extraction finding no matching item.BitExtractor::extractFolder(): extract a specific folder from an archive.BitExtractor::extract()withRenameCallback: rename items on the fly during extraction.BitFileCompressor::compress(vector<pair<path, alias>>): compress files using explicit in-archive path aliases. (#313)BitIndicesView: lightweight, non-owning span of item indices; implicitly constructible from a single index, vector, array, or initializer list.BitInputArchive::extractFolderTo(): extract a single folder from an archive.sevenzip_stringtype alias andto_native_string()conversion functions.EncryptionScopeenum (since beta):setPassword()now takes a strongly typedEncryptionScopeinstead of abool cryptHeaders, making it explicit whether only data, or data and headers, are encrypted. The oldbooloverload is deprecated and forwards to the new one.ArchiveStartOffsetfor nested-archive subfiles (since beta): control how the start of a nested archive is located when opening a subfile that is itself an archive (e.g.ArchiveStartOffset::Noneto search for the header within the subfile's stream).Improvements
BitItemsVectornow stores items by value instead of viastd::unique_ptr, eliminating one heap allocation per file when indexing items for compression. The improvement scales with the number of files being compressed.OpenErrorcategory for richer archive-opening failure messages.Win32Categoryerror category for correctstd::error_codehandling.to_tstring()is zero-copy when already atstring. (#276)password(), anditemProperties()returnconstreferences; file path retrieval in extract callbacks is skipped when noFileCallbackis set; regex extraction accepts pre-compiledtregexobjects.S_FALSEwithout an error flag (e.g. the PE handler rejecting an executable with trailing data); bit7z now maps such results toOpenError::IsNotArc("Invalid archive, or wrong format used.") instead of an opaqueHRESULT.BitError::NoMatchingItems/NoMatchingFile) map toERROR_NOT_FOUND, and the generic "no such file" fallback maps toERROR_FILE_NOT_FOUNDrather than the misleadingERROR_PATH_NOT_FOUND.noexceptcorrectness (since beta): non-throwing public functions are now markednoexcept.retainDirectories()is disabled, nested structures below the selected extraction folder are now flattened consistently for allFolderPathPolicyvalues.Build & Packaging 🆕 (since beta)
find_package(bit7z): new install/export rules generatebit7zConfig.cmakeand export targets so consumers canfind_package(bit7z)and linkbit7z::bit7z(orbit7z::bit7z64). Gated behind the newBIT7Z_INSTALLoption (ONby default).BIT7Z_USE_SYSTEM_DEPENDENCIESoption (OFFby default) resolves 7-Zip and ghc::filesystem viafind_package()first, falling back to CPM.cmake downloads, improving integration with system package managers like vcpkg. (#96)/WXand-Werrorwere previously always on, breaking consumers and package managers on newer/different compilers. They are now gated behind the newBIT7Z_WARNINGS_AS_ERRORSoption, defaulting toBIT7Z_BUILD_TESTS(ONfor contributor/CI builds,OFFfor plain consumers).bit7z::fsnamespace clash: the publicbit7z::fsnamespace was replaced with a top-levelbit7zfsalias. As a sibling ofbit7z, it is never pulled in byusing namespace bit7z;, so it can no longer clash with a user's ownfsalias.Bug Fixes
BIT7Z_USE_NATIVE_STRING=ON.Deprecated
Bit7zLibrary::setLargePageMode()-> useuseLargePages().setPassword(password, bool cryptHeaders)-> use theEncryptionScopeoverload.(index, path)RenameCallbacksignature -> use theBitArchiveItem-based callback(
LegacyRenameCallbackis kept for backward compatibility and will be removed in the future).Delivered Since the Beta
The beta's "Planned Before Stable Release" list has been addressed:
BitFileCompressor,BitArchiveWriter(compression callbacks, buffer/stream start-offset constructors),BitArchiveEditor(rename/update/delete/add),BitOutputArchive,BitInputArchive(lookup by name, progress-callback abort), thestoreOpenFilesoption, and the SFX open paths.EncryptionScopeenum, richerRenameCallbackcontext, and improved error reporting (see above).SymlinkPolicyenum, corrected vcpkg usage instructions, a new SAST Tools section, and an updated bug-report template.fs::pathwork). The residual gap with v3, partly intrinsic to v4.1's extra security validation, is targeted for future releases.Note
This release includes all improvements, patches and fixes introduced in the v4.0.x series up to and including v4.0.12, and in the v4.1.0-beta.
Full Changelog: v4.1.0-beta...v4.1.0
Binaries built using the default options with Clang13, GCC 9, MinGW 8, MSVC 2015, 2017, 2019, and 2022 👇
This discussion was created from the release Bit7z v4.1.0.
Beta Was this translation helpful? Give feedback.
All reactions