Main Features and Improvements
-
Nested archives support: the new
BitNestedArchiveReaderclass lets you open and inspect archives that are embedded as items inside another archive, without extracting them to disk first. (#92)// Extract the .tar inside a .tar.gz without writing it to disk Bit7zLibrary lib{ "7z.dll" }; BitArchiveReader gz{ lib, "archive.tar.gz", BitFormat::GZip }; BitNestedArchiveReader tar{ lib, gz, BitFormat::Tar }; tar.extractTo( "output/" );
-
Native file I/O: file streams have been rewritten to use low-level Win32 APIs on Windows and POSIX I/O on Unix, replacing
std::fstream. This dramatically improves the performance of both extraction and compression operations. (#319, #325)Due to the choice of using
std::fstreamfor cross-platform support, v4.0 introduced a severe performance regression.For a quick comparison, I did some non-exhaustive benchmarks. The result is the following boxplot, which showcases the scale of that regression and how v4.1 resolves it. In this test, v4.0 extraction takes roughly six times as long as v3, while v4.1 brings it back down into the same range as v3:
Zooming in on just v3 and v4.1 (the v4.0 box is omitted here), a small performance gap remains:
Part of this residual gap is likely due to v4.1 performing additional security validation that v3 did not, so I anticipate a modest, permanent cost relative to v3 in exchange for safer behavior. I'll continue working to reduce, and where possible eliminate, the non-essential portion of this gap in the stable release.
Benchmark methodology
Each version was benchmarked by repeatedly extracting the same archive to the same destination, discarding warmup iterations. Boxes show the median (line) and interquartile range; the cross marks the mean; whiskers follow the Tukey 1.5×IQR convention. Measurements were collected on a single machine; absolute timings are environment-dependent and intended for relative comparison between versions, not as absolute throughput figures. The v4.0 regression and its resolution in v4.1 are large effects that are robust to measurement noise; the finer v3-vs-v4.1 gap is smaller and more sensitive to measurement conditions, so it should be read as indicative rather than exact.
-
New extraction callbacks:
RenameCallback(dynamically rename or skip items during extraction, #123),RawDataCallback(stream raw bytes directly to user code without touching the filesystem, #122), andBufferCallback(route each extracted file to a separate, independently chosen buffer) unlock extraction patterns that were previously impossible.Example
BitFileExtractor extractor{ lib, BitFormat::Zip }; // RenameCallback: skip .tmp files, extract everything else under a subfolder extractor.extract( "archive.zip", "output/", []( uint32_t, const tstring& path ) -> tstring { if ( path.find( ".tmp" ) != tstring::npos ) { return {}; // empty string = skip this item } return "backup/" + path; } ); // RawDataCallback: receive a file's raw bytes without writing to disk std::vector<byte_t> rawData; extractor.extractTo( "archive.zip", [&]( const byte_t* data, std::size_t size ) -> bool { rawData.insert( rawData.end(), data, data + size ); return true; // return false to abort }, /* index = */ 0 ); // BufferCallback: extract all files, each into its own buffer std::map<tstring, buffer_t> files; extractor.extract( "archive.zip", [&]( uint32_t, const tstring& path ) -> buffer_t& { return files[ path ]; } ); -
Multi-volume LRU file-handle cache: when reading or creating large multi-volume archives, bit7z now keeps only the most recently used volume file handles open, preventing file descriptor exhaustion on archives with many volumes (#150, #161).
-
Timestamp control: two complementary additions give full control over timestamps when creating archives.
BitAbstractArchiveCreatorgainssetStoreLastWriteTime(),setStoreCreationTime(), andsetStoreLastAccessTime()to control which timestamp types are stored globally (format support varies: 7z has the most complete support; TAR does not support creation/last-access timestamps). Additionally,BitOutputArchive::addFile()now returns aBitInputItem&, allowing per-item timestamp overrides viasetCreationTime(),setLastWriteTime(), andsetLastAccessTime(). (#184)Example
// Global control: choose which timestamp types to embed in the archive BitFileCompressor compressor{ lib, BitFormat::SevenZip }; compressor.setStoreCreationTime( true ); // opt in to creation timestamps compressor.setStoreLastAccessTime( true ); // opt in to last access timestamps compressor.setStoreLastWriteTime( false ); // suppress last write timestamps compressor.compress( { "file1.txt", "file2.txt" }, "archive.7z" ); // Per-item control: override the timestamp of a specific item const auto customTime = std::chrono::system_clock::now(); BitArchiveWriter writer{ lib, BitFormat::SevenZip }; writer.addFile( "document.txt" ).setLastWriteTime( customTime ); writer.compressTo( "archive.7z" );
-
Deferred library loading:
Bit7zLibraryLoaderallows constructing a loader without immediately loading the 7-zip DLL/SO, then loading (and unloading) it at any later point. Useful for plugin systems and applications that need to control when the native library is brought in.Example
Bit7zLibraryLoader loader; // no library loaded yet // ... later, e.g. after the user selects the library path ... std::error_code ec; loader.load( "7z.dll", ec ); if ( ec ) { std::cerr << "Could not load 7-zip: " << ec.message() << "\n"; return; } BitFileExtractor extractor{ loader, BitFormat::Zip }; // implicit conversion to Bit7zLibrary& extractor.extractTo( "archive.zip", "output/" ); loader.unload();
-
🆕 Self-extracting (SFX) archive support (since beta): bit7z correctly handles SFX executables. (#317, #326)
- Requesting an executable format explicitly (
BitFormat::Pe,BitFormat::Elf,BitFormat::Macho) now succeeds on unsigned SFX executables, so listing its sections works where it previously failed with an opaqueS_FALSE. - With
BitFormat::Auto, detection now scans for and prioritizes the embedded archive (7z, Rar, Zip, Cab, ...) over the executable wrapper, matching typical SFX behavior. If an embedded encrypted archive is found but cannot be opened,OpenErrorEncryptedis reported. The SFX scan is skipped when an explicit format is requested orArchiveStartOffset::FileStartis used, so the executable wrapper remains directly accessible.
- Requesting an executable format explicitly (
-
🆕 Root-folder extraction (since beta): extract the contents of an archive's single top-level folder without knowing its name in advance. (#331)
-
BitInputArchive::rootFolder()returns the single common top-level folder shared by every item, or an empty string when there is no common root. -
extractRootFolderContent()/extractRootFolderContentTo()(onBitInputArchive,BitArchiveReader, andBitExtractor) strip that folder's prefix so its contents land directly in the output directory, reusing the already-open archive. Archives without a single root folder throw aBitException.BitArchiveReader reader{ lib, "node-v24.16.0-win-x64.zip", BitFormat::Zip }; reader.extractRootFolderContentTo( "C:/Program Files/NODE.JS" ); // drops the top "node-..." folder
-
-
🆕 Locked-file compression (since beta): the new
setStoreOpenFiles()setting onBitAbstractArchiveCreatoropens files with shared read/write access on Windows, mirroring 7-Zip's-sswswitch, so you can archive files another process holds open for writing instead of failing with "access denied". No effect on non-Windows platforms. (#329)
New Features
Bit7zLibrary::useLargePages(): replaces the deprecatedsetLargePageMode().BitArchiveItem::rawPath()/nativeName(): access item paths and names in their raw/native string forms.BitArchiveReader: nested archive constructors: open nested archives directly from a parentBitInputArchive.BitArchiveReader:ArchiveStartOffsetconstructors: open archives embedded in the middle of a file (e.g., self-extracting archives).BitArchiveReader::archiveProperties(): access archive-level format properties.BitArchiveReader::itemsMatching(): get items matching a wildcard pattern.BitError::NoMatchingFile: new error code for filter/regex extraction finding no matching item.BitExtractor::extractFolder(): extract a specific folder from an archive.BitExtractor::extract()withRenameCallback: rename items on the fly during extraction.BitFileCompressor::compress(vector<pair<path, alias>>): compress files using explicit in-archive path aliases. (#313)BitIndicesView: lightweight, non-owning span of item indices; implicitly constructible from a single index, vector, array, or initializer list.BitInputArchive::extractFolderTo(): extract a single folder from an archive.sevenzip_stringtype alias andto_native_string()conversion functions.- 🆕
EncryptionScopeenum (since beta):setPassword()now takes a strongly typedEncryptionScopeinstead of abool cryptHeaders, making it explicit whether only data, or data and headers, are encrypted. The oldbooloverload is deprecated and forwards to the new one. - 🆕
ArchiveStartOffsetfor nested-archive subfiles (since beta): control how the start of a nested archive is located when opening a subfile that is itself an archive (e.g.ArchiveStartOffset::Noneto search for the header within the subfile's stream).
Improvements
- Compression indexing performance:
BitItemsVectornow stores items by value instead of viastd::unique_ptr, eliminating one heap allocation per file when indexing items for compression. The improvement scales with the number of files being compressed. - Filesystem performance: besides the usage of native I/O APIs, also the number of unnecessary accesses to the filesystem was reduced, which in some cases was causing performance degradation due to realtime inspections by antiviruses like Windows Defender (#320).
- Atomic in-place updates: archive updates now write to a temporary file and rename atomically on success.
- Error reporting: new
OpenErrorcategory for richer archive-opening failure messages. - MinGW: added
Win32Categoryerror category for correctstd::error_codehandling. - String conversion: improved UTF-8 <-> UTF-16 correctness;
to_tstring()is zero-copy when already atstring. (#276) - Performance: callback getters,
password(), anditemProperties()returnconstreferences; file path retrieval in extract callbacks is skipped when noFileCallbackis set; regex extraction accepts pre-compiledtregexobjects. - Default 7-Zip version: updated to 26.01.
- 🆕 Clearer archive-opening errors (since beta): in some cases, 7-Zip handlers return
S_FALSEwithout an error flag (e.g. the PE handler rejecting an executable with trailing data); bit7z now maps such results toOpenError::IsNotArc("Invalid archive, or wrong format used.") instead of an opaqueHRESULT. - 🆕 More faithful "not found" mapping (since beta): filter/lookup misses (
BitError::NoMatchingItems/NoMatchingFile) map toERROR_NOT_FOUND, and the generic "no such file" fallback maps toERROR_FILE_NOT_FOUNDrather than the misleadingERROR_PATH_NOT_FOUND. - 🆕 Improved
noexceptcorrectness (since beta): more non-throwing public functions are now markednoexcept. - 🆕 Consistent path flattening (since beta): when
retainDirectories()is disabled, nested structures below the selected extraction folder are now flattened consistently for allFolderPathPolicyvalues.
Build & Packaging 🆕 (since beta)
- Installable package +
find_package(bit7z): new install/export rules generatebit7zConfig.cmakeand export targets so consumers canfind_package(bit7z)and linkbit7z::bit7z(orbit7z::bit7z64). Gated behind the newBIT7Z_INSTALLoption (ONby default).- This removes the need for downstream install patches (e.g. in vcpkg). (#96)
- System dependency resolution: the new
BIT7Z_USE_SYSTEM_DEPENDENCIESoption (OFFby default) resolves 7-Zip and ghc::filesystem viafind_package()first, falling back to CPM.cmake downloads, improving integration with system package managers like vcpkg. (#96) - Warnings-as-errors is now opt-in:
/WXand-Werrorwere previously always on, breaking consumers and package managers on newer/different compilers. They are now gated behind the newBIT7Z_WARNINGS_AS_ERRORSoption, defaulting toBIT7Z_BUILD_TESTS(ONfor contributor/CI builds,OFFfor plain consumers). - No more
bit7z::fsnamespace clash: the publicbit7z::fsnamespace was replaced with a top-levelbit7zfsalias. As a sibling ofbit7z, it is never pulled in byusing namespace bit7z;, so it can no longer clash with a user's ownfsalias. - Dependency bumps: CPM.cmake updated to 0.42.3; the test data set updated to bit7z-test-data 1.18.
Bug Fixes
- Fixed build on non-Windows platforms with
BIT7Z_USE_NATIVE_STRING=ON. - Fixed build with the C++20 standard.
- 🆕 Fixed current-directory path resolution when extracting folders.
Deprecated
Bit7zLibrary::setLargePageMode()-> useuseLargePages().- 🆕
setPassword(password, bool cryptHeaders)-> use theEncryptionScopeoverload. - 🆕 The legacy
(index, path)RenameCallbacksignature -> use theBitArchiveItem-based callback (LegacyRenameCallbackis kept for backward compatibility and will be removed in the future).
Delivered Since the Beta
The beta's "Planned Before Stable Release" list has been addressed:
- ✅ Improved test coverage: new and expanded tests for
BitFileCompressor,BitArchiveWriter(compression callbacks, buffer/stream start-offset constructors),BitArchiveEditor(rename/update/delete/add),BitOutputArchive,BitInputArchive(lookup by name, progress-callback abort), thestoreOpenFilesoption, and the SFX open paths. - ✅ Code cleanup: numerous clang-tidy fixes, early-return refactors, tighter integer/enum types, and consistent formatting across the codebase.
- ✅ Small features that didn't make the beta: locked-file compression, SFX support, the root-folder API, the
EncryptionScopeenum, richerRenameCallbackcontext, and improved error reporting (see above). - ✅ Documentation: new-feature docs and a refined nested-archive description in the README, added Doxygen comments to public headers, a documented
SymlinkPolicyenum, corrected vcpkg usage instructions, a new SAST Tools section, and an updated bug-report template. - ◑ Performance: this cycle focused on the above; only minor incidental performance touches landed (e.g. root-content extraction avoids per-item
fs::pathwork). The residual gap with v3, partly intrinsic to v4.1's extra security validation, is targeted for future releases.
Note
This release includes all improvements, patches and fixes introduced in the v4.0.x series up to and including v4.0.12, and in the v4.1.0-beta.
Full Changelog: v4.1.0-beta...v4.1.0
Binaries built using the default options with Clang13, GCC 9, MinGW 8, MSVC 2015, 2017, 2019, and 2022 👇


