Swiss army knife for backup and disaster recovery, like 7z or RAR on steroids, with deduplicated "snapshots" (versions). Conceptually similar to the Mac time machine, but much more efficiently. zpaq 7.15 fork.
Platform | OS package | Version | Video |
---|---|---|---|
Windows 32/64bit (sourceforge) | |||
Windows 32 (direct) | zpaqfranz32 upgrade -force |
latest | |
Windows 64 (direct) | zpaqfranz upgrade -force |
latest | |
Windows 64 (HW accelerated) | zpaqfranzhw -force |
latest | |
OpenBSD | pkg_add zpaqfranz |
||
FreeBSD | pkg install zpaqfranz |
60.3a | |
MacOS | brew install zpaqfranz |
||
OpenSUSE | sudo zypper install zpaqfranz |
||
Debian (Ubuntu etc) .deb | sudo apt install zpaqfranz |
59.8j / 60.7 | Desktop |
Linux generic (32/64) | 60.7u | ||
Arch | AUR user repository (latest) | 58.10i | Terminal |
Solaris 11-64 | 60.7u | ||
OmniOS 64 | 60.7u | ||
NAS (Intel, Synology...) | 60.7u | ||
NAS (armv8, QNAP...) | 60.7u | ||
NAS (arm Cortex-C57,QNAP TS433...) | 60.7u | ||
Haiku | 60.7u | ||
ESXi | 60.7u | ||
Freeware GUI for Windows | latest | ||
Third Party Python software |
Classic archivers (tar, 7z, RAR etc) are obsolete, when used for repeated backups (daily etc), compared to the ZPAQ technology, that maintain "snapshots" (versions) of the data. This is even more true in the case of ASCII dumps of databases (e.g. MySQL/MariaDB)
Let's see. Archiving a folder multiple times (5), simulating a daily run Monday-to-Friday, with 7z
7Z-1.mp4
Same, but with zpaqfranz
Zpaq-1.mp4
As you can see, the .7z "daily" 5x backups takes ~ 5x the space of the .zpaq
I thought it's best to show the difference for a more realistic example.
Physical (small fileserver) Xeon machine with 8 cores, 64GB RAM and NVMe disks, plus Solaris-based NAS, 1Gb ethernet
Rsync update from filesystem to filesystem (real speed)
Rsync.Backup-1.mp4
Rsync update to Solaris NAS (real speed)
Rsync.Nas-1.mp4
Backup update from file system with zpaqfranz (real speed)
Zpaq.Backup-1.mp4
Backup upgrade via zfsbackup (real speed)
Zfs.Backup-1.mp4
At every run only data changed since the last execution will be added, creating a new version (the "snapshot"). It is then possible to restore the data @ the single version, just like snapshots by zfs or virtual machines, but a single-file level.
- Keeps a forever-to-ever copy (even thousands of versions), conceptually similar to Mac's time machine, but much more efficiently.
- Ideal for virtual machine disk storage (ex backup of vmdk), virtual disks (VHDx) and even TrueCrypt containers.
- Easily handles millions of files and tens of TBs of data.
- Allows rsync (or zfs replica) copies to the cloud with minimal data transfer and encryption.
- Multiple possibilities of data verification, fast, advanced and even paranoid.
- Some optimizations for modern hardware (aka: SSD, NVMe, multithread).
- By default triple-check with "chunked" SHA-1, XXHASH64 and CRC-32 (!).
For even higher level of paranoia, it is possible to use others hash algorithms, as
- MD5
- SHA-1 of the full-file (NIST FIPS 180-4)
- XXH3-128
- BLAKE3 128
- SHA-2-256 (NIST FIPS 180-4)
- SHA-3-256 (NIST FIPS 202)
- WHIRLPOOL (ISO/IEC 10118-3)
- HIGHWAY (64,128,256) ...And much more.
No complex (and fragile) repository folders, with hundreds of "whatever", just only a single file!
It is often important to copy the %desktop% folder, Thunderbird's data, %download% and generally the data folders of a Windows system, leaving out the programs
Real speed (encrypted) update of C: without software (-frugal)
vss.mp4
In this case the space used is obviously larger, as is the execution time, but even the "most difficult" folders are also taken. Deliberately the bitmap of occupied clusters is ignored: if you are paranoid, be all the way down!
It is just like a dd. You can't (for now) restore with zpaqfranz. You have to extract to a temporary folder and then use other software (e.g., 7z, OSFMount) to extract the files directly from the image
Accelerated speed (encrypted) every-sector update of a 256GB C: @ ~150MB/s
dd.mp4
AFAIK of course
10+ years of developing (2009-now).
Who did that?
One of the world's leading scientists in compression.
No, not me, but this guy ZPAQ - Wikipedia
When?
From 2009 to 2016.
Where?
On a Russian compression forum, one of the most famous, but obviously super-niche
Why is it not known as 7z or RAR, despite being enormously superior?
Because lack of users who ... try it!
Who are you?
A user (and a developer) who has proposed and made various improvements that have been implemented over the years. When the author left the project, I made my fork to make the functions I need as a data storage manager.
Why is it no longer developed? Why should I use your fork?
Because Dr. Mahoney is now retired and no longer supports it (he... run!)
Why should I trust? It will be one of 1000 other programs that silently fail and give problems
As the Russians (and Italians) say, trust me, but check.
Archiving data requires safety. How can I be sure that I can then extract them without problems?
It is precisely the portion of the program that I have evolved, implementing a barrage of controls up to the paranoid level, and more. Let's say there are verification mechanisms which you have probably never seen. Do you want to use SHA-2/SHA-3 to be very confident? You can.
Accelerated speed of real world testing of archive >1GB/s
test.mp4
ZPAQ (zpaqfranz) allows you to NEVER delete the data that is stored and will be available forever (in reality typically you starts from scratch every 1,000 or 2,000 versions, for speed reasons, on HDD. 10K+ on SSD), and restore the files present to each archived version, even if a month or three years ago.
Real-speed updating (on QNAP NAS) of a small server (300GB); ~7GB of Thunderbird mbox become ~6MB (!) in ~4 minutes.
update-nas.mp4
In this "real world" example (a ~500.000 files / ~500GB file server of a mid-sized enterprise), you will see 1042 "snapshots", stored in 877GB.
root@f-server:/copia1/copiepaq/spaz2020 # zpaqfranz i fserver_condivisioni.zpaq
zpaqfranz v51.27-experimental snapshot archiver, compiled May 26 2021
fserver_condivisioni.zpaq:
1042 versions, 1.538.727 files, 15.716.105 fragments, 877.457.003.477 bytes (817.20 GB)
Long filenames (>255) 4.526
Version(s) enumerator
-------------------------------------------------------------------------
< Ver > < date > < time > < added > <removed> < bytes added >
-------------------------------------------------------------------------
00000001 2018-01-09 16:56:02 +00308608 -00000000 -> 229.882.913.501
00000002 2018-01-09 18:06:28 +00007039 -00000340 -> 47.356.864
00000003 2018-01-10 15:06:25 +00007731 -00000159 -> 7.314.709
00000004 2018-01-10 15:17:44 +00007006 -00000000 -> 612.584
00000005 2018-01-10 15:47:03 +00007005 -00000000 -> 611.980
00000006 2018-01-10 18:03:08 +00008135 -00000829 -> 2.698.417.427
(...)
00000011 2018-01-10 19:20:30 +00007007 -00000000 -> 613.273
00000012 2018-01-11 07:00:36 +00007008 -00000000 -> 613.877
(...)
00000146 2018-03-27 17:08:39 +00001105 -00000541 -> 164.399.767
00000147 2018-03-28 17:08:28 +00000422 -00000134 -> 277.237.055
00000148 2018-03-29 17:12:02 +00011953 -00011515 -> 826.218.948
(...)
00001039 2021-05-02 17:17:42 +00030599 -00031135 -> 12.657.155.316
00001040 2021-05-03 17:14:03 +00000960 -00000095 -> 398.358.496
00001041 2021-05-04 17:13:40 +00000605 -00000004 -> 95.909.988
00001042 2021-05-05 17:15:13 +00000579 -00000008 -> 82.487.415
54.799 seconds (all OK)
Do you want to restore @ 2018-03-28?
00000147 2018-03-28 17:08:28 +00000422 -00000134 -> 277.237.055
Version 147 =>
zpaqfranz x ... -until 147
Do you want 2021-03-05? Version 984 =>
zpaqfranz x ... -until 984
Another real world example: 4900 versions, from mid-2017
zpaqfranz v51.10-experimental journaling archiver, compiled Apr 5 2021
franz:use comment
old_aserver.zpaq:
4904 versions, 385.830 files, 3.515.679 fragments, 199.406.200.193 bytes (185.71
GB)
Version comments enumerator
------------
00000001 2017-08-16 19:26:15 +00090863 -00000000 -> 79.321.339.869
00000002 2017-08-17 13:29:25 +00000026 -00000000 -> 629.055
00000003 2017-08-17 13:30:41 +00000005 -00000000 -> 18.103
00000004 2017-08-17 14:34:12 +00000005 -00000000 -> 18.149
00000005 2017-08-17 15:28:42 +00000008 -00000000 -> 99.062
00000006 2017-08-17 19:30:03 +00000008 -00000000 -> 1.013.616
00000007 2017-08-18 19:33:14 +00000021 -00000001 -> 2.556.335
00000008 2017-08-19 19:29:23 +00000025 -00000000 -> 1.377.082
00000009 2017-08-20 19:29:56 +00000002 -00000000 -> 24.153
00000010 2017-08-21 19:34:35 +00000031 -00000000 -> 2.554.582
(...)
00004890 2021-02-16 16:40:51 +00000190 -00000005 -> 99.051.540
00004891 2021-02-16 19:30:17 +00000065 -00000006 -> 16.467.364
00004892 2021-02-17 19:34:04 +00000381 -00000257 -> 95.354.305
(...)
00004900 2021-02-25 19:35:47 +00000755 -00000611 -> 132.241.557
00004901 2021-02-26 19:57:16 +00000406 -00000253 -> 122.669.868
00004902 2021-02-27 20:33:45 +00000029 -00000002 -> 12.677.932
00004903 2021-02-28 20:34:00 +00000027 -00000001 -> 6.978.088
00004904 2021-03-01 20:33:52 +00000174 -00000019 -> 77.113.147
until 2021 (4 years later)
This is a ~200GB server
(...)
- 2019-09-23 10:14:44 2.943.578.106 0666 /tank/mboxstorico/inviata_spazzatura__2017_2018
- 2021-02-18 10:16:25 4.119.172 0666 /tank/mboxstorico/inviata_spazzatura__2017_2018.msf
- 2019-10-25 15:39:15 1.574.715.392 0666 /tank/mboxstorico/nstmp
- 2020-11-28 20:33:22 2.038.165 0666 /tank/mboxstorico/nstmp.msf
- 2021-02-25 17:48:11 8.802 0644 /tank/mboxstorico/sha1.txt
214.379.664.412 (199.66 GB) of 214.379.664.412 (199.66 GB) in 154.975 files shown
so for 4900 versions you need 200GB*4900 = ~980TB with something like tar, 7z, RAR etc (yes, 980 terabytes), versus ~200GB (yes, 200GB) with zpaq.
Same things for virtual machines (vmdks)
Because other software (sometimes very, very good) runs on complex "repositories", very fragile and way too hard to manage (at least for my tastes).
It may happen that you have to worry about backing up ... the backup, because maybe some files were lost during a transfer, corrupted etc.
If it's simple, maybe it will work
Obviously this is not "magic", it is simply the "chaining" of a block deduplicator with a compressor and an archiver. There are faster compressors. There are better compressors. There are faster archivers. There are more efficient deduplicators.
But what I have never found is a combination of these that is so simple to use and reliable, with excellent handling of non-Latin filenames (Chinese, Russian etc).
This is the key: you don't have to use complex "pipe" of tar | srep | zstd | something hoping that everything will runs file, but a single ~4MB executable, with 7z-like commands.
You don't even have to install a complex program with many dependencies that will have to read a folder (the repository) with maybe thousands of files, hoping that they are all fully functional.
There are also many great features for backup, I mention only the greatest.
The ZPAQ file is "in addition", it is never modified
So rsync --append will copy only the portion actually added, for example on ssh tunnel to a remote server, or local NAS (QNAP etc) with tiny times.
TRANSLATION
You can pay ~$4 a month for 1TB cloud-storage-space to store just about everything
You don't have to copy or synchronize let's say 700GB of tar.gz,7z or whatever, but only (say) the 2GB added in the last copy, the first 698GB are untouched.
This opens up the concrete possibility of using VDSL connections (upload ~ 2/4MB /s) to backup even virtual servers of hundreds of gigabytes in a few minutes.
In this (accelerated) video the rsync transfer of 2 remote backups: "standard" .zpaq archive (file level) AND zfsbackup (bit-level) for a small real-world server 1 day-update of work
download.mp4
Bonus: for a developer it's just like a "super-git-versioning"
In the makefile just put at top a zpaq-save-everything and you will keep all the versions of your software, even with libraries, SQL dump etc. A single archive keeps everything, forever, with just one command (or two, for verify)
Defects?
Some.
The main one is that the listing of files is not very fast, when there are many versions (thousands), due to the structure of the archiver-file-format. I could get rid of it, but at the cost of breaking the backward compatibility of the file format, so I don't want to. On 52+ there is a workaround (-filelist)
It is not the fastest tool out there, with real world performance of 80-200MB/s (depending on the case and HW of course). Not a big deal for me (I have very powerful HW, and/or run nightly cron-tasks)
Extraction can require a number of seeks (due to various deduplicated blocks), which can slow down extraction on magnetic disks (but not on SSDs).
If you have plenty of RAM now it is possible to bypass with the w command
No other significant ones come to mind, except that it is known and used by few
Very hard to use?
It is a tool for power users and administrators, who are used to the command line. A text-based GUI is being developed to make data selection and complex extraction easier (!).
In this example we want to extract all the .cpp files as .bak from the 1.zpaq archive. This is something you typically cannot do with other archives such as tar, 7z, rar etc.
First f key (find) and entering .cpp
Then s (search) every .cpp substring
Then r (replace) with .bak
Then t (to) for the z:\example folder
Finally x to run the extraction
gui.mp4
I do not trust you, but I am becoming curious. So?
On FreeBSD you can try to build the port (of paq, inside archivers) but it is very, very, very old (v 6.57 of 2014)
You can get a "not too old" zpaqfranz with a pkg install zpaqfranz
On OpenBSD pkg_add zpaqfranz
is usually rather updated
On Debian there is a zpaq 7.15 package, and starting with Debian 13 zpaqfranz is available too.
You can download the original version (7.15 of 2016) directly from the author's website, and compile it, or get the same from github.
In this case be careful, because the source is divided into 3 source files, but nothing difficult for the compilation.
OK, let's assume I want to try out zpaqfranz. How?
From branch 51 all source code is merged in one zpaqfranz.cpp aiming to make it as easy as possible to compile on "strange" systems (NAS, vSphere etc).
Updating, compilation and Makefile are now trivial.
My main development platforms are INTEL Windows (non-Intel Windows (arm) currently unsupported) and FreeBSD.
I rarely use Linux or MacOS or whatever (for compiling), so fixing may be needed.
As explained the program is single file, be careful to link the pthread library.
You need it for ESXi too, even if it doesn't work. Don't be afraid, zpaqfranz knows!