-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ten megabytes per minute #2
Comments
par2-tbb is par2cmdline modified to use Intel Threading Building Blocks. It seems this software can work on multiple files at once. But I've never heard of TBB and I'm not keen on using it. |
Gibraltar uses CUDA to perform Reed-Solomon encoding and decoding on a GPU rather than the CPU. Besides the impressive results, another great thing about this paper is the quick introduction it gives to how Reed-Solomon coding is commonly done on CPUs. |
FFLAS appears to be an optimized way to do finite-field matrix and vector operations. Perhaps it may offer a more highly optimized means of doing the Reed-Solomon arithmetic than par2cmdline presently has. |
I feel an itch to make a par2 utility in Haskell. I'm trying to ignore it, and think toward working with par2cmdline instead, but the PAR2 format is so well specified that writing another program that produces it is tempting. par2cmdline is written in about 13,000 lines of C++. I think a PAR2 utility could be written in not as many lines of Haskell, but FFLAS is not bound to Haskell, and PARI doesn't seem to be, either, yet, so there would be a detour to get tools ready. |
Looked at FFLAS and par2cmdline's reedsolomon.cpp and galois.h. reedsolomon.cpp is right down in the weeds, xoring things and creating lookup tables and etc. That's what would need to be replaced with calls to FFLAS, I think. Looked at PARI documentation. It was suggested because it's not made out of C++ templates, which would be difficult to bind to Haskell. But instead of embodying complexities of types in C++ templates, it swallows the complexities into its C code and interfaces. Not something that can be easily and quickly bound to Haskell. Those StackOverflow questions linked above were asked by Johannes Weiß, who appears to have settled on NTL for doing whatever he was doing with finite fields. A possibility I've overlooked so far is not binding the whole finite field library to Haskell, but writing a Reed-Solomon algorithm using the library and binding that into Haskell. That would be hard enough for me. |
I ran another test. First I made 200MB of random data, in 20 files.
Then I backed it up using darbrrb, with 30MB discs, 3 data discs per set, 2 parity discs per set, 10 slices per disc. This took 770 seconds, and when I did it again, it took 770 seconds again. This was under Debian GNU/Linux i386. When I reinstalled the same machine with Debian GNU/Linux amd64, it took 593 and 590 seconds to do it twice again. These figures work out to something like 15 MB/min and 20 MB/min, respectively. Still not what they need to be, but better. |
A careful look through the Gibraltar paper yields a footnote I overlooked before: "Gibraltar is available at http://www.cis.uab.edu/hpcl/gibraltar, along with sample applications to test it." |
After a lengthy hiatus, I found that while PAR2 specifies the use of a 16-bit Galois field, Gibraltar uses an 8-bit Galois field. I didn't want to change Gibraltar because the paper goes into detail about how they optimized its speed, but I didn't want to sprout my own file format, because PAR2 has been established for 10 years. I decided to make my own file format, found that 33 commits have happened to par2cmdline since I forked it, and saw that several of these are about how you can send par2cmdline into an infinite loop with a maliciously crafted par2 file. This made me realize that there may be more possibilities to deal with in making one's own file format than I reckoned on. I hacked on Gibraltar, and arrived at jaredjennings/libgibraltar@735f7c5. I found that the log/antilog tables necessary for 16-bit Galois field arithmetic, which are 65536 entries long instead of 256, and are made of 16-bit numbers rather than 8-bit numbers, don't fit in the shared memory of my graphics card. I could be five feet away from a solution to that problem and not know it, because I haven't learned CUDA. Now it appears that, to get fast redundancy calculations, I need my own file format and tool to read and write it, and the best I can do is to look at the history and tests of par2cmdline in an attempt to avoid duplicating too much work. |
I thought I had a breakthrough with Gibraltar and par2cmdline, but then I found I didn't, and came back around to needing a new file format. After much handwringing, paper-reading, googling, and more paper-reading, I've found that cooler, newer codes exist than Reed-Solomon, which may be easier to compute. Nope, just kidding, they're only faster because you implement them in hardware. And there are so many patents around error-correcting codes that it looks like a minefield to try to implement good new software that will do it. I mean, some guy used Intel's SIMD operations (SSE2 or so, I assume) to make a speed breakthrough with Reed-Solomon, but then got a patent smackdown. Then I realized that all the features PAR2 has over PAR1 are not being used by darbrrb. I fetched parchive, and modified darbrrb slightly to use it instead of par2. I tested the resulting darbrrb on 1.5GB of random data, and found it to be around five and a half times faster than par2. Since I filed this ticket, I've gotten a new computer, so par2 goes 34 MB/min. parchive went 184 MB/min. Ah, now I see the test above, and will run exactly that one. |
OK - same test as above, with par2, 345 seconds; with parchive (PAR1), 60 seconds flat. That's 34.7 MB/min with par2 and 200 MB/min with parchive. (The discrepancy for parchive with the above comment may be explained by the idea that parchive is no longer the dominant influence on the time taken; perhaps dar did something faster when it had smaller data. I don't know.) |
OK - there have been several commits to go back to parchive (PAR 1), most notably fd0092e. I'm going to call this fixed. |
After the unit tests all passed, and the
-n
switch was changed to mean "don't actually burn discs," I ran a test backup of a directory containing a rip of a CD. 700MiB of data, on 30MiB "discs," with no compression, AES encryption and redundancy, took ... well I forgot what, but the rate worked out to ten megabytes per minute, and subjectively it looked like most of the time spent was spent running par2. And this was not even counting the time it takes to burn discs.My aim is for darbrrb to back up hundreds of gigabytes of data, and do it in less time than a week.
The text was updated successfully, but these errors were encountered: