New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synology unrar performance on atom - official binary vs spksrc compiled binary #847
Comments
On my DS413 with PowerPC / qoriq CPU i have a similar issue. |
Hey Hubfront, take a look here: #809 Thanks for your advice. Nevertheless I can also confirm that the Thecus N5550 is doing the extraction more faster than my DS1513+ with the roshall binary with same RAID configuration and HDD setup. The problem is that I don't have the Thecus anymore, because I brought it back to the reseller. I also contacted the Synology Support because of this issue and I got this answer for my problem: BEGIN: Hi Jaroslaw, Thank you for the reply. rar performance very depends on the content of the archive. Since user was not comparing all machines with the same archive file (5GB on DS and 12GB on Thecus), we cannot tell if this is normal. Please first have the user test with the 12GB archive file on DS, or at least test machines all with the same archive file, and see how it performs. So we can compare apples to apples. END: Not very helpful. In my opinion they should try the extraction on a standard notebook with one single hdd. Even there the unrar process is 3-4 times faster than on the DS with a roshall unrar binary. Strange thing. Best regards, |
Most likely, rarlabs` unrar is optimized for that specific CPU type/architecture (x86). I would expect (though unconfirmed) that with the correct compiler flags, you should be able to compile an unrar version that has similar performance as the rarlabs version. Give it a try :) |
I updated the first post with two perfomance tests for completeness: overrunner: good to hear from you, that rarlab also has a fast binary for powerpc |
Dr-Bean, thanks for this advice. I have already tried to compile unrar within the spksrc with optimized compiler flags. Well, it worked, I was able to compile and I replaced the binary. But I haven't registered any performance boosts using the new compiled binary. Where I can look up the Intel Atom Cedarview optimized compiler flags? Perhaps I have used the wrong ones. Regards, |
Take a look at env.patch. Try and leave that part of the patch out and see if it compiles correctly/improves speed. Next step would be to change the Makefile to use that flag conditionally, based on $(ARCH). |
For my Synology DS413 i use now the binary which comes with the IPKG Package. It's Version 5 and a little bit faster than the 5.01 from the rarlab Page. |
Using the version from rarlabs website did not work for me but the IPKG unrar works like a charm. |
IPKG recipe for unrar is here: http://svn.nslu2-linux.org/svnroot/optware/trunk/make/unrar.mk but I don't see anything obviously different. |
I was looking for that, thanks ;) |
So it turns out the flag is not set. Results, with a few seconds more or less between runs:
The results are generated from a modified version of this, which comes from here |
With Synology's DSM 5.0 Final, there comes a fresh 5.01 with. |
I like not to depend on Synology's binaries for our own packages. This can prove usefull sometimes and we ensure there's no issue with the binaries as we include everything. |
I'm compiling all arches to make sure the flag is supported on all of them...I only added x86 to the fix above to be on the safe side. If not, I'm wondering if there are downsides to add it to the toolchains makefile...apart from the (possible) need to override it in some cases. |
I think I removed it from toolchains makefile because on some archs, too much optimization flag cause compilation errors. An other reason is that sometimes it is set upstream. |
Results including -O3. It's just a couple milliseconds lower than -O2 (consistently), but increases compilation time a bit:
If compiling with -O2 completes successfully, I'll just remove that part of env.patch. Should be good enough. |
Pushed the fix, it works on all arches :) |
Dr-Bean, great work indeed and many thanks for this fix! I re-compiled with "ADDITIONAL_CXXFLAGS = -O2" in the Makefile and now the unrar_5.01_spksrc_o2 performed 18 percent faster than the official linux build from rarlab.com in my test, which was the fastest in my tests before. I updated my first post for test results. |
You might be interested in checking out -O3. If the percentages on my small rarset scale linearly, you could shave another minute off ;) |
It's complicated for me. The compiler rejects the changes in the patch. ;-) |
In env.patch, put a minus sign in front of
That should patch correctly |
I should have asked earlier. ;-) |
This the result: with unrar_5.01_spksrc_o3 my 47 files were unpacked in 22min 05sec, this is around 6 percent faster than unrar_5.01_spksrc_o2, which needed 23min 29sec and 24 percent faster than the unrar_roshall. Definitely good :) |
Minute and a half quicker? We're breaking records here ;) |
Thank you for your effort! |
-O3 compiles just fine so far, only x86 arches are left to compile, and that shouldn't be a problem. I'll merge the change one of these days. |
@Dr-Bean -O3 works nice on x86 for me :) |
commit 4972b62 Author: filin20 <filin.20@gmail.com> Date: Sun Mar 16 11:46:33 2014 +0200 update btsync to 1.2.92 commit 27a1d40 Merge: feac255 2333f88 Author: filin20 <filin.20@gmail.com> Date: Sun Mar 16 11:33:32 2014 +0200 Merge remote-tracking branch 'upstream/develop' into develop commit feac255 Author: filin20 <filin.20@gmail.com> Date: Sat Mar 15 13:59:22 2014 +0200 update btsynk to 1.2.91 commit 2333f88 Author: Dr-Bean <github@beanpoint.com> Date: Wed Mar 12 17:15:16 2014 +0100 Add compiler flag -O2 back to unrar, ref SynoCommunity#847
There is no harm in pushing nzbget too while you're at it. With SABnzbd they'll both greatly benefit from the optimization. |
Hi Dr-Bean, tx for the update. Should this thread marked as done and closed now? |
Yep, while I'm at it, I'll close it ;) I'll look into updating some packages in the next couple of days or so. |
I just downloaded and compiled the newest version of unrar v5.1beta1 with spksrc for my DS214play, after updating the digests file for the new hashes and the Makefile in cross/unrar/. And found it to be blazing fast! For the 47 files the unrar_5.1beta1_spksrc_o3 only needed 7min 30sec! Thats an speed improvement of 294 percent over the former number one unrar_5.01_spksrc_o3 (unrar version 5.0.14), which needed 22min 04sec. To sort out any possible systematic influences, e.g. the new DSM Update DSM5update1 and possible changes to spksrc i recompiled unrar_5.01_spksrc_o3 and found no significant changes in speed: it needed 22min 10sec, which is a negligible difference and confirmes, that the speed increase of the new unrar beta 5.1beta1 is caused solely from the improvements in its code. Btw. the new version also uses only one thread on my DS like the other binaries, so CPU usage of the process never exceeds 25%. So stay tuned for the release of unrar 5.1! Im already using the beta ofc for daily use. :-) |
Just a thought...: Are the par2 binaries fine? or are they not optimized with O2 like the unrar-binary? |
Good morning, due to par2: The original par2 isn't able to handle multiple threads, so that's the reason why I use this one. Regards, |
Hi @Jarosch, thankyou for the par2 hint. I use that multicore par2 tbb now with sabnzbd and its a lot faster than before. I just retested the rarset with 5.1.2 and it was exactly as fast unpacking as 5.1.1: 7min 30sec. As said before, thats about three times faster than with 5.01 on the same rarset. |
Downloading the set now. I'll have the script run the whole set, for each version of unrar I tested before, and we see what happens. Sneak preview, only
Tested a couple times, and on all of these runs, unrar 5.1.2 with |
@Dr-Bean, interesting, which processor platform do you use? I just tested unrar 5.1.2 with O2 flag and the result is exact the same as with O3 flag, even for the complete set: 7 min 30sec for the first 47 files. For the complete set of 71 parts, which is altogether about 10GB, unrar 5.1.2 (O3-flag) needed 11min 10sec, unrar 5.1.2 (O2-flag) needed also 11min 10sec. |
All tests have been run on a DSM 4.3, Bromolow-based VM. Seems you're right about O2 and O3 not making much of a difference. Here, O3 is usually faster, but the times vary too much for it to be a clear win: times vary between 5m17 and 5m35 for both.
All in all, the new version is quite an improvement, if you test with a proper set of files ;) |
Yes, it is indeed. I tested unrar also on my DS214 yesterday, which is a 2-core armadaxp architecture. On this machine unrar 5.0.14 compiled with O3-flag surprisingly performed a little faster than unrar 5.1.2 with O3-flag, at least on this rarset. ;-) Ranking for DS214 (armadaxp, 2 cores): So it looks like the new version unrar 5.1.x is a quite a boost for atom, although we cant say about the other arm or powerpc architectures yet. And the versions compiled with O3-flag did perfom faster as the ones with O2 flag on atom and armadaxp. |
Ran your set on 88f6281 arch, so there's another ARM arch. In order:
So it looks like O3 wins, regardless of arch. Ran the script twice, 5.1.2 beat 5.0.14 twice, so I'm calling that good too ;) |
Sabnzbd is ready to be updated then :) |
Yep, the code is all set, except for a Changelog entry on major unrar speedup ;) |
SAB has been updated, and it even has a neat changelog entry for unrar. Sidenote: If it's worth looking further into par2, let's open a new issue. It would be preferred to stick to one source, which we can compile for the various arches, not sure if that's possible though. In the meantime, I think we've tested everything there is to test on unrar. Thanks to all who have brought it to our attention, and have been testing with the various settings and versions. |
@Dr-Bean Published. |
@moneytoo Great, thanks a lot :) |
@Dr-Bean regarding par2 there is par2cmdline+tbb, but unfortunately, arm seems not to be supported with tbb 2.2. If somebody is aware of another par2 project running on linux and having multicore/multithreading-support, that would be cool :) |
Maybe somebody with synology atom nas might find this helpful. Thanks to the kind help of Diaoul and Dr-Bean i managed to compile spksrc unrar on debian 7 32bit for x686 evansport platform to compare the performance of unrar compiled with spksrc against the linux commandline version from http://www.rarlab.com/download.htm
In my test i found the roshall build more than 2,5 times faster than the spksrc build on a password encrypted rarset with compression level store. I now use the roshall linux build for nzbget and sabznd on my nas.
Versions were 5.01 for both builds on a DS214play, DSM 4.3. Both versions used only 1 core, which is in my case actually 1 thread (2 physical cores with Hyper-Threading), accordingly the cpu-usage of unrar was at 25%. WIth unrar 5.0 it is theoretically possible for unrar to use more than one core, although i did never see it happen on my nas. ;-)
Owners of a Synology x64-nas might find this post helpful http://www.synology-forum.de/showthread.html?50070-NZBGET-Entpacken-schneller-machen
I unrared a 10gb file from 71 rarfiles, each file 150MB size. To finish 47 rafiles, the unrar_roshall needed 27min 16sec. unrar_spksrc needed 72min 55sec.
Update:
For completeness i tested also the version of unrar, which comes with Synology DSM 4.3, which is version 3.80 and additionally a version of unrar which was included in my sabnzbd installation from SynoCommunity / spksrc, which is version 5.0. The rarfiles are exact the same as above and the nas hardware as well.
To finish the 47 files the unrar_380_syn needed 30min 45sec, the unrar_5.0_sab_package needed 75min 41sec.
It looks like the Synology's own version of unrar has been optimized nicely for the hardware, but the roshals binary did run slightly faster in my test. The performance of the unrar which was included in the sabnzbd package (0.7.16-7) run similar to the binary i did compile on myself.
After setting ADDITIONAL_CXXFLAGS = -O2 or -O3 respectively in Makefile (many thanks to Dr-Bean for this fix) :) :
This time running on DSM 5.0 final:
To finish the 47 files the unrar_5.01_spksrc_o3 needed 22min 05sec, unrar_5.01_spksrc_o2 needed 23min 29sec (recompiled with CXXFLAGS = -O2 flag), the unrar_5.01_syn_DSM5.0 needed 30min 02sec (comes with synology DSM 5.0), the unrar_roshall needed 27min 36sec (unrar_roshall (see above) tested again, in order to check on possible systematic influence of DSM 5.0 on unrar performance).
The re-compiled unrar_5.01_spksrc_o3 did perform best of all tested unrar builds, unpacking around 6 percent faster than unrar_5.01_spksrc_o2 and 24 percent faster than the unrar_roshall on above rarset. The measured time difference of unrar_roshall in my tests on DSM version 4.3 and 5.0 respectively was small (around 1 percent), so this could be explained maybe better by random cause than to assume a systematic influence of the DSM version.
Update with unrar v5.1beta1 (28.03.2014):
I just downloaded and compiled the newest version of unrar v5.1beta1 with spksrc for my DS214play, after updating the digests file for the new hashes and the Makefile in cross/unrar/.
For the 47 files the unrar_5.1beta1_spksrc_o3 only needed 7min 30sec. Thats an speed improvement of 294 percent over the former number one unrar_5.01_spksrc_o3 (unrar version 5.0.14), which needed 22min 04sec.
To sort out any possible systematic influences, e.g. the new DSM Update DSM5update1 and possible changes to spksrc i recompiled unrar_5.01_spksrc_o3 and found no significant changes in speed: it needed 22min 10sec, which is a negligible difference and confirmes, that the speed increase of the new unrar beta 5.1beta1 is caused solely from the improvements in its code.
Ranking for DS214play (evansport x686, 2 cores, 4 threads), unpacking 47 parts:
Ranking for DS214 (armadaxp, 2 cores): (Update 14.04.2014)
unrar 5.0.1 (O3 flag, spksrc): 18min 44sec for 47 parts, 27min 53sec for all 71parts
unrar 5.1.2 (O3 flag, spksrc): 19min 52sec for 47 parts, 29min 35sec for all 71parts
unrar 5.0.1 (O2 flag, spksrc): 22min 59sec for 47 parts, 34min 11sec for all 71parts
unrar 5.0.1 (original DSM5): 23min 16sec for 47 parts, 34min 42sec for all 71parts
unrar 5.1.2 (O2 flag, spksrc): 23min 48sec for 47 parts, 35min 49sec for all 71parts
The text was updated successfully, but these errors were encountered: