Assembler optimizations #2398
Replies: 4 comments 14 replies
-
If you want to stick with MD5, I'd recommend ignoring the file MD5 and try to compute block based MD5s instead. Block MD5s won't be less reliable than a file MD5 (unlike CRC32).
The assumption obviously breaks if the upload isn't a bunch of parts. But as you point out, you generally want to write sequentially as much as possible, without jumping across files. But even in the ideal scenario, it scales less than what a block-based MD5 approach can achieve, hence my recommendation to avoid relying on file-based MD5s if possible.
Sparse/pre-allocated files doesn't prevent you from writing sequentially though. They just provide the option to perform a random write if necessary (e.g. missing articles, article cache fills up etc). Other benefits are listed here. |
Beta Was this translation helpful? Give feedback.
-
Duplicating from the 16 KB discussion and a few additional things from some observations I found of the assembler behaviour:
My PC and in-memory testing setup is quite extreme however, if optimisations can be found allowing faster speeds to be possible then it will also have the effect of reducing CPU usage for slower connections or better performance on slower devices. If any system can download faster than it can MD5 or I suppose faster than it can write the article cache then the article cache can become full, at which point it is bypassed and writes straight to disk which the assembler than needs to read again. I think there are a couple of areas we should consider optimisations to:
Please share any thoughts on the above or your own ideas for the optimal "should the assembler run" logic. |
Beta Was this translation helpful? Give feedback.
-
We could do what we do several other places to reduce useless calls. Add a |
Beta Was this translation helpful? Give feedback.
-
looking at logs with latest dev (4.2.0a2), was curious why article cache doesnt seem to use too much and recalled this discussion. in logs i see "Assembler trigger = 72" what is the 72 in regards? |
Beta Was this translation helpful? Give feedback.
-
I've started this discussion for more general talk about Python and file assembler optimizations. The 16 KB SSL discussion is getting a bit long and broad and it seems better to keep #2396 specifically for the technical issues of replacing MD5 with CRC32.
I have done some isolated testing of 750 KB blocks while calculating MD5. My conclusion is that it can be parallelized fairly well. Even with just one thread, the calculating and writing overlaps somewhat. It's probably because of the write cache. Doing MD5 in a separate thread can improve it to be more or less fully parallelized. In my test branch I've put it in a separate thread: https://github.com/sabnzbd/sabnzbd/pull/2391/files
We could even run multiple MD5 threads as long as most usenet posts consists of lots of RAR files by downloading blocks from multiple files simultaneously. If we have x threads with their own queue then we could attach threads with queues to each file (nzf) in a round robin fashion. It would make sure SAB could scale better as CPUs get more cores without increasing the frequency much. It wouldn't work on spinning disks but those probably won't be faster than single core MD5 calculations anyway.
There was some discussion about sparse files. I'm not sure if that's a good idea, for the same reason. I assume that if we write parts to random parts in a file then the write speed will go down on a spinning drive. I suspect that this may sometimes be the reason why some people get a higher speed with SAB than Nzbget.
@mnightingale: Regarding your last message, I think that in theory it shouldn't be necessary to release the GIL by using sleep. It happens anyway both when writing data and calculating MD5. Because of this I think it's already releasing the GIL most of the time it's running. If you want to try without the unnecessary loops and file openings you can apply this change: https://github.com/sabnzbd/sabnzbd/pull/2392/files
Beta Was this translation helpful? Give feedback.
All reactions