Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-threaded analysis/compaction #26

Open
vladaad opened this issue Jun 8, 2020 · 3 comments
Open

Multi-threaded analysis/compaction #26

vladaad opened this issue Jun 8, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@vladaad
Copy link

vladaad commented Jun 8, 2020

On my PC, the scanning part only uses ~11% of CPU (Ryzen 2600, 11% is usually single threaded stuff) and ~6% of my SSD (Crucial P1 NVME 1TB). Would it be possible to somehow speed this up?

Also, sometimes the compression itself doesn't use the PC fully either, usually disk usage is below 50% and CPU usage below 30%, I think there might be a way to optimize that too.

@Freaky Freaky added the enhancement New feature or request label Jun 10, 2020
@Freaky Freaky changed the title Suggestion - use multiple cores and general optimization Multi-threaded analysis/compaction Jun 10, 2020
@Freaky
Copy link
Owner

Freaky commented Jun 10, 2020

I tend to be using my computer when I'm running Compactor, so leaving resources free is seen as more of a feature than a bug. This also ties in with future plans to make a set-it-and-forget-it background service.

That said I'll probably get around to adding optional concurrent operation at some point.

@malxau
Copy link

malxau commented Jun 21, 2021

For what it's worth, the kernel compressor and decompressor does use multiple cores/threads when acting on different chunks of a file. The problem the compressor has is it doesn't know the final size of each compressed chunk, so it's compressing multiple chunks in parallel but can only write the earliest compressed chunk into the file. When using a very lightweight compression algorithm, I'll bet the parallel compression operations are completing quickly so the single copy/write operation ends up being the bottleneck.

Remember that this engine was designed for Surface RT - it uses multiple cores but it was expecting multiple very slow cores.

Also since the API is per-file, it can only parallelize large files effectively. With a Ryzen 2600 (12 hardware threads) and XPress 4Kb, a file would need to be at least 48Kb before the hardware threads can be completely used, but this still ends up with a draining problem because the file isn't compressed until the last task is done. You should see the cores loaded by compressing a 1Gb file with LZX though, where the compression operation is relatively slow.

For what it's worth, I have a (command-line) tool for this that uses a threadpool so multiple files are compressed in parallel, which works fairly well for this problem if you're interested or want to measure the expected results: https://github.com/malxau/yori/tree/master/compact

@Toys0125
Copy link

For what it's worth, you can run multiple instances of compactor to increase the speed of compression by manually selecting folders to compress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants