New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Include a list of file extensions that are known to be compressed to be excluded from compression #2190
Comments
In my experience Kopia performance does not really improve when skipping already compressed data unless you choose a deliberately CPU-intensive compression setting, in which case i have observed that png, jpg and (non-encrypted) zip at least can benefit from a second round of compression. In conclusion, I am not sure a pre-included list like this would be of benefit on average, whether applied by default or not. |
These are interesting findings I wouldn't have expected! My main reason for excluding already compressed files is to save CPU - especially on some older notebooks I still have around running with Linux - and backup size is not the main focus. But as you mention compression (a bit off-topic, but still related): It's also quite challenging to select an appropriate compression method: You can select from 18 different settings in kopia (2 of them are marked as "not recommended" which is helpful). With respect to UX, this is really overwhelming. If you're a computer scientist or admin, you quite likely have heard of the different algorithms, but for an "normal" user, I expect, they simply would have no idea what to choose. As I guess, it really depends on the data and the hardware you have, which option is the best: <dream-mode> Well, maybe not test all options, but just some, depending on a a setting with the slider value from "max speed" to "max compression" ... I have never heard of S2 (before I saw it in kopia) and in my experience zstd seems to be better both in terms of speed and compression than zip and gzip. How does S2 compare to zstd? |
Oh, there is a benchmark command already! Oops, I've overseen this. Wow! Kopia is really amazing! Thanks a lot! I'll try benchmarking once with excluded files, once without ... when I've got some spare time. |
Closed due to inactivity. Re-open and remove "stale" label if it should remain open for an additional period of time |
I'd assume that in most cases it doesn't gain a lot when a compressed file is compressed again when storing to the repository.
So, I think it would be helpful to ease setup, to include a list of file extensions that are known to be compressed - in such a way that they can be easily excluded from compression by a policy.
Actually, I'm not sure what would be the best way to do this. Ideas:
Anyway, this is the list I'm currently using:
The text was updated successfully, but these errors were encountered: