Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow binary encoding for written files to reduce file size #224

Merged
merged 3 commits into from
Feb 9, 2021

Conversation

karakayasemi
Copy link
Contributor

@karakayasemi karakayasemi commented Dec 20, 2020

Fixes #210, owncloud/core#10831
Needs owncloud/core#38249

This PR makes binary encoding default and writes the encoding info to the file metadata. After this PR, all the newly created files will be encrypted with binary encoding.
For decryption, if no metadata presents we keep continue to decrypt these files with legacy base64 encoding, if header is present we use binary encoding to decrypt those files.

@karakayasemi
Copy link
Contributor Author

@phil-davis @micbar looks like we have common problem in web ui tests independent from this pr. Daily buils are also failing. Probably, it is related with the web ui changes of oc10. (tags, open with etc.)

@phil-davis
Copy link
Contributor

PR #223
is currently running CI. I am trying to get a pass! When that is merged, then other PRs can be rebased.

@karakayasemi
Copy link
Contributor Author

@phil-davis thank you for information.

@phil-davis
Copy link
Contributor

@micbar the repo is still set to require codecov/patch and codecov/project - those should no longer be required.

@micbar
Copy link
Contributor

micbar commented Dec 22, 2020

removed the check

@mmattel
Copy link
Contributor

mmattel commented Dec 22, 2020

@karakayasemi How does this work with existing encrypted files?
Just thinking about that the reduction is as written in the referenced PR ~35%, I would like to see that old files get the reduction too. What about backwards compatibility for existing setups? See also my comment in the referenced issue.
BTW, needs a rebase 😄

@karakayasemi
Copy link
Contributor Author

@mmattel I already read your comments on the issue. In this pr's current implementation, the code checks encryption.binary_encode system config value for binary encoding. If it is not exist, the app works like before. But, I guess this config value more proper as app config. We may move this config value to app config and add a checkbox to admin settings to be able to enable.

For the migration, as you said we can add an occ command.

@karakayasemi
Copy link
Contributor Author

karakayasemi commented Dec 22, 2020

@mmattel I thought about the migration command again. Actually, we have encryption:decrypt-all and encryption:encrypt-all commands. We may just suggest decrypt-all the files, enable the binary encode and re-encrypt all for existing setups.

What do you think? I am open to suggestions.

@mmattel
Copy link
Contributor

mmattel commented Dec 22, 2020

Just reading the occ command description:
encryption:encrypt-all encrypts all data files for all users (no single user encryption... good)
encryption:decrypt-all decrypts all user data files, or optionally a single user (I can live with it)

Yes, this would work
decrypt
enable new config setting
encrypt

I would thing that it would be good to have the choice, at least to see which (old or new) method will be used. That was the reason for my thoughts of a new config setting...

@karakayasemi
Copy link
Contributor Author

Another option would be making binary encoding default and writing the encoding info to the file metadata. I guess no reason to keep base64 encoding format for new files. In this way, all the newly created files encrypted with binary encoding.
For decryption, if no metadata presents we can keep continue to decrypt these files with legacy base64 encoding, if header is present we can use binary encoding to decrypt.

@mmattel
Copy link
Contributor

mmattel commented Dec 22, 2020

making binary encoding default

as long you can still reade the old format for existing files... 😄
(means you can have both types in parallel (old: still readable, new: rw) )
The rest is a matter of documentation.

@mmattel
Copy link
Contributor

mmattel commented Dec 23, 2020

old: still readable, new: rw

Idea: as old files are still readable and new files will have the new format...
What about the idea of an occ command which rewrites existing files? Somthing like:
occ encryption:rewrite (in single user mode only)
To think about possible options, at least -v, or a --skip= option, or a --path= option
Such a command could be interrupted and restarted without harm!

@mmattel
Copy link
Contributor

mmattel commented Dec 27, 2020

It is interresting that the blocksize was set to

old: private $unencryptedBlockSizeSigned = 6072;
new: private $unencryptedBlockSizeSigned = 8096;

I just think about that on the filesystem level, blocks are usually chunks in 4k. Where is 6k coming from, because that allone is producing a overhead and misaglingment ...?

@karakayasemi
Copy link
Contributor Author

karakayasemi commented Dec 27, 2020

@mmattel It is really not easy to understand these calculations for me either. Still I could not exactly understand the reason of magic number 6126 also. As far as I understand, ownCloud filesystem blocksize is 8192. With current base64 encoding format, we can only use 6072 bytes in an 8192 bytes block for storing data. Its the cause of %33 file overhead. Each signed encrypted block also stores initialization vector, signature and extra padding.

Seems like storing signature of each block takes (6126-6072) = 54 bytes and if a block signed, the available bytes reduces to 6072. Since I only touched encoded block, I just adjusted 6072 to 8096 for binary encoded data by multiplying it with coefficient 1.33.

@karakayasemi karakayasemi force-pushed the binary-encode branch 2 times, most recently from 628c5d5 to 747492a Compare January 14, 2021 17:57
@phil-davis
Copy link
Contributor

phil-davis commented Feb 4, 2021

@karakayasemi we were forced to switch drone CI to use .drone.yml in PR #237

@dpakach is now getting the tests sorted out in PR #236 - he has rebased that. (done)

But you will need to rebase this PR anyway so that it uses the "new" .drone.yml. So you may as well rebase it now. Then we can easily cherry-pick tests onto it later today or tomorrow. - I have rebased just now. And soon we will push the commit with the tests. And a commit with tests has been cherry-picked into this PR. Let's see what CI thinks.

@phil-davis
Copy link
Contributor

If everyone is happy, IMO this PR is OK to merge.

@micbar micbar merged commit b10151d into master Feb 9, 2021
@delete-merged-branch delete-merged-branch bot deleted the binary-encode branch February 9, 2021 13:01
@butonic butonic requested review from micbar, pmaier1 and dragotin and removed request for dragotin, micbar and pmaier1 March 19, 2021 16:02
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 21, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 21, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 22, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 22, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 24, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 24, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 26, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request Apr 26, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request May 3, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request May 3, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
plumbeo added a commit to plumbeo/server that referenced this pull request May 4, 2022
Default to the more space-efficient binary encoding for newly encrypted files
instead of the traditional base64 encoding, eliminating the 33% overhead.

The new option 'encryption.use_legacy_encoding' allows to force the legacy
encoding format if needed. Files encoded in the old format remain readable.

Based on owncloud/encryption#224 and
owncloud/core#38249 by karakayasemi.

Signed-off-by: plumbeo <plumbeo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unnecessary file size overhead
7 participants