Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving blocks separately instead of final file, on the backend/cache #17

Open
maxux opened this issue Dec 20, 2018 · 3 comments
Open

Saving blocks separately instead of final file, on the backend/cache #17

maxux opened this issue Dec 20, 2018 · 3 comments
Assignees
Milestone

Comments

@maxux
Copy link
Collaborator

maxux commented Dec 20, 2018

This issue is related and created because of:

If the final file is saved after a download and the same cache is used with a new flist, if theses files are still there, they won't be updated, even if blocks have changed.

The only thing, sure to be unique and unmodified, is a block file. Each file have one or multiple block. This block contains integrity hash and encryption key.

It would make sens for me, to save each blocks (uncrypted, to avoid doing this all the time) and providing the right portion of file requested by the system, by reading blocks. Not keeping the final file.

In this way, we can always ensure the blocks integrity and if a final file have changed, this will be seen by the 0-fs, because one hash won't be there.

Cc @zaibon for follow-up.

@muhamadazmy
Copy link
Member

Actually saving file blocks instead of full file makes more sense and was considered before. Except it will give very bad performance. Also the way the fuse layer work now when a file is accessed, is be passing the open file descriptor to the fuse module fully hence it works like a proxy.

If we start saving the file as blocks we will have to handle all read, and seek operations and all will have to go through the 0-fs process, reducing performance.

It still can be done though, but we will have to do some benchmarking to see how much it will affect the performance.

@zaibon
Copy link

zaibon commented Dec 20, 2018

That's also the feeling I have about this. Re-assembling blocks will make things way slower I think.
Although @maxux had a point when he said that the syscalls to read a file give the offset and size, so we could just give the proper blocks.

But indeed benchmarking needs to be done to make sure we still get enough performance from this

@muhamadazmy
Copy link
Member

This commit 19b8df0 , is an improvement already on this. but it doesn't fix the issue of corrupted files (disk failure) or other causes of corruption once the file is downloaded. Hence a full cache check on boot should be implemented.

As mentioned above, the storing of the individual blocks can cause a huge impact, i can still give it a try. A check against the file hash on boot can be a better solution, but will cause a boot delay ...

@despiegk despiegk added this to the later milestone Dec 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants