Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upper directory (in-memory fs) should act as a block cache #2

Closed
MeanMangosteen opened this issue Feb 21, 2019 · 5 comments
Closed
Assignees
Labels
development Standard development

Comments

@MeanMangosteen
Copy link
Contributor

Similar to a page cache in operating systems, all read and write file operations can be tried against the image of the file contained the VFS first. This would be on a block level basis. If the desired block is not present in the file image inside the upper directory (VFS), a 'block fault' would occur. Now the corresponding block, persisted on disk, would be read, decrypted then populated in the image in the upper dir.

Every subsequent read of the block would simple be retrieved from memory, instead of performing a disk read.

Every write would also populate the upper dir image as well to ensure the block in the upper dir contains the most up-to-date data.

With this measures, blocks in the upper dir image cannot become 'dirty'. So the integrity of every read of loaded blocks in the upper dir is guaranteed.

To know which blocks are currently loaded in a file in the upper dir, a set for each file containing the loaded block numbers can be maintained. Block numbers would only be added never removed.

@MeanMangosteen MeanMangosteen self-assigned this Feb 21, 2019
@CMCDragonkai CMCDragonkai transferred this issue from MatrixAI/Polykey Mar 6, 2019
@MeanMangosteen MeanMangosteen added the development Standard development label Mar 7, 2019
@CMCDragonkai
Copy link
Member

How is this implemented in our inspirations?

@robert-cronin
Copy link
Contributor

robert-cronin commented May 8, 2020

On this line of thinking, one also needs to consider how to handle functions like stat/fstat or utimes. Functions that return properties of a directory/file might just look at the upper directory cache. In this case, efs needs to somehow make sure lower and upper dierectories agree on things like access permissions/timestamps/size etc. Well size would be an interesting one, we probably just want to provide the size of the decrypted file since encryption metadata should be transparent to the user.

Another thing to consider is fsync and fdatasync; I think this is more straightforward as in we can just flush the data from upperDir downwards using existing write methods but I have a feeling it's redundant since in both read and write methods, the data is synced between upper and lower directories.

@robert-cronin
Copy link
Contributor

One issue is knowing which blocks have been read into upperDir and which have yet to be read.

The "paging" system can be implemented by maintaining an in-memory index, an internal private object that keeps track of chunk mapping.

One other thing that I don't think is really an issue now (or might not ever be an issue) is concurrency with multiple instances of EFS. If there ever was multiple instances of EFS operating on the same file, we would need to ensure that the in-memory blocks are consistent with those in the encrypted chunk on the lowerDir. This could be done by storing a content hash (of the block) in the encrypted chunk and if the hash has changed, this would mean it has been written by another EFS instance.

I can see this maybe happening in distributed file systems, but it would be easy to circumvent by only sending from upperDir to upperDir using transport level encryption. I don't think it is within scope at the moment.

@robert-cronin
Copy link
Contributor

This has been implemented for write but is not yet utilised in the read method. This will depend on the in-memory index (chunk-mapping) described above

@robert-cronin
Copy link
Contributor

Closing on account of migration to gitlab

CMCDragonkai pushed a commit that referenced this issue May 7, 2021
Managing UpperFS Permissions and Metadata in the LowerFS

Closes #38, #8, #6, and #2

See merge request MatrixAI/Engineering/Polykey/js-encryptedfs!42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development
Development

No branches or pull requests

3 participants