Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce file system abstraction and an S3 implementation #492

Merged
merged 8 commits into from
Aug 13, 2021

Conversation

jonatanklosko
Copy link
Member

Closes #203.

Initially we intended to start with GitHub as an alternative file system, but they don't support per-repository API tokens, which implies security concerns in our case. Instead, we went for S3, which makes it possible to limit API credentials to a specific bucket. Additionally, it seems popular among storage providers to expose S3-compatible API, so in turn we support these as well! Storj offers a bunch of free storage, so it's potentially a neat choice to play around with Livebook.

Visible changes

There's a new settings page, where we can plug in virtual file system(s) backed by an S3 bucket. All the regular navigation and create/rename/delete operations work as expected, the same goes for image uploads and relative link navigation. Notebooks can be persisted directly to such file system and there's a new option to configure/disable the autosave interval.

Demo

s3.mp4

Technical details

The file system abstraction is encapsulated in the Livebook.FileSystem protocol, which defines all operations we need a virtual file system to support. Currently there are two implementations of this protocol: Livebook.FileSystem.Local and Livebook.FileSystem.S3.

The protocol is relatively low level as works with paths directly, so a higher level interface is available through Livebook.FileSystem.File. This module defines a struct, which points to a specific location in a specific file system. Most operations on %File{} proxy to the underlying file system, but some operations (copy, rename) are extended to work across file systems.

Contrarily to regular file system, we normalize paths, such that a trailing slash determines whether the path points to a directory or a regular file. This disambiguation is necessary, because S3 (and possibly other file systems) don't have a concept of a directory and allow for dir/ and dir objects at the same time.

@dsdshcym
Copy link

@jonatanklosko
If GitHub is not an option, then what about Gist?
Gist doesn't provide per-gist API tokens, but I personally would still be willing to give gist access to livebook

@jonatanklosko
Copy link
Member Author

@dsdshcym we could use a single gist with multiple files to mimic a file system yeah. However the concern here is similar, we can only generate personal access token with access to all gists. To be more precise, the problem is when running in the cloud with multiple users, because when you configure a filesystem and put your token there, any other user can programmatically extract the token, in this case gaining access to all your gists. The only difference is that gist access feels "less critical" than repository access.

@dsdshcym
Copy link

@jonatanklosko I see... Thank you for the explanation!
I only considered the use case for a single user.
Now you have saved me quite some time submitting a PR for the gist backend. 🤣

BTW you are doing a really great job here! Appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add pluggable filesystem
3 participants