New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaffold S3 remote persistence interface #3
Conversation
The new `RemotePersistence` trait provides an interface for loading and storing files on in remote location. The present implementation includes one implementation of this interface for AWS's S3 service. The `RemotePersistence` trait will optionally back the file-based persisters of results and manifests.
It appears that with ubuntu-latest runners, we can end up with queues much further away than we observed with rwx hosted runners. We should investigate what we can do to reduce this latency other than increasing batch sizes, but as a stopgap, bump the max overhead.
PersistenceKind::Manifest => "manifest", | ||
PersistenceKind::Results => "results", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ayazhafiz Do we want these to include a file extension, eg. .jsonl
or .jsonl.gz
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have too much of a preference - what would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's easy enough to add in a follow-up, I like having the extension so we'll never need to guess the format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll be sure to add that in a next PR.
The new
RemotePersistence
trait provides an interface for loading andstoring files on in remote location. The present implementation includes
one implementation of this interface for AWS's S3 service. The
RemotePersistence
trait will optionally back the file-based persistersof results and manifests.
Support for S3 is added as a feature, so that folks self-compiling ABQ do
not have to compile-in S3. By default, RWX will build ABQ binaries with
support for S3.
Note that the implementation adds a dependency on AWS's Rust SDK,
which is presently a "developer preview" and not stable. However,
anecdotal evidence suggests (1, 2)
the dependency is usable. We also do not have much other choice with
regards to AWS S3 clients.