Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AWS S3 Connector with Streaming support #1176

Closed
mfelsche opened this issue Aug 11, 2021 · 7 comments
Closed

Add AWS S3 Connector with Streaming support #1176

mfelsche opened this issue Aug 11, 2021 · 7 comments
Assignees
Labels
_complexity:medium A task with a medium complexity that should be challanging enhancement New feature or request mentorship

Comments

@mfelsche
Copy link
Member

Describe the problem you are trying to solve

It is very common in event processing to stream data to some kind of persistent storage engine for later processing or archiving purposes. One very prominent storage engine is AWS S3.

A common practice is to stream data into files that aggregate across a time window (e.g. 1 hour) or that accumulate a certain number of events or grow to a certain size. An AWS S3 connector should support this style of streaming.

Describe the solution you'd like

We would like to have an AWS S3 Connector that enables tremor to read S3 objects in a streaming fashion (source-part of the connector) and to write data to S3 objects also in a streaming fashion (sink-part of the connector).

It should support all the common ways of authentication to AWS and maintain authentication across the whole lifetime of the connector (e.g. through token refresh etc.).

It should use the official Rust SDK: https://github.com/awslabs/aws-sdk-rust

@mfelsche mfelsche added enhancement New feature or request mentorship _complexity:medium A task with a medium complexity that should be challanging labels Aug 11, 2021
@rahul799
Copy link

Hey @mfelsche,
This project seems very interesting to me so I will like to work on this as part of the LFX'21 Mentorship program. Thank you.

@mfelsche
Copy link
Member Author

Nice!

Please apply via the LFX site once it appears here as a mentorship. This might take some days: https://mentorship.lfx.linuxfoundation.org/#projects_accepting

We will handle it from there. Here is a tutorial-like guide from the LFX on how to apply: https://docs.linuxfoundation.org/lfx/mentorship/mentees/apply-to-a-project

@dak-x
Copy link
Contributor

dak-x commented Aug 16, 2021

Hi @mfelsche .

  • The awk-sdk in currently available as alpha and I donot see any roadmap for the official release yet. Does the support has to be experimental right now?
  • smithy-rs (the code generation tool for the sdk) does not produce runtime agnostic code. So are we expected to contribute this to their sdk as well. I will have to check whether they are accepting contribution for smithy-rs. I just hope this is straightforward.

@mfelsche
Copy link
Member Author

Hi @dak-x it would be cool to not rely on the tokio runtime that is used for the rust aws-sdk, but changing the codegen tool for the aws-sdk (smithy-rs) is not a requirement. It would be wicked cool, nontheless 😎

@mfelsche
Copy link
Member Author

Also i wouldnt worry about the SDK being experimental. This is fine!

@OliverShang
Copy link

Hi @mfelsche,

I have applied this mentorship program via LFX, this project seems interesting. I got some experience with some other Object Storage Service like Aliyun OSS. Hope I could get the opportunity to work on this project.

Thanks,

@dak-x dak-x mentioned this issue Nov 26, 2021
Merged
6 tasks
@dak-x dak-x mentioned this issue Jan 10, 2022
6 tasks
@mfelsche
Copy link
Member Author

This is done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
_complexity:medium A task with a medium complexity that should be challanging enhancement New feature or request mentorship
Projects
None yet
Development

No branches or pull requests

4 participants