-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In S3 source connector, how to fetch only new files ? #311
Comments
You need to set the deleteAfterRead to true and the file will be deleted once consumed. |
Explained. |
What if I don't want to remove content from s3 ? |
You can use the aws2-s3 component and use the moveAfterRead option |
I'll add an example in camel-kafka-connector-examples repository |
Thanks for the help. Using moveAfterRead expects a destination bucket. What if i don't want to use new bucket. |
No, it's not. The connector is generated starting from the camel component, so you need to move the file somewhere else, actually only a different bucket is supported. |
Do we plan to add this functionality down the line ? |
No, we don't. It doesn't make sense. If you use Camel, you can add an idempotent repository to the consumer, so you would be able to consume each file just one time and at the same time we cannot change the Camel component behavior for a particular use case. Moving stuff to a different bucket it's already a good tradeoff. |
Got it thanks.
What do you mean by this ? |
In Apache Camel you can use https://camel.apache.org/components/latest/eips/idempotentConsumer-eip.html So if you really need to avoid a situation where you have to move the files on a different bucket, you can avoid using the camel-aws-s3 connector and pass your application to a pure camel route, by using the camel-aws-s3 component in combination with the camel-kafka component. |
Sure, trying it out. A question regarding idempotent. or this mechanism filters file list and sends only new one ? |
No, they will be filtered before reaching kafka. If you use the idempotentRepository don't forget to set the deleteAfterRead=false, so the file won't be deleted |
I am using S3 source connector tp fetch files from S3. It works fine and gets back all records.
But on adding new files, it again gives all reacords.
Here is my configuration
How do i set this configuration ?
The text was updated successfully, but these errors were encountered: