-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification Required: S3 Source Connector doesn't fetch new files #1570
Comments
If you want to achieve you need to deleteAfterRead or increase the max messages per poll. Your configuration will always poll the same 10 files since you don't move/delete them. Another possibility for achieving you what you are looking for is using the following connector: https://github.com/apache/camel-kafka-connector/tree/camel-kafka-connector-4.0.0/connectors/camel-aws-s3-cdc-source-kafka-connector with this one you should be able to consume new files without deleting them. Here the docs: https://camel.apache.org/camel-kafka-connector/next/reference/connectors/camel-aws-s3-cdc-source-kafka-source-connector.html |
thanks for the quick response.
Is this value configurable for the connector? |
No, it's not exposed, I can add that, but in your case whatever is value it won't cover your case unless you delete after read or move. |
It would be great if it's exposed. |
Opened an issue on camel-kamelets project. |
Thanks a lot for apache/camel-kamelets#1692 🙏 |
We first need to release camel-kamelets 4.1.0. I'm planning to do it this week or beginning of the next, then we can upgrade in ckc and release thanks to @valdar |
Usecase:
I'm trying out the S3 source connector. The S3 bucket will be periodically updated, and I want the new files to be sourced to the Kafka topic, without duplicates, without deleting from the existing S3 bucket, and without moving to a new bucket.
Test
My test was with the deleteAfterRead
false
and with idempotency enabled (with the Kafka type repository), with the below configuration:Configuration:
ISSUE:
The new files are not fetched. From the Kafka connect DEBUG logs, it looks like the first few files (10 files or so) are fetched during each poll to S3.
Other info:
maxMessagesPerPoll
is 10? But then there seems to be no configuration property to set this for S3 source connector? 🤔Versions tested
camel-aws2-s3-kafka-connector 0.11.5
camel-aws-s3-source-kafka-connector 3.20.6
camel-aws-s3-source-kafka-connector 4.0.0
Question
Please let me know if the intended use case can be realized. And if so, what am I missing? Kindly advise 🙏
The text was updated successfully, but these errors were encountered: