Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grab topics using regex and put in S3 by topic name #1020

Closed
cristianburca opened this issue Feb 17, 2021 · 4 comments
Closed

Grab topics using regex and put in S3 by topic name #1020

cristianburca opened this issue Feb 17, 2021 · 4 comments

Comments

@cristianburca
Copy link

cristianburca commented Feb 17, 2021

Hello,

I would like to know if possible to put grabbed topics with regex into particular folder in s3 by topic name.

let's say we have:

test.topic1.error
test.topic2.error
test.topic3.error
test.topic1.raw
test.topic2.raw
test.topic3.raw

And my sink s3 connector using a regex looks like:

{
 "name": "s3-camel-pattern",
 "config": {
    "connector.class": "org.apache.camel.kafkaconnector.awss3.CamelAwss3SinkConnector",
	"tasks.max": "1",
	"key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
    "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
	"topics.regex": ".*(error|raw)$",
    "camel.sink.path.bucketNameOrArn": "camelecareconnectortestreg",
	"camel.sink.endpoint.keyName": "${date:now:yyyyMMdd-HHmmssSSS}-${exchangeId}",
	"camel.component.aws-s3.region": "xxx",
        "camel.component.aws-s3.accessKey": "xxx",
        "camel.component.aws-s3.secretKey": "xxx"
   }
}

Right now all the topics messages are bulk put directly into root folder of s3 bucket.

I was thinking if we can split by folders using the topic name.
eg adding something like:
"camel.sink.endpoint.keyName": "$TOPIC_NAME/${date:now:yyyyMMdd-HHmmssSSS}-${exchangeId}"
where TOPIC_NAME is automatically retrieved as test.topic1.error and the other names.

Then the idea is to use a source s3 connector and put the messages from a specific s3 prefix into a particular topic (eg test.topic1.error)

{
 "name": "s3-camel-ecare-source-for-topic1",
 "config": {
    "connector.class": "org.apache.camel.kafkaconnector.awss3.CamelAwss3SourceConnector",
	"tasks.max": "1",
	"key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
	"topics": "s3.test.topic1.error",
        "camel.source.path.bucketNameOrArn": "camelecareconnectortestreg?autocloseBody=false",
        "camel.source.endpoint.prefix" : "test.topic1.error/", 
	"camel.source.endpoint.includeBody": "true",
        "camel.component.aws-s3.deleteAfterRead": "false",
	"camel.component.aws-s3.region": "xxx",
        "camel.component.aws-s3.accessKey": "xxx",
        "camel.component.aws-s3.secretKey": "xxx"
   }
}
@oscerd
Copy link
Contributor

oscerd commented Feb 18, 2021

The expression you see in the filename is coming from Camel, while the topic name should be retrieved by the connector configuration, so no, at the moment is not possible. You'll have to define n connectors one for each topic, if you want to achieve this.

@dhope-nagesh
Copy link

@oscerd Any timeline for this feature, if your are going to consider it in future?

@oscerd
Copy link
Contributor

oscerd commented Mar 3, 2021

We are actually not considering this. I think I'll close this one.

@oscerd
Copy link
Contributor

oscerd commented Mar 3, 2021

Thanks for pinging.

@oscerd oscerd closed this as completed Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants