Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment in AWS MSK Connect fails with sts "connect timed out" #30

Closed
kyrsideris opened this issue Jun 23, 2022 · 5 comments
Closed

Comments

@kyrsideris
Copy link

kyrsideris commented Jun 23, 2022

Hello,

Thank you very much for this nice connector, I wish I could use it!
I am deploying it in AWS MSK Connect and the exception that I see is the following:

com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to sts.amazonaws.com:443 [sts.amazonaws.com/X.Y.Z.V] failed: connect timed out

Any help will be appreciated! 🙏

Configuration

The role that I have provided to the MSK Connector (test-kafka-connect-sqs-source-role) has more than enough permissions in the policies and the trust policy looks like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "kafkaconnect.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789012:root"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

The trust in kafkaconnect.amazonaws.com is needed so the service can used this IAM role.

The configuration of the connector is the following:

name=TestSQSToKafka
connector.class=com.nordstrom.kafka.connect.sqs.SqsSourceConnector
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
topics=event-via-sqs
tasks.max=3
sqs.wait.time.seconds=5
sqs.max.messages=5
sqs.queue.url=https://sqs.eu-west-1.amazonaws.com/123456789012/test-event
sqs.region=eu-west-1
sqs.credentials.provider.class=com.nordstrom.kafka.connect.auth.AWSAssumeRoleCredentialsProvider
sqs.credentials.provider.role.arn=arn:aws:iam::123456789012:role/test-kafka-connect-sqs-source-role
sqs.credentials.provider.session.name=my-kafka-connector-source-sqs-session
sqs.credentials.provider.external.id=test-kafka-connect-sqs-source-role

Kafka authentication method is set to "None"

Versions

Apache Kafka version: 2.6.2
Apache Kafka Connect version: 2.7.1
kafka-connect-sqs: 1.4.0

Logs

The full exceptions is the following

 [2022-06-23 13:22:30,701] ERROR [TestSQSToKafka|task-0] WorkerSourceTask{id=TestSQSToKafka-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:191)"
 com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to sts.amazonaws.com:443 [sts.amazonaws.com/X.Y.Z.V] failed: connect timed out
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
 	at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.doInvoke(AWSSecurityTokenServiceClient.java:1727)
 	at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.invoke(AWSSecurityTokenServiceClient.java:1694)
 	at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.invoke(AWSSecurityTokenServiceClient.java:1683)
 	at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.executeAssumeRole(AWSSecurityTokenServiceClient.java:532)
 	at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.assumeRole(AWSSecurityTokenServiceClient.java:501)
 	at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.newSession(STSAssumeRoleSessionCredentialsProvider.java:348)
 	at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.access$000(STSAssumeRoleSessionCredentialsProvider.java:44)
 	at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider$1.call(STSAssumeRoleSessionCredentialsProvider.java:93)
 	at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider$1.call(STSAssumeRoleSessionCredentialsProvider.java:90)
 	at com.amazonaws.auth.RefreshableTask.refreshValue(RefreshableTask.java:295)
 	at com.amazonaws.auth.RefreshableTask.blockingRefresh(RefreshableTask.java:251)
 	at com.amazonaws.auth.RefreshableTask.getValue(RefreshableTask.java:192)
 	at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.getCredentials(STSAssumeRoleSessionCredentialsProvider.java:320)
 	at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.getCredentials(STSAssumeRoleSessionCredentialsProvider.java:43)
 	at com.nordstrom.kafka.connect.auth.AWSAssumeRoleCredentialsProvider.getCredentials(AWSAssumeRoleCredentialsProvider.java:43)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1269)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:845)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:794)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
 	at com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java:2329)
 	at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2296)
 	at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2285)
 	at com.amazonaws.services.sqs.AmazonSQSClient.executeReceiveMessage(AmazonSQSClient.java:1715)
 	at com.amazonaws.services.sqs.AmazonSQSClient.receiveMessage(AmazonSQSClient.java:1683)
 	at com.nordstrom.kafka.connect.sqs.SqsClient.receive(SqsClient.java:116)
 	at com.nordstrom.kafka.connect.sqs.SqsSourceConnectorTask.poll(SqsSourceConnectorTask.java:80)
 	at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:291)
 	at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:248)
 	at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:189)
 	at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:238)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 	at java.base/java.lang.Thread.run(Thread.java:829)
 Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to sts.amazonaws.com:443 [sts.amazonaws.com/209.54.177.164] failed: connect timed out
 	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
 	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
 	at jdk.internal.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
 	at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
 	at com.amazonaws.http.conn.$Proxy42.connect(Unknown Source)
 	at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
 	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
 	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
 	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
 	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
 	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
 	at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346)
 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
 	... 47 more
 Caused by: java.net.SocketTimeoutException: connect timed out
 	at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
 	at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412)
 	at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255)
 	at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237)
 	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
 	at java.base/java.net.Socket.connect(Socket.java:609)
 	at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:368)
 	at com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:142)
 	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
 	... 62 more
 [2022-06-23 13:22:30,702] INFO [TestSQSToKafka|task-0] task.stop:OK (com.nordstrom.kafka.connect.sqs.SqsSourceConnectorTask:126)"
@dylanmei
Copy link
Contributor

Connect to sts.amazonaws.com timed out implies that it cannot be reached. Is it possible the Kafka Connect cluster runs in an unusual networking configuration?

If the queue and connector accounts are the same, you may not need to assume a role at all, although it is nice to have.

@kyrsideris
Copy link
Author

Thanks for your reply @dylanmei. Yes everything is in the same account. Let's say that I don't want to assume a role at all, should I just remove the following properties?

sqs.credentials.provider.class
sqs.credentials.provider.role.arn
sqs.credentials.provider.session.name
sqs.credentials.provider.external.id

@dylanmei
Copy link
Contributor

That's correct, it should just inherit the context of the worker in that case. The README doesn't make it so clear that these things are optional 😞 but you can see we don't supply any such values in the demo folder.

@kyrsideris
Copy link
Author

I removed the above properties and I am observing the same failure mode. 😞
I will not push more to fix this issue.
Thank you again for your help @dylanmei! 🙏
Please close this issue as not conclusive.

@mshidlov
Copy link

Hey @kyrsideris, I had a similar issue #31. In my case, MSK Connect didn't have internet access. The connector worked fine when I added a NAT to the subnet that the MSK cluster was connected to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants