Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting Schema-Registry ECS Container to AWS MSK #1126

Open
GeorgeWB19 opened this issue May 23, 2019 · 17 comments
Open

Connecting Schema-Registry ECS Container to AWS MSK #1126

GeorgeWB19 opened this issue May 23, 2019 · 17 comments

Comments

@GeorgeWB19
Copy link

GeorgeWB19 commented May 23, 2019

Hi Everyone,

I am trying for the past days to connect my Schema Registry ECS container to AWS MSK but the container keeps stopping.

2019-05-22 14:06:17 at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:143) 2019-05-22 14:06:17java.lang.RuntimeException: No endpoints found for security protocol [PLAINTEXT]. Endpoints found in ZK [{REPLICATION=b-1-internal.TESTCLUSTER.ttx77r.c3.kafka.eu-west-1.amazonaws.com:9093, CLIENT=b-1.TESTCLUSTER.ttx77r.c3.kafka.eu-west-1.amazonaws.com:9092}] 2019-05-22 14:06:17[main] ERROR io.confluent.admin.utils.cli.KafkaReadyCommand - Error while running kafka-ready. 2019-05-22 14:06:16[main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x3000003ecf70041 2019-05-22 14:06:16[main] INFO org.apache.zookeeper.ZooKeeper - Session: 0x3000003ecf70041 closed 2019-05-22 14:06:16[main-SendThread(172.xxx.xxx.xxx:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 172.xxx.xxx.xxx/172.xxx.xxx.xxx:2181, sessionid = 0x3000003ecf70041, negotiated timeout = 40000 2019-05-22 14:06:16[main-SendThread(172.xxx.xxx.xxx:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 172.xxx.xxx.xxx/172.xxx.xxx.xxx:2181, initiating session 2019-05-22 14:06:16[main-SendThread(172.xxx.xxx.xxx:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket

I am passing the following parameters:

`

SCHEMA_REGISTRY_DEBUG true
SCHEMA_REGISTRY_HOST_NAME arn:aws:kafka:eu-west-1:791843256015:cluster/TestCluster/6d14a907-fe45-4d4b-bd00-b933956a3289-3
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL 172.xxx.xxx.xxx:2181,172.xxx.xxx.xxx:2181,172.xxx.xxx.xxx:2181
SCHEMA_REGISTRY_KAFKASTORE_TOPIC_REPLICATION_FACTOR 3
SCHEMA_REGISTRY_LISTENERS http://0.0.0.0:8081
`

Has anyone tried before and succeeded on connecting the two?

Thank you!

@OneCricketeer
Copy link
Contributor

OneCricketeer commented Jun 6, 2019

Endpoints found in ZK [{REPLICATION=b-1-internal.TESTCLUSTER.ttx77r.c3.kafka.eu-west-1.amazonaws.com:9093, CLIENT=b-1.TESTCLUSTER.ttx77r.c3.kafka.eu-west-1.amazonaws.com:9092}]

You would need to set CLIENT as the Kafka store connection protocol rather than the default PLAINTEXT because that's apparently what's advertised

I would suggest just using Kafka master election rather than Zookeeper anyway

@chasdevs
Copy link

chasdevs commented Jun 9, 2019

Also running into this issue. Setting CLIENT as the connection protocol does not seem to work, though, because schema registry only accepts PLAINTEXT, SASL_PLAINTEXT, SSL or SASL_SSL (docs).

[main] ERROR io.confluent.admin.utils.cli.KafkaReadyCommand - Error while running kafka-ready.
org.apache.kafka.common.KafkaException: Failed create new KafkaAdminClient
        at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:378)
        at org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:64)
        at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:138)
        at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.kafka.common.security.auth.SecurityProtocol.CLIENT
        at java.lang.Enum.valueOf(Enum.java:238)
        at org.apache.kafka.common.security.auth.SecurityProtocol.valueOf(SecurityProtocol.java:26)
        at org.apache.kafka.common.security.auth.SecurityProtocol.forName(SecurityProtocol.java:72)
        at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:106)
        at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:353)
        ... 3 more

@cgrimal
Copy link

cgrimal commented Jun 14, 2019

Hi guys,

I've been struggling with the exact issue for a couple of days, and I think I managed to connect my schema registry (on ECS) to MSK.

I switched to Kafka master election by removing the SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL and replacing it by the SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS. I initially copy-pasted the value from the MSK client information pop-up, but it was still failing. I had to prepend PLAINTEXT:// in front of the addresses like that:
PLAINTEXT://X.Y.Z.A:9092,PLAINTEXT://X.Y.Z.B:9092,PLAINTEXT://X.Y.Z.C:9092

Not entirely sure yet it is exactly what I want, but it sure is better to have the schema registry up and running! I hope this helps and I'll stick around for a little while.

@pgottvalles
Copy link

Hi,

I have a working MSK cluster configured with SSl.
I have set up ACL on all of my topics and I can connet to them accordingly depending on the jks key store I use.
Now I'm trying to establish the same kind of SSL connetion to _schemas topic (protected by an ACL) from the schema registry but I keep on receiving the following error

[2019-06-28 15:45:19,015] INFO Initializing KafkaStore with broker endpoints: SSL://node1.c4.kafka.eu-central-1.amazonaws.com:9094,SSL://node2.c4.kafka.eu-central-1.amazonaws.com:9094,SSL://node3.c4.kafka.eu-central-1.amazonaws.com:9094 (io.confluent.kafka.schemaregistry.storage.KafkaStore) [2019-06-28 15:47:19,118] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication) io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:203) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:63) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:41) at io.confluent.rest.Application.createServer(Application.java:169) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43) Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: Failed trying to create or validate schema topic configuration at io.confluent.kafka.schemaregistry.storage.KafkaStore.createOrVerifySchemaTopic(KafkaStore.java:175) at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:113) at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:201) ... 4 more Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:104) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:274) at io.confluent.kafka.schemaregistry.storage.KafkaStore.createOrVerifySchemaTopic(KafkaStore.java:163) ... 6 more Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.

NB:
from the same machine I can consume without any issue my empty _schemas topic

here's my schema-registry.properties

listeners=http://0.0.0.0:8081
kafkastore.bootstrap.servers=SSL://node1.c4.kafka.eu-central-1.amazonaws.com:9094,SSL://node2.c4.kafka.eu-central-1.amazonaws.com:9094,SSL://node3.c4.kafka.eu-central-1.amazonaws.com:9094
kafkastore.topic=_schemas
kafkastore.security.protocol=SSL
kafkastore.ssl.truststore.location=/opt/confluent/etc/kafka-security/msk-client-trust-store/msk.client.truststore.jks
kafkastore.ssl.keytstore.location=/opt/confluent/etc/kafka-security/msk-client-key-store/schemaregistryuser.kafka.client.keystore.jks
kafkastore.ssl.keytstore.password=blabla123
kafkastore.ssl.key.password=blabla123

Any idea?
Thx in advance

@pgottvalles
Copy link

just a quick addition: even if the topic _schemas is not protected by any ACL, the same isse occurs...

@rkotha123
Copy link

Hi Guys , I am facing the same issue ..
Did you guys find any solution ?

Thanks in advance ..

@navinsnn53
Copy link

navinsnn53 commented Aug 21, 2019

Hi Guys ,

Above suggested fix not working. Still not able to connect schema registry to MSK. I am running Docker in a separate instance and have connections enabled to MSK services.

> docker run -d \
>   --net=host \
>   --name=schema-registry \
>   -e SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=10.95.21.38:2181 \
>   -e SCHEMA_REGISTRY_HOST_NAME=schema-registry \
>   -e SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8081 \
>   -e SCHEMA_REGISTRY_DEBUG=true \
>   docker.io/confluentinc/cp-schema-registry:latest
> 

Error faced when connecting with SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL is below

[main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x30000003c6e0008
[main] ERROR io.confluent.admin.utils.cli.KafkaReadyCommand - Error while running kafka-ready.
java.lang.RuntimeException: No endpoints found for security protocol [PLAINTEXT]. Endpoints found in ZK [{REPLICATION=b-1-internal.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9093, CLIENT=b-1.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9092, CLIENT_SECURE=b-1.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9094, REPLICATION_SECURE=b-1-internal.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9095}]

Also tried to replace SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL with SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS

docker run -d \
  --net=host \
  --name=schema-registry \
  -e SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS=PLAINTEXT://b-2.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9092,b-3.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9092,b-1.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9092 \
  -e SCHEMA_REGISTRY_HOST_NAME=schema-registry \
  -e SCHEMA_REGISTRY_LISTENERS=http://schema-registry:8081 \
  -e SCHEMA_REGISTRY_DEBUG=true \
  docker.io/confluentinc/cp-schema-registry:latest

Still getting error with below message

ERROR Server died unexpectedly:  (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain)
java.net.SocketException: Unresolved address

Any know solution to this.

@pgottvalles
Copy link

So I go the config resolved:
first of all I had a typo in my schema-registry config:
kafkastore.ssl.keytstore.location=/opt/confluent/etc/kafka-security/msk-client-key-store/schemaregistryuser.kafka.client.keystore.jks
kafkastore.ssl.keytstore.password=blabla123
kafkastore.ssl.key.password=blabla123

kafkastore.ssl.keytstore.location mis-spelt => should have been kafkastore.ssl.keystore.location
same for kafkastore.ssl.keytstore.password => kafkastore.ssl.keystore.password

Then I had an issue with ACL:
I solved that by giving to my shema registry user the following rights:

  • on _schemas topic: consumer, producer and --operation DescribeConfigs
  • on __consumer_offsets topic --operation Describe

that's it!!

@navinsnn53
your bootstrap address is wrong and shoud be:

kafkastore.bootstrap.servers=PLAINTEXT://b-2.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9092,PLAINTEXT://b-3.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9092,PLAINTEXT://b-1.kafkaclusternew.jbf0pp.c4.kafka.us-west-2.amazonaws.com:9092

Hope this helps

@evarga
Copy link

evarga commented Oct 2, 2019

I would like to share with you some additional tips how to setup Schema Registry (at the time of this writing the last stable edition is 5.3.1-1) to run inside AWS ECS using the AWS Fargate launch type. Using a serverless option can make your life easier in cases where you don't need full control over EC2 instances. I hope you will find the next couple of tricks useful.

The decision about specifying brokers is explained in the section Single Primary Architecture of the Confluent's documentation.

Schema Registry requires setting the SCHEMA_REGISTRY_HOST_NAME environment variable. With AWS Fargate and awsvpc networking mode there is no way to explicitly set the host name property inside the task definition. For horizontal scaling you need multiple registry instances each having its own SCHEMA_REGISTRY_HOST_NAME value. I suggest overriding the Command property in a task definition in the following manner (the HOSTNAME is provided by Docker and under awsvpc regime it reliably reflects the host name) to set this env. variable:

sh,-c,export SCHEMA_REGISTRY_HOST_NAME=$HOSTNAME;/etc/confluent/docker/run

You can "insert" additional files into your container by using volumes. Confluent's Docker images follow the convention of exposing a container mount point via the VOLUME Dockerfile statement as /etc/<component name>/secrets. In our case, the component name is schema-registry.

P.S. You may also want to attach a web-based GUI to browse and manage the schemas inside the registry. There is an open-source project Landoop/schema-registry-ui (ships as a Docker image) that mimics the Confluent's Control Center.

@aboyanov
Copy link

@pgottvalles are you using ACM PCA(Private Certificate Authority) to connect your Schema-Registry to MSK over TLS? According to the documentation, if you want to connect over TLS, it is only possible if you have PCA. Does someone know a different way or workaround, I don't want to spin up PCA just for that.
Thanks in advance :)

@pgottvalles
Copy link

Yes we are using ACM PCA.
I don't think you can establish mutual TLS for MSK without it.... But to be honest we didn't even try.

I think that would require serious hacking as MSK needs to be deployed with the PCA arn if you want to enable TLS and authorization

Best regards

@sudarshanGit
Copy link

Hello Guys .I am still getting same issue I tried PLAINTEXT:// without connection URL and with also .

@leohoare
Copy link

Anyone had any luck with establishing an MSK cluster with TLS, without running a PCA? It really hikes up the cost, particularly running multiple environments

@ppiazzolla
Copy link

Anyone had any luck with establishing an MSK cluster with TLS, without running a PCA? It really hikes up the cost, particularly running multiple environments

I finally got it to work with TLS (not mutual TLS) without a PCA. The key for me was to actually set the security protocol.

My setup in Task definition:

SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS=SSL://b-2.xyz.kafka.eu-north-1.amazonaws.com:9094,SSL://b-1.xyz.kafka.eu-north-1.amazonaws.com:9094
SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL=SSL

And with @evarga tips as the Command.
sh,-c,export SCHEMA_REGISTRY_HOST_NAME=$HOSTNAME;/etc/confluent/docker/run

@brettcave
Copy link

Trying to connect to brokers via Zookeeper in MSK fails because of the networking - Zookeeper returns both CLIENT_SECURE and REPLICATION_SECURE endpoints when using SSL through protocol mapping. The REPLICATION_SECURE endpoints are inaccessible outside of the cluster and will fail. If S.R. was able to filter by protocol mapping or endpoint, then it would solve it. See here: https://stackoverflow.com/questions/60149087/configure-schema-registry-to-only-use-client-secure-protcol-mapping-to-connect

@dcorrea777
Copy link

I had a very similar problem, but it was not with the schema-registry but with kafka-connect, I will leave my contribution here =).

When you set up an msk cluster on aws, it recommends that you use 3 brokers, but you can also use only 2.
If you want to use only 2 brokers, you also need to change the default.replication.factor setting to 2, because if the value of default.replication.factor is different from the number of brokers running, you will have some problems with Communication.

Well, that was what worked for me.

@rupeshmore85
Copy link

rupeshmore85 commented Jan 22, 2021

I would like to share with you some additional tips how to setup Schema Registry (at the time of this writing the last stable edition is 5.3.1-1) to run inside AWS ECS using the AWS Fargate launch type. Using a serverless option can make your life easier in cases where you don't need full control over EC2 instances. I hope you will find the next couple of tricks useful.

The decision about specifying brokers is explained in the section Single Primary Architecture of the Confluent's documentation.

Schema Registry requires setting the SCHEMA_REGISTRY_HOST_NAME environment variable. With AWS Fargate and awsvpc networking mode there is no way to explicitly set the host name property inside the task definition. For horizontal scaling you need multiple registry instances each having its own SCHEMA_REGISTRY_HOST_NAME value. I suggest overriding the Command property in a task definition in the following manner (the HOSTNAME is provided by Docker and under awsvpc regime it reliably reflects the host name) to set this env. variable:

sh,-c,export SCHEMA_REGISTRY_HOST_NAME=$HOSTNAME;/etc/confluent/docker/run

You can "insert" additional files into your container by using volumes. Confluent's Docker images follow the convention of exposing a container mount point via the VOLUME Dockerfile statement as /etc/<component name>/secrets. In our case, the component name is schema-registry.

P.S. You may also want to attach a web-based GUI to browse and manage the schemas inside the registry. There is an open-source project Landoop/schema-registry-ui (ships as a Docker image) that mimics the Confluent's Control Center.

Thanks @evarga /@ppiazzolla, I am trying to run the docker schema registry container within an EC2 instance which has access to MSK cluster, however not sure if I have specified the Command correctly in docker run. Do i need to specify command it in DockerFile then use docker build to built the image?

docker run -d  -p 8081:8081   --name=schema-registry   -e SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS="SSL://broker1:9094,SSL://broker2:9094" -e SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL=SSL  -e SCHEMA_REGISTRY_HOST_NAME=localhost -e SCHEMA_REGISTRY_LISTENERS=http://localhost:8081   -e SCHEMA_REGISTRY_DEBUG=true   confluentinc/cp-schema-registry:latest "sh,-c,export SCHEMA_REGISTRY_HOST_NAME=$HOSTNAME;/etc/confluent/docker/run"
1b57745b73260a62212ea449007e45f8e734a0763abc5f4f65015733a4fd2d55
docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: exec: "sh,-c,export SCHEMA_REGISTRY_HOST_NAME=Private_IPv4_DNS_Address;/etc/confluent/docker/run": stat sh,-c,export SCHEMA_REGISTRY_HOST_NAME=Private_IPv4_DNS_Address;/etc/confluent/docker/run: no such file or directory: unknown.

If i do not include the sh command, the container comes up cleanly and only works in locally(i.e. http://localhost:8081), however i cannot use the schema registry http://public_ipv4_dns:8081 for the exposed port 8081 outside of this instance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests