Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HUDI-1657: build failed on AArch64, Fedora 33 #4617

Merged
merged 1 commit into from Feb 14, 2022
Merged

Conversation

guyuqi
Copy link
Member

@guyuqi guyuqi commented Jan 17, 2022

Tips

What is the purpose of the pull request

Fix the build Hudi on Arm64 Fedora33.

Protocbuf did not support Arm64 until Protobuf 3.5.0.
So we should upgrade Protoc dependency for fix the build issues:

[INFO] Protoc version: 3.1.0
protoc-jar: protoc version: 310, detected platform: linux/aarch64
[INFO] Protoc command: /tmp/protocjar1741426592250217060/bin/protoc.exe
[INFO] Input directories:
[INFO]     /home/builder/hudi/hudi-kafka-connect/src/main/resources
[INFO] Output targets:
[INFO]     java: /home/builder/hudi/hudi-kafka-connect/target/generated-sources (add: main, clean: false, plugin: null, outputOptions: null)
[INFO] /home/builder/hudi/hudi-kafka-connect/target/generated-sources does not exist. Creating...
[INFO]     Processing (java): ControlMessage.proto
protoc-jar: executing: [/tmp/protocjar1741426592250217060/bin/protoc.exe, -I/home/builder/hudi/hudi-kafka-connect/src/main/resources, --java_out=/home/builder/hudi/hudi-kafka-connect/target/generated-sources, /home/builder/hudi/hudi-kafka-connect/src/main/resources/ControlMessage.proto]
/tmp/protocjar1741426592250217060/bin/protoc.exe: /tmp/protocjar1741426592250217060/bin/protoc.exe: cannot execute binary file


[ERROR] Failed to execute goal com.github.os72:protoc-jar-maven-plugin:3.1.0.1:run (default) on project hudi-kafka-connect: protoc-jar failed for /home/builder/hudi/hudi-kafka-connect/src/main/resources/ControlMessage.proto. Exit code 126 

Brief change log

Upgrade dependency in pom.xml

Verify this pull request

This pull request is a trivial rework / code cleanup without any test coverage.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

Fix the build Hudi on Arm64 Fedora33.

Change-Id: I4795f5b0483521bfdc419b84f3b99faff4b3a847
Signed-off-by: Yuqi Gu <guyuqi@apache.org>
@guyuqi
Copy link
Member Author

guyuqi commented Jan 17, 2022

Successfully build Hudi on Arm64 Fedora33/Ubuntu20:

INFO] Dependency-reduced POM written at: /home/builder/hudi/packaging/hudi-kafka-connect-bundle/target/dependency-reduced-pom.xml
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Hudi 0.11.0-SNAPSHOT:
[INFO]
[INFO] Hudi ............................................... SUCCESS [  3.737 s]
[INFO] hudi-common ........................................ SUCCESS [ 28.122 s]
[INFO] hudi-aws ........................................... SUCCESS [  3.771 s]
[INFO] hudi-timeline-service .............................. SUCCESS [  3.532 s]
[INFO] hudi-client ........................................ SUCCESS [  0.186 s]
[INFO] hudi-client-common ................................. SUCCESS [ 18.025 s]
[INFO] hudi-hadoop-mr ..................................... SUCCESS [  6.800 s]
[INFO] hudi-spark-client .................................. SUCCESS [ 41.667 s]
[INFO] hudi-sync-common ................................... SUCCESS [  2.059 s]
[INFO] hudi-hive-sync ..................................... SUCCESS [  8.411 s]
[INFO] hudi-spark-datasource .............................. SUCCESS [  0.161 s]
[INFO] hudi-spark-common_2.11 ............................. SUCCESS [ 43.698 s]
[INFO] hudi-spark2_2.11 ................................... SUCCESS [ 25.266 s]
[INFO] hudi-spark2-common ................................. SUCCESS [  0.220 s]
[INFO] hudi-spark_2.11 .................................... SUCCESS [01:04 min]
[INFO] hudi-utilities_2.11 ................................ SUCCESS [ 13.561 s]
[INFO] hudi-utilities-bundle_2.11 ......................... SUCCESS [ 24.301 s]
[INFO] hudi-cli ........................................... SUCCESS [ 26.444 s]
[INFO] hudi-java-client ................................... SUCCESS [  5.383 s]
[INFO] hudi-flink-client .................................. SUCCESS [ 13.235 s]
[INFO] hudi-spark3-common ................................. SUCCESS [ 15.734 s]
[INFO] hudi-spark3_2.12 ................................... SUCCESS [ 11.753 s]
[INFO] hudi-spark3.1.x_2.12 ............................... SUCCESS [  7.994 s]
[INFO] hudi-dla-sync ...................................... SUCCESS [  3.061 s]
[INFO] hudi-sync .......................................... SUCCESS [  0.142 s]
[INFO] hudi-hadoop-mr-bundle .............................. SUCCESS [ 10.599 s]
[INFO] hudi-hive-sync-bundle .............................. SUCCESS [  3.295 s]
[INFO] hudi-spark-bundle_2.11 ............................. SUCCESS [ 20.337 s]
[INFO] hudi-presto-bundle ................................. SUCCESS [ 12.757 s]
[INFO] hudi-timeline-server-bundle ........................ SUCCESS [  7.888 s]
[INFO] hudi-trino-bundle .................................. SUCCESS [  8.714 s]
[INFO] hudi-hadoop-docker ................................. SUCCESS [  3.496 s]
[INFO] hudi-hadoop-base-docker ............................ SUCCESS [  2.160 s]
[INFO] hudi-hadoop-base-java11-docker ..................... SUCCESS [  2.255 s]
[INFO] hudi-hadoop-namenode-docker ........................ SUCCESS [  2.243 s]
[INFO] hudi-hadoop-datanode-docker ........................ SUCCESS [  2.236 s]
[INFO] hudi-hadoop-history-docker ......................... SUCCESS [  2.381 s]
[INFO] hudi-hadoop-hive-docker ............................ SUCCESS [  2.848 s]
[INFO] hudi-hadoop-sparkbase-docker ....................... SUCCESS [  2.314 s]
[INFO] hudi-hadoop-sparkmaster-docker ..................... SUCCESS [  2.235 s]
[INFO] hudi-hadoop-sparkworker-docker ..................... SUCCESS [  2.236 s]
[INFO] hudi-hadoop-sparkadhoc-docker ...................... SUCCESS [  2.234 s]
[INFO] hudi-hadoop-presto-docker .......................... SUCCESS [  2.369 s]
[INFO] hudi-hadoop-trinobase-docker ....................... SUCCESS [  2.505 s]
[INFO] hudi-hadoop-trinocoordinator-docker ................ SUCCESS [  2.232 s]
[INFO] hudi-hadoop-trinoworker-docker ..................... SUCCESS [  2.228 s]
[INFO] hudi-integ-test .................................... SUCCESS [ 24.196 s]
[INFO] hudi-integ-test-bundle ............................. SUCCESS [01:02 min]
[INFO] hudi-examples ...................................... SUCCESS [ 13.839 s]
[INFO] hudi-flink_2.11 .................................... SUCCESS [  8.843 s]
[INFO] hudi-kafka-connect ................................. SUCCESS [  6.963 s]
[INFO] hudi-flink-bundle_2.11 ............................. SUCCESS [ 26.518 s]
[INFO] hudi-kafka-connect-bundle .......................... SUCCESS [ 30.123 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  10:45 min
[INFO] Finished at: 2022-01-17T07:40:00Z
[INFO] ------------------------------------------------------------------------

@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@vinothchandar vinothchandar added this to Ready for Review in PR Tracker Board Jan 17, 2022
@nsivabalan nsivabalan added the priority:critical production down; pipelines stalled; Need help asap. label Jan 17, 2022
@nsivabalan nsivabalan assigned nsivabalan and yihua and unassigned nsivabalan Jan 17, 2022
Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @guyuqi Could you run the Quick Start Guide for Kafka Connect Sink for Hudi to make sure the Sink functionality is not affected?

@guyuqi
Copy link
Member Author

guyuqi commented Jan 21, 2022

LGTM. @guyuqi Could you run the Quick Start Guide for Kafka Connect Sink for Hudi to make sure the Sink functionality is not affected?

Thanks for your comments.

From Quick Start Guide,

Enviroments:

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
export CONFLUENT_DIR=/home/builder/confluent-7.0.1
export PATH=${CONFLUENT_DIR}/bin:${PATH}
export KAFKA_HOME=/home/builder/kafka_2.12-3.0.0
export HUDI_DIR=/home/builder/hudi

Linux fdr33-test-vm 5.11.0-43-generic #47~20.04.2-Ubuntu SMP Mon Dec 13 11:10:13 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux

1. Successfully Create the Hudi Topic for the Sink and insert data into the topic:

[builder@fdr33-test-vm demo]$ bash setupKafka.sh -n 3
Argument num-kafka-records is 3
Delete Kafka topic hudi-test-topic ...
Create Kafka topic hudi-test-topic ...
Created topic hudi-test-topic.
{"id":1}{"subject":"hudi-test-topic","version":1,"id":1,"schema":"{\"type\":\"record\",\"name\":\"stock_ticks\",\"fields\":[{\"name\":\"volume\",\"type\":\"long\"},{\"name\":\"ts\",\"type\":\"string\"},{\"name\":\"symbol\",\"type\":\"string\"},{\"name\":\"year\",\"type\":\"int\"},{\"name\":\"month\",\"type\":\"string\"},{\"name\":\"high\",\"type\":\"double\"},{\"name\":\"low\",\"type\":\"double\"},{\"name\":\"key\",\"type\":\"string\"},{\"name\":\"date\",\"type\":\"string\"},{\"name\":\"close\",\"type\":\"double\"},{\"name\":\"open\",\"type\":\"double\"},{\"name\":\"day\",\"type\":\"string\"}]}"}Fri Jan 21 08:25:52 UTC 2022
Start batch 1 ...
Fri Jan 21 08:25:53 UTC 2022
 Record key until 3
publish to Kafka ...
[builder@fdr33-test-vm demo]$ bash setupKafka.sh -n 3 -b 3
Argument num-kafka-records is 3
Argument num-batch is 3
Delete Kafka topic hudi-test-topic ...
Create Kafka topic hudi-test-topic ...
Created topic hudi-test-topic.
{"id":1}{"subject":"hudi-test-topic","version":1,"id":1,"schema":"{\"type\":\"record\",\"name\":\"stock_ticks\",\"fields\":[{\"name\":\"volume\",\"type\":\"long\"},{\"name\":\"ts\",\"type\":\"string\"},{\"name\":\"symbol\",\"type\":\"string\"},{\"name\":\"year\",\"type\":\"int\"},{\"name\":\"month\",\"type\":\"string\"},{\"name\":\"high\",\"type\":\"double\"},{\"name\":\"low\",\"type\":\"double\"},{\"name\":\"key\",\"type\":\"string\"},{\"name\":\"date\",\"type\":\"string\"},{\"name\":\"close\",\"type\":\"double\"},{\"name\":\"open\",\"type\":\"double\"},{\"name\":\"day\",\"type\":\"string\"}]}"}Fri Jan 21 08:27:04 UTC 2022
Start batch 1 ...
Fri Jan 21 08:27:04 UTC 2022
 Record key until 3
publish to Kafka ...
Fri Jan 21 08:27:24 UTC 2022
Start batch 2 ...
Fri Jan 21 08:27:25 UTC 2022
 Record key until 6
publish to Kafka ...
Fri Jan 21 08:27:45 UTC 2022
Start batch 3 ...
Fri Jan 21 08:27:45 UTC 2022
 Record key until 9
publish to Kafka ...

2. Run the Sink connector worker

[builder@fdr33-test-vm kafka_2.12-3.0.0]$ ./bin/connect-distributed.sh $HUDI_DIR/hudi-kafka-connect/demo/connect-distributed.properties
[2022-01-21 08:30:48,524] INFO WorkerInfo values:
        jvm.args = -Xms256M, -Xmx2G, -XX:+UseG1GC, -XX:MaxGCPauseMillis=20, -XX:InitiatingHeapOccupancyPercent=35, -XX:+ExplicitGCInvokesConcurrent, -XX:MaxInlineLevel=15, -Djava.awt.headless=true, -Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.ssl=false, -Dkafka.logs.dir=/home/builder/kafka_2.12-3.0.0/bin/../logs, -Dlog4j.configuration=file:./bin/../config/connect-log4j.properties
        jvm.spec = Red Hat, Inc., OpenJDK 64-Bit Server VM, 1.8.0_312, 25.312-b07
        jvm.classpath = /home/builder/kafka_2.12-3.0.0/bin/../libs/activation-1.1.1.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/aopalliance-repackaged-2.6.1.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/argparse4j-0.7.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/audience-annotations-0.5.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/commons-cli-1.4.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/commons-lang3-3.8.1.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-api-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-basic-auth-extension-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-file-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-json-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-mirror-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-mirror-client-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-runtime-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/connect-transforms-3.0.0.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/hk2-api-2.6.1.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/hk2-locator-2.6.1.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/hk2-utils-2.6.1.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-annotations-2.12.3.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-core-2.12.3.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-databind-2.12.3.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-dataformat-csv-2.12.3.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-datatype-jdk8-2.12.3.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-jaxrs-base-2.12.3.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-jaxrs-json-provider-2.12.3.jar:/home/builder/kafka_2.12-3.0.0/bin/../libs/jackson-module-jaxb-annotations
...............
..........
.......

[2022-01-21 08:31:05,629] INFO [Consumer clientId=consumer-hudi-connect-cluster-3, groupId=hudi-connect-cluster] Subscribed to partition(s): connect-configs-0 (org.apache.kafka.clients.consumer.KafkaConsumer:1121)
[2022-01-21 08:31:05,630] INFO [Consumer clientId=consumer-hudi-connect-cluster-3, groupId=hudi-connect-cluster] Seeking to EARLIEST offset of partition connect-configs-0 (org.apache.kafka.clients.consumer.internals.SubscriptionState:641)
[2022-01-21 08:31:05,641] INFO [Consumer clientId=consumer-hudi-connect-cluster-3, groupId=hudi-connect-cluster] Resetting offset for partition connect-configs-0 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[fdr33-test-vm:9092 (id: 0 rack: null)], epoch=0}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState:398)
[2022-01-21 08:31:05,642] INFO Finished reading KafkaBasedLog for topic connect-configs (org.apache.kafka.connect.util.KafkaBasedLog:202)
[2022-01-21 08:31:05,642] INFO Started KafkaBasedLog for topic connect-configs (org.apache.kafka.connect.util.KafkaBasedLog:204)
[2022-01-21 08:31:05,643] INFO Started KafkaConfigBackingStore (org.apache.kafka.connect.storage.KafkaConfigBackingStore:306)
[2022-01-21 08:31:05,643] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Herder started (org.apache.kafka.connect.runtime.distributed.DistributedHerder:322)
[2022-01-21 08:31:05,657] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Cluster ID: n2GHZbkaRAaByx27gk9Bhw (org.apache.kafka.clients.Metadata:287)
[2022-01-21 08:31:05,658] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Discovered group coordinator fdr33-test-vm:9092 (id: 2147483647 rack: null) (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:849)
[2022-01-21 08:31:05,661] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:222)
[2022-01-21 08:31:05,662] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] (Re-)joining group (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:535)
[2022-01-21 08:31:05,683] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] (Re-)joining group (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:535)
[2022-01-21 08:31:05,688] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Successfully joined group with generation Generation{generationId=1, memberId='connect-1-02d6edfd-fcc3-4565-b4fc-e2a550da3ef4', protocol='sessioned'} (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:591)
[2022-01-21 08:31:05,722] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Successfully synced group in generation Generation{generationId=1, memberId='connect-1-02d6edfd-fcc3-4565-b4fc-e2a550da3ef4', protocol='sessioned'} (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:757)
[2022-01-21 08:31:05,723] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Joined group at generation 1 with protocol version 2 and got assignment: Assignment{error=0, leader='connect-1-02d6edfd-fcc3-4565-b4fc-e2a550da3ef4', leaderUrl='http://172.17.0.2:8083/', offset=-1, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1848)
[2022-01-21 08:31:05,723] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Starting connectors and tasks using config offset -1 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1378)
[2022-01-21 08:31:05,724] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Finished starting connectors and tasks (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1406)
[2022-01-21 08:31:05,823] INFO [Worker clientId=connect-1, groupId=hudi-connect-cluster] Session key updated (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1716)

  1. To add the Hudi Sink to the Connector
[builder@fdr33-test-vm ~]$ curl -X GET -H "Content-Type:application/json"  http://localhost:8083/connectors/hudi-sink/status | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7993  100  7993    0     0   300k      0 --:--:-- --:--:-- --:--:--  300k
{
  "name": "hudi-sink",
  "connector": {
    "state": "RUNNING",
    "worker_id": "localhost:8083"
  },

@yihua
Copy link
Contributor

yihua commented Jan 23, 2022

@guyuqi Have you verified that the Hudi table written by the sink can be successfully read?

@nsivabalan
Copy link
Contributor

@guyuqi : can you respond to @yihua 's clarification above.

@guyuqi
Copy link
Member Author

guyuqi commented Feb 1, 2022

@guyuqi : can you respond to @yihua 's clarification above.

Sorry for the late reply.
I’m on Chinese New Year vacation and limited to access the PC. I’ll update the PR at the end of this week. Thanks.

@nsivabalan
Copy link
Contributor

sure, thanks! no probs.

@nsivabalan nsivabalan moved this from Ready for Review to Nearing Landing in PR Tracker Board Feb 4, 2022
@nsivabalan
Copy link
Contributor

@guyuqi : Is there any updates for us. We plan to get this into 0.11. Thats why being pushy. sorry about that.

@guyuqi
Copy link
Member Author

guyuqi commented Feb 10, 2022

I put everything in a Fedora-33 docker container for I have no Fedora-33 host.
And then follow the Quick Start (demo) guide:

But when add the Hudi Sink to the Connector:
curl -X GET -H "Content-Type:application/json" http://localhost:8083/connectors/hudi-sink/status | jq

hudi-sink is running but tasks failed:

[builder@f7b3d84dbcab kafka_2.12-3.1.0]$ curl -X GET -H "Content-Type:application/json"  http://localhost:8083/connectors/hudi-sink/status | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7650  100  7650    0     0   373k      0 --:--:-- --:--:-- --:--:--  373k
{
  "name": "hudi-sink",
  "connector": {
    "state": "RUNNING",
    "worker_id": "172.17.0.3:8083"
  },
  "tasks": [
    {
      "id": 0,
      "state": "FAILED",
      "worker_id": "172.17.0.3:8083",
      "trace": "org.apache.kafka.common.KafkaException: Failed to construct kafka producer\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:442)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:292)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:319)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.start(KafkaControlProducer.java:59)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.<init>(KafkaControlProducer.java:50)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.<init>(KafkaConnectControlAgent.java:77)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.createKafkaControlManager(KafkaConnectControlAgent.java:86)\n\tat org.apache.hudi.connect.HoodieSinkTask.start(HoodieSinkTask.java:81)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:312)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:186)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:89)\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:48)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:416)\n\t... 15 more\n"
    },
    {
      "id": 1,
      "state": "FAILED",
      "worker_id": "172.17.0.3:8083",
      "trace": "org.apache.kafka.common.KafkaException: Failed to construct kafka producer\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:442)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:292)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:319)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.start(KafkaControlProducer.java:59)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.<init>(KafkaControlProducer.java:50)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.<init>(KafkaConnectControlAgent.java:77)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.createKafkaControlManager(KafkaConnectControlAgent.java:86)\n\tat org.apache.hudi.connect.HoodieSinkTask.start(HoodieSinkTask.java:81)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:312)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:186)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:89)\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:48)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:416)\n\t... 15 more\n"
    },
    {
      "id": 2,
      "state": "FAILED",
      "worker_id": "172.17.0.3:8083",
      "trace": "org.apache.kafka.common.KafkaException: Failed to construct kafka producer\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:442)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:292)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:319)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.start(KafkaControlProducer.java:59)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.<init>(KafkaControlProducer.java:50)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.<init>(KafkaConnectControlAgent.java:77)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.createKafkaControlManager(KafkaConnectControlAgent.java:86)\n\tat org.apache.hudi.connect.HoodieSinkTask.start(HoodieSinkTask.java:81)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:312)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:186)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:89)\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:48)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:416)\n\t... 15 more\n"
    },
    {
      "id": 3,
      "state": "FAILED",
      "worker_id": "172.17.0.3:8083",
      "trace": "org.apache.kafka.common.KafkaException: Failed to construct kafka producer\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:442)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:292)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:319)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.start(KafkaControlProducer.java:59)\n\tat org.apache.hudi.connect.kafka.KafkaControlProducer.<init>(KafkaControlProducer.java:50)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.<init>(KafkaConnectControlAgent.java:77)\n\tat org.apache.hudi.connect.kafka.KafkaConnectControlAgent.createKafkaControlManager(KafkaConnectControlAgent.java:86)\n\tat org.apache.hudi.connect.HoodieSinkTask.start(HoodieSinkTask.java:81)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:312)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:186)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:89)\n\tat org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:48)\n\tat org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:416)\n\t... 15 more\n"
    }
  ],
  "type": "sink"
}

jps:

9713 ConnectDistributed
4898 Kafka
5462 SchemaRegistryMain
4486 QuorumPeerMain

ifconfig:

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.3  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:ac:11:00:03  txqueuelen 0  (Ethernet)
        RX packets 403830  bytes 1795559769 (1.6 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 209087  bytes 16087596 (15.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

For I'm the new comer to Hudi, I have no idea how to change 172.17.0.3 to localhost(127.0.0.1).

This PR is just to upgrade the Protocbuf from 3.1 to 3.11.

Is there any simple unitest case for it instead of this complicated demo for the new comer ?
Or could you guys as the Hudi veteran help to verify it on x86 first for the release 0.11? Thanks.

@yihua
Copy link
Contributor

yihua commented Feb 14, 2022

@guyuqi I'll test the sink on my side as well.

For changing the IP address, could you do

curl -X GET -H "Content-Type:application/json" http://127.0.0.1:8083/connectors/hudi-sink/status

You may also hardcode the IP address of the host instead of localhost in your config-sink.json. Another way to change the hostname resolution is to add an entry in /etc/hosts, e.g., 127.0.0.1 localhost. However, that is an OS-wide change and can affect others.

@yihua
Copy link
Contributor

yihua commented Feb 14, 2022

I followed the Quick Start Guide for Kafka Connect Sink for Hudi and verified that the Hudi table written by the sink can be successfully read through Spark datasource on my x86_64 macbook.

@guyuqi When you get a chance, could you report your testing on aarch64 machine after fixing the hostname issue? I'm going to merge this PR for now. Feel free to follow up with any fixes needed.

@yihua yihua merged commit e639d99 into apache:master Feb 14, 2022
PR Tracker Board automation moved this from Nearing Landing to Done Feb 14, 2022
vingov pushed a commit to vingov/hudi that referenced this pull request Apr 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:critical production down; pipelines stalled; Need help asap.
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants