can not submit to yarn in some centos machine! Heron bug? Please check my comments #3500
Comments
detail information: 2020-03-27 17:27:15 +0800] [信息] org.apache.heron.statemgr.zookeeper.curator.CuratorStateManager: Closing the tunnel processes |
Is it heron bug? 2845 length data cause error? [2020-03-27 17:27:18 +0800] [较详细] org.apache.reef.wake.remote.impl.OrderedPushEventHandler: Value length is 2,854 |
issue detail:
Same heron version(I compiled using last month codes), same hadoop version: 3.2.1, almost same hadoop config, same heron topology
submit to yarn always well on mac
sometimes can not submit to yarn cluster on three lab centos machine
can not submit to yarn on another company centos machine always.
This issue has blocked me for some days, and I have to change to use other cluster.....
My suspect:
Please help me to check the error below, other log seems no any hint.
The error is below:
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal: container CPU hint: 2.000, RAM hint: ByteAmount{1.0 GB (1073741824 bytes)}, disk hint: ByteAmount{-1 bytes}.
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal finalized: container#1 CPU: 2.000000, RAM: ByteAmount{1.0 GB (1073741824 bytes)}, disk: ByteAmount{13.0 GB (13958643712 bytes)}.
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.packing.roundrobin.RoundRobinPacking: Initalizing RoundRobinPacking. CPU default: 1.000000, RAM default: ByteAmount{1.0 GB (1073741824 bytes)}, DISK default: ByteAmount{1.0 GB (1073741824 bytes)}, RAM padding: ByteAmount{2.0 GB (2147483648 bytes)}.
[2020-03-25 10:36:38 +0800] [警告] org.apache.heron.packing.roundrobin.RoundRobinPacking: Container#1 (max RAM: ByteAmount{1.0 GB (1073741824 bytes)}) is now hosting instances that take up to ByteAmount{0 bytes} RAM. The container may not have enough resource to accommodate internal processes which take up to ByteAmount{2.0 GB (2147483648 bytes)} RAM.
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal: container CPU hint: 2.000, RAM hint: ByteAmount{1.0 GB (1073741824 bytes)}, disk hint: ByteAmount{-1 bytes}.
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal finalized: container#1 CPU: 2.000000, RAM: ByteAmount{1.0 GB (1073741824 bytes)}, disk: ByteAmount{13.0 GB (13958643712 bytes)}.
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.scheduler.yarn.YarnLauncher: Initializing topology: Test3Topology, core: /root/.heron/dist/heron-core.tar.gz
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /heron/topologies/Test3Topology
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /heron/packingplans/Test3Topology
[2020-03-25 10:36:38 +0800] [信息] org.apache.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /heron/executionstate/Test3Topology
[2020-03-25 10:36:38 +0800] [严重] org.apache.reef.runtime.yarn.YarnClasspathProvider: YarnConfiguration.YARN_APPLICATION_CLASSPATH is empty. This indicates a broken cluster configuration.
2020-03-25 10:36:38,705 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[2020-03-25 10:36:39 +0800] [信息] org.apache.reef.util.REEFVersion: REEF Version: 0.14.0
[2020-03-25 10:36:39 +0800] [信息] org.apache.heron.scheduler.yarn.ReefClientSideHandlers: Initializing REEF client handlers for Heron, topology: Test3Topology
[INFO] RMProxy - Connecting to ResourceManager at guoxinghua1/127.0.0.1:8032
[2020-03-25 10:36:51 +0800] [警告] org.apache.reef.runtime.common.files.JobJarMaker: Failed to delete [/tmp/reef-job-1836122165165029413]
2020-03-25 10:36:54,247 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-03-25 10:36:54,666 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-03-25 10:36:54,988 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-03-25 10:36:55,149 INFO conf.Configuration: resource-types.xml not found
2020-03-25 10:36:55,149 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
[2020-03-25 10:36:55 +0800] [信息] org.apache.reef.runtime.yarn.client.YarnSubmissionHelper: Submitting REEF Application to YARN. ID: application_1585102108714_0002
2020-03-25 10:36:55,210 INFO impl.YarnClientImpl: Submitted application application_1585102108714_0002
[2020-03-25 10:36:59 +0800] [信息] org.apache.heron.scheduler.yarn.ReefClientSideHandlers: Topology Test3Topology is running, jobId Test3Topology.
[2020-03-25 10:36:59 +0800] [信息] org.apache.heron.statemgr.zookeeper.curator.CuratorStateManager: Closing the CuratorClient to: 127.0.0.1:2181
2020-03-25 10:36:59,098 INFO imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting
2020-03-25 10:36:59,104 INFO zookeeper.ZooKeeper: Session: 0x1000030d5e70002 closed
[2020-03-25 10:36:59 +0800] [信息] org.apache.heron.statemgr.zookeeper.curator.CuratorStateManager: Closing the tunnel processes
2020-03-25 10:36:59,104 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x1000030d5e70002
[2020-03-25 10:37:04 +0800] [警告] org.apache.reef.runtime.common.client.RuntimeErrorProtoHandler: socket://127.0.0.1:52988 Runtime Error: com.google.protobuf.Descriptors$Descriptor.getOneofs()Ljava/util/List;
[2020-03-25 10:37:04 +0800] [严重] org.apache.heron.scheduler.yarn.ReefClientSideHandlers: Failed to start topology: Test3Topology
[2020-03-25 10:37:04 +0800] [警告] org.apache.reef.runtime.common.client.RuntimeErrorProtoHandler: socket://127.0.0.1:52990 Runtime Error: Thread main threw an uncaught exception.
[2020-03-25 10:37:04 +0800] [严重] org.apache.heron.scheduler.yarn.ReefClientSideHandlers: Failed to start topology: Test3Topology
The text was updated successfully, but these errors were encountered: