There are 1 datanode(s) running and 1 node(s) are excluded in this operation. #38

MrHurt · 2019-07-01T08:40:31Z

问题（question）：
不能写入（can't put ） hdfs://localhost:9000/work/test.txt
报错如下（error log as follow）：

namenode_1                   | java.io.IOException: File /work/test.txt could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
namenode_1                   |  at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1628)
namenode_1                   |  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3121)
namenode_1                   |  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3045)
namenode_1                   |  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
namenode_1                   |  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)
namenode_1                   |  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
namenode_1                   |  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
namenode_1                   |  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
namenode_1                   |  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
namenode_1                   |  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
namenode_1                   |  at java.security.AccessController.doPrivileged(Native Method)
namenode_1                   |  at javax.security.auth.Subject.doAs(Subject.java:422)
namenode_1                   |  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
namenode_1                   |  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)

有遇到相同问题的吗？
Thanks

The text was updated successfully, but these errors were encountered:

c1rew · 2020-02-26T14:12:32Z

@MrHurt
I encounter the same situation, if you use the virtual machine, you need to expose datanode address ports

c1rew · 2020-04-29T13:34:49Z

actually, 2 datanode is impossible, to host, cannot bind the same port for 2 container, you need to use two real 'datanode', two virtual machines may be ok. bdemailly <notifications@github.com> 于2020年4月24日周五下午11:53写道：

…

Same for me, expose datanode port 9866 like @c1rew <https://github.com/c1rew> @c1rew <https://github.com/c1rew> but do you know how to do with 2 datanode? It is not possible to map the same port — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#38 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACMQVHA3VTALFEKKELJFI7TROGYW7ANCNFSM4H4QCFCA> .

bde27 · 2020-04-29T13:44:49Z

Thank you a lot for your reply.

The solution to open port 9864 works when you want to write from the server hosting the hadoop cluster. Cool, thank you!

But when I try to write from an external server, it doesn't work and I get the same error.

Do you have an idea?

c1rew · 2020-04-29T14:08:42Z

Thank you a lot for your reply.

The solution to open port 9864 works when you want to write from the server hosting the hadoop cluster. Cool, thank you!

But when I try to write from an external server, it doesn't work and I get the same error.

Do you have an idea?

if you really want to fix this problem, you can debug the code, then you will find the answer.

when the request reach host, it can not known container ip, you need to write host information in /etc/host

bde27 · 2020-04-30T10:32:31Z

The problem was not with the Hadoop cluster but with the parameters used by default when writing.

The parameters "dfs.client.use.datanode.hostname" and "dfs.datanode.use.datanode.hostname" must be forced to true when writing an external program.
Otherwise the internal IP of the server is used by the external server, and not the hostnames.

thank you for your help.

rodrigo-borges · 2020-08-27T00:01:22Z

@bde27 aren't these two parameters already forced to true in the entrypoint.sh?

...
if [ "$MULTIHOMED_NETWORK" = "1" ]; then
    echo "Configuring for multihomed network"

    # HDFS
    addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.rpc-bind-host 0.0.0.0
    addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.servicerpc-bind-host 0.0.0.0
    addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.http-bind-host 0.0.0.0
    addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.https-bind-host 0.0.0.0
    addProperty /etc/hadoop/hdfs-site.xml dfs.client.use.datanode.hostname true
    addProperty /etc/hadoop/hdfs-site.xml dfs.datanode.use.datanode.hostname true
...

I'm facing the same issue when trying to write or read files from pyspark in an external client. Could you please elaborate more on what you did to solve the problem?

haydenzhourepo · 2020-11-27T03:11:32Z

The problem was not with the Hadoop cluster but with the parameters used by default when writing.

The parameters "dfs.client.use.datanode.hostname" and "dfs.datanode.use.datanode.hostname" must be forced to true when writing an external program.
Otherwise the internal IP of the server is used by the external server, and not the hostnames.

thank you for your help.

The parameters "dfs.client.use.datanode.hostname" and "dfs.datanode.use.datanode.hostname" be forced to true, not working for me set up hadoop by docker

A7Khan · 2020-12-07T09:20:39Z

Hey guys,

Has anyone actually got a fix for this or have solved it? I'm facing the same problem with my Hadoop docker. I have run a simple wordcount test to see if everything is working fine, and it does but as soon as I have spark stream writing into it. HDFS doesn't seem to pick them up at all

`2020-12-07 09:20:58.212 WARN 1 --- [ool-22-thread-1] o.a.spark.streaming.CheckpointWriter : Could not write checkpoint for time 1607332854000 ms to file 'hdfs://namenode:8020/dangerousgoods/checkpoint/checkpoint-1607332858000'

2020-12-07 09:20:58.213 INFO 1 --- [uler-event-loop] o.a.spark.storage.memory.MemoryStore : Block broadcast_18 stored as values in memory (estimated size 17.2 KB, free 9.2 GB)

2020-12-07 09:20:58.214 INFO 1 --- [uler-event-loop] o.a.spark.storage.memory.MemoryStore : Block broadcast_18_piece0 stored as bytes in memory (estimated size 7.4 KB, free 9.2 GB)

2020-12-07 09:20:58.214 INFO 1 --- [er-event-loop-8] o.apache.spark.storage.BlockManagerInfo : Added broadcast_18_piece0 in memory on 16b1f170f11c:42679 (size: 7.4 KB, free: 9.2 GB)

2020-12-07 09:20:58.215 INFO 1 --- [uler-event-loop] org.apache.spark.SparkContext : Created broadcast 18 from broadcast at DAGScheduler.scala:1163

2020-12-07 09:20:58.215 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : Submitting 1 missing tasks from ShuffleMapStage 53 (MapPartitionsRDD[28] at mapToPair at RealtimeProcessor.java:256) (first 15 tasks are for partitions Vector(0))

2020-12-07 09:20:58.215 INFO 1 --- [uler-event-loop] o.a.spark.scheduler.TaskSchedulerImpl : Adding task set 53.0 with 1 tasks

2020-12-07 09:20:58.216 INFO 1 --- [er-event-loop-7] o.apache.spark.scheduler.TaskSetManager : Starting task 0.0 in stage 53.0 (TID 19, 10.0.9.185, executor 0, partition 0, PROCESS_LOCAL, 7760 bytes)

2020-12-07 09:20:58.221 INFO 1 --- [r-event-loop-10] o.apache.spark.storage.BlockManagerInfo : Added broadcast_18_piece0 in memory on 10.0.9.185:38567 (size: 7.4 KB, free: 366.2 MB)

2020-12-07 09:20:58.225 INFO 1 --- [result-getter-0] o.apache.spark.scheduler.TaskSetManager : Finished task 0.0 in stage 53.0 (TID 19) in 9 ms on 10.0.9.185 (executor 0) (1/1)

2020-12-07 09:20:58.225 INFO 1 --- [result-getter-0] o.a.spark.scheduler.TaskSchedulerImpl : Removed TaskSet 53.0, whose tasks have all completed, from pool

2020-12-07 09:20:58.226 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : ShuffleMapStage 53 (mapToPair at RealtimeProcessor.java:256) finished in 0.014 s

2020-12-07 09:20:58.226 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : looking for newly runnable stages

2020-12-07 09:20:58.226 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : running: Set()

2020-12-07 09:20:58.226 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : waiting: Set(ResultStage 55)

2020-12-07 09:20:58.226 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : failed: Set()

2020-12-07 09:20:58.227 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : Submitting ResultStage 55 (MapPartitionsRDD[33] at map at RealtimeProcessor.java:264), which has no missing parents

2020-12-07 09:20:58.227 INFO 1 --- [uler-event-loop] o.a.spark.storage.memory.MemoryStore : Block broadcast_19 stored as values in memory (estimated size 8.8 KB, free 9.2 GB)

2020-12-07 09:20:58.228 INFO 1 --- [uler-event-loop] o.a.spark.storage.memory.MemoryStore : Block broadcast_19_piece0 stored as bytes in memory (estimated size 4.4 KB, free 9.2 GB)

2020-12-07 09:20:58.229 INFO 1 --- [er-event-loop-0] o.apache.spark.storage.BlockManagerInfo : Added broadcast_19_piece0 in memory on 16b1f170f11c:42679 (size: 4.4 KB, free: 9.2 GB)

2020-12-07 09:20:58.229 INFO 1 --- [uler-event-loop] org.apache.spark.SparkContext : Created broadcast 19 from broadcast at DAGScheduler.scala:1163

2020-12-07 09:20:58.229 INFO 1 --- [uler-event-loop] org.apache.spark.scheduler.DAGScheduler : Submitting 1 missing tasks from ResultStage 55 (MapPartitionsRDD[33] at map at RealtimeProcessor.java:264) (first 15 tasks are for partitions Vector(0))`

that is the first error that prompts and after few sec, I get the exact same error like this post is titled

macknight · 2022-01-10T03:35:49Z

Hi anyone could fix this issue?
I'm using docker-hub image.
The parameters "dfs.client.use.datanode.hostname" and "dfs.datanode.use.datanode.hostname" are all true in my hdfs-site.xml but it still have that problem.

shivanshkaushikk · 2022-03-30T08:53:51Z

Any update on this issue?

fenggolang · 2022-04-07T07:55:42Z

I encounter the same situation？ when I deploy docker-hadoop image in k8s cluster.

shivanshkaushikk · 2022-04-07T09:52:42Z

I encounter the same situation？ when I deploy docker-hadoop image in k8s cluster.

just found out that the issue at my end was the connectivity between datanode and my local python app, after deploying my app in the same docker network as hadoop it was solved.

AChangFeng · 2022-05-31T03:53:31Z

If you want to write from external hosts:

Open datanode container port 9864 and 9866, then recreate containers.
Add ${host ip} datanode namenode ${namenode container id} ${datanode container id} to your local hosts file.
Set client configuration parameter dfs.client.use.datanode.hostname to true:

Configuration().apply {
  this.set("dfs.client.use.datanode.hostname", "true")
}

If you do not want to add ${namenode container id} ${datanode container id} to your local hosts file, you can set datanode container hostname to datanode and namenode container hostname to namenode by hostname instruction to container configuraion in docker-compose.yaml:

...
namenode:
    image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
    hostname: namenode
...
datanode:
    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
    hostname: datanode
...

geekyouth · 2024-06-19T05:07:21Z

datax 写入报这个错时，应该使用：

"hadoopConfig": {
    "dfs.client.use.datanode.hostname": "true",
    "dfs.datanode.use.datanode.hostname": "true"
}

pplmx mentioned this issue Jun 29, 2020

RPC response exceeds maximum data length #80

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There are 1 datanode(s) running and 1 node(s) are excluded in this operation. #38

There are 1 datanode(s) running and 1 node(s) are excluded in this operation. #38

MrHurt commented Jul 1, 2019

c1rew commented Feb 26, 2020

c1rew commented Apr 29, 2020 via email •

edited

Loading

bde27 commented Apr 29, 2020

c1rew commented Apr 29, 2020

bde27 commented Apr 30, 2020

rodrigo-borges commented Aug 27, 2020

haydenzhourepo commented Nov 27, 2020

A7Khan commented Dec 7, 2020 •

edited

Loading

macknight commented Jan 10, 2022

shivanshkaushikk commented Mar 30, 2022

fenggolang commented Apr 7, 2022

shivanshkaushikk commented Apr 7, 2022

AChangFeng commented May 31, 2022 •

edited

Loading

geekyouth commented Jun 19, 2024

There are 1 datanode(s) running and 1 node(s) are excluded in this operation. #38

There are 1 datanode(s) running and 1 node(s) are excluded in this operation. #38

Comments

MrHurt commented Jul 1, 2019

c1rew commented Feb 26, 2020

c1rew commented Apr 29, 2020 via email • edited Loading

bde27 commented Apr 29, 2020

c1rew commented Apr 29, 2020

bde27 commented Apr 30, 2020

rodrigo-borges commented Aug 27, 2020

haydenzhourepo commented Nov 27, 2020

A7Khan commented Dec 7, 2020 • edited Loading

macknight commented Jan 10, 2022

shivanshkaushikk commented Mar 30, 2022

fenggolang commented Apr 7, 2022

shivanshkaushikk commented Apr 7, 2022

AChangFeng commented May 31, 2022 • edited Loading

geekyouth commented Jun 19, 2024

c1rew commented Apr 29, 2020 via email •

edited

Loading

A7Khan commented Dec 7, 2020 •

edited

Loading

AChangFeng commented May 31, 2022 •

edited

Loading