New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fall back to sequential concat when HDFS concat fails #1478

Closed
fnothaft opened this Issue Apr 7, 2017 · 1 comment

Comments

Projects
1 participant
@fnothaft
Member

fnothaft commented Apr 7, 2017

There's a dizzying array of undocumented cases that will cause HDFS' concat method to fail. The most recent one that I found is running on files in an encrypted HDFS directory:

17/04/07 01:45:00 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.HadoopIllegalArgumentException): concat can not be called for files in an encryption zone.
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInternal(FSNamesystem.java:2116)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInt(FSNamesystem.java:2081)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concat(FSNamesystem.java:2043)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.concat(NameNodeRpcServer.java:814)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.concat(AuthorizationProviderProxyClientProtocol.java:280)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.concat(ClientNamenodeProtocolServerSideTranslatorPB.java:562)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

Yahtzee! Anyways, I'm thinking that the reasonable way to move forward is to fall back on the serial concat code when parallel concatenation fails.

@fnothaft fnothaft added the bug label Apr 7, 2017

@fnothaft fnothaft added this to the 0.23.0 milestone Apr 7, 2017

@fnothaft fnothaft self-assigned this Apr 7, 2017

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 7, 2017

Member

Actually, instead of falling back, I'm thinking it makes more sense to allow the user to disable the use of the fast concat method.

Member

fnothaft commented Apr 7, 2017

Actually, instead of falling back, I'm thinking it makes more sense to allow the user to disable the use of the fast concat method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment