Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fall back to sequential concat when HDFS concat fails #1478

Closed
fnothaft opened this issue Apr 7, 2017 · 1 comment
Closed

Fall back to sequential concat when HDFS concat fails #1478

fnothaft opened this issue Apr 7, 2017 · 1 comment
Assignees
Labels
bug
Milestone

Comments

@fnothaft
Copy link
Member

@fnothaft fnothaft commented Apr 7, 2017

There's a dizzying array of undocumented cases that will cause HDFS' concat method to fail. The most recent one that I found is running on files in an encrypted HDFS directory:

17/04/07 01:45:00 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.HadoopIllegalArgumentException): concat can not be called for files in an encryption zone.
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInternal(FSNamesystem.java:2116)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInt(FSNamesystem.java:2081)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concat(FSNamesystem.java:2043)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.concat(NameNodeRpcServer.java:814)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.concat(AuthorizationProviderProxyClientProtocol.java:280)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.concat(ClientNamenodeProtocolServerSideTranslatorPB.java:562)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

Yahtzee! Anyways, I'm thinking that the reasonable way to move forward is to fall back on the serial concat code when parallel concatenation fails.

@fnothaft fnothaft added the bug label Apr 7, 2017
@fnothaft fnothaft added this to the 0.23.0 milestone Apr 7, 2017
@fnothaft fnothaft self-assigned this Apr 7, 2017
@fnothaft
Copy link
Member Author

@fnothaft fnothaft commented Apr 7, 2017

Actually, instead of falling back, I'm thinking it makes more sense to allow the user to disable the use of the fast concat method.

fnothaft added a commit to fnothaft/adam that referenced this issue Apr 7, 2017
fnothaft added a commit to fnothaft/adam that referenced this issue Apr 7, 2017
fnothaft added a commit to fnothaft/adam that referenced this issue Apr 7, 2017
fnothaft added a commit to fnothaft/adam that referenced this issue Apr 22, 2017
heuermh added a commit that referenced this issue Apr 24, 2017
@heuermh heuermh added this to Completed in Release 0.23.0 May 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.