Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan doesn't work in a cluster with kerberos authentication #6

Closed
ennio1991 opened this issue Sep 2, 2015 · 14 comments

Comments

@ennio1991
Copy link

commented Sep 2, 2015

Hi, I am trying examples job in a cluster with kerberos authentication
HBaseBulkPutExample.scala is working properly
While HBaseScanRDDExample.scala doesn't work.
If I run it in my environment, it doesn't terminate the run.
Checking yarn logs I have multiple times the following exception:

15/09/01 17:52:50 WARN security.UserGroupInformation: PriviledgedActionException as:tubbl999 (auth:PROXY) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
15/09/01 17:52:50 WARN ipc.RpcClient: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
15/09/01 17:52:50 ERROR ipc.RpcClient: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:770)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$600(RpcClient.java:357)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:891)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:888)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:888)
at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30304)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1564)
at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:711)
at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:709)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:715)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:144)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1140)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1204)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1092)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1049)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:890)
at org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:72)
at org.apache.hadoop.hbase.client.ScannerCallable.prepare(ScannerCallable.java:125)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:113)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
at org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:183)
at org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:110)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:739)
at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.restart(TableRecordReaderImpl.java:90)
at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.initialize(TableRecordReaderImpl.java:152)
at org.apache.hadoop.hbase.mapreduce.TableRecordReader.initialize(TableRecordReader.java:125)
at com.cloudera.spark.hbase.HBaseScanRDD$$anon$1.(HBaseScanRDD.scala:94)
at com.cloudera.spark.hbase.HBaseScanRDD.compute(HBaseScanRDD.scala:79)
at com.cloudera.spark.hbase.HBaseScanRDD.compute(HBaseScanRDD.scala:28)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
... 48 more

@tmalaska

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2015

With the version of SparkOnHBase on Yarn that you need to run your spark job as yarn-client.

We are working on yarn-cluster mode and it will be working in future versions.

@ennio1991

This comment has been minimized.

Copy link
Author

commented Sep 2, 2015

I tried the example with the related spark submit, so I ran it in yarn-client mode:
https://github.com/cloudera-labs/SparkOnHBase/blob/cdh5-0.0.2/README.md#scan-that-works-on-kerberos

@tmalaska

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2015

So that should work I know I tested it. Hmm what to do.

I would open up a ticket to Cloudera then I can help through a webex.

If you can't do that. Here are the points to add in debug information

val creds = SparkHadoopUtil.get.getCurrentUserCredentials()

https://github.com/cloudera-labs/SparkOnHBase/blob/cdh5-0.0.2/src/main/scala/com/cloudera/spark/hbase/HBaseContext.scala#L214

https://github.com/cloudera-labs/SparkOnHBase/blob/cdh5-0.0.2/src/main/scala/com/cloudera/spark/hbase/HBaseContext.scala#L70

@tmalaska

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2015

Yeah looks like this line is missing from hbaseScanRDD

https://github.com/cloudera-labs/SparkOnHBase/blob/cdh5-0.0.2/src/main/scala/com/cloudera/spark/hbase/HBaseContext.scala#L225

It should also be here

https://github.com/cloudera-labs/SparkOnHBase/blob/cdh5-0.0.2/src/main/scala/com/cloudera/spark/hbase/HBaseScanRDD.scala#L137

For a work around do the following:

  • Make a fake RDD and give it 100 partitions (make sure the number of partitions is equal to the number of cores you are using)
  • Put one value in each partitions
  • Then do a HBaseContext.foreachPartition on the RDD
  • Then do your hbaseScanRDD

Here is sample code to make the RDD of partitions
val rowRDD = sc.parallelize( (0 until numberOfPartitions).map( i => i), numberOfPartitions)

But if you would like to fix the main issue that would be a better fix.

@ennio1991

This comment has been minimized.

Copy link
Author

commented Sep 3, 2015

Thank you very much Ted, now also the scan works!

@ennio1991 ennio1991 closed this Sep 3, 2015

@tmalaska

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2015

Hey ennio can you summarize what steps you took to resolve the problem. So I can add the fix back into the project.

Thanks

@ennio1991

This comment has been minimized.

Copy link
Author

commented Sep 3, 2015

Ted I sent you a pull request ( #7 )
I share the credentials from the driver to all executor by broadcast, then authenticate every user by the shared credentials

@tmalaska

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2015

Thank you super much.

@drocsid

This comment has been minimized.

Copy link

commented Apr 25, 2016

@ennio1991 : since you had luck using this, can you try to give me some advice, on my referenced issue #13 ?

@ennio1991

This comment has been minimized.

Copy link
Author

commented Apr 26, 2016

@drocsid

This comment has been minimized.

Copy link

commented Apr 26, 2016

Hi @ennio1991 I just had a look. I'm not using Kerberos, then I don't expect my issues to be related. Thanks for the suggestion. I'm going to try to work with the examples today.

@nauu

This comment has been minimized.

Copy link

commented Nov 25, 2016

@tmalaska is there any solution works on yarn-cluster mode with kerberos?

@bhuvana-pathsamatla

This comment has been minimized.

Copy link

commented Jan 7, 2019

Hi, Has anyone found a solution to this issue? We are still facing the same issue.

@busbey

This comment has been minimized.

Copy link
Collaborator

commented Jan 7, 2019

The Spark On HBase project in Cloudera Labs got merged into the upstream HBase project in 2015 and a backport of that has shipped with CDH since CDH 5.7

https://blog.cloudera.com/blog/2015/08/apache-spark-comes-to-apache-hbase-with-hbase-spark-module/

This problem is solved there and you should just use the version that ships with CDH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.