Skip to content

Conversation

@jerryshao
Copy link
Contributor

What changes were proposed in this pull request?

This PR propose to add a new API in SparkHadoopUtil to trigger manual credential updating in the run-time.

This is mainly used in long running spark applications which needs to access different secured system in the run-time. For example, when Zeppelin / Spark Shell needs to access different secured HBase cluster or Hive metastore service in the run-time, it requires tokens from new services and updates to executors immediately. Previously either we need to relaunch the application to get new tokens, or we need to wait until the old tokens to expire to get new ones.

With this new API, user could manually trigger credential updating in the run-time when required. Credentials will be renewed in AM and updated in executor and driver side.

How was this patch tested?

Manually verified in the secured cluster.

@tgravescs
Copy link
Contributor

what or who exactly is calling the updateCredentials here? The user themselves are calling a developer api in SparkHadoopUtil? I don't know if we ever finished the discussion on DeveloperAPi, but that doesn't seem like a very stable interface for users.

@jerryshao
Copy link
Contributor Author

The user themselves will call this API. Another option in SPARK-14743 is to add this API in SparkContext. But from my understanding it is quite Hadoop related, so adding here might be more proper.

@tgravescs
Copy link
Contributor

So as far as the use case you are saying a user has a shell/process running, they are possibly accessing one hive or hbase cluster (or maybe none) and then they want to access another one. So how is it a user adds/changes the confs to the other hbase or hive clusters? which the trigger would then know how to go fetch.

What are all the steps required?

@SparkQA
Copy link

SparkQA commented Aug 24, 2016

Test build #64352 has finished for PR 14789 at commit 85370e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jerryshao
Copy link
Contributor Author

@tgravescs , with SPARK-14743, credentials/tokens can be managed out of Spark with their own credential provider. In that case user could update the credentials based on configuration changes or something else. For example: we could write a dynamic HBase credential provider which loads configuration from HDFS, once user what to change to another HBase cluster, they could update configuration files in HDFS and trigger credential updating manually using the API provided in this PR. It is mainly based on how user implement their own credential provider. What here provided is just a manual credential update mechanism, when and how to trigger this is relying on user.

@SparkQA
Copy link

SparkQA commented Aug 25, 2016

Test build #64390 has finished for PR 14789 at commit a25a347.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mridulm
Copy link
Contributor

mridulm commented Nov 8, 2016

Hi @tgravescs,
This is specifically for the case where new input source/output destinations are getting added after the driver has already started. Notebooks, long running streaming applications applications, etc - where the configuration could be passed in dynamically.
I do agree that SparkHadoopUtil is a DeveloperApi which we are leveraging here, is there any other api in yarn module which will be a better fit ? I did not find, but could have overlooked.

I have other comments on the pr which I will add, but wanted to ensure there are no high level issues with the approach. Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgravescs @vanzin Any thoughts on keeping this private and executing a closure on values of map for usecases like this ? (instead of exposing it or creating a copy via toMap). This can also ensure that outside uses will maintain the MT-invariance expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should stay private since there are synchronization implications. We also want to be careful about having it locked while trying to send this updates though too because it could interfere with normal message processing.

perhaps it needs a new interface and then a DriveEndPoint to handle this. I'd have to look closer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to return a Future for this, which can indicate when the credentials have been updated : as of now, developers invoking this api have no way to know when (if at all) the credentials have been propagated through the driver/executors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, Future might be better, I will change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ask instead of send.
This allows for consumers of the api to have reasonable confidence (based on Future's timeout) whether the update was propagated.

send could result in a task schedule before the update is processed.

@tgravescs
Copy link
Contributor

I guess I forgot about this pr. I don't think we have any yarn specific public apis. Also theoretically this isn't yarn specific if others decided to support it as well. mesos/standalone could also use credentials and updates its just no one has implemented it.

We do have api's that don't support all deploy modes though, like the SparkLauncher.
This feels like we should have public api in common place and just have backends handle if they support.

A little bit of a side note here, personally I would like to move away from keytabs because its not as secure and at least my company doesn't allow it. To get around keytabs, I was going to add an interface to push new credentials from the gateway box. Have a spark-submit argument that talked to the driver to push new credentials securely over rpc. The reason I bring that up is that if we are adding an api to updateCredentials it would be nice to make it flexible enough to handle that use case as well.

perhaps we should add an interface to get a Credentials like object that then can have the updateCredentials routine and others as needed, that way everything isn't directly in the SparkContext.

@mridulm
Copy link
Contributor

mridulm commented Nov 8, 2016

@tgravescs Interesting, so it means that there wont be a --principal/--keytab, and what is being done in AMCredentialRenewer will be done in the launcher itself ? So the driver behaves just like what executors currently do ?

The reason I want to understand more is that current spark design does allow for what is described above, but as soon as this change comes in, it will be difficult to support it : since driver knows about dynamically added clusters/sources, while launcher will not.
Unless we propagate from driver back to launcher, and then have the launcher update/push updates to driver and executors.

If yes, you are right, we will need to be careful while designing this to ensure there is minimal change when we evolve to allow for that scenario also.
Not sure if I got the problem statement totally confused though !

Edit: Currently, it is possible for launcher to 'go away' and the spark job to still continue (iirc), but this will mean it cant (when tokens have to be renewed), right ?

@vanzin
Copy link
Contributor

vanzin commented Nov 9, 2016

Hey, I haven't had the chance to look at this closely, but this seems to just be an API to trigger the existing credential updater code in Spark, right?

When I filed SPARK-14743 I had a different use case in mind. When I talked about Oozie and HoS, Spark's credential updater would not be enabled. Those systems generally do not use Spark's credential updater, since they do not have the user's keytab. They have their own keytab, which they use to login to the KDC and generate the tokens, and they run the child applications using a proxy user. They need a way to update those tokens after Spark has been started, which is different from having a way to trigger Spark's token updater mechanism.

I'm not sure I understand how something like Oozie or HoS would use the particular feature added with this change. They can't give their own keytab to the user's application, because the user code should not have access to that.

Tom:

To get around keytabs, I was going to add an interface to push new credentials from the gateway box.

Yes basically that is more in line with what I had in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that when we modify the above to return future, this will also need to be in the future (or can always be done inline).

@mridulm
Copy link
Contributor

mridulm commented Nov 9, 2016

@vanzin You are right, this does look orthogonal to what you and Tom described : though both will end up modifying similar codepaths.
'Where' the renewal is happening is the difference - from within spark or from outside of spark (oozie/from gateway node/etc).

I am wondering if we can model the change such that both usecases can be supported with minimal impedence.

@tgravescs
Copy link
Contributor

So I was intending to leave the keytab/principal stuff in there and just add the additional mechanism to push credentials. As long as you are ok with keytab from security point of view its easier for somethings.

I think both mechanisms could go through a similar type interface though which is why I brought it up here.

@mridulm
Copy link
Contributor

mridulm commented Nov 10, 2016

Interesting, thanks for clarifying.

The way I was looking at the change was, given we have ability for custom credential providers now, we would need to support for out-of-band (to the current expiry time based refresh/renew) updates to credentials.
If the same can be leveraged for other usecases (or evolved to do so), that would be even better.

@vanzin
Copy link
Contributor

vanzin commented Nov 10, 2016

I think there are two different issues, one that is being addressed by this change and one that is the one Tom and I are talking about:

  1. triggering Spark's credential updater mechanism outside its period, because you want to connect to a new NN or something like that.
  2. triggering the credential fetch mechanism from outside the internal credential updater, which would allow an external entity to update an existing Spark app's credentials.

The latter is a more complicated change since credential update and fetch are kinda intertwined right now; for example, for the latter to work, the AM would also need to fetch the new credentials as if it were an executor, and right now all the AM do is run the update thread.

So I think this is fine for covering case 1 (caveat: I haven't looked at the code yet), but it would be nice to have a solution for case 2 eventually.

@SparkQA
Copy link

SparkQA commented Nov 11, 2016

Test build #68519 has finished for PR 14789 at commit b1cac5c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

so I agree these are separate cases but I think the api makes sense to be very similar, or at least in the same sort of class. I don't think we want a public end user api in SparkHadoopUtil.updateCredentials. I would rather us create something in core like a Credentials class and put it in there.

@vanzin
Copy link
Contributor

vanzin commented Dec 5, 2016

but I think the api makes sense to be very similar, or at least in the same sort of class

I think it will be hard to have the same API serve both use cases. You could have one API to generate new credentials and a second one to distribute a set of credentials, and then you could use the second one for the "update from outside of Spark" case. But they'd still be two different new user-facing methods.

I agree that SparkHadoopUtil is a weird place for this. I don't have a good suggestion for where to add it though - I've been planning to play with removing the credentials stuff from SparkHadoopUtil, since we don't support hadoop 1.x anymore and can clean up that code, but haven't had the time.

@SparkQA
Copy link

SparkQA commented Feb 25, 2017

Test build #73461 has finished for PR 14789 at commit b1cac5c.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@jerryshao jerryshao closed this Mar 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants