Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New caching proposal to reduce number of requests sent to the token broker #42

Open
jphalip opened this issue Oct 18, 2022 · 0 comments

Comments

@jphalip
Copy link
Collaborator

jphalip commented Oct 18, 2022

This purpose of this proposal is to reduce the load on the token broker by implementing the following new caching strategy:

  • When the client submits a job and requests the creation of a new TB session, the TB returns a session ID as well as an access token.
  • The client then generates a random encryption key, encrypts the access token with that key, and stores the encrypted access token in a file on HDFS with a path name based on the Job ID inside the user's home folder (e.g. /users/bob/.encrypted-access-token-[JOB-ID]).
  • The client then puts the encryption key in the delegation token. It then passes the delegation token over to Yarn, which itself passes it on to all the deployed tasks.
  • When a task needs to access GCS, it reads the encrypted access token from HDFS and decrypts it using the key provided in the delegation token.
  • For long-running jobs, Yarn periodically (e.g. at 80% of the access token's lifetime) sends a "RenewSessionToken" request to the token broker, which returns a new access token.
  • Yarn encrypts the new access token using the same key provided in the delegation token and overwrites the file on HDFS. This ensures that the access token stored in HDFS is always valid for the tasks to use.
  • At the end of the job, Yarn cancels the TB session and deletes the file from HDFS.

Encrypting the access tokens on HDFS reduces the risk in case of exfiltration of the HDFS files. Also, the fact that each job uses a different random encryption means that a rogue job wouldn't be able to steal another job's access token.

This implementation means that the individual tasks would never have to communicate with the TB directly, therefore drastically reducing load on the TB server. Only the client and the Yarn master would send a few requests to create, renew, and delete the session TB session.

@jphalip jphalip changed the title Client-side caching proposal New caching proposal to reduce number of requests sent to the token broker Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant