Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GatewayTokenRenewer classes stopped being called with the 2.8.0 release of the jupyter_server package. #1339

Closed
ojarjur opened this issue Oct 18, 2023 · 3 comments · Fixed by #1340
Labels

Comments

@ojarjur
Copy link
Contributor

ojarjur commented Oct 18, 2023

Description

We have an extension for Jupyter that, among other things, configures the gateway_token_renewer_class value to be a class that we provide.

With the 2.7.3 release, this works perfectly. However, with the 2.8.0 release, this has stopped working.

I see that the token renewer class is being configured because I see a log message of the form:

[D 2023-10-18 13:21:34.339 ServerApp] Config changed: {'ExtensionApp': {'log_level': 'DEBUG'}, 'PasswordIdentityProvider': {'hashed_password': '<REDACTED>'}, 'DelegatingWebsocketConnection': {'kernel_ws_protocol': ''}, 'GatewayClient': {'auth_scheme': 'Bearer', 'headers': '{"Cookie": "_xsrf=XSRF", "X-XSRFToken": "XSRF"}', 'gateway_token_renewer_class': <class 'google.cloud.jupyter_config.tokenrenewer.CommandTokenRenewer'>, 'url': '<REDACTED>'}, 'CommandTokenRenewer': {'token_command': 'gcloud config config-helper --format="value(credential.access_token)"'}, 'NotebookApp': {}, 'ServerApp': {'log_level': 'DEBUG', 'password': '<REDACTED>', 'jpserver_extensions': <LazyConfigValue value={'jupyterlab': True, 'dataproc_jupyter_plugin': True, 'jupyter_lsp': True, 'jupyter_server_fileid': True, 'jupyter_server_terminals': True, 'jupyter_server_ydoc': True, 'nbclassic': True, 'notebook_shim': True}>, 'kernel_spec_manager_class': <class 'kernels_mixer.kernelspecs.MixingKernelSpecManager'>, 'kernel_manager_class': <class 'kernels_mixer.kernels.MixingMappingKernelManager'>, 'session_manager_class': <class 'jupyter_server.services.sessions.sessionmanager.SessionManager'>, 'kernel_websocket_connection_class': <class 'kernels_mixer.websockets.DelegatingWebsocketConnection'>}}

... however, the token renewer class is never being called. Specifically, when I add print statements to it for every call to the get_token method, those print statements never show up in the log.

Reproduce

I can consistently reproduce this, but the reproduction steps require a Google account. I'll see if I can put together a smaller, more minimal repro, but what I have now is this:

  1. Install the dataproc-jupyter-plugin package from PyPI, along with jupyter_server==2.8.0 and jupyterlab==4.0.5
  2. Install the gcloud command line tool from the Google Cloud SDK.
  3. Configure gcloud with credentials, a project, and region.
  4. Launch jupyter lab.
  5. See a bunch of HTTP 401: Unauthorized errors in the logs because the token renewer is never being called.

Expected behavior

The expected behavior is what was seen with the 2.7.3 version; the above steps would work without any errors.

@ojarjur ojarjur added the bug label Oct 18, 2023
@ojarjur
Copy link
Contributor Author

ojarjur commented Oct 18, 2023

I found the root cause.

PR #1330 changed the way that token renewers are used at startup.

Previously, the token renewer class could be used to generate the initial auth token, but after that change, they are only invoked if the auth_token field is already set.

This means that if the auth_token config value is not set (to a non-empty string) initially, then it will never be set by the token renewer. We could work around this by setting the initial value of that config option to some nonsense string (and in fact, doing that is how I confirmed the root cause), but there's still on more potential issue:

If the token renewer ever returns an empty string as its result, then it is effectively disabled from then on.

I don't think this is how we want token renewers to work in that scenario.

I don't understand why this change was made. The affected PR is fixing type errors, but I don't have context on what the specific type error was here in order to suggest a better way to fix it. @blink1073 do you still have context on what the type error was that you were trying to fix?

@ojarjur
Copy link
Contributor Author

ojarjur commented Oct 18, 2023

Now that I know the root cause, the simplest repro steps would be:

  1. Configure the c.GatewayClient.auth_token value to be empty.
  2. Configure the c.GatewayClient.token_renewer_class to be non-empty.
  3. Configure the c.GatewayClient.url value to be any valid gateway URL.

Then, when you start Jupyter and try to list kernelspecs, the requests to the Gateway URL will not have an auth token even though the token_renewer_class was configured.

@ojarjur
Copy link
Contributor Author

ojarjur commented Oct 19, 2023

We discussed this in the Jupyter Server community meeting today, and @Zsailer suggested that the typing issue that this change was meant to fix is the type annotation of auth_token parameter to the TokenRenewerBase class's get_token` method.

That parameter is annotated with a type of str, but it was potentially being passed None.

This suggests two possible alternative fixes to the type error that would maintain backwards compatiblity with the 2.7.3 and earlier releases:

  1. Remove the conditional check on line 604 and then change the type annotation of the auth_token parameter to be ty.Union[str, None] instead of just str, or
  2. Change the conditional check to be if self.auth_token is not None: instead of if self.auth_token:.

The difference between the two approaches is whether or not implementations of TokenRenewerBase are expected to handle the case of the auth_token parameter being None.

The default value for the auth_token configuration is an empty string, so I believe the only scenario where the value could be None (and as such, where the type mismatch could have previously occurred at run time) would be if the value was explicitly configured to be None.

If that happened then the outcome would be (depending on scenario):

jupyter_server version token renewer supports None Outcome
2.7.3 Yes None token gets potentially replaced by the renewer and Gateway auth presumably works
2.7.3 No Token renewer potentially crashes
2.8.0 Yes None token never gets replaced and Gateway auth potentially fails
2.8.0 No Token renewer never gets called and so doesn't crash
With Option 1 Yes Token renewer gets called and Gateway auth presumably works
With Option 1 No Token renewer potentially crashes
With Option 2 Yes Token renewer never gets called and so the None token never gets replaced and Gateway auth fails
With Option 2 No Token renewer never gets called and so doesn't crash

This means that neither solution is ideal; they both leave a scenario with a potentially bad outcome, but also that this scenario only occurs in the unlikely event that the GatewayClient.auth_token configuration value is explicitly set to None.

However, there is another element at play here; if we go with Option 1, then any existing extensions of TokenRenewerBase will potentially have to update their type signatures (if they have one). This means that the choice of option here potentially has consequences on other code outside of the jupyter_server repo.

Given that, I think the safest bet is to go with option 2, and only call the token renewer if the auth_token value is not None. This is assuming that I am right about the default value being the empty string.

ojarjur added a commit to ojarjur/jupyter_server that referenced this issue Oct 20, 2023
Prior to the `2.8.0` release, Gateway token renewers would be run
regardless of the value of the initial auth token configured with
the Gateway client.

However, that initial value could potentially be `None`, and token
renewers are not required to support that as a value for their
`auth_token` parameter. This meant that token renewers might be
called with a value they do not support.

The `2.8.0` release fixed that issue by only calling token renewers
if the `auth_token` value was truthy. However, this introduced a
regression because the default value for auth tokens is the empty
string, which is not truthy but which *is* a valid string value
(and, as such, is expected to be supported by token renewers).

This change fixes that by calling token renewers as long as the
`auth_token` value is not `None`. This is a compromise between
the pre-2.8.0 behavior and the 2.8.0 behavior, and is believed
to be the least disruptive option.

This fixes jupyter-server#1339
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant