Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to have authentication enabled for all endpoints by default #1392

Merged
merged 20 commits into from Mar 4, 2024

Conversation

krassowski
Copy link
Collaborator

@krassowski krassowski commented Feb 10, 2024

Opening a draft for early feedback on naming and implementation.

References

Closes #389

Builds upon @yuvipanda's idea of using prepare method as described in #1012 (comment).

Code changes

  • adds ServerApp.allow_unauthenticated_access traitlet (True by default for backwards compatibility)
  • adds @allow_unauthenticated decorator, an opposite of Torando's @web.authenticated
  • if allow_unauthenticated_access is False rejects requests to endpoints which do not have @allow_unauthenticated decorator for any non-authenticated user

Punchlist

  • add a test for ServerApp.allow_unauthenticated_access and @allow_unauthenticated
  • add a test ensuring that all methods on all JupyterHandler descendants in jupyter-server have either @web.authenticated or @allow_unauthenticated
    • maybe make it a runtime warning?
    • maybe also add a runtime warning if any handlers which inherit directly from web.RequestHandler rather than AuthenticatedHandler/JupyterHandler get registered with the router?
    • maybe raise or warn if both @allow_unauthenticated and @web.authenticated are set?
  • check if any other endpoint beside login and logout requires @allow_unauthenticated
  • support websockets in the same way
  • consider moving the logic to AuthenticatedHandler, but this also requires setting current_user in AuthenticatedHandler; I do not understand why it is only set in JupyterHandler I opened Should current_user and prepare guards be set in AuthenticatedHandler rather than JupyterHandler? #1398 to track this as a future improvement

@yuvipanda
Copy link
Contributor

Haven't looked too deep into this, but I absolutely love this.

fix passing the setting down from the traitlet to web server
`WebsocketMixing` may be used with or without `JupyterHandler`;
- if used with it, we want to have custom auth implementation
  because redirecting to login page does not make sense for
  a websocket's GET request
- if these are used without `JupyterHandler `we want the auth
  rules to still apply, even though the `current_user`
  logic may differ slightly
@@ -1214,6 +1215,21 @@ def _deprecated_password_config(self, change: t.Any) -> None:
""",
)

allow_unauthenticated_access = Bool(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of use in JupyterHub deployments, I would also suggest making this available as an env variable. I know you can setup a _config.json file reasonably easily, but given the possible high impact here, I think it's worth making an env var for this too. JUPYTER_SERVER_ALLOW_UNAUTHENTICATED_ACCESS or similar?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in cd84175

Also adds `@ws_authenticated` to make the check simpler
@krassowski krassowski marked this pull request as ready for review February 13, 2024 13:04
@krassowski
Copy link
Collaborator Author

This is ready for review. The failing tests (Downstream Tests / jupytext and Downstream Tests / downstream_check) are also failing on the main branch and I believe these are unrelated.

@Wh1isper
Copy link
Contributor

Wh1isper commented Feb 15, 2024

Maybe allow_ws_unauthenticated is better compared to ws_authenticated?

Otherwise, user seems to have to add @ws_authenticated to every websocket's get method...?

@krassowski
Copy link
Collaborator Author

krassowski commented Feb 15, 2024

If you do not turn the flag on you do not have to add it. But all websockets should be authenticated.

Also, you can decorate websocket's get with @allow_unauthenticated if you wish.

@@ -85,3 +85,58 @@ async def inner(self, *args, **kwargs):
return cast(FuncT, wrapper(method))

return cast(FuncT, wrapper)


def allow_unauthenticated(method: FuncT) -> FuncT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something even more aggressive, e.g. @UNAUTHENTICATED.

@bollwyvl
Copy link
Contributor

This seems like a really good step forward.

Not sure where it should go, but it seems like this should already start tossing out warnings, which can be hidden with e.g. JUPYTER_SERVER_ANNOUNCE_AUTH=0.

Perhaps related, having an @authorized(RESOURCE_NAME) might avoid a bunch of boilerplate.

@krassowski
Copy link
Collaborator Author

Not sure where it should go, but it seems like this should already start tossing out warnings, which can be hidden with e.g. JUPYTER_SERVER_ANNOUNCE_AUTH=0.

Since the motion of "warnings are bad for users, we want developers to see it, not users" was brought up on the meeting before I started working on it, I kept the warnings only to core endpoints (errors when ServerApp.allow_unauthenticated_access=True; mostly to catch these early in PRs against jupyter-server) and to warnings for extensions endpoints when ServerApp.allow_unauthenticated_access=True.

I think that adding a warning for extension endpoints even if allow_unauthenticated_access=False (for now the default) is a good idea but for purpose of getting this PR merged sooner than later I wanted to avoid the potentially contentious issue of when and how to introduce the warning for extension when allow_unauthenticated_access=False.

jupyter_server/base/handlers.py Outdated Show resolved Hide resolved
@no_type_check
def prepare(self, *args, **kwargs):
"""Handle a get request."""
self._maybe_auth()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

._maybe_auth accesses .current_user, but .prepare() below is where it is set. The order needs to be reversed, but I assume there's a reason this is first. I think the override may need to be done a different way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I assume there's a reason this is first

Yes, this is because JupyterHandler.prepare() would raise HTTPError with redirect which does not make sense for websocket.

The order needs to be reversed

So this works in test because .current_user is nominally a getter implemented in tornado.web.RequestHandler. Unfortunately JupyterHandler.prepare() overrides it in a way which will cause a problem for this logic for a non-trivial identity provider (if I see it right, the default user will be used instead of the one provided by IdentityProvider).

It feels like we should move the implementation of setting the current_user from JupyterHandler.prepare to JupyterHandler.get_current_user. Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, why is IdentityProvider setting the current user in JupyterHandler and not in AuthenticatedHandler in the first place? I do not see rationale for it in #671

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see the user from identity provider may require awaiting. I guess I can just pass an argument to JupyterHandler.prepare() instructing it to not redirect to login page.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JupyterHandler is responsible for .current_user being set, since it may be async, it cannot be done in a property. It would be appropriate to override the .current_user getter, which should make it very clear it can never be accessed before .prepare() is called:

@property
def current_user(self):
    if not hasattr(self, "_jupyter_current_user"):
        raise RuntimeError(".current_user accessed before being populated in JupyterHandler.prepare()")
    return self. _jupyter_current_user

@current_user.setter
def current_user(self, user):
    self._jupyter_current_user = user

Any time that error is hit is necessarily code that is bypassing Jupyter Server authentication.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 5e7615d (added test) and 4265f4e (fixed it).

Would you like me to add the RuntimeError on accessing current_user too early (from the last comment) in this PR, or would it be better to ensure that this PR does not introduce potentially breaking changes and can be adopted in existing deployments easily?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, I guess we already have this RuntimeError in get_current_user. So maybe it's just the message that wants clarifying?

Because it's already there, I don't think moving the RuntimeError to the .current_user getter can break anything. But at the same time, since folks may call get_current_user() or access the (correct) .current_user, then I guess .get_current_user covers both cases. So maybe it only makes sense to put it in .current_user if it helps to have two distinct, clear error messages.

# new in 3.11
return code.co_qualname.startswith("authenticated") # type:ignore[no-any-return]
elif hasattr(code, "co_filename"):
return code.co_filename.replace("\\", "/").endswith("tornado/web.py")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only checks if it's anything from web.py, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will have false positives in Python < 3.11. In the ideal world we can get a PR into tornado which would set a flag on the wrapped method, but it would still only work in a future version of tornado.

Copy link
Collaborator Author

@krassowski krassowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @minrk for the review!

jupyter_server/base/handlers.py Outdated Show resolved Hide resolved
@no_type_check
def prepare(self, *args, **kwargs):
"""Handle a get request."""
self._maybe_auth()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I assume there's a reason this is first

Yes, this is because JupyterHandler.prepare() would raise HTTPError with redirect which does not make sense for websocket.

The order needs to be reversed

So this works in test because .current_user is nominally a getter implemented in tornado.web.RequestHandler. Unfortunately JupyterHandler.prepare() overrides it in a way which will cause a problem for this logic for a non-trivial identity provider (if I see it right, the default user will be used instead of the one provided by IdentityProvider).

It feels like we should move the implementation of setting the current_user from JupyterHandler.prepare to JupyterHandler.get_current_user. Thoughts?

# new in 3.11
return code.co_qualname.startswith("authenticated") # type:ignore[no-any-return]
elif hasattr(code, "co_filename"):
return code.co_filename.replace("\\", "/").endswith("tornado/web.py")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will have false positives in Python < 3.11. In the ideal world we can get a PR into tornado which would set a flag on the wrapped method, but it would still only work in a future version of tornado.

The websocket test is failing as of this commit
because as pointed out in review the `_maybe_auth`
is using the default user rather the one from `IdentityProvider`
even in websockets (but only if those inherit from JupyterHandler,
and if they do not fallback to previous implementation and warn).
@krassowski
Copy link
Collaborator Author

krassowski commented Feb 20, 2024

Trying to nail down why after switching WebSocketMixin.prepare to re-use JupyterHandler.prepare the identity provider (PasswordIdentityProvider) returns a valid user for requests without proper auth for websocket connection in Jupyter Server Tests / Test Minimum Versions despite passing locally.

enabled `jp_ws_fetch` header override in:
gttps://github.com/jupyter-server/pytest-jupyter/pull/51
@krassowski
Copy link
Collaborator Author

Found it, it was due to minimum versions check using minimum pytest-jupyter version which did not yet have the ability to override headers (jupyter-server/pytest-jupyter#51). Something to consider in maintainer-tools action is whether it should only use minimum versions of the runtime rather than test deps.

Copy link
Contributor

@minrk minrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

I've written some other auth decorators that are like @web.authenticated (e.g. to raise 403 instead of login redirect, or only accept token auth on this handler, not cookies). How would this interact with those? i.e. how would I explicitly mark that "this handler really is authenticated, even if you can't find @web.authenticated on it. I see there's a check for __allow_unauthenticated, so maybe there should also be __is_authenticated?

@@ -726,7 +745,7 @@ def write_error(self, status_code: int, **kwargs: Any) -> None:
class APIHandler(JupyterHandler):
"""Base class for API handlers"""

async def prepare(self) -> None:
async def prepare(self) -> None: # type:ignore[override]
"""Prepare an API response."""
await super().prepare()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something for later - API handlers should probably 403, not redirect. Now that we have logic for this.

@@ -589,7 +589,7 @@ def check_host(self) -> bool:
)
return allow

async def prepare(self) -> Awaitable[None] | None: # type:ignore[override]
async def prepare(self, *, _redirect_to_login=True) -> Awaitable[None] | None: # type:ignore[override]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private keyword arg makes this hard for subclasses, e.g. extensions with websockets, but I suppose they will inherit from our own websocket classes, too?

I wonder if this should be a Handler class attribute, rather than a private argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a check for inheritance from JupyterHandler in the weboscket mixin class:

if not isinstance(self, JupyterHandler):
should_authenticate = not self.settings.get("allow_unauthenticated_access", False)
if "identity_provider" in self.settings and should_authenticate:
warnings.warn(
"WebSocketMixin sub-class does not inherit from JupyterHandler"
" preventing proper authentication using custom identity provider.",
JupyterServerAuthWarning,
stacklevel=2,
)
self._maybe_auth()
return super().prepare(*args, **kwargs)
return super().prepare(*args, **kwargs, _redirect_to_login=False)

A Handler class attribute is certainly a reasonable alternative that I briefly considered. My thinking was along: if you pass _redirect_to_login argument but are inheriting from wrong class you will get an error. If you set a class attribute but are inheriting from a wrong class you would not get anything. The downside of passing argument is that mypy typing might complain downstream.

@krassowski
Copy link
Collaborator Author

I see there's a check for __allow_unauthenticated, so maybe there should also be __is_authenticated?

Currently you could use __allow_unauthenticated = False.

@minrk
Copy link
Contributor

minrk commented Feb 22, 2024

Currently you could use __allow_unauthenticated = False.

I see, I suppose that would have the desired behavior, even if the name is the opposite of the effect.

Perhaps one issue: it is not the case that @web.authenticated is the only way to authenticate a request. It is a convenient way to encode a specific behavior when authentication fails (specifically: redirect to login on failure). Where redirect on failure is not desired, it is quite appropriate to use other decorators to authenticate request methods (e.g. we should probably use a 403 instead of redirect on ~all of our APIHandlers, using decorators like these in JupyterHub).

So a public API from jupyter_server (e.g. a decorator) that serves only to 'mark' a method as authenticated might be appropriate. Right now, I guess that's @allow_unauthenticated, which is a bit confusing?

@krassowski
Copy link
Collaborator Author

krassowski commented Mar 2, 2024

So a public API from jupyter_server (e.g. a decorator) that serves only to 'mark' a method as authenticated might be appropriate. Right now, I guess that's @allow_unauthenticated, which is a bit confusing?

@do_not_check_authentication_and_do_not_warn_about_it? At this point I would be happy to see a version of this PR land; it feels to me that the developer experience improvements could be made separately (naming things is difficult, but I would trust you or whoever may wish to push to my branch or open a follow up PR to make a right choice).

Edit: regarding naming not sure if everyone here is familiar with this classic from React codebase: __SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED.

@yuvipanda
Copy link
Contributor

+1 for getting this merged in some form and getting a release out, and tackling the devex separately :)

@minrk
Copy link
Contributor

minrk commented Mar 4, 2024

I'm happy with this as-is. I'm not sure i understand the current state of the downstream tests enough to merge or if things need fixing.

I'll add one note from the JupyterHub perspective: I think even the allow_unauthenticated decorator is not necessary in JupyterHub, as there are no handlers in a single-user server that require unauthenticated access. So the much simpler "just always authenticate" without any checking of individual handlers would work there, I believe (unlike a regular jupyter server, which at least needs to serve static files for the login page itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Have handlers be @web.authenticated by default ?
6 participants