-
Notifications
You must be signed in to change notification settings - Fork 0
Use Flask-Login for user session management #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c827645 to
073e11e
Compare
073e11e to
46adb59
Compare
Pull Request Test Coverage Report for Build 10497196660Details
💛 - Coveralls |
46adb59 to
1599330
Compare
Before I jump into a review, my vote would be to keep the flask logout route, but not show this anywhere in the UI. This would allow us to add this functionality in the UI later if we want, but keep it from feeling "off" (agreed there) for the user experience. I do think it's possible our flask app could more meaningfully "logout" (at the ALB and/or Okta level), but this hasn't been requested yet. Keeping this flask route, almost like a stub, I think would be helpful in case we do want to explore that later. |
dd7f80b to
ca48a93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this feels like a great approach, and well executed.
My comments and requested changes, many of which are nit-picky or stylistic, come from this being our first implementation of the ALB + lambda authentication, and the very real possibility this could become a template for how we do it in other applications.
I have no doubt this would work as expected, and laid out well at that, but felt this was a good time to offer some slight tweaks and alternate options before we get too baked in.
But again, overall, think it's great. Feels like an elegant approach to managing a "logged in user" when in fact every request that makes it to the flask app is already logged in. Leaning on flask-login feels like it makes sense, as it leans on its strengths of pre-defined patterns, built-in functions (e.g. logout_user, even if not yet really utilizing), and session management.
Nice work all around!
| REGION = "us-east-1" | ||
|
|
||
|
|
||
| def parse_oidc_data( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this utility function could benefit from a docstring. It's doing a lot of the heavy lifting of parsing authenticated user data from the HTTP headers on the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add a docstring (wip).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webapp/app.py
Outdated
| # get user data if new session or refresh user data when access token expires | ||
| if oidc_access_token != session.get("oidc_access_token"): | ||
| oidc_data = parse_oidc_data(oidc_jwt_data, options={"verify_exp": False}) | ||
| user = User( | ||
| id=oidc_data["mit_id"], | ||
| name=oidc_data["name"], | ||
| email=oidc_data["preferred_username"], | ||
| ) | ||
| session.update( | ||
| { | ||
| "oidc_access_token": oidc_access_token, | ||
| "user": user.__dict__, | ||
| } | ||
| ) | ||
| return User(**session["user"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to get a little complicated, but will try and make a case here.
Am I recalling correctly that we had discussed an approach that would only ever instantiate the User class in one way here? Where instead of storing a User.__dict__ in the session, we could store the parsed OIDC data and then use that? e.g.
- if token stale or token never set, parse OIDC data from headers
- store parsed OIDC data -- not the
Userobject -- in the session, e.g.session['oidc_data'] = oidc_data - last, always, instantiate a
Userobject by using OIDC data in session:
# last part of function
oidc_data = session['oidc_data']
return User(
id=oidc_data["mit_id"],
name=oidc_data["name"],
email=oidc_data["preferred_username"],
)I would find this more readable, knowing that a User object is only instantiated in one way: pulling specific pieces of OIDC data already in the session. With that cognitive load understood and gone, I could then focus my attention on understanding when and why we might parse OIDC data from headers and set it in the session.
It may seem trivial, but I think this opens up the option of creating a new() or load() or from_session_data() method on the User class and using that data, e.g.:
class User(UserMixin):
# ...
# ...
@classmethod
def from_session_data(cls):
oidc_data = session.get("oidc_data")
return cls(
id=oidc_data["mit_id"],
name=oidc_data["name"],
email=oidc_data["preferred_username"],
)Even if a method like this is not used, I still do think the @login_manager.request_loader function having only one line / one way of instantiating a User object has value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we did do something like this, perhaps we could also add a safeguard handler right before we instantiate and return the User object, something along the lines of:
# rest of method...
if not session.get('oidc_data'):
# do something meaningful...
return User.from_session_data() # or if not the method, just instantiating and returningWe shouldn't be in a situation like this where there is not session data to work with... but this ensures if that somehow arises, the app will gracefully handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, apologies for missing this! I do remember us talking about replacing the use of user.__dict__. I updated the code to store OIDC data in the session to instantiate the User.
If we did implement a class method like User.from_session_data, how do you feel about returning None if oidc_data is not in the session? 🤔 I think I like it because it would align with load_user_from_request returning None if it is missing data from the request headers --- which is also worth noting (i.e., because we have these if statements, session must have OIDC data before it ever reaches User instantiation. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started a review, but I'll hold off until all changes are finalized.
I think returning None makes a ton of sense. Like you say, it mirrors the behavior if the headers are missing. No matter how you slice it: if we can't instantiate a User, then return None from this method and let the flask-login library decide how to handle things.
Good thinking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed ☝️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A User.from_session_data class method has been implemented. The request_loader callback is updated to use this method. Additional tests have been added.
Note: I wasn't sure if the idea was to have the condition if session.get('oidc_data') is None as part of the request_loader callback or part of User.from_session_data. 🤔 What I was proposing was the latter -- adding the condition as part of User.from_session_data. For what it's worth, I did try initially writing the condition in the callback, but I was getting a lot of linting errors -- the update I went with resolved those errors. Let me know what ya'll think!
webapp/app.py
Outdated
| """ | ||
|
|
||
| def __init__( | ||
| self, id: str, name: str, email: str | None = None # noqa: A002 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth considering something like user_id instead of id to avoid shadowing the python built-in id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess if we're considering not ignoring rule A002, I would propose updating to mit_id instead. What do you think, @ehanson8 ? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we shouldn't be ignoring A002, we can always be more specific than id. I like mit_id!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was going to say +1 to mit_id! But...
I was a little nervous that flask-login's UserMixin requires an id field, given that some other flask-login flow is passing around identifiers (almost like it knows which field to use).
I think this line in flask-login suggests that we do need this id field for it to function properly.
We could, of course, override all those related methods... but feels like a situation where it makes sense to ignore convention and linting a bit.
Sorry for starting this thread in the first place!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do folks feel about overriding UserMixin.get_id?
def get_id(self):
try:
return str(self.mit_id)
except AttributeError:
raise NotImplementedError("No `id` attribute - override `get_id`") from NoneThe error message seems to invite folks to override get_id method. That said, if MIT Data Warehouse decided to ever update this field name, we'd have to update the code in two places instead of one:
- custom
get_idmethod - [creating
Userinstance](https://github.com/MITLibraries/alma-sapinvoices-ui/pull/60/files#diff-55f12a9f17b4ba61d3e37852150b7d78b8aedbe3629bc4ce8173d4b572297a54R83-R8
Given this, is everyone set for overriding this A002?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice find about the NotImplementedError, and how it suggests this is designed to be overridden. I'm sold!
I'm in full support of:
- changing
idtomit_id - overridding this method to treat
self.mit_idas the unique,Useridentifier - and then, linting concern
A002no longer an issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that seems like an elegant solution for this situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The User class has been updated:
idattribute has been renamed tomit_idget_idmethod implemented (overridingUserMixin.get_id)
Note: I chose to ignore ruff error codeEM101to keep the try-except block consistent with parent method.
The line of code instantiating the User has also been updated.
3072861 to
d3d80ba
Compare
webapp/app.py
Outdated
| # get user data if new session or refresh user data when access token expires | ||
| if oidc_access_token != session.get("oidc_access_token"): | ||
| oidc_data = parse_oidc_data(oidc_jwt_data, options={"verify_exp": False}) | ||
| user = User( | ||
| id=oidc_data["mit_id"], | ||
| name=oidc_data["name"], | ||
| email=oidc_data["preferred_username"], | ||
| ) | ||
| session.update( | ||
| { | ||
| "oidc_access_token": oidc_access_token, | ||
| "user": user.__dict__, | ||
| } | ||
| ) | ||
| return User(**session["user"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started a review, but I'll hold off until all changes are finalized.
I think returning None makes a ton of sense. Like you say, it mirrors the behavior if the headers are missing. No matter how you slice it: if we can't instantiate a User, then return None from this method and let the flask-login library decide how to handle things.
Good thinking!
1289e5b to
f94b6d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions but this is looking very good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitting an approval!
Thanks for making the changes per PR discussions. And thanks for trailblazing a new strategy for ALB authentication for lambda flask apps; I really do think we'll be able to reuse a lot of this.
I did make one comment / request, to remove the try/except block and NotImplementedError in the User.get_id() method. I'm comfortable approving, whether or not that change is made via a follow-up comit, but it makes sense to me to remove.
Nice work!
f94b6d8 to
794b470
Compare
794b470 to
2a9200d
Compare
Why these changes are being introduced: * The Application Load Balancer (ALB) that manages authentication and authorization provides an access token and user claims (info about user) in the request headers routed to its target, the Flask app. With Flask-Login, this data can be used to identify the 'current_user' on the app and detect when a session may be stale. How this addresses that need: * Implement 'request_loader' callback method * Add utility method to parse OIDC data from request headers * Add 'User' class * Add 'logout' view * Decorate views using 'login_required' * Add 'AWS_DEFAULT_REGION' to Config * Add unit tests Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/IN-1033
Why these changes are being introduced: * The application configuration variable LOGIN_DISABLED is a convenient method to globally turn off the login method (i.e., parsing OIDC data from header) for purposes of unit testing and running the app in Dev1 (ALB is turned off for Dev1). How this addresses that need: * Add environment variable 'LOGIN_DISABLED' to Config Side effects of this change: * Environment variable LOGIN_DISABLED must be set to "true" for Dev1 and "false" for Prod and Stage. Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/IN-1033
2a9200d to
5e8380f
Compare
Purpose and background context
The Application Load Balancer (ALB) that manages authentication and authorization provides an access token and user claims (info about user) in the request headers routed to its target, the Flask app. This PR uses Flask-Login to implement a login method to identify users and detect "stale" sessions, refreshing access tokens and user claims JWTs as needed.
Note(s)
SECRET_KEYis introduced. TheSECRET_KEYis required by the Flask app to "securely sign[ing] the session cookie and can be used for any other security related needs by extensions or [the] application" and is set for the Flask app.request_loadercallback method is implemented to set thecurrent_userin the Flask session.How can a reviewer manually see the effects of these changes?
make testand verify all tests are passing.LOGIN_DISABLEDenvironment variable, try setting the variable to "true" and "false" and running the app with the command:pipenv run sapinvoices_flask_app.Includes new or updated dependencies?
YES; updated dependencies and added packages (flask_login, pyjwt) and dev-packages (types-requests)
Changes expectations for external applications?
NO
What are the relevant tickets?
https://mitlibraries.atlassian.net/browse/IN-1033
Developer
Code Reviewer(s)