Remove synchronous blocking frontend backend interactions #180

gene1wood · 2020-01-10T00:05:33Z

Removed synchronous blocking frontend backend interactions and instead this now relies entirely on the pollState frontend polling loop
- Previously there was a mix of frontend backend interactions which were async and one which blocked and waited for other async calls to finish
- This previous blocking interaction was when the frontend called "/redirect_callback". handle_oidc_redirect_callback
  would then call login.exchange_token_for_credentials which would loop and wait for a separate POST from the frontend to /api/roles to
  complete before login.exchange_token_for_credentials would continue and allow /redirect_callback to return a response to the frontend.
- This commit removes this behavior so that all frontend to backend interactions are async and the frontends movement through the workflow
  is entirely governed by the changes to the login.state value
- This new functionality is achieved with the following changes
  - login.exchange_token_for_credentials
    - The method no longer loops, sleeping and waiting for self.role_arn to be set
    - The method now calls self.print_output if it succeeds at getting STS credentials instead of handle_oidc_redirect_callback or login.login
      triggering this as was done previously
    - The method now handles all error cases that it could encounter on its own and returns either "error" if there was a problem, "finished"
      if it succeeded and is done, or "restart_auth" if AWS rejected the ID token and the tool should redirect the user back to the IdP
      to get a new ID token.
  - In cases where the user doesn't provide a role_arn, the set_role function (which is called when the user clicks the role they want and a POST
    is made to /api/roles) is what initiates the login.exchange_token_for_credentials call. This way the user selecting a role is what directly kicks
    off the sequence that ends with credentials being printed on the CLI
Previously this long running synchronous call to /redirect_callback also kept an eye on how long it had been since the frontend had last made a call
to /api/state. The way it did this was that while the backend looped, waiting on login.role_arn to get set, it would on each loop check when /api/state
had last been called. Since we no longer have this loop, this commit adds a new "heartbeat" backend endpoint
- The "/api/heartbeat" endpoint starts an async long running call that loops and watches to make sure the frontend is awake and polling
- /api/heartbeat responds every 30 seconds at which point a new call to /api/heartbeat is started. This is just to ensure the browser doesn't give up
  on waiting for the response from the endpoint.
- This heartbeat call runs in parallel to everything else watching how long it's been since the last call to /api/state and if it's been too long
  the listener kills itself
- In order to maintain the prompt identification (within 2 seconds) by the CLI that the user has closed the frontend window prematurely and still
  accommodate potentially slow calls to external data sources (like Auth0 and the ID token for role endpoint) that could take longer than 2 seconds,
  this commit makes the max_sleep_no_state_check value dynamic. Previously the fact that external data sources would sometimes take longer than 2
  seconds was not an issue because multiple pollStates could be running at the same time. Now with the state.backendInProgress described below, only
  a single pollState will be running at any given time and so slow responses need to be considered.
  - This commit solves this problem by watching the login.state at each call to /api/state and if a login.state is about to be returned which precedes
    an external data source call, the max_sleep_no_state_check is increased from 2 to 10 seconds.
  - Then later when the login.state changes to something not requiring external calls, the max_sleep_no_state_check is reduced back to 2 seconds.
This also adds a new behavior in the frontend to accommodate the fully async model which is to track when each run of pollState begins and ends. This
is to ensure that if one run of pollState takes longer than "sleepTime", a second concurrent execution of pollState won't start.
This is achieved with the new global state value, state.backendInProgress. When the pollState starts, this is set to prevent a later pollState from starting
(as the first pollState hasn't completed yet).
Instead of killing the flask process in response to error conditions, now the tool instead responds to the frontend to call the /shutdown endpoint
and that endpoint is responsible for killing the listener. This will remove the potentially scary scenario where the continued execution of logic
in the tool after an error condition is only prevented by the process being killed instead of returning cleanly and then killing the process.
To achieve this, some methods that previously returned nothing, now return values to either finish execution early in the case of an error or
to communicate success.
In this new model where error cases cause backend calls to complete cleanly, and then for the frontend to call /shutdown, I've removed the "sleep(3600)"
statements scattered through the code as this isn't needed any more
Add click check for the case where a user passes --batch but does not provide a --role-arn as well and exit with a usage complaint
Move the display of the "select a role" message in the frontend to after the role list has been fetched to avoid showing the user a request to select
a role when there are no roles to select from

* Removed synchronous blocking frontend backend interactions and instead this now relies entirely on the pollState frontend polling loop * Previously there was a mix of frontend backend interactions which were async and one which blocked and waited for other async calls to finish * This previous blocking interaction was when the frontend called "/redirect_callback". handle_oidc_redirect_callback would then call login.exchange_token_for_credentials which would loop and wait for a separate POST from the frontend to /api/roles to complete before login.exchange_token_for_credentials would continue and allow /redirect_callback to return a response to the frontend. * This commit removes this behavior so that all frontend to backend interactions are async and the frontends movement through the workflow is entirely governed by the changes to the login.state value * This new functionality is achieved with the following changes * login.exchange_token_for_credentials * The method no longer loops, sleeping and waiting for self.role_arn to be set * The method now calls self.print_output if it succeeds at getting STS credentials instead of handle_oidc_redirect_callback or login.login triggering this as was done previously * The method now handles all error cases that it could encounter on its own and returns either "error" if there was a problem, "finished" if it succeeded and is done, or "restart_auth" if AWS rejected the ID token and the tool should redirect the user back to the IdP to get a new ID token. * In cases where the user doesn't provide a role_arn, the set_role function (which is called when the user clicks the role they want and a POST is made to /api/roles) is what initiates the login.exchange_token_for_credentials call. This way the user selecting a role is what directly kicks off the sequence that ends with credentials being printed on the CLI * Previously this long running synchronous call to /redirect_callback also kept an eye on how long it had been since the frontend had last made a call to /api/state. The way it did this was that while the backend looped, waiting on login.role_arn to get set, it would on each loop check when /api/state had last been called. Since we no longer have this loop, this commit adds a new "heartbeat" backend endpoint * The "/api/heartbeat" endpoint starts an async long running call that loops and watches to make sure the frontend is awake and polling * /api/heartbeat responds every 30 seconds at which point a new call to /api/heartbeat is started. This is just to ensure the browser doesn't give up on waiting for the response from the endpoint. * This heartbeat call runs in parallel to everything else watching how long it's been since the last call to /api/state and if it's been too long the listener kills itself * In order to maintain the prompt identification (within 2 seconds) by the CLI that the user has closed the frontend window prematurely and still accommodate potentially slow calls to external data sources (like Auth0 and the ID token for role endpoint) that could take longer than 2 seconds, this commit makes the max_sleep_no_state_check value dynamic. Previously the fact that external data sources would sometimes take longer than 2 seconds was not an issue because multiple pollStates could be running at the same time. Now with the state.backendInProgress described below, only a single pollState will be running at any given time and so slow responses need to be considered. * This commit solves this problem by watching the login.state at each call to /api/state and if a login.state is about to be returned which precedes an external data source call, the max_sleep_no_state_check is increased from 2 to 10 seconds. * Then later when the login.state changes to something not requiring external calls, the max_sleep_no_state_check is reduced back to 2 seconds. * This also adds a new behavior in the frontend to accommodate the fully async model which is to track when each run of pollState begins and ends. This is to ensure that if one run of pollState takes longer than "sleepTime", a second concurrent execution of pollState won't start. This is achieved with the new global state value, state.backendInProgress. When the pollState starts, this is set to prevent a later pollState from starting (as the first pollState hasn't completed yet). * Instead of killing the flask process in response to error conditions, now the tool instead responds to the frontend to call the /shutdown endpoint and that endpoint is responsible for killing the listener. This will remove the potentially scary scenario where the continued execution of logic in the tool after an error condition is only prevented by the process being killed instead of returning cleanly and then killing the process. To achieve this, some methods that previously returned nothing, now return values to either finish execution early in the case of an error or to communicate success. * In this new model where error cases cause backend calls to complete cleanly, and then for the frontend to call /shutdown, I've removed the "sleep(3600)" statements scattered through the code as this isn't needed any more * Add click check for the case where a user passes --batch but does not provide a --role-arn as well and exit with a usage complaint * Move the display of the "select a role" message in the frontend to after the role list has been fetched to avoid showing the user a request to select a role when there are no roles to select from

april

Initial review.

mozilla_aws_cli/login.py

mozilla_aws_cli/listener.py

mozilla_aws_cli/static/index.js

mozilla_aws_cli/login.py

mozilla_aws_cli/static/index.js

gene1wood requested a review from april January 10, 2020 00:05

april reviewed Jan 10, 2020

View reviewed changes

mozilla-iam deleted a comment from april Jan 10, 2020

A few minor style changes

4c19b5d

gene1wood merged commit 1b974f8 into mozilla-iam:master Jan 11, 2020

gene1wood deleted the remove_mixed_sync_async branch January 11, 2020 00:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove synchronous blocking frontend backend interactions #180

Remove synchronous blocking frontend backend interactions #180

gene1wood commented Jan 10, 2020 •

edited

april left a comment

Remove synchronous blocking frontend backend interactions #180

Remove synchronous blocking frontend backend interactions #180

Conversation

gene1wood commented Jan 10, 2020 • edited

april left a comment

Choose a reason for hiding this comment

gene1wood commented Jan 10, 2020 •

edited