Skip to content

Conversation

@sohail-hashicorp
Copy link
Collaborator

Description

Display of incorrect UI error message when the user tries to login after unsuccessful attempt.

Short Summary of the Issue

A ticket has been raised with the issue where the user encounter an error message after attempting to log in to Boundary using OpenID Connect (OIDC) with Vault as the authentication provider. While the login flow works as expected when the token is valid, user experienced unexpected behavior once the token expires - despite re-authenticating successfully, an error is displayed.

RCA Details

1. Handling Expiry of Tokens

When a user's OIDC token (from Vault) expires, they are redirected to the Vault login page for re-authentication. However, Vault fails to return a valid token back to Boundary after this process. As a result, Boundary receives an invalid token and responds with a 500 error.

  • Root Cause: Vault does not return a fresh or valid token ID to Boundary after re-authentication. This causes Boundary to treat the session as invalid.
  • Note: This issue has been confirmed and is tracked separately

2. Sticky Error Message on UI

Once the 500 error is received due to the invalid token, an error message is shown in the UI. If the user subsequently logs in successfully (e.g., by manually authenticating through Vault in another tab), the previous error message still persists.

  • Root Cause: Error messages are not automatically cleared when the login button is clicked again, resulting in stale messages being shown post-successful login.

Solution provided for Issue 2 (Sticky Error Message)
To resolve the sticky error message, considered the below solution:

Message Reset on Login

Clear all existing success or error messages when the user initiates a new login attempt. This ensures that only the most recent status message is shown.

🎟️ Jira ticket

Screenshots (if appropriate)

Before:
image

After:
image

How to Test

  1. Set Up Vault as an OIDC Provider
    To get started, create a Vault as an OIDC provider for the login, you can refer the documentation for the setup process and use "HCP Vault Dedicated" because token has expiry time here.
    Note: Use HCP Vault Dedicated, as tokens here have an expiration time, which is essential for testing.

  2. Run the UI with Mirage Disabled
    Ensure that ENABLE_MIRAGE flag is set to false and run the below command:
    ENABLE_MIRAGE=false yarn start:desktop

  3. Login via Vault
    Select the Vault as the login provide and login. If the token is valid, it should take you to the home page of the boundary.

  4. Test Invalid Token Scenario
    If the token is invalid, you will be redirected to Vault's login UI page to re-authenticate. Currently the vault's UI has a known issue of not passing the correct token and the ticket is open. After few minutes the API server will return a 500 error indicating an invalid token, and an error message will be displayed to the user.
    image

image
  1. Workaround for Vault UI Issue
    We can bypass the above issue from Vault's UI by re-authenticating the vault from the browser by navigating to the HCP Vault Dedicated's page from the HCP portal - (configured in the step 1 above)

  2. Verify Successful Re-authentication
    After successful attempt, repeat the step 3 to login the error message should no longer be displayed.

  3. Reference Video
    A video demonstrating both the issue and the fix is attached for reference.

Before Fix:
https://github.com/user-attachments/assets/b73238f3-d118-41eb-91a1-c549ff181df9

After Fix:
https://github.com/user-attachments/assets/a04a8fd2-772d-4a34-9597-49bc00fc87c1

Checklist

  • I have added before and after screenshots for UI changes
  • I have added JSON response output for API changes
  • I have added steps to reproduce and test for bug fixes in the description
  • I have commented on my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works

@sohail-hashicorp sohail-hashicorp requested a review from a team as a code owner April 11, 2025 17:10
@vercel
Copy link

vercel bot commented Apr 11, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
boundary-ui ❌ Failed (Inspect) May 12, 2025 4:46am
boundary-ui-desktop ❌ Failed (Inspect) May 12, 2025 4:46am

@sohail-hashicorp sohail-hashicorp marked this pull request as draft April 11, 2025 17:11
@sohail-hashicorp sohail-hashicorp marked this pull request as ready for review April 11, 2025 17:27
calcaide
calcaide previously approved these changes Apr 14, 2025
Copy link
Contributor

@calcaide calcaide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work fixing this issue, and clever test to validate the work!! 🙌 💪

laurenolivia
laurenolivia previously approved these changes Apr 15, 2025
Copy link
Contributor

@laurenolivia laurenolivia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. The PR description and testing plan were very well written!

Also, welcome to Boundary!

Copy link
Collaborator

@cameronperera cameronperera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left a small nitpick in one of the tests.

Copy link
Collaborator

@hashicc hashicc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few requests around using sinon on the existing flash messages service and also setting up the two tests in the same way would help with readability. Thanks for the deep dive into this. I'll try and give it a manual test tomorrow

@sohail-hashicorp
Copy link
Collaborator Author

Thanks @cameronperera and @hashicc for the comments. I have updated the test case, felt like one test case was redundant and have removed that. Here are the latest changes.

Copy link
Collaborator

@hashicc hashicc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing this bug and figuring it out. It seems quite involved to get everything setup to properly exercise the bug! I left some feedback on getting the test fixed so that it's running and possibly a better hook for clearing the messages.

Also can you help me with getting this set up, I wanted to test the bug locally. I was able to setup the vault cluster with hashicorp cloud and set it up as an oidc provider and log in successfully with a local boundary dev. I'm not sure how to get the token to a point where it fails and I see the error flash message.

@sohail-hashicorp
Copy link
Collaborator Author

@hashicc, if you'd like to test token expiration behavior, you can adjust the token's TTL (Time To Live) by updating the configuration in the Vault UI. Here's how you can do it:

  1. Go to: HCP portal
  2. Select Vault Dedicated.
  3. Choose your Vault cluster.
  4. Click on Launch Web UI in the top right corner.
  5. In the Web UI, navigate to the left menu and click Access.
  6. Under Authentication Methods, click on userpass/ — this is the method configured via our docs.
  7. Select the end-user entry (this name might vary if you’ve used a custom username).
  8. Click Edit User (top right corner).
  9. Scroll down, expand the Tokens section.
  10. Update the Generated Token’s Initial TTL field to a value like 2, 5, or 10 minutes — this sets how long the token will remain valid.

I've also recorded a short video which showcase the above steps:

Changing_the_tokens_live_time.mov

@hashicc
Copy link
Collaborator

hashicc commented Apr 22, 2025

Thanks for steps, that was really helpful! I was able to go in and make the changes for the token.

I was digging into this a bit further and was thinking of ways that we can clear just the specific message in case other errors were encountered on the way to the oidc route and we wouldn't want to clear those as well. It looks like this message is triggered here. And the notifyError decorator dispatches the toast message here. With this information could we find find the specific message:

// with injected intl service, find the matching message that matches the same key used dispatch
const previousAuthenticationFailedFlashMessage = this.flashMessages.queue.find(flash => flash.message === this.intl.t("errors.authentication-failed.title");
// if it's not-nullish call `destroyMessage` on the message
previousAuthenticationFailedFlashMessage?.destroyMessage();

This is similar to how the clearMessages works but instead of finding a specific message to clear, it clears all of them in a forEach.

I think in the test you might also be able to:

  1. call the ScopesScopeAuthenticateMethodOidcRoute.error(e) method instead which would trigger @notifyError
  2. Assert that the flash message service has one message and the name of that message matches the translated key of "errors.authentication-failed.title"
  3. Then call activate to simulate re-entering the route
  4. Assert the flash message service has zero messages

The benefit of this is the route ever gets refactored this test will likely fail (if either the message gets changed, or the notify error is removed) and will also need to be refactored. This might be doable as an acceptance test too.

@sohail-hashicorp
Copy link
Collaborator Author

Thanks for steps, that was really helpful! I was able to go in and make the changes for the token.

I was digging into this a bit further and was thinking of ways that we can clear just the specific message in case other errors were encountered on the way to the oidc route and we wouldn't want to clear those as well. It looks like this message is triggered here. And the notifyError decorator dispatches the toast message here. With this information could we find find the specific message:

// with injected intl service, find the matching message that matches the same key used dispatch
const previousAuthenticationFailedFlashMessage = this.flashMessages.queue.find(flash => flash.message === this.intl.t("errors.authentication-failed.title");
// if it's not-nullish call `destroyMessage` on the message
previousAuthenticationFailedFlashMessage?.destroyMessage();

This is similar to how the clearMessages works but instead of finding a specific message to clear, it clears all of them in a forEach.

I think in the test you might also be able to:

  1. call the ScopesScopeAuthenticateMethodOidcRoute.error(e) method instead which would trigger @notifyError
  2. Assert that the flash message service has one message and the name of that message matches the translated key of "errors.authentication-failed.title"
  3. Then call activate to simulate re-entering the route
  4. Assert the flash message service has zero messages

The benefit of this is the route ever gets refactored this test will likely fail (if either the message gets changed, or the notify error is removed) and will also need to be refactored. This might be doable as an acceptance test too.

@hashicc Thanks for the suggestion, that is a nice way to do it.

I have updated the code and the test cases with the above suggestion, can you have a look again. Thank you.

@sohail-hashicorp
Copy link
Collaborator Author

As discussed in the Slack thread regarding this issue, the agreed-upon solution is to make the error message non-sticky.

This PR has been updated accordingly, and here is the reference video showing the change in action:

ui_error_fixed.mov

@sohail-hashicorp sohail-hashicorp requested a review from hashicc May 8, 2025 15:50
Copy link
Collaborator

@hashicc hashicc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 I know there was a bit of back and forth on this one. I appreciate you working on it and getting it to this state. Thank you!

@sohail-hashicorp sohail-hashicorp force-pushed the ICU-16227/fixing-login-ui-error branch from fc6d009 to 260b8e0 Compare May 12, 2025 04:45
@sohail-hashicorp sohail-hashicorp merged commit 3d58745 into main May 12, 2025
11 of 13 checks passed
@sohail-hashicorp sohail-hashicorp deleted the ICU-16227/fixing-login-ui-error branch May 12, 2025 05:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants