Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: token LocalCache to DistributedCache #46109

Closed

Conversation

dsisysteme
Copy link

@dsisysteme dsisysteme commented Jun 25, 2024

Fix #46165

In a load-balanced platform with multiple application servers, local token caching does not work effectively and leads to login loops. We believe that by making this change, we can use a distributed cache instead.

  • Resolves:
    In a load-balanced platform without sticky sessions, local token caching does not work and causes a login loop.

Summary

Currently, the token is stored in the local cache. In the case of a load-balanced Nextcloud platform without Sticky Sessions, it is necessary to store the token in the distributed cache if it exists.

Checklist

In load balanced plateform, token local Cache does not work.
We think with this change that we can use Distributed Cache instead. 

Signed-off-by: dsisysteme <72081136+dsisysteme@users.noreply.github.com>
@dsisysteme dsisysteme changed the title Fix token Cache: Local -> Distributed fix: token LocalCache to DistributedCache Jun 25, 2024
@kesselb kesselb added this to the Nextcloud 30 milestone Jun 27, 2024
@kesselb kesselb added bug 3. to review Waiting for reviews labels Jun 27, 2024
@kesselb
Copy link
Contributor

kesselb commented Jun 27, 2024

Currently, the token is stored in the local cache. In the case of a load-balanced Nextcloud platform without Sticky Sessions, it is necessary to store the token in the distributed cache if it exists.

The token is stored in oc_authtokens and the cache there to reduce database queries.

Changing local to distributed to seems like a hack to fix something that's broken somewhere else.

@ChristophWurst
Copy link
Member

Distributed cache is slow, I would prefer to avoid it.

Can you outline how the local cache can lead to login loops?

@dsisysteme
Copy link
Author

dsisysteme commented Jun 27, 2024

Sorry, we try to explain the problem in the issue.

What we notice in the flow is that when requests come back to Nextcloud after passing through user_saml, the node that generated the token can handle the requests, whereas the second node will return a 401 and redirect to user_saml, which is already authenticated. It then returns to Nextcloud, which will generate a new token on the node handling the requests, but when a request arrives on the second node, it returns a 401 and the loop is initiated.

When the token is stored in Redis, all nodes retrieve the information. Alternatively, if a node cannot find the token, it should store it in its local cache.

@ChristophWurst
Copy link
Member

whereas the second node will return a 401

I don't yet see why this would happen. The second node doesn't have the app token cached, true, but it also doesn't have a negative cache entry. So it will go to the database and should find the row, right?

@dsisysteme
Copy link
Author

Yes, it seems that the record in the oc_authtoken table disappears immediately when the issue arises. However, logically, the second node should make a request to the database to update its local cache. In the case of distributed caching, there is ultimately only one cached record for the entire cluster. This is just for discussion because we do not know all the impacts behind it.

We have conducted upgrade tests and detected that the issue appeared between version 27.1.9 and 27.1.10, if that helps to understand the reason!

@dsisysteme
Copy link
Author

We detected that right after the SSO login (redirect from SSO to Nextcloud), we get this error and then the redirection loop starts, but the error appears only once:

{
  "reqId": "GMZg7KyZVNMpJ9dUVL58",
  "level": 3,
  "time": "2024-06-27T17:00:15+02:00",
  "remoteAddr": "192.168.1.1",
  "user": "--",
  "app": "core",
  "method": "GET",
  "url": "/index.php/apps/theming/theme/light.css?plain=0&v=81b4a035",
  "message": "Renewing session token failed: Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3",
  "userAgent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:127.0) Gecko/20100101 Firefox/127.0",
  "version": "29.0.3.4",
  "exception": {
    "Exception": "OC\\Authentication\\Exceptions\\InvalidTokenException",
    "Message": "Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3",
    "Code": 0,
    "Trace": [
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
        "line": 168,
        "function": "getTokenFromCache",
        "class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
        "type": "->",
        "args": [
          "*** sensitive parameters replaced ***"
        ]
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
        "line": 249,
        "function": "getToken",
        "class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
        "type": "->",
        "args": [
          "*** sensitive parameters replaced ***"
        ]
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/public/AppFramework/Db/TTransactional.php",
        "line": 63,
        "function": "OC\\Authentication\\Token\\{closure}",
        "class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
        "type": "->",
        "args": [
          "*** sensitive parameters replaced ***"
        ]
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
        "line": 248,
        "function": "atomic",
        "class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
        "type": "->"
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/Manager.php",
        "line": 172,
        "function": "renewSessionToken",
        "class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
        "type": "->"
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/User/Session.php",
        "line": 941,
        "function": "renewSessionToken",
        "class": "OC\\Authentication\\Token\\Manager",
        "type": "->"
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/base.php",
        "line": 1132,
        "function": "loginWithCookie",
        "class": "OC\\User\\Session",
        "type": "->",
        "args": [
          "*** sensitive parameters replaced ***"
        ]
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/lib/base.php",
        "line": 1039,
        "function": "handleLogin",
        "class": "OC",
        "type": "::"
      },
      {
        "file": "/var/www/***URL-PLATEFORM***/htdocs/index.php",
        "line": 49,
        "function": "handleRequest",
        "class": "OC",
        "type": "::"
      }
    ],
    "File": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
    "Line": 197,
    "message": "Renewing session token failed: Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3",
    "user": "***LOGIN***",
    "exception": {},
    "CustomMessage": "Renewing session token failed: Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3"
  }
}

@dsisysteme
Copy link
Author

Hello,

It seems that there is a problem in the getToken() function of the PublicKeyTokenProvider.php file
When connecting, this function is called but the record is not found in the database or in the cache (because of loadbalancing and local cache)
The token hash is invalidated ($this->cacheInvalidHash($tokenHash);) which explains the loop.

If we comment on this line $this->cacheInvalidHash($tokenHash); authentication works with local cache

Hope this information helps you find the cause.

Copy link
Contributor

Hello there,
Thank you so much for taking the time and effort to create a pull request to our Nextcloud project.

We hope that the review process is going smooth and is helpful for you. We want to ensure your pull request is reviewed to your satisfaction. If you have a moment, our community management team would very much appreciate your feedback on your experience with this PR review process.

Your feedback is valuable to us as we continuously strive to improve our community developer experience. Please take a moment to complete our short survey by clicking on the following link: https://cloud.nextcloud.com/apps/forms/s/i9Ago4EQRZ7TWxjfmeEpPkf6

Thank you for contributing to Nextcloud and we hope to hear from you soon!

(If you believe you should not receive this message, you can add yourself to the blocklist.)

@blizzz
Copy link
Member

blizzz commented Jul 10, 2024

A different approach #46398

@dsisysteme
Copy link
Author

Close because fix by #46398

@dsisysteme dsisysteme closed this Jul 30, 2024
@dsisysteme dsisysteme deleted the fix/token-distributed-cache branch July 30, 2024 13:58
@skjnldsv skjnldsv removed this from the Nextcloud 30 milestone Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Invalid LocalCache Token on a Load-Balanced System
6 participants