Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary Redis Session Locking On All HTTP GET Requests - Affecting PWA Studio Concurrent GraphQL Requests #34758

Open
1 task
mttjohnson opened this issue Dec 3, 2021 · 42 comments
Labels
Area: APIs Component: Framework/Session Component: GraphQL GraphQL Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Progress: ready for dev Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed.

Comments

@mttjohnson
Copy link

Preconditions (*)

  1. Magento 2.3.4-p1
  2. PWA Studio 12.1.0 (potentially optional)
  3. Setup Magento with Redis Sessions

Helpful Insights

  • New Relic APM Transaction Traces
  • Tested tech stack version (though this may not be relevant)
    • PHP 7.4
    • MySQL 8.0
    • Varnish 6.4
    • Redis 5
    • Elasticsearch 7.12

Steps to reproduce (*)

Configure Magento

  1. Ensure that default configuration of Redis session locking is enabled by setting disable_locking to 0:
cat app/etc/env.php | grep disable_locking
bin/magento -n setup:config:set --session-save-redis-disable-locking=0

Load the PWA storefront home page

  1. Load home page (example: https://syseng-seldon.cldev.io/) which will make multiple graphql calls to the MAGENTO_BACKED_URL
  2. In Chrome Developer Tools on the Network tab you can use a filter to only display requests for the specific domain that are graphql calls domain:syseng-seldon.cldev.io -media -static graphql
  3. Multiple (15) /graphql requests are all sent concurrently to the configured MAGENTO_BACKEND_URL of the PWA app around the same time

Simulate the PWA storefront requests on ANY Magento site

  1. Create a static html file (example: graphqlrepeat.html) in /pub/ of the magento root containing fetch
<!DOCTYPE html>
<html lang="en">
<head><title>JavaScript HTTP Requests</title></head>
<body><script>
 
var domain = "syseng-seldon.cldev.io";
 
fetch("https://" + domain + "/graphql?query=query+GetStoreConfigForCarouselEE%7BstoreConfig%7Bid+product_url_suffix+magento_wishlist_general_is_enabled+enable_multiple_wishlists+__typename%7D%7D&operationName=GetStoreConfigForCarouselEE&variables=%7B%7D", {
  "headers": {
    "accept": "*/*",
    "accept-language": "en-US,en;q=0.9",
    "authorization": "",
    "cache-control": "no-cache",
    "content-type": "application/json",
    "pragma": "no-cache",
    "sec-ch-ua": "\" Not A;Brand\";v=\"99\", \"Chromium\";v=\"96\", \"Google Chrome\";v=\"96\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"macOS\"",
    "sec-fetch-dest": "empty",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-origin",
    "store": "default",
    "x-magento-cache-id": "null"
  },
  "referrer": "https://" + domain + "/",
  "referrerPolicy": "strict-origin-when-cross-origin",
  "body": null,
  "method": "GET",
  "mode": "cors",
  "credentials": "include"
});
 
</script></body>
</html>
  1. Replace the domain variable with the domain of the magento site you're testing

  2. Load the HTML page containing js that will fetch multiple graphql requests concurrently

  3. You can open Chrome Developer Tools on the Network tab and where you see one of the GraphQL calls, you can right click on the request line and "Copy as fetch" to get the javascript fetch statement for making that same request in an html page js inline script.
    image

  4. You can do this to re-create all the requests for a specific page by adding each fetch() call to the html file, or duplicate the exact same fetch() graphql request (40x) to reproduce the session locking behavior being seen here.

Simulate SOME of the expected behavior by globally disabling redis session locking on ALL requests

  1. Simulate SOME of the expected behavior by globally disabling redis session locking on ALL requests
cat app/etc/env.php | grep disable_locking
bin/magento -n setup:config:set --session-save-redis-disable-locking=1

Expected result (*)

When concurrent GraphQL GET requests are made from a visitor, requests should be able to complete in parallel to keep page load time minimal, while still allowing requests that require session locking (where important session data is being written) in order to prevent some other request from overwriting data in the session. Important data getting overwritten in a session can negatively affect critical application behavior.

image
image
image

In simulating some of the correct behavior with Redis session locking disabled, I was able to load the home page and all 15 graphql requests within a window of 600 ms. There are other pages that may contain many more GraphQL requests where this can be even more important to have concurrent requests complete in parallel.

Home Page GraphQL Measurements - with Session Locking Disabled
  First started at: 393ms
  Last started at: 586ms
  Last duration was: 401ms
  Last ended at: 586ms + 401ms = 987ms
  Total window of execution: 586ms + 401ms - 393ms = 594ms
  GraphQL Requests Total: 15
  Page Finished 1020ms

image
image

Looking at a waterfall of how the concurrent GraphQL requests complete, can reveal that multiple requests are completing at or within close to the same time.

image

Actual result (*)

With Redis session locking enabled which is the default and recommended safe behavior for redis session configs, the concurrent requests queue up, each waiting in sequence for a redis session lock to clear before the next request is able to complete.

image
image
image

Home Page GraphQL Measurements - with Session Locking Disabled
  First started at: 313ms
  Last started at: 323ms
  Last duration was: 2800ms
  Last ended at: 323ms + 2800ms = 3123ms
  Total window of execution: 323ms + 2800ms - 313ms = 2810ms
  GraphQL Requests Total: 15
  Page Finished 3120ms

image
image
image

This makes it look like several of the graphql requests are taking an excessive amount of time to complete, while others completed in less time, but while these requests started close to the same time, they spent a lot of time waiting for session locking to clear, resulting in requests being completed in sequence rather than in parallel.

image
image

Cause of Behavior

With the PWA Studio (Client Side React App) running as the frontend "storefront" of Magento and sending GraphQL calls to the backend of Magento there are different behavioral patterns in how requests are sent to and processed by the web server from how we have been seeing interactions when using the Magento "theme" as the frontend.

The PWA sends multiple concurrent AJAX calls to /graphql end points on the web server, and these requests are all processed by the Magento backend PHP application. It turns out that Magento architecture has it creating a session and locking that session regardless of the type of request or response being issued.

Some /graphql requests are able to be cached by Varnish and because cached requests in Varnish do not execute code and therefore do not interact with redis sessions. This can mask the fact that each request that hits the backend will lock the session for the request and cause requests to be completed in sequence. Many of the graphql requests are not able to be cached in Varnish at this time.

We have seen this behavior in the Magento "theme" frontend also, but there are typically only a few requests that typically happen concurrently, and we see these results show up in New Relic on transaction traces a lot for AJAX calls that are likely to be running concurrently with other requests with the same session.

image

The problem is not just isolated to AJAX calls like customer/section/load on the Magento "theme" frontend, it also tends to occasionally interfere with other requests on product pages, or AJAX calls in the checkout. In the example below the unlucky visitor ended up waiting 8.8s to get an initial response for loading the page instead of what should have taken 300ms because they had some other request that was locking their session. It doesn't happen often, but it's not pleasant when it does.

The big behavioral difference is that most of the requests in the Magento "theme" frontend do NOT happen concurrently, so session locking on a couple concurrent requests doesn't affect things overall that much, and it tends to be more of an exception and most requests (even customer/section/load AJAX requests) are not delayed waiting for session locks to release on average, so this known deficiency is only causing mild problems with the Magento "theme" frontend overall, and limiting concurrent AJAX calls is a way of working around it.

image

This session locking causes lots of problems with a PWA by delaying concurrent AJAX calls causing them to wait in line and finish sequentially as the locks they are waiting on clear. This ends up making requests randomly appear like they are taking a really long time to complete while under the hood they are primarily just waiting in line for the session to be available so it can lock the session and complete it's request. This makes it very hard to identify from the client side where the problem is as it relates to concurrent requests for the same session and running the requests independently results in a very fast response.

While it's possible to disable session locking, this can cause serious issues with requests that make changes to session data and overwrite each other, and that can result in serious problems on checkout where payments are captured but orders fail to be saved in Magento resulting sometimes in a customer re-submitting the order and getting charged multiple times if they are able to get the order to complete successfully.

The library used by Magento (v2.3.4) is the latest available (colinmollenhour/php-redis-session-abstract v1.4.4) and it does support the ability to process read only requests without locking a session when the global config is set to utilize session locking... it just so happens that Magento does not utilize this functionality and opens all session connections without specifying if it needs to write to the session or not, thus defaulting to write mode and session locking.

  1. https://github.com/magento/magento2/blob/2.4.3-p1/composer.lock#L321
  2. https://github.com/colinmollenhour/php-redis-session-abstract/releases
  3. https://github.com/colinmollenhour/php-redis-session-abstract/blob/v1.4.4/src/Cm/RedisSession/Handler.php#L434
  4. https://github.com/magento/magento2/blob/2.4.3-p1/lib/internal/Magento/Framework/Session/SaveHandler/Redis.php#L63
  5. https://github.com/colinmollenhour/php-redis-session-abstract/blob/v1.4.4/src/Cm/RedisSession/Handler.php#L263

Possible Solution

Requests that come into Magento as GET requests are typically expected to return generic publicly identical response that can be cached by Varnish, while POST requests are explicitly not allowed to be cached by Varnish as they are expected to contain visitor/session/customer private data in the responses. Important write operations should typically happen in POST requests, and those types of requests would be expected to need and utilize session locking, while GET requests would generally be for returning generic data that is not visitor/session/customer specific and the same response would be returned to all requests and not involve any kind of write to the session. If any session write activity were to occur on GET requests, it would likely be to update the timestamp of the most recent request to indicate the session is still active and has not expired yet (this is an assumption that should be verified).


  • Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
@m2-assistant
Copy link

m2-assistant bot commented Dec 3, 2021

Hi @mttjohnson. Thank you for your report.
To speed up processing of this issue, make sure that you provided the following information:

  • Summary of the issue
  • Information on your environment
  • Steps to reproduce
  • Expected and actual results

Make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, review the Magento Contributor Assistant documentation.

Add a comment to assign the issue: @magento I am working on this

To learn more about issue processing workflow, refer to the Code Contributions.


⚠️ According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

🕙 You can find the schedule on the Magento Community Calendar page.

📞 The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

🎥 You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@ihor-sviziev
Copy link
Contributor

@mttjohnson,
I can say that precisely the same issue we have on the luma-based frontend.
Actually, because of too frequent requests to customer/section/load, we had even issues with losing customer sessions. The solution for us was the following:

  1. Reduce not needed session starts 2.3...ihor-sviziev:fix-not-needed-session-start
  2. Installed the following extension: https://github.com/integer-net/magento2-session-unblocker/

@m2-community-project m2-community-project bot added Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed and removed Issue: ready for confirmation labels Feb 10, 2022
@cpartica
Copy link
Contributor

I confirmed the issue. I think is valid to address this at least in GraphQL where we don't really need session

@github-jira-sync-bot github-jira-sync-bot removed the Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed label Feb 10, 2022
@cpartica cpartica added Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed. Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed labels Feb 10, 2022
@cpartica cpartica added the Priority: P0 This generally occurs in cases when the entire functionality is blocked. label Feb 10, 2022
@github-jira-sync-bot github-jira-sync-bot removed the Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed label Feb 10, 2022
@cpartica cpartica added Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed and removed Priority: P0 This generally occurs in cases when the entire functionality is blocked. labels Feb 10, 2022
@github-jira-sync-bot github-jira-sync-bot removed the Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed label Feb 10, 2022
@magento magento deleted a comment from github-jira-sync-bot Mar 1, 2022
@magento magento deleted a comment from github-jira-sync-bot Mar 1, 2022
@magento magento deleted a comment from github-jira-sync-bot Mar 1, 2022
@damienwebdev
Copy link
Member

I don't know if it's been mentioned yet, but there's work in v2.4.5 to disable sessions on GraphQl as they're not strictly necessary any more.

@cpartica or @sidolov may know best, I haven't messed around with the setting yet.

@luishenriqueamaral
Copy link

We are also facing the same problem using our headless storefront and Magento checkout pages. Although we don't use graphql, our requests towards default and custom rest endpoints are stuck in Redis operations. We contacted Adobe support and they recommended us to disable the session locking and they also mentioned that this is default in the cloud version.

I noticed that @colinmollenhour mentioned a few times (1, 2, 3) that the risk of this is low and that it could only cause minor side effects, which is encouraging.
But we are still hesitant to do so as this comment from @mttjohnson poses a greater risk:

While it's possible to disable session locking, this can cause serious issues with requests that make changes to session data and overwrite each other, and that can result in serious problems on checkout where payments are captured but orders fail to be saved in Magento resulting sometimes in a customer re-submitting the order and getting charged multiple times if they are able to get the order to complete successfully.

Matt, is this something that actually happened to you or is it just a hypothetical situation?

@mttjohnson
Copy link
Author

@damienwebdev I was reading the release notes on 2.4.5 last week and noticed that it mentioned there was some ability to disable sessions on GraphQL requests, and I think that would solve the big issues for operating Magento with a PWA that makes a lot of concurrent GraphQL requests. While writing to and locking sessions on all other requests hitting the backend is still excessive, it would be nice to confirm if this provides a workaround to this issues I identified above, so we could resolve the critical aspects of this issue.

@luishenriqueamaral For the sites I managed we always enforced having session locking enabled, but I have heard reports from other people claiming the issues during checkout when session locking was disabled. I have however witnessed and helped troubleshoot issues where session locking was enabled but long running concurrent requests during checkout placement would cause the session lock to expire and session data similarly getting overwritten, causing payments to be captured without associated orders in the database. Sometimes we'd see this as payments without orders, and other times it would show up as multiple payments from the same initial quote. It's also worth pointing out that I've seen payments without order, and duplicate payments that didn't relate to any session problems, so that symptom can have multiple causes, it is just that session can be related to causing problems. I suspect that most of the session data getting overwritten isn't of much concern, but during checkout if critical data is written to the session, the act of it getting overwritten can result in some seriously difficult to troubleshoot issues in addition to customer service problems.

@Yonn-Trimoreau
Copy link

Yonn-Trimoreau commented Nov 12, 2022

I've come up with an acceptable temporary solution for this: colinmollenhour/php-redis-session-abstract#50

To temporarily fix this issue, here are the patches to apply:

On colinmollenhour/php-redis-session-abstract module => implement-write-lock-and-diff-patch.txt

And on magento/framework module => set-php-serialize-handler-patch.txt

In fact the colinmollenhour/php-redis-session-abstract module implements read locking and NOT write locking.
Basically this means that if multiple requests needing to read the session come to happen concurrently, the second needs to wait for the first to end, the third needs to wait for the second, and so on...
This situation is natively happening frequently when using GraphQL, or in the native checkout, and another example is when enabling the banner functionality in Magento Commerce (the /customer/section/load request happens almost always concurrently with the /banner/ajax/load request).

A handy way of clearly and consistently reproducing this performance issue is by following these steps:

  • Setting mage mode to developer (using bin/magento deploy:mode:set developer and ensuring MAGE_MODE env variable contains the same value)
  • Raising redis log_level to 7 in the app/etc/env.php file
  • Flushing the pub/static/frontend folder
  • Opening a terminal and run the following command tail -f var/log/debug.log | grep --line-buffered "Waiting"
  • Loading any page of the Magento website (and ensure the browser's developer console is opened and the "Disable cache" checkbox is enabled in the Network tab, to avoid getting memory cached files)

The disable_locking feature is clearly not a definitive solution and may result in weird and random problems depending on the performance of the browser and server.
What may happen is the following situation:

           A-------------A
           |             |
         reads        writes I
           |             |
           v             v
Session-------------------------X------Time
                ^               ^
                |               |
              reads          writes J
                |               |
                B---------------B

In this situation, the request B will overwrite the modifications made by the request A, and the resulting session will not contain I modification, but only J.
Depending on the modifications, this can provoke this behavior #10111 or this one https://magento.stackexchange.com/questions/350520/checkout-successful-redirects-to-empty-cart-but-not-using-redis
The reverse situation may also happen (some request removes an entry and the following request unremoves it), and it can cause notifications to show twice for example.

But the scenario I've just described can also happen when using file storage instead of Redis.
If using file storage on your project doesn't produce such problems, then you can safely use disable_locking feature.
The root cause of the overwriting problem is that sessions are read and written as a whole and not "column by column", and there is no diff mechanism or database-oriented language that can tell "write this column specifically but not this other one".

I understand the point of @colinmollenhour which says that API requests should not rely on instantiating the session since the common and widely adopted architectural style for building APIs is REST, and being RESTful implies being stateless.
But currently, most of Magento's checkout APIs and customer/section/load or even banner/ajax/load APIs are not RESTful (and a whole bunch of Magento marketplace's modules include APIs that are not either).
Probably for legacy reasons or wrong conception, but in any case, rewriting all this is quite a lot of work, and in the meantime, the problem will continue to happen.
Having stateful AJAX calls in non-headless websites is also quite common and I can't find why we should put an emergency on forcing them to comply to the REST principles.
By the way, I really don't understand why Magento GraphQL requests are stateful, this sounds like a serious architectural mistake to me (but I don't use it regularly so I won't firmly pronounce on this).

And the solution proposed by @mttjohnson brings the burden of having to list every requests that need read-only access to the session in some kind of whitelist (another Magento xml file?).
Ultimately, this kind of whitelist would not be needed if @colinmollenhour's module was implementing write lock instead of read lock (which the module is actually advertised to).

Maybe the definitive solution is for Magento to implement a "last second session state check" with a diff mechanism that would only propagate actual changes made to the session to the write operation, and apply it on all types of session storage (file, memcache or Redis).
And for @colinmollenhour to implement a simple real write lock solution (probably by passing an option to avoid causing regressions on every non-Magento users of this module).

Both solutions are actually implemented in the patches I supplied at the beginning of this comment. It's now just a matter of moving some part in Magento and another in @colinmollenhour's module, if both parties accept this way of resolving this extremely long running issue once and for all.

@colinmollenhour
Copy link

colinmollenhour commented Nov 13, 2022

@Yonn-Trimoreau I suppose you're right if you are referencing "Only one process may get a write lock on a session." - it should just read "Only one process may get a lock on a session." but that is just semantics..

As of now when you start a session is is basically assumed that there will be a write, like a "SELECT ... FOR UPDATE". Unlike a database, you can't know if there will be a write until the very end, basically during the "COMMIT". So I don't see how a concept of separate read/write locks translates, but I digress..

For example, let's say you add a product to the cart and so a "Added to cart" success message is added to the session storage. Then two requests are made at once and they both start the session reading the data without waiting. When they write the session they will both delete the message so that's fine, but still you have a different outcome than if you had a read lock (two requests get the success message instead of two). Another example may be a "firstPageAfterLogin" flag or a security nonce that is meant to be read only once. The problem is not serializing the writes in these cases, it is the reads.

I'm not familiar with the Magento 2 frontend so I don't know what requests are occurring and why they need to use the session but my general suggestion would be to find which requests modify the session and which ones don't. The ones that don't should just be read-only and the ones that do should just use the normal locking. If you're in a hurry this could even be done outside of Magento 2 code:

In index.php or anywhere before the session is instantiated:

if (preg_match('#^/(banner|customer)/#', $_SERVER['REQUEST_URI']??'')) {
    define('SESSION_READ_ONLY', TRUE);
}

Then replace

$this->connection[$pid] = new \Cm\RedisSession\Handler($this->config, $this->logger);
with:

$this->connection[$pid] = new \Cm\RedisSession\Handler($this->config, $this->logger, defined('SESSION_READ_ONLY'));

The read-only flag is already supported in https://github.com/colinmollenhour/php-redis-session-abstract/blob/master/src/Cm/RedisSession/Handler.php#L263

EDIT: After reading up farther I see/remember this was basically already suggested by @mttjohnson a while back. Using read-only sessions is safer than disabling locking since the read-only sessions can't clobber good data.

@Yonn-Trimoreau
Copy link

Yonn-Trimoreau commented Nov 13, 2022

@colinmollenhour Yes I am referencing this sentence "Only one process may get a write lock on a session". And semantics is actually the key to the problem here.

A read lock is what LibreOffice or Microsoft Word implements for example. When you open it, a lock file is written next to the file you're editing, just to remind you that you've already opened it in another text editor instance, in case you try to double open it. So that you won't overwrite your data from one instance to another.

A write lock is what the filesystem implements, or any NoSQL database, they run write operations one at a time to preserve atomicity of the data which is currently being written.
And Redis does actually implement this by default.

What you're talking about when mentioning the "COMMIT" operation in SQL language is a transaction.
Basically, the SQL database will put some part of the tables you're reading/writing in memory, to let you work with a consistent version of the database, and then when you call COMMIT, it will take the write lock (to avoid anyone else writing during the commit operation), and then write all the INSERT or UPDATE operations you put in the transaction one at a time, until it releases the lock and frees the temporary memory.
But in the meantime, callers can continue reading the tables you're currently modifying in the transaction (and that's the important part in our case).
Transactions are useful for two purposes:

  1. To ensure consistency of the whole database.
    Storing sessions doesn't critically need consistency since we don't have foreign keys, auto increment, etc... And Redis doesn't provide consistency at all.

  2. To ensure the atomicity of multiple write operations, so that a group of operations are either persisted entirely or not at all.
    Internally this works by, in this order: taking the write lock ; taking a snapshot of the current state of the data ; applying the multiple operations to the snapshot ; overwriting the actual data in the database with the prepared snapshot ; releasing the write lock
    And this is exactly what we need in our session use case.

What's a real shame is that Redis actually provide such transactions. But they operate at the entry level, and since the whole session is stored serialized in a single entry, we cannot profit from this.

As you can see, the following workflow is needed:

  1. Read session data when request starts
  2. Keep a snapshot of the first read data in memory
  3. Process the request
  4. Calculate a diff between the first read data and what is programmed to be written
  5. Take the write lock
  6. Read the session data again
  7. Apply the calculated diff on this fresh session data
  8. Write the session
  9. Release the lock

So if Magento takes care of the diff part of the process, your module should expose three methods: read, lock and write, so that the caller can choose when to read, when to lock, and when to write.

And yes, this will change the outcome, 100% agree on that. But since most Magento projects are presently forced to disable_locking for performance reasons, and that works quite ok 95% of the time, we have already proven that Magento can safely work without a read lock. What I would seriously like to remove is the "quite" part of this sentence, by ensuring that there is no random or weird and un-debuggable behavior in all this.

Using read-only is half of the answer, since if you have multiple requests that need to write in parallel (and this actually happens in the Magento checkout), you will still have unnecessary waiting time.

@convenient
Copy link
Contributor

@colinmollenhour We did as you suggested in #34758 (comment) for requests that we're 100% confident we do not care about the session for.

  1. We put some REQUEST_URI checking and handling in the top of pub/index.php to identify if this is a request we don't care about the session for.
  2. We patched in the change so that the handler is passed defined('SESSION_READ_ONLY')

It's not a perfect metric, but you can see where it was deployed and any session variation has dropped off

Screenshot 2022-12-08 at 10 04 43

And you can see average response time has dropped in this table, as well as the "max" time dropping way down because we're no longer fighting with session locking etc i think.

Before
Screenshot 2022-12-08 at 10 08 04

After
Screenshot 2022-12-08 at 10 08 38

Cheers for the simple recommendation!

@Yonn-Trimoreau
Copy link

Yes definitely a simpler and faster solution, if you don't care about locking at all

@convenient
Copy link
Contributor

@Yonn-Trimoreau definitely a stop gap until we can get magento up to date to latest, this helps with some locking performance before the holiday period at least :)

@Swahjak
Copy link
Contributor

Swahjak commented Mar 17, 2023

@Yonn-Trimoreau did you make any progress on this after the setReadOnly method was introduced (by you) on colinmollenhour/php-redis-session-abstract?

@Yonn-Trimoreau
Copy link

@Swahjak sorry no, the patch is working on my customers' ends, and to be frank with you, I know making this MR will take some time, and has probably a 30% chance of being approved and integrated in the core.

Lot of efforts for a small chance of success, and since most of my customers are Enterprise Edition customers (that's why I massively encounter the /banner/ajax/load issue...), I would have hoped that a "Severity S0" issue would be taken more seriously by Adobe's team and that they would take up on the production of a definitive fix, based on my last comprehensive explanation + working patch.

@Swahjak
Copy link
Contributor

Swahjak commented Mar 20, 2023

@Yonn-Trimoreau I feel you. Thank you for taking the time to respond, it's appreciated.

@convenient
Copy link
Contributor

Seems 2.4.5 and above have this flag which may help provided all the graphql endpoints are coded correctly.

private const XML_PATH_GRAPHQL_DISABLE_SESSION = 'graphql/session/disable';

<plugin name="graphql_session_disable" type="Magento\GraphQl\Plugin\DisableSession"/>

@indykoning
Copy link

Monitoring we have suggests this same issue occurs within the admin wysiwyg editor in Magento 2.4.7.

When the editor is loaded it will attempt to load the thumbnails with https://example.com/admin/cms/wysiwyg_images/thumbnail/file/XXXXXXXX/key/XXXXXXXX/
Considering all these thumbnails are requested at the same time and they all implement read locking.

For now i will implement @colinmollenhour's suggestion #34758 (comment)

@denniskopitz
Copy link

denniskopitz commented Jun 12, 2024

Hello,

We have implemented @colinmollenhour's suggestion #34758 (comment) and are testing in pre-production environments. We are primarily concerned about improving performance (ie eliminating unnecessary redis locking) around the checkout flow.

Wondering if other's could share the request URIs they are targeting. So far we have identified these:

'/(amasty_promo|estimate-shipping-methods|amazon_pay|group-prices|customer\/section\/load)/'

Logging the requests in this flow looks like this (SESSION READ indicates where we have modified the redis connection to readOnly):

GUEST CHECKOUT

[2024-06-11T17:48:31.373743+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /mens/watches.html [] []
[2024-06-11T17:48:38.352651+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages&force_new_section_timestamp=false&_=1718128116420 [] []
[2024-06-11T17:49:19.959767+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/cart/add/uenc/aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9tZW5zL3dhdGNoZXMuaHRtbA%2C%2C/product/1297/ [] []
[2024-06-11T17:49:23.301546+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Cdirectory-data%2Cammessages%2Csignifyd-fingerprint%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128116421 [] []
[2024-06-11T17:49:32.360905+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /mens/watches/therunwell41-leather-watch-s0126.html [] []
[2024-06-11T17:49:42.726151+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/cart/add/uenc/aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9tZW5zL3dhdGNoZXMvdGhlcnVud2VsbDQxLWxlYXRoZXItd2F0Y2gtczAxMjYuaHRtbA%2C%2C/product/1293/ [] []
[2024-06-11T17:49:45.351061+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Cdirectory-data%2Cammessages%2Csignifyd-fingerprint%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128176496 [] []
[2024-06-11T17:49:58.398474+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/cart/ [] []
[2024-06-11T17:50:05.137050+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amazon_pay/checkout/config?_=1718128203130 [] []
[2024-06-11T17:50:05.137051+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /reclaim/checkout/reload [] []
[2024-06-11T17:50:05.503370+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/estimate-shipping-methods [] []
[2024-06-11T17:50:07.096253+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128203131 [] []
[2024-06-11T17:50:07.513691+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:07.517021+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:07.568005+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:11.820151+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC9jYXJ0Lw%2C%2C&_=1718128203132 [] []
[2024-06-11T17:50:12.706038+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/ [] []
[2024-06-11T17:50:17.680662+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=idme-verify&force_new_section_timestamp=false&_=1718128215558 [] []
[2024-06-11T17:50:17.922531+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/internal/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/retrieve-adyen-payment-methods [] []
[2024-06-11T17:50:17.979711+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/estimate-shipping-methods [] []
[2024-06-11T17:50:19.993062+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215559 [] []
[2024-06-11T17:50:20.151170+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:20.151197+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:22.767240+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215560 [] []
[2024-06-11T17:50:24.932934+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /reclaim/checkout/email [] []
[2024-06-11T17:50:26.557505+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215561 [] []
[2024-06-11T17:50:26.723169+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/customers/isEmailAvailable [] []
[2024-06-11T17:50:35.854990+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/estimate-shipping-methods [] []
[2024-06-11T17:50:35.854846+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:38.882791+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215562 [] []
[2024-06-11T17:50:39.081188+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:42.093900+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:42.093969+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:44.664860+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:44.664870+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/internal/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/retrieve-adyen-payment-methods [] []
[2024-06-11T17:50:44.664867+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:44.665751+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/shipping-information [] []
[2024-06-11T17:50:48.366334+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215563 [] []
[2024-06-11T17:50:50.308453+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215564 [] []
[2024-06-11T17:50:53.869133+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amazon_pay/checkout/config?_=1718128215566 [] []
[2024-06-11T17:50:53.869097+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215565 [] []
[2024-06-11T17:50:54.177965+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/affirm/checkout/verify [] []
[2024-06-11T17:50:55.737571+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215567 [] []
[2024-06-11T17:51:06.361389+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/set-payment-information [] []
[2024-06-11T17:51:06.374592+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:51:08.573159+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /captcha/refresh/ [] []
[2024-06-11T17:51:08.573166+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215568 [] []
[2024-06-11T17:51:08.886128+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals?_=1718128215569 [] []
[2024-06-11T17:51:08.915024+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215570 [] []
[2024-06-11T17:51:13.269034+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/payment-information [] []
[2024-06-11T17:51:18.561105+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /captcha/refresh/ [] []
[2024-06-11T17:51:18.561144+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Ccaptcha%2Cammessages%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128215571 [] []
[2024-06-11T17:51:18.561269+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/onepage/success/ [] []

@colinmollenhour
Copy link

@denniskopitz Looks like a great step forward, thanks for sharing. Why does /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information need to lock the session? Surely REST API endpoints don't write messages to the session and just return them in the response body instead? Asking because I genuinely don't know..

@indykoning
Copy link

Wondering if other's could share the request URIs they are targeting. So far we have identified these:

'/(amasty_promo|estimate-shipping-methods|amazon_pay|group-prices|customer\/section\/load)/'

I've mainly been focussing on the backend since that caused the most problems for our clients for that i had:
#^/[^/]+/(catalog/category/categoriesJson|cms/wysiwyg_images/thumbnail|admin/widget_instance|\.thumbs[a-zA-Z]+)/#

Surely REST API endpoints don't write messages to the session and just return them in the response body instead? Asking because I genuinely don't know..

I don't think it was Magento framework code but a plugin, but i can definitely remember finding some messages being set on the session in some rest endpoints (after wondering why the hell our headless frontend wasn't showing these messages)

@denniskopitz
Copy link

@denniskopitz Looks like a great step forward, thanks for sharing. Why does /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information need to lock the session? Surely REST API endpoints don't write messages to the session and just return them in the response body instead? Asking because I genuinely don't know..

Apologies for the delayed response @colinmollenhour. When testing, we found that setting the totals-information to the readOnly list caused the checkout success page to fail. Rather than spend the time to figure out root cause of that, we just left it out.

@denniskopitz
Copy link

@indykoning Did your effort improve performance on the admin wysiwyg editor? We have also experienced that issue.

@vadim4err
Copy link

vadim4err commented Sep 9, 2024

We have the same issue with Cm\RedisSession\Handler::read 30sec
API request to /V1/carts/mine/totals-information
I've found that the customer's quote has product qty 871 with 251 items.
But, unfortunately, I can't reproduce the issue.

Here is a Redis session config, I've also set debug log level to catch the issue, maybe this can shed light

        'timeout' => '2.5',
        'persistent_identifier' => '',
        'compression_threshold' => '2048',
        'compression_library' => 'gzip',
        'log_level' => '7',
        'max_concurrency' => '25',
        'break_after_frontend' => '5',
        'break_after_adminhtml' => '30',
        'first_lifetime' => '600',
        'bot_first_lifetime' => '60',
        'bot_lifetime' => '7200',
        'disable_locking' => '0',
        'min_lifetime' => '60',
        'max_lifetime' => '2592000'

@colinmollenhour
Copy link

I'm not familiar with Magento 2 checkout - does it send one request per quote item or something like that? Also if any other requests had a fatal error and the session lock was not released that could explain it. E.g. perhaps there is an out of memory error on the request just before the totals-information one.

@AndresInSpace
Copy link

AndresInSpace commented Sep 9, 2024

I'm not familiar with Magento 2 checkout - does it send one request per quote item or something like that? Also if any other requests had a fatal error and the session lock was not released that could explain it. E.g. perhaps there is an out of memory error on the request just before the totals-information one.

The vanilla m2 one step checkout sends everything via ajax to update quote as you enter information, then shipping options, then payment accepted redirects to success page.

We utilize Bolt payment which has to communicate to Bolt servers then updates quote back in Magento via API which is why I'm reading this issue.

@colinmollenhour can you clarify, if there is a fatal error anywhere in Magento 2 process then it does not release the lock for session?

@Yonn-Trimoreau
Copy link

@AndresInSpace there is a "catch all" exception management in place. If any exception occurs, the session is closed cleanly. So the lock is released, don't worry about it.

@colinmollenhour some situations can lead to having at least 3 Ajax requests sent in parallel in the normal checkout (not one per item, but each request can be time-consuming, especially if the cart contains a lot of items)

And we've already discussed this, the 3 definitive solutions are:

  • Magento stops using session in the checkout APIs (which is somewhat difficult, it would imply implementing some kind of JWT token or similar in the front-office instead of relying on the session auth, and some ACLs on checkout-related entities)
  • Colinmollenhour's library implements a real entry-level write lock mechanism (instead of the actual read lock logic)
  • Magento implements a real write lock mechanism on top of colinmollenhour's library

@AndresInSpace you should look into using the patches I have provided above in this discussion. They are presently in place on 5 sites and they fix exactly the kind of issue you are describing. They are however not adapted to the latest version of colinmollenhour's library so I can send you an updated version if you want (which a colleague of mine has made compatible with the latest version).

@colinmollenhour just to clarify things up: I understand why it would be problematic to do this in your library, as this would be a major breaking change for all your non-Magento users, and I don't think you should change your mind because of Magento's current issue.
But still, this would be the cleanest and most elegant solution.

@AndresInSpace
Copy link

AndresInSpace commented Sep 12, 2024

Edit: Please ignore my misunderstanding here :)

What I was trying to say (and figure out) is that it seems this is not specific to just using redis for sessions caching(eg magento session), but also when utilized for default caches or full page cache, as this is an issue with redis and the process that opened the request (eg the 'redis session') right? I realize the module is specifically the php redis session, but I am wondering if the other redis module/configurations exacerbate this issue since despite having multiple instances for the default/fpc cache configuration those caches are always being hammered. Would that have a byproduct effect on the redis session cache(due to the single process nature of redis and reaching memory limits in environment and having 50 concurrent users sending requests.., etc) and possibly cause forced breaks?..

What I would like to figure out is if @mttjohnson or anyone in the thread (who all have done amazing debugging work, apologies for all my edits here and the confusion) can confirm they ran their tests on a storefront with redis configured for all (frontend default caches, fpc, and mage sessions) or was the storefront only having mage session configured to redis? Would be interested to see if redis for all three affects the queue timing differently, vs say redis session and varnish for fpc.

@Yonn-Trimoreau
Copy link

@AndresInSpace I think you are going out of road with this reasoning.
We are talking about Redis session locks. Session locks are specific to sessions and are not used in fpc or other cache types.
Furthermore, you are supposed to use a different Redis database for each cache type. No locking conflict can happen.
And Redis is a database that is built to withstand a huge number of requests in parallel, if you were to have a response time in the order of seconds, your Redis instance would be already completely down I think? Maybe I'm quite wrong on this, but the performance of Redis is not to be discussed and it does not respond slowly because of being hammered down in a Magento context.

@AndresInSpace
Copy link

AndresInSpace commented Sep 13, 2024

My apologies everyone I had a misunderstanding, did not mean to branch the subject.
Thank you for the clarification @Yonn-Trimoreau. I will look into the suggested patches/fixes for our project with this issue.

@vadim4err IIRC the totals-information triggers a calculate method to collect totals and is expensive since that is where tax loops through all items to calculate accordingly based on row eg line item (I believe this was context @colinmollenhour was asking earlier)
https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/Checkout/Model/TotalsInformationManagement.php

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: APIs Component: Framework/Session Component: GraphQL GraphQL Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Progress: ready for dev Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed.
Projects
Status: Ready for Development
Development

No branches or pull requests