Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Under load, the cache can fail to read a key it has just written #9121

Closed
marcins opened this issue Jul 5, 2023 · 1 comment 路 Fixed by #9123
Closed

Under load, the cache can fail to read a key it has just written #9121

marcins opened this issue Jul 5, 2023 · 1 comment 路 Fixed by #9123

Comments

@marcins
Copy link
Contributor

marcins commented Jul 5, 2023

馃悰 bug report

Sometimes Parcel will throw an error when writing bundles because it can't read the content from the cache that was produced during Packaging.

馃槸 Current Behavior

We see this stack trace:

  Error: Key 2fbe5faff60c9359 not found in cache

      at LMDBCache.getBlob 
  (<ROOT>/node_modules/@atlassian/parcel-cache/lib/LMDBCache.js:134:70)
      at Object.run 
  (<ROOT>/node_modules/@atlassian/parcel-core/lib/requests/WriteBundleRequest.js:215:68)
      at RequestTracker.runRequest 
  (<ROOT>/node_modules/@atlassian/parcel-core/lib/RequestTracker.js:749:20)

馃敠 Context

Packaging runs in workers, there are many workers concurrently writing package contents to the cache. Bundle writing also happens in workers, but potentially a different worker. At least that's my understanding.

I've tried bumping LMDB to 2.8.2 (via Resolutions) but the issue still occurrs.

馃捇 Code Sample

I've managed to reproduce this issue in a standalone test case using Parcel's packages (workers, cache, etc): https://github.com/marcins/lmdb-testbed

Running node index.js will cause the bug to manifest. Lowering the concurrency causes the bug to either go away, or take longer to show up.

Reading the key again after even a setTimeout(..., 0) causes the key to be read correctly.

馃實 Your Environment

Software Version(s)
Parcel 2.9.3
Node 18.15.0
npm/Yarn yarn 1.23
Operating System macOS Ventura 13.4.1
@lettertwo
Copy link
Member

lettertwo commented Jul 5, 2023

I believe we've identified the root cause:

Parcel is not accounting for LMDB鈥檚 read transaction behavior. When performing a read, LMDB maintains short-lived transactions that take a snapshot of the DB

So the error case occurs in this fashion:

  • Parcel聽 reads from the db in a thread (creating a read transaction)
  • while (roughly) concurrently writing to the db in another thread (rendering the other thread鈥檚 read transaction snapshot stale)
  • After write, the Parcel thread yields the key back to the main thread
  • the other thread that has previously performed a read may now be tasked with a read for this newly written key
  • If this happens quickly enough that the existing read transaction in the thread not been reset, the error occurs

Here鈥檚 the relevant LMDB docs:

Normally, this library will automatically start a reader transaction for get and range operations, periodically reseting the read transaction on new event turns and after any write transactions are committed, to ensure it is using an up-to-date snapshot of the database. However, you can call resetReadTxn if you need to manually force the read transaction to reset to the latest snapshot/version of the database. In particular, this may be useful running with multiple processes where you need to immediately reset the read transaction based on a known update in another process (rather than waiting for the next event turn).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants