-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDB_PROBLEM: Unexpected problem - txn should abort #153
Comments
I don't think that I can tell what is occurring from this information alone, but I have added some additional details for the error messages when a MDB_PROBLEM occurs to better determine the cause (it is possible that is related #41, but not sure). I published this additional error information in the v2.3.0-beta3 release if you want to try that out. |
Ok, thanks! I'll tell users to use this version until it's in a stable version then (when they can more reliably reproduce this). |
Have you ever seen this error again (with the more detailed error message)? Still would be interested in knowing what caused this. |
We had to pin lmdb (gatsbyjs/gatsby#35397) so our own tests didn't run this version yet. (cc @pieh) |
That issue (#156) has been fixed though, so you can use latest now, right? |
We need to update https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby/src/schema/graphql-engine/lmdb-bundling-patch.ts first for the new binary loading -- but that's on us. |
Ok, no rush, was just curious. |
@LekoArts actually looking at the patch module you are using... would you like me to use more statically analyzable code like
Or do you think that would just cause more errors/confusion with vercel (the |
Hey @kriszyp - about the hacky patch we are using, ideally we don't need patch like that, but I did some attempts at playing with lmdb code before in hope I could submit pull request and frankly didn't exactly like what I was doing there and that's when I went with the "hacky patch" route.
https://github.com/DoctorEvidence/lmdb-js/blob/4d6371dc6faaa69f2a1a43d980ace0f328c89106/package.json#L22-L32 The binaries loading part also shared that, but I think there were more problems there which I just don't remember (it was a while ago). Other than above, with
For In general - non-web-frontend npm packages have varying levels of "support" for being bundled and it's hard to expect them to work out-of-the-box for that, so I just didn't want to waste your time on our niche use case for |
Ok, thanks for the explanation! And yeah hopefully the node-gyp-build PR merges soon. And again, no rush, was just checking. |
This is the error message I get with the 2.3.0 beta Error message
This the source code we're hitting https://github.com/gatsbyjs/gatsby/blob/9b25267009f318949705e2e1faf7af859b0b668a/packages/gatsby/src/schema/types/date.ts#L224-L259
Let me know if this helps or not at all. We're basically checking if the element is in the cache if not we set it |
That helps, that definitely is not where I had been expecting an I am investigating if this assertion/error really is necessary and the state is detrimental. It may be possible to simply turn off of this check or make it a warning (which basically allows unspilling even if a page can't be found in the spilled list), but will do some checking of the effects. Of course the potential problem is if this allows for a bad state where invalid data is returned or corruption can take place. Also FYI, I have not removed these more detailed messages, you can use the latest lmdb version (2.3.5 right now), and the extra error messages should still be there. |
I'll see if I can get you a repro. We're not using childTransactions and I don't event think we're using any transactions ourselves. |
Well, everything in LMDB is in a transaction, it is just implicit/automatic if there is no explicit transaction, and usually one transaction per event turn, unless there is a lot of heavy writing where more gets batched into the current transaction. |
Ok, v2.3.6 should have the assertion switched to just being a warning (with a little more info), so the application can proceed and we can see if it really is a bad state with detrimental outcomes, or can safely be ignored. |
|
We're using Gatsby cloud for 5 staging sites and seeing this issue pop up multiple times a day. Per Gatsby support team's suggestion, I updated our gatsby version to 4.13.1 yesterday and 3 of the 5 sites failed with this error on that PR build. Please let me know how we can best support you all in finding a solution! https://www.gatsbyjs.com/dashboard/f2f3e6f9-9b33-4fb6-92b2-f1d7711d4e15/sites/f3997d6e-da36-4555-8cba-e424b040cbe3/builds/5a3864d0-73d6-4348-9a1e-0e696ef0ed26/details#rawLogs |
@lauraturk FYI, we believe this may be related or the same issue #164, which also has plenty of discussion of our efforts to track this down. Certainly if you are ever able to come up with a reproducible test case that I could run locally, we would be most grateful (I've never been able to reproduce the issue in this ticket, so most of the discussion has been theorizing about ways that it might possibly occur). |
Hi @kriszyp , unfortunately, I am not able to create a reproducible test case for you to run locally. It's very random across our 5 gatsby sites. Per Gatsby support I upgraded to gatsby@next today, and only one site failed. Hopefully this log is helpful? https://www.gatsbyjs.com/dashboard/f2f3e6f9-9b33-4fb6-92b2-f1d7711d4e15/sites/da690a16-3284-48d1-8e0f-4545d6169068/builds/cb4bfd92-c8a9-4d73-9ff6-b96b3d94ffe1/details#rawLogs |
He can't access your build logs. Could you copy in the relevant lines?
…On Fri, May 20, 2022, 3:09 PM Laura Turk ***@***.***> wrote:
Hi @kriszyp <https://github.com/kriszyp> , unfortunately, I am not able
to create a reproducible test case for you to run locally. It's very random
across our 5 gatsby sites. Per Gatsby support I upgraded to ***@***.***
today, and only one site failed. Hopefully this log is helpful?
https://www.gatsbyjs.com/dashboard/f2f3e6f9-9b33-4fb6-92b2-f1d7711d4e15/sites/da690a16-3284-48d1-8e0f-4545d6169068/builds/cb4bfd92-c8a9-4d73-9ff6-b96b3d94ffe1/details#rawLogs
—
Reply to this email directly, view it on GitHub
<#153 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAARLBZIG2JN42MNASQW5B3VLAER5ANCNFSM5SVAVOBQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
@lauraturk I would still love to see what logs you got for that error, if you can paste them in? |
Thanks for your patience, here it is:
|
Thanks for the logs, this is helpful. This is a db corruption error, which is basically impossible to occur in typical usage. There are only few things that I know that can lead to this:
It is clear from these errors that there are multiple child processes (rather than just threads). This is fine to do with LMDB, but is a helpful insight. From this comment #164 (comment) it sounds like that error might be specific cloud configuration where this occurs (GCP Kubernetes). I wonder if perhaps this configuration uses a remote/network drive/storage that isn't playing nicely memory maps (there are known problems with some remote file systems with LMDB )? I don't know enough about the gatsby infrastructure to know if this is the same platform where the error in this last message occurred? |
…an break the entire locking scheme, parcel-bundler/parcel#8165, #153, #164
I discovered and fixed an issue with file locking (causing locks to break), that should be fixed v2.4.5, and might possibly address this issue and #164. |
The CircleCI tests in gatsbyjs/gatsby#35724 seem to consistently fail on I can SSH into that pod and get you any info you need :) |
… necessary and still identified, #153
Sorry, this is kind of trial and error. I believe this latest error is probably because the fix to better ensure database file identification was incorrectly handling the case of initial creation of the database (was doing the stat check before it was created). Published a fix for this in v2.5.1, if you can try it out when you get a chance. |
I was OOO for a week, but I'll update the PR to 2.5.2 now and let you know how it goes :) |
BTW, I would also be interested in adding an integration test of gatsby using lmdb-js. If you have any pointers/suggestion of a good test that would efficiently test gatsby's usage of lmdb, would love to add that. |
Hi Kris!
Every now and then a user reports this error. Or our CI or our testing itself runs into it:
User error:
Our CI run (https://app.circleci.com/pipelines/github/gatsbyjs/gatsby/81406/workflows/46bdcdf9-85eb-444a-aab1-b92845954af3/jobs/964466) failed with it intermittently:
I'm not too well versed with all the moving pieces yet so I'll ask my colleague to maybe shed some more light on this, but wanted to report it nevertheless so that a) I don't forget it and b) we can get behind this :)
Files from the stacktrace:
Thanks for your work!
The text was updated successfully, but these errors were encountered: