New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Database has a global failure error during replication #5348
Comments
To clarify -- this quote refers to the changes from #5045 which prompt the user to reload the page for unhandled rejection The exception in this issue here is |
@SCdF As the resident IDB and purging expert does this ring any bells? |
Originally I thought it might be the view cleanup that happens after purge, but looking at the whole log there is a lot more going on. A slightly more complete look at that log is:
The key numbers here are:
I can't get these to line perfectly up, but they get suspiciously close to each other. @newtewt can you confirm that for this user the purging actually occurred? That is to say, were the documents no longer accessible? My hunch is that it IndexedDB has broken either before purge has run, or because we are doing an all_docs at the start and idb is "not ready" somehow, and so none of the purges were actually successful. In terms of the actual error, it's hard to track down. Apart from one specific usage that we aren't going to get to, it's just used whenever a transaction fails. One option is that we could inject failures in pdb's code and narrow it down to only places where the promise chain doesn't except. |
I'll confirm today that the docs are unavailable. I was going based on the console log that purging completed. |
@SCdF purging completes properly. The home_visit docs are not available anymore after refreshing and getting into the app. |
This happens consistently for the filysa user on first replication and purging. So it is reproducible. However, I logged in as user madinagu and didn't experience that error. This user has more docs than filysa user. |
Have you tried wiping the |
I'm wiping all the data in the Android app. I would need to try another device to be sure as I can't remember off the top of my head right now. |
That's messed up. @SCdF Would you mind investigating this user to see if they have a dodgy doc or purge configuration that's messing them up? The goal is just to get 3.3.0 to the point of go/no-go so if in your opinion this isn't required then bump it out to 3.5.0 and we'll pick it off the stack when we get to it. |
I'm on it (well, once I get creds etc) |
So it took four different devices, but I managed to replicate it, on our old friend the Tecno Y4. It's unclear why it happens on that device over the others, though it is the worst device I have. @newtewt can you post a user who does succeed compared to this one, and specifically:
I'm trying to work out if there is some kind of breaking point ratio. In the mean time, I'm going to pull this user's data down locally and I will try mucking with the code to see if I can produce any interesting results. |
@SCdF Here are the two users I'm using on the wx3 and seeing the error. I tried on my pixel 2 and tecno J5 with no errors. Let me know if you want more.
|
So I'm guessing that we're using a WX3 because Muso people have them deployed? |
I believe so. @abbyad mentioned that is what the CHW's are using in Mali. |
We already compact the local DB on each write, so this should be unneccessary. It is also causing crashes on some lower power devices, presumably because it's holding open some transaction for too long and erroring everything out after that. #5348
So I'm going to remove the compaction (optimise) step. We are already supposedly doing this on every write (and the code seems to exist in PouchDB), and manual compaction is seq based, so it's going to run over the whole changes feed again, which is a lot. There is no reason this should cause crashes, but there is also no reason for us to manually compact if we're always compacting when we write. @newtewt this build is available in I'm going to test this locally as well, but you should test it on your phones when you get a chance. |
So far this has been working fine on purge-upgrade instance. I haven't seen the crash today. I've initial replicated a few times in browser, tecno wx3, and tecno j5. I'd say let's push a PR and get another 3.3 beta and give it another once over. |
@SCdF I'll leave it for you to merge and backport accordingly. The diff looks fine - we should probably delete the translations for "optimize" at the same time. |
Drop "optimise", the manual compaction step. We don't need to do this as we are compacting on write as we go. Move start. This now happens once we decide we are going to purge, as it's more useful to show this early and have something to show while we pull in and organise documents that need purging. #5348
I made enough changes after @garethbowen 's comment that I made a PR and have linked it to Gareth. @garethbowen leave it to me to merge / forward port. |
Drop "optimise", the manual compaction step. We don't need to do this as we are compacting on write as we go. Move start. This now happens once we decide we are going to purge, as it's more useful to show this early and have something to show while we pull in and organise documents that need purging. #5348
Drop "optimise", the manual compaction step. We don't need to do this as we are compacting on write as we go. Move start. This now happens once we decide we are going to purge, as it's more useful to show this early and have something to show while we pull in and organise documents that need purging. #5348
Drop "optimise", the manual compaction step. We don't need to do this as we are compacting on write as we go. Move start. This now happens once we decide we are going to purge, as it's more useful to show this early and have something to show while we pull in and organise documents that need purging. #5348
@SCdF To get AT unblocked I've ignored your directive and merged the PR. I also forward ported to |
Drop "optimise", the manual compaction step. We don't need to do this as we are compacting on write as we go. Move start. This now happens once we decide we are going to purge, as it's more useful to show this early and have something to show while we pull in and organise documents that need purging. #5348
Coolio, so:
**breathes out** |
Replicating and purging is occurring as expected. Closing |
When replicating to tecno wx3 the indexeddb database seems to be dying. Errors in the console are showing. Reloading the app finishes the app start process and the app becomes usable.
Steps to reproduce:
Using user filysa on purge-upgrade-muso-mali instance
Log in to a fresh app install on tecno wx3. Have chrome://inspect going for the mobile app
What should happen:
The app should launch without error.
What actually happens:
The app replicates and purges documents then the errors showing the 404 error page and the error above are shown in the log. Currently we have a solution to handle this when it is unhandled but this is being handled.
https://medic.slack.com/files/UBHER12S2/FFZ6U3HEJ/indexed_db_went_bad.log
Environment:
The text was updated successfully, but these errors were encountered: