New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] CACHE_MANAGER BSOD on Windows 10 #1061
Comments
|
This is a bug in a kernel mode component (either a driver or the kernel itself), hence the BSOD instead of a "SumatraPDF has stopped working..." message like you'd see with a regular application crash. Most likely some behaviour of SumatraPDF (possibly in combination with your specific hardware) is exposing this bug. I can't reproduce the BSOD here, but if you could upload a crash dump file (the screen says it's generating one, normally it should be at |
I'll get back to you guys tomorrow. The issue is reproducible with any valid PDF file on the systems I encounter it on. I'll try the pre-release tomorrow as well. |
@philiparvidsson |
@GitHubRulesOK The Windows SDK debugging tools are indeed the best way to diagnose a crash, but the correct approach to doing this depends greatly on whether the crash occurs in user or kernel mode. The advice in your link is excellent if you have a problem where SumatraPDF crashes. However, in this case it is the system that is crashing, which cannot normally be caused by a user mode applications, which can merely expose or exacerbate some existing bug in the kernel code to trigger the vulnerability/bug. More briefly stated, the following step
is slightly over-optimistic for this case 😄 Rather, what will happen is that Sumatra will appear to run for a short time, until the blue screen comes in and it is game over. At this point you can forget about saving a process dump or running Diagnosing a kernel mode crash is in fact very similar to the above, except that you will need (ideally) a second machine to debug the victim machine that is going to crash. However this is not practical to set up for the average user, and a kernel developer can often find enough context to isolate the cause from a crash dump alone (this is 'post mortem debugging'). This is why crash dump files are so useful - if you are met with a bugcheck, you're going to have to reboot regardless, so being able to step through the code wouldn't have been a lot of help anyway. @philiparvidsson I've just discovered that it is possible to regain the good old bugcheck parameters that made the whole damn screen useful in the first place. So if you can't obtain a crash dump, try simply importing this as a .reg file and rebooting:
Your next BSOD should then look something like this: From parameter 1 it is possible to tell that the exact security violation was 'Stack cookie instrumentation code detected a stack-based buffer overrun'. Parameter 2 gives the address of the trap frame for the bugcheck-causing exception, and parameter 3 gives the exception record address for same (viewable with So, this 'number stuff for nerds' is actually pretty damn useful, because it allows developers to actually fix bugs! Too bad Microsoft decided that giant smiley faces are more important. |
So, some results:
I'm sticking with the pre-release for now. Do you guys still want a crash dump? (Also some amazing knowledge sharing re. kernel/user mode and debugging here - big thanks for that!) |
Thanks for the detailed steps. I tried the same on both a live machine and a VirtualBox VM with Windows 10 x64. They are Enterprise editions, not Pro, but I really doubt that matters. I didn't really expect to be able to reproduce the issue to be honest, it seems far too hardware and/or device driver specific for that. And yes, I would really like to see a crash dump of this if it's not too much trouble for you to make one, because this is an interesting case as one of those things that 'shouldn't happen' (TM). Crash dumps can be quite big, but they compress very well due to being sparse. The best method is to reboot (after making sure the crash dump settings are correct as described above), produce the BSOD directly after startup, reboot again and then 7zip the |
Ok, I configured my system according to your instructions - below is a kernel mem dump. BSODParameter values
MEMORY.DMPLink (~2GB, compressed to ~150MB): http://philiparvidsson.com/pub/memory.dmp.zip |
@philiparvidsson There's about 0% chance that this is caused by Sumatra. The contract between the OS and user-mode applications (like Sumatra) is such that a user mode app cannot crash the OS. It's caused by bad driver. Based on google research, it's most likely related to a storage device. The way to fix it is to uninstall or update the driver. See e.g.
The screenshots doesn't provide much information. You can probably find out more information by looking at Event Viewer (https://www.reviversoft.com/blog/2013/12/how-to-find-out-the-cause-of-your-bsod/ or use http://www.nirsoft.net/utils/blue_screen_view.html). You want to find out which driver caused the crash (usually the file with Or google for that name + "bsod" or "windows 10 crash". Chances are there were other people with the same driver causing the same crash and they figured how to fix it. Let us know if you manage to solve it, but I'll close this bug soon because there's no way this is caused by Sumatra. |
@kjk have a look at my conversation with @Mattiwatti - he might be able to do something with the mem dump. Re. storage - one of the two computers has a Samsung 850 EVO, the other one (from today) has a Samsung 860 EVO. So that might be it, I guess. shrugs I don't have any attached devices (USB storage etc) and PDF was hence read from my hard drive. Both computers only have a single drive in them. They don't share mobo (this one has an Asus Z270F mobo, no idea what the other one has) but I guess it's possible they're sharing some storage controller chip. Also, I only encounter this with SumatraPDF (not meaning to put blame on SumatraPDF here, but it's interesting to investigate what problem its behavior is exposing!) Oh and also, the pre-release version is not causing the BSOD, so it's somehow related to what SumatraPDF is doing - not completely unrelated. |
I ran the analyze on that .dmp: https://gist.github.com/kjk/fbc5ad95b396eaa8dd0161fb355e73a7 The process that actually crashed is dropbox and the crash seems to happen in file reading code. But that's all I can see there. You might try:
|
@kjk, good catch! If I close Dropbox, there's no crash when opening a PDF.
I recently read that Dropbox is injecting code into other processes, so that might be related. See this for some info: https://www.ghacks.net/2018/08/20/about-google-chromes-incompatible-applications-warning/ (TL;DR: Chrome has started issuing warnings on applications that are injecting code into it - Dropbox is one of them. Dropbox basically seems to be injecting code everywhere.) I don't want to turn this into some political discourse, but it bothers me to no end that Dropbox does this (despite pre-release version working fine). I don't know if there's any chance of getting hold of the Dropbox devs re. this. I'm a looong time Dropbox user, and Dropbox is becoming more and more of a monster. Either way, I'm still curious what it is that's actually causing the BSOD with SumatraPDF in particular. |
That looks like a serious bug in Dropbox. I believe they actually do have a kernel driver that might cause BSOD. You could report this to them, pointing to this discussion, maybe via https://www.dropboxforum.com/t5/Desktop-client-builds/bd-p/101003016 I don't think it's particular to Sumatra. Sumatra seems to trigger that bug in Dropbox but it doesn't mean that there are no other ways to trigger it, just that we don't know them. The issue here seems related to re-reading the same file the second time. Per your information, this happens when you re-open the same PDF file. I imagine other programs with similar pattern might trigger the same bug for example opening the same text file twice in a notepad or opening the same image file twice in image viewer app etc. I'm closing this bug as there's nothing I can do in Sumatra to fix it. |
Haha, yes, I was just writing a reply to blame Dropbox. It is the current process at the time of the crash. There seem to be some strange issues with the dump file that I have never seen before:
This is a bit frustrating because I am looking for evidence that Dropbox is messing up (meaning in kernel mode, because they have recently started deploying kernel drivers(!) via their obnoxious constant updates). However, the stacktrace does not indicate any third party driver interference:
Furthermore no Dropbox driver (that I'm aware of) can be found in your loaded module list, although you do have a lot of crap in it:
These are the 'wtf is this' drivers that I found off hand. Note that Null.sys is NOT a third party driver and should checksum correctly and have debug symbols loaded. But this driver is located in one of the pages that could not fit in the dump file, so I can't say whether this is a real issue or just a hiccup with the crash dump. Unrelated, but you should nuke those Samsung Rapid Mode drivers ASAP. There are numerous issues with those drivers, and worst of all, they even degrade performance compared to not having Rapid Mode enabled (naturally, since the whole idea of it is to poorly reimplement a cache that already exists in the kernel, which does a fine job on its own).
Because the filesystem is not FAT at all but NTFS, this a bogus analysis, or a lazy bugcheck by a developer who should have used a different bug code. The exact bugcheck call you are seeing is this: BOOLEAN NTAPI CcCopyReadEx(PFILE_OBJECT FileObject, PLARGE_INTEGER FileOffset, ULONG Length, BOOLEAN Wait, PVOID Buffer, PIO_STATUS_BLOCK IoStatus, PETHREAD IoIssuerThread)
{
// ...
if (Length + FileOffset->QuadPart > FileObject->SectionObjectPointer->SharedCacheMap->FileSize.QuadPart))
KeBugCheckEx(CACHE_MANAGER, 0x273, 0xFFFFFFFFC0000420, 0UL, 0UL);
} For CC (cache manager) bugchecks, the first parameter (here Interestingly this bugcheck was not always here; older versions of Windows simply truncate the filesize if this happens. Now to find out if and how I can reproduce this so I can crash people's Windows 10 machines 😄 (Edit: updated code snippet using the full |
Wow, amazing job @Mattiwatti - genuinely impressive. I'll try disabling the RAPID mode drivers and see what happens! Brb! Would a complete memory dump be more useful for you (would prefer sending it over a private channel, e.g., link via mail)? Do you need a new dump? |
No go, still crashing with RAPID mode off! 👎 |
Sure, I will take a look at a complete dump if you're willing to send me one. However it's possible that no further useful info can be found in it (depending on what exactly Dropbox is doing, from which process, and when). Keep in mind that you'll probably need to increase the size of your pagefile to get such a dump (to, say, your total RAM + 1 GB). I do agree with @kjk (and my own first post... heh) that this issue is unrelated to Sumatra, so I think it would be better to continue this discussion via email. My address is my github username@gmail.com. |
Kudos to @philiparvidsson for perseverance and @Mattiwatti for expert insights @kjk may be worth updating wiki debug page to point to the moved debugger link as I found above and also mention a link to this as a typical example of tracking a bug on BSOD crash |
In case Dropbox devs arrive here (I've forwarded the issue to their sec team which in turn forwarded it to their desktop client team): @Mattiwatti and I are delving deeper into this, feel free to contact me (and I assume @Mattiwatti as well) if you need more information. |
Hello everyone! I apologize for the (almost three years) late off-topic, but despite not being a SumatraPDF user, I have encountered a similar issue as well: an out-of-the-blue CACHE_MANAGER bugcheck. Having Googled about this issue, I am now here. I was hoping you guys might have some insights, as @philiparvidsson said that he and @Mattiwatti were diving deeper into the issue. The NTSTATUS I see is the same: STATUS_ASSERTION_FAILURE. Yet, the call stack is different: nt!KeBugCheckEx The failing process is System, which tells me that this was either directly or indirectly triggered by ring0 code. Perhaps you guys could shed some light? Thanks and apologies again for the off-topic! |
Description
On 2 of 3 Windows 10 machines, I receive a BSOD when I open PDF files with SumatraPDF. It is 100% reproducible on the two machines, but does not happen at all on the third (a Dell XPS laptop). The other two have nothing in common except that they both have Intel i7 CPUs and NVIDIA GeForce GPUs and that they are both stationary with two monitors connected.
Software
Version: SumatraPDF x64 3.1.2 (portable)
OS: Windows 10 Pro x64 (1803)
No AV-software on any of the machines.
Steps to reproduce
Expected result
PDF is displayed on screen.
Actual result
Windows 10 BSOD's with "CACHE_MANAGER" related issue (no more information than that is provided on the BSOD).
Attachments
The text was updated successfully, but these errors were encountered: