New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x32dbg hangs itself and the debugee #1481

Closed
Maktm opened this Issue Mar 2, 2017 · 28 comments

Comments

Projects
None yet
7 participants
@Maktm

Maktm commented Mar 2, 2017

Debugger version:
x32dbg v25 (Compiled on: Feb 28 2017, 05:07:48)

Operating system version and Service Pack (including 32 or 64 bits):
Windows 7 x64 Service Pack 1

Brief description of the issue:

When setting a breakpoint on a function (INT3 software breakpoint) and clicking submit on the debugee in order to trigger the breakpoint, x32dbg hangs completely. I can't click anything and if I can't minimize/maximize it. I can tell all input has stopped because my volume keys stop working.

Elaborate reproduction steps for the bug/issue being reported:

  • Start byond.exe
  • Attach to byond.exe
  • Set a breakpoint on DungPager::Login (exported subroutine)
  • Go back to debugee and login with an account

Image of hang

@mrexodia mrexodia added bug gui labels Mar 3, 2017

@torusrxxx

This comment has been minimized.

Show comment
Hide comment
@torusrxxx

torusrxxx Mar 3, 2017

Member
Member

torusrxxx commented Mar 3, 2017

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 3, 2017

Member
Member

mrexodia commented Mar 3, 2017

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 4, 2017

Member

@Maktm what plugins do you have? I attempted to reproduce the issue today with x32dbg in debug mode, but I couldn't reproduce it this time. Could you try without plugins and after that also without snowman (use https://github.com/x64dbg/SnowmanDummy/releases)?

Okay, I think the issue is somewhere in snowman. In release mode (with plugins). With snowman is hangs, without snowman it works as expected...

Member

mrexodia commented Mar 4, 2017

@Maktm what plugins do you have? I attempted to reproduce the issue today with x32dbg in debug mode, but I couldn't reproduce it this time. Could you try without plugins and after that also without snowman (use https://github.com/x64dbg/SnowmanDummy/releases)?

Okay, I think the issue is somewhere in snowman. In release mode (with plugins). With snowman is hangs, without snowman it works as expected...

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 4, 2017

Member

Hm no, it turns out it's probably some race condition. I can now reproduce it with a clean release...

Member

mrexodia commented Mar 4, 2017

Hm no, it turns out it's probably some race condition. I can now reproduce it with a clean release...

@X39

This comment has been minimized.

Show comment
Hide comment
@X39

X39 Mar 4, 2017

just stumbled across this bug too today
accidently deleted an older release, debugging my target now freezes with int3 breakpoint (~month maybe two old)

X39 commented Mar 4, 2017

just stumbled across this bug too today
accidently deleted an older release, debugging my target now freezes with int3 breakpoint (~month maybe two old)

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 4, 2017

Member

I'm having trouble reliably reproducing the issue... Do you know a release that is close to the version that didn't have this issue @X39 ?

Member

mrexodia commented Mar 4, 2017

I'm having trouble reliably reproducing the issue... Do you know a release that is close to the version that didn't have this issue @X39 ?

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 4, 2017

Member

It appears to be stuck here:

Member

mrexodia commented Mar 4, 2017

It appears to be stuck here:

@X39

This comment has been minimized.

Show comment
Hide comment
@X39

X39 Mar 4, 2017

sadly no
as said ... removed it accidently and just downloaded the latest

X39 commented Mar 4, 2017

sadly no
as said ... removed it accidently and just downloaded the latest

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 4, 2017

Member

Once x32dbg hangs, do the following:

  1. Attach to x32dbg.exe with x32dbg
  2. Suspend all threads
  3. Put a breakpoint on PeekMessageW
  4. Replace the code with LoadLibrary("peeker.dll"); call peeker
#include <Windows.h>
#include <stdio.h>

HANDLE hFile;

struct BufferedWriter
{
    explicit BufferedWriter(HANDLE hFile, size_t size = 65536)
        : hFile(hFile),
        mBuffer(new char[size]),
        mSize(size),
        mIndex(0)
    {
        memset(mBuffer, 0, size);
    }

    bool Write(const void* buffer, size_t size)
    {
        return Write((const char*)buffer, size);
    }

    bool Write(const char* buffer, size_t size)
    {
        for(size_t i = 0; i < size; i++)
        {
            mBuffer[mIndex++] = buffer[i];
            if(mIndex == mSize)
            {
                if(!flush())
                    return false;
                mIndex = 0;
            }
        }
        return true;
    }

    ~BufferedWriter()
    {
        flush();
        delete[] mBuffer;
        CloseHandle(hFile);
    }

private:
    HANDLE hFile;
    char* mBuffer;
    size_t mSize;
    size_t mIndex;

    bool flush()
    {
        if(!mIndex)
            return true;
        DWORD written;
        auto result = WriteFile(hFile, mBuffer, DWORD(mIndex), &written, nullptr);
        mIndex = 0;
        return !!result;
    }
};

extern "C" __declspec(dllexport) void peeker()
{
    int count = 0;
    {
        BufferedWriter bufWriter(hFile);
        MSG msg;

        while(PeekMessageW(&msg, 0, 0, 0, PM_REMOVE))
        {
            bufWriter.Write(&msg, sizeof(msg));
            count++;
        }
    }
    char fuck[128] = "";
    sprintf_s(fuck, "%d messages fucked", count);
    MessageBoxA(0, fuck, "fuck!", MB_SYSTEMMODAL);
}

BOOL WINAPI DllMain(
    _In_ HINSTANCE hinstDLL,
    _In_ DWORD     fdwReason,
    _In_ LPVOID    lpvReserved
)
{
    if(fdwReason = DLL_PROCESS_ATTACH)
    {
        char file[MAX_PATH] = "";
        sprintf_s(file, "fuck%d.fuck", rand());
        hFile = CreateFileA(file, GENERIC_WRITE, 0, nullptr, CREATE_ALWAYS, 0, nullptr);
    }
    return TRUE;
}

Now this will empty the message queue correctly, however it will hang on MessageBoxA with the following call stack:

Address  To       From     Size   Comment                                          Party 
0109D5B8 7549E508 7547CDCC 1C     user32.NtUserSetFocus+C                          System
0109D5D4 75478E71 7549E508 2C     user32.MB_DlgProc+111                            System
0109D600 7548F730 75478E71 24     user32._InternalCallWinProc+2B                   System
0109D624 7548F6C8 7548F730 80     user32.InternalCallWinProc+20                    System
0109D6A4 7548F2D7 7548F6C8 5C     user32.UserCallDlgProcCheckWow+129               System
0109D700 7548F4F5 7548F2D7 20     user32.DefDlgProcWorker+A7                       System
0109D720 75478E71 7548F4F5 2C     user32.DefDlgProcW+25                            System
0109D74C 754790D1 75478E71 94     user32._InternalCallWinProc+2B                   System
0109D7E0 754A78E4 754790D1 6C     user32.UserCallWinProcCheckWow+18E               System
0109D84C 7549044B 754A78E4 10C    user32.SendMessageWorker+1DB                     System
0109D958 75494FBA 7549044B 44     user32.InternalCreateDialog+9C6                  System
0109D99C 7549F907 75494FBA CC     user32.InternalDialogBox+F4                      System
0109DA68 7549F06D 7549F907 164    user32.SoftModalMessageBox+E2C                   System
0109DBCC 754E786B 7549F06D 80     user32.MessageBoxWorker+293                      System
0109DC4C 754E77CB 754E786B 34     user32.MessageBoxTimeoutW+6B                     System
0109DC80 754E760B 754E77CB 20     user32.MessageBoxTimeoutA+7B                     System
0109DCA0 754E75D8 754E760B 1C     user32.MessageBoxExA+1B                          System
0109DCBC 1AF811B2 754E75D8 E4     user32.MessageBoxA+18                            User
0109DDA0 6676FA11 1AF811B2 21AF64 peeker.peeker+102                                User
012B8D04 012F6D10 6676FA11 4      qt5core.QEventDispatcherWin32::processEvents+221 User
012B8D08 0109FC3C 012F6D10 3E00C  012F6D10                                         User
012F6D14 012B8D00 0109FC3C 4      0109FC3C                                         User
012F6D18 00000000 012B8D00        012B8D00                                         User

It just deadlocked somewhere in the kernel... But creating a whole new thread that does call MessageBoxA(0, 0, 0, 0); spawns a message box as expected.

Some further research looking at the ReactOS source code: https://doxygen.reactos.org/dc/dd9/focus_8c.html#add28541af7e513708d0b4fc14d43c84e it calls UserEnterExclusive which apparently locks up the GUI. The weirdest thing is that after resuming the threads everything continue to work, except for user input.

Member

mrexodia commented Mar 4, 2017

Once x32dbg hangs, do the following:

  1. Attach to x32dbg.exe with x32dbg
  2. Suspend all threads
  3. Put a breakpoint on PeekMessageW
  4. Replace the code with LoadLibrary("peeker.dll"); call peeker
#include <Windows.h>
#include <stdio.h>

HANDLE hFile;

struct BufferedWriter
{
    explicit BufferedWriter(HANDLE hFile, size_t size = 65536)
        : hFile(hFile),
        mBuffer(new char[size]),
        mSize(size),
        mIndex(0)
    {
        memset(mBuffer, 0, size);
    }

    bool Write(const void* buffer, size_t size)
    {
        return Write((const char*)buffer, size);
    }

    bool Write(const char* buffer, size_t size)
    {
        for(size_t i = 0; i < size; i++)
        {
            mBuffer[mIndex++] = buffer[i];
            if(mIndex == mSize)
            {
                if(!flush())
                    return false;
                mIndex = 0;
            }
        }
        return true;
    }

    ~BufferedWriter()
    {
        flush();
        delete[] mBuffer;
        CloseHandle(hFile);
    }

private:
    HANDLE hFile;
    char* mBuffer;
    size_t mSize;
    size_t mIndex;

    bool flush()
    {
        if(!mIndex)
            return true;
        DWORD written;
        auto result = WriteFile(hFile, mBuffer, DWORD(mIndex), &written, nullptr);
        mIndex = 0;
        return !!result;
    }
};

extern "C" __declspec(dllexport) void peeker()
{
    int count = 0;
    {
        BufferedWriter bufWriter(hFile);
        MSG msg;

        while(PeekMessageW(&msg, 0, 0, 0, PM_REMOVE))
        {
            bufWriter.Write(&msg, sizeof(msg));
            count++;
        }
    }
    char fuck[128] = "";
    sprintf_s(fuck, "%d messages fucked", count);
    MessageBoxA(0, fuck, "fuck!", MB_SYSTEMMODAL);
}

BOOL WINAPI DllMain(
    _In_ HINSTANCE hinstDLL,
    _In_ DWORD     fdwReason,
    _In_ LPVOID    lpvReserved
)
{
    if(fdwReason = DLL_PROCESS_ATTACH)
    {
        char file[MAX_PATH] = "";
        sprintf_s(file, "fuck%d.fuck", rand());
        hFile = CreateFileA(file, GENERIC_WRITE, 0, nullptr, CREATE_ALWAYS, 0, nullptr);
    }
    return TRUE;
}

Now this will empty the message queue correctly, however it will hang on MessageBoxA with the following call stack:

Address  To       From     Size   Comment                                          Party 
0109D5B8 7549E508 7547CDCC 1C     user32.NtUserSetFocus+C                          System
0109D5D4 75478E71 7549E508 2C     user32.MB_DlgProc+111                            System
0109D600 7548F730 75478E71 24     user32._InternalCallWinProc+2B                   System
0109D624 7548F6C8 7548F730 80     user32.InternalCallWinProc+20                    System
0109D6A4 7548F2D7 7548F6C8 5C     user32.UserCallDlgProcCheckWow+129               System
0109D700 7548F4F5 7548F2D7 20     user32.DefDlgProcWorker+A7                       System
0109D720 75478E71 7548F4F5 2C     user32.DefDlgProcW+25                            System
0109D74C 754790D1 75478E71 94     user32._InternalCallWinProc+2B                   System
0109D7E0 754A78E4 754790D1 6C     user32.UserCallWinProcCheckWow+18E               System
0109D84C 7549044B 754A78E4 10C    user32.SendMessageWorker+1DB                     System
0109D958 75494FBA 7549044B 44     user32.InternalCreateDialog+9C6                  System
0109D99C 7549F907 75494FBA CC     user32.InternalDialogBox+F4                      System
0109DA68 7549F06D 7549F907 164    user32.SoftModalMessageBox+E2C                   System
0109DBCC 754E786B 7549F06D 80     user32.MessageBoxWorker+293                      System
0109DC4C 754E77CB 754E786B 34     user32.MessageBoxTimeoutW+6B                     System
0109DC80 754E760B 754E77CB 20     user32.MessageBoxTimeoutA+7B                     System
0109DCA0 754E75D8 754E760B 1C     user32.MessageBoxExA+1B                          System
0109DCBC 1AF811B2 754E75D8 E4     user32.MessageBoxA+18                            User
0109DDA0 6676FA11 1AF811B2 21AF64 peeker.peeker+102                                User
012B8D04 012F6D10 6676FA11 4      qt5core.QEventDispatcherWin32::processEvents+221 User
012B8D08 0109FC3C 012F6D10 3E00C  012F6D10                                         User
012F6D14 012B8D00 0109FC3C 4      0109FC3C                                         User
012F6D18 00000000 012B8D00        012B8D00                                         User

It just deadlocked somewhere in the kernel... But creating a whole new thread that does call MessageBoxA(0, 0, 0, 0); spawns a message box as expected.

Some further research looking at the ReactOS source code: https://doxygen.reactos.org/dc/dd9/focus_8c.html#add28541af7e513708d0b4fc14d43c84e it calls UserEnterExclusive which apparently locks up the GUI. The weirdest thing is that after resuming the threads everything continue to work, except for user input.

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 4, 2017

Contributor

Try the following:

  • Windows key + R, type services.msc and hit enter
  • Go to 'Windows Font Cache Service'
  • Set it to disabled and stop it.
Contributor

Mattiwatti commented Mar 4, 2017

Try the following:

  • Windows key + R, type services.msc and hit enter
  • Go to 'Windows Font Cache Service'
  • Set it to disabled and stop it.
@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 4, 2017

Contributor

As a followup, here is the guilty stacktrace:
Stacktrace
(this is the x32dbg.exe main thread)

For some reason win32k.sys keeps looking something up in the font cache infinitely. (But it's not stuck doing it, because PeekMessage does return.) However what the actual underlying cause is I don't know. win32k.sys is well known for being a buggy piece of shit, but there could also be something else at play.

I actually had some trouble reproducing the stacktrace after disabling the font cache, because even after re-enabling it I could still set the breakpoint and step. I had to reboot to get it to hang again.

Edit: I made a new stacktrace after fixing my symbols, because the offsets were fishy. So now it looks like it doesn't have anything to do with the font cache at all - even though disabling it did fix the problem for me. Huh?!

Contributor

Mattiwatti commented Mar 4, 2017

As a followup, here is the guilty stacktrace:
Stacktrace
(this is the x32dbg.exe main thread)

For some reason win32k.sys keeps looking something up in the font cache infinitely. (But it's not stuck doing it, because PeekMessage does return.) However what the actual underlying cause is I don't know. win32k.sys is well known for being a buggy piece of shit, but there could also be something else at play.

I actually had some trouble reproducing the stacktrace after disabling the font cache, because even after re-enabling it I could still set the breakpoint and step. I had to reboot to get it to hang again.

Edit: I made a new stacktrace after fixing my symbols, because the offsets were fishy. So now it looks like it doesn't have anything to do with the font cache at all - even though disabling it did fix the problem for me. Huh?!

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 5, 2017

Member

Yeah, this issue is really hard to grasp. First I could reproduce it every single time, then I tried compiling x64dbg in debug mode, couldn't reproduce it. Then I tried release, reproduced it again, then I removed snowman from release and it didn't happen for pretty long, but then it happened again. Now I have to kind of keep trying to reproduce it and it happens occasionally. Something I did observe though, sometimes it hangs with 15% (I have 6 cores) CPU usage and sometimes it hangs with the usual 5-6%...

Member

mrexodia commented Mar 5, 2017

Yeah, this issue is really hard to grasp. First I could reproduce it every single time, then I tried compiling x64dbg in debug mode, couldn't reproduce it. Then I tried release, reproduced it again, then I removed snowman from release and it didn't happen for pretty long, but then it happened again. Now I have to kind of keep trying to reproduce it and it happens occasionally. Something I did observe though, sometimes it hangs with 15% (I have 6 cores) CPU usage and sometimes it hangs with the usual 5-6%...

@Maktm

This comment has been minimized.

Show comment
Hide comment
@Maktm

Maktm Mar 5, 2017

what plugins do you have?

Just the defaults that come with x32dbg/x64dbg. I was using a fresh install when this issue came up.

I attempted to reproduce the issue today with x32dbg in debug mode, but I couldn't reproduce it this time.

The issue only happens sometimes so it's pretty annoying to debug.

Maktm commented Mar 5, 2017

what plugins do you have?

Just the defaults that come with x32dbg/x64dbg. I was using a fresh install when this issue came up.

I attempted to reproduce the issue today with x32dbg in debug mode, but I couldn't reproduce it this time.

The issue only happens sometimes so it's pretty annoying to debug.

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 5, 2017

Contributor

The CPU utilization does seem to be pretty random; when I get a hang with a stacktrace like above I have one fully utilized core, but I can also produce hangs in win32k.sys!xxxRealSleepEx (or whatever the name is) where it really does seem to be blocking on something and the CPU utilization is 0.

Some other interesting stuff:

  • WinDbg is also affected by this! Furthermore it doesn't matter if you use the X86 or the X64 version (the X64 version of WinDbg is the only debugger I know that can debug Wow64); both will hang just like x32dbg does.
  • I managed to fish out an... interesting... stacktrace from the debuggee thread with 13506 stack frames:
0, smokeweed.exe!KiDeliverApc+0x1e3
1, smokeweed.exe!KiCommitThreadWait+0x3dd
2, smokeweed.exe!KeWaitForSingleObject+0x19f
3, smokeweed.exe!DbgkpQueueMessage+0x2a8
4, smokeweed.exe!DbgkpSendApiMessage+0x5c
5, smokeweed.exe! ?? ::NNGAKEGL::`string'+0x2842d
6, smokeweed.exe!KiDispatchException+0x287
7, smokeweed.exe!KiExceptionDispatch+0xc2
8, smokeweed.exe!KiBreakpointTrap+0xf4
9, byondcore.dll!DungPager::Login+0x1
10, 0x50108000f151a5
11, 0x200524c40
12, 0x111c001a5b2
13, 0x10040c3e8
14, 0x2e90c8000404940
15, 0x1745ab628
16, 0x50108002d39ca8
17, 0xf3def600404940
18, 0x40494c00000004
19, 0x156ef2800f172e9
20, 0xc001a5da00524c40
21, 0x856ef2800000000
22, 0xf3b8f900404abc
23, 0x40495c00000000
24, mfc120.dll!_AfxDispatchCmdMsg+0x3f
(...etc...)
13389, mfc120.dll!_AfxDispatchCmdMsg+0x3f
13390, byondcore.dll!DungPager::Login+0x1 (No unwind info)
13391, 0x501080 (No unwind info)
13392, byond.exe+0x272e9 (No unwind info)
13393, mfc120.dll!_AfxDispatchCmdMsg+0x3f
13394, mfc120.dll!CCmdTarget::OnCmdMsg+0x12f
13395, mfc120.dll!CDialog::OnCmdMsg+0x1b
13396, mfc120.dll!CWnd::OnCommand+0x7b
13397, mfc120.dll!CWnd::OnWndMsg+0x62
13398, mfc120.dll!CWnd::WindowProc+0x22
13399, byondwin.dll!CResizableDialog::WindowProc+0x6f (No unwind info)
13400, mfc120.dll!AfxCallWndProc+0x99
13401, mfc120.dll!AfxWndProc+0x34
13402, mfc120.dll!AfxWndProcBase+0x34
13403, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13404, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13405, user32.dll!GetWindow+0x3f0 (No unwind info)
13406, user32.dll!SendMessageW+0x4c (No unwind info)
13407, user32.dll!LoadCursorFromFileA+0x1097 (No unwind info)
13408, user32.dll!LoadCursorFromFileA+0x11d8 (No unwind info)
13409, user32.dll!SetKeyboardState+0x1c7e (No unwind info)
13410, user32.dll!IsCharAlphaA+0x1a9f (No unwind info)
13411, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13412, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13413, user32.dll!GetClientRect+0xc5 (No unwind info)
13414, user32.dll!CallWindowProcA+0x1b (No unwind info)
13415, mfc120.dll!CWnd::DefWindowProcA+0x46
13416, mfc120.dll!CWnd::WindowProc+0x39
13417, mfc120.dll!AfxCallWndProc+0x99
13418, mfc120.dll!AfxWndProc+0x34
13419, mfc120.dll!AfxWndProcBase+0x34
13420, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13421, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13422, user32.dll!CharPrevW+0x138 (No unwind info)
13423, user32.dll!DispatchMessageW+0xf (No unwind info)
13424, user32.dll!IsDialogMessageW+0x11e (No unwind info)
13425, user32.dll!IsDialogMessage+0x5c (No unwind info)
13426, mfc120.dll!CWnd::IsDialogMessageA+0x2e
13427, mfc120.dll!CWnd::PreTranslateInput+0x2b
13428, mfc120.dll!CDialog::PreTranslateMessage+0xc9
13429, mfc120.dll!CWnd::WalkPreTranslateTree+0x21
13430, mfc120.dll!AfxInternalPreTranslateMessage+0x3f
13431, mfc120.dll!CWinThread::PreTranslateMessage+0xb
13432, mfc120.dll!AfxPreTranslateMessage+0x17
13433, mfc120.dll!AfxInternalPumpMessage+0x2b
13434, mfc120.dll!CWnd::RunModalLoop+0xc6
13435, mfc120.dll!CWnd::CreateRunDlgIndirect+0x3e
13436, mfc120.dll!CDialog::DoModal+0x109
13437, byond.exe+0x31e8e (No unwind info)
13438, byond.exe+0x23487 (No unwind info)
13439, byondwin.dll!CVHTMLCtrl::OnBeforeNavigate2Browser+0x71a (No unwind info)
13440, mfc120.dll!_AfxDispatchCall+0x10
13441, mfc120.dll!CCmdTarget::CallMemberFunc+0x1be
13442, mfc120.dll!CCmdTarget::OnEvent+0x18b
13443, mfc120.dll!COccManager::OnEvent+0x14
13444, mfc120.dll!CCmdTarget::OnCmdMsg+0x38
13445, mfc120.dll!CDialog::OnCmdMsg+0x1b
13446, mfc120.dll!COleControlSite::OnEvent+0x45
13447, mfc120.dll!COleControlSite::XEventSink::Invoke+0x4d
13448, ieframe.dll!Ordinal234+0x180b (No unwind info)
13449, ieframe.dll!Ordinal319+0x1763d (No unwind info)
13450, ieframe.dll!Ordinal319+0x1746b (No unwind info)
13451, ieframe.dll!Ordinal319+0x179e4 (No unwind info)
13452, ieframe.dll!Ordinal319+0x30178 (No unwind info)
13453, ieframe.dll!Ordinal319+0x2f345 (No unwind info)
13454, ieframe.dll!SetQueryNetSessionCount+0x1ac2f (No unwind info)
13455, mshtml.dll!CWebOCEvents::BeforeNavigate2+0x23e
13456, mshtml.dll!CDoc::DoNavigate_FireBeforeNavigateEvent+0x10d
13457, mshtml.dll!CDoc::DoNavigate+0x76e
13458, mshtml.dll!CDoc::FollowHyperlink2+0x89b
13459, mshtml.dll!CWindow::FollowHyperlinkHelper+0xf0
13460, mshtml.dll!CWindow::NavigateEx+0xbe
13461, mshtml.dll!COmLocationProxy::InvokeEx+0x4a2
13462, mshtml.dll!CBase::ContextInvokeEx+0x53e
13463, mshtml.dll!CBase::InvokeEx+0x26
13464, mshtml.dll!COmWindowProxy::InvokeEx+0x1ea
13465, mshtml.dll!CBase::VersionedInvokeEx+0xb8
13466, mshtml.dll!PlainInvokeEx+0x99
13467, jscript9.dll!JsVarToExtension+0x127ef3 (No unwind info)
13468, jscript9.dll!DllCanUnloadNow+0x3fc13 (No unwind info)
13469, jscript9.dll!DllCanUnloadNow+0x3fb67 (No unwind info)
13470, jscript9.dll!DllCanUnloadNow+0x3fb33 (No unwind info)
13471, jscript9.dll!DllCanUnloadNow+0x3fc72 (No unwind info)
13472, jscript9.dll!JsVarToExtension+0xbab4 (No unwind info)
13473, jscript9.dll!JsVarToExtension+0x63b76 (No unwind info)
13474, jscript9.dll!JsVarToExtension+0x63bf6 (No unwind info)
13475, jscript9.dll!JsVarToExtension+0xd65e (No unwind info)
13476, jscript9.dll!JsVarToExtension+0x10c39 (No unwind info)
13477, 0x4be04a1 (No unwind info)
13478, jscript9.dll!JsVarToExtension+0x10a66 (No unwind info)
13479, jscript9.dll!JsVarToExtension+0x10c39 (No unwind info)
13480, 0x4be04a9 (No unwind info)
13481, jscript9.dll!JsVarToExtension+0x9dc3 (No unwind info)
13482, jscript9.dll!JsVarToExtension+0xa3f8 (No unwind info)
13483, jscript9.dll!JsVarToExtension+0xa32d (No unwind info)
13484, jscript9.dll!JsVarToExtension+0xa2c0 (No unwind info)
13485, jscript9.dll!DllCanUnloadNow+0x28e6d (No unwind info)
13486, jscript9.dll!JsVarToExtension+0x12a8f6 (No unwind info)
13487, jscript9.dll!JsVarToExtension+0x12ab36 (No unwind info)
13488, mshtml.dll!CWindow::ExecuteCallbackScript+0xb1
13489, mshtml.dll!CWindow::FireTimeOut+0x25c
13490, mshtml.dll!CPaintBeat::ProcessTimers+0x233
13491, mshtml.dll!CPaintBeat::OnWMTimer+0x5d
13492, mshtml.dll!GetBrowserProcess+0x27d
13493, mshtml.dll!GlobalWndProc+0x1d9
13494, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13495, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13496, user32.dll!GetClientRect+0xc5 (No unwind info)
13497, user32.dll!CallWindowProcA+0x1b (No unwind info)
13498, mfc120.dll!_AfxActivationWndProc+0x132
13499, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13500, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13501, user32.dll!CharPrevW+0x138 (No unwind info)
13502, user32.dll!DispatchMessageA+0xf (No unwind info)
13503, mfc120.dll!AfxInternalPumpMessage+0x3e
13504, byond.exe+0x2f94e (No unwind info)
13505, 0xffffffffc00111f6 (No unwind info)

The bogus stack frames and the double occurrence of byondcore.dll!DungPager::Login+0x1 actually aren't abnormal for a debugged thread, but mfc120.dll seems to need to dispatch a LOT of messages...

Contributor

Mattiwatti commented Mar 5, 2017

The CPU utilization does seem to be pretty random; when I get a hang with a stacktrace like above I have one fully utilized core, but I can also produce hangs in win32k.sys!xxxRealSleepEx (or whatever the name is) where it really does seem to be blocking on something and the CPU utilization is 0.

Some other interesting stuff:

  • WinDbg is also affected by this! Furthermore it doesn't matter if you use the X86 or the X64 version (the X64 version of WinDbg is the only debugger I know that can debug Wow64); both will hang just like x32dbg does.
  • I managed to fish out an... interesting... stacktrace from the debuggee thread with 13506 stack frames:
0, smokeweed.exe!KiDeliverApc+0x1e3
1, smokeweed.exe!KiCommitThreadWait+0x3dd
2, smokeweed.exe!KeWaitForSingleObject+0x19f
3, smokeweed.exe!DbgkpQueueMessage+0x2a8
4, smokeweed.exe!DbgkpSendApiMessage+0x5c
5, smokeweed.exe! ?? ::NNGAKEGL::`string'+0x2842d
6, smokeweed.exe!KiDispatchException+0x287
7, smokeweed.exe!KiExceptionDispatch+0xc2
8, smokeweed.exe!KiBreakpointTrap+0xf4
9, byondcore.dll!DungPager::Login+0x1
10, 0x50108000f151a5
11, 0x200524c40
12, 0x111c001a5b2
13, 0x10040c3e8
14, 0x2e90c8000404940
15, 0x1745ab628
16, 0x50108002d39ca8
17, 0xf3def600404940
18, 0x40494c00000004
19, 0x156ef2800f172e9
20, 0xc001a5da00524c40
21, 0x856ef2800000000
22, 0xf3b8f900404abc
23, 0x40495c00000000
24, mfc120.dll!_AfxDispatchCmdMsg+0x3f
(...etc...)
13389, mfc120.dll!_AfxDispatchCmdMsg+0x3f
13390, byondcore.dll!DungPager::Login+0x1 (No unwind info)
13391, 0x501080 (No unwind info)
13392, byond.exe+0x272e9 (No unwind info)
13393, mfc120.dll!_AfxDispatchCmdMsg+0x3f
13394, mfc120.dll!CCmdTarget::OnCmdMsg+0x12f
13395, mfc120.dll!CDialog::OnCmdMsg+0x1b
13396, mfc120.dll!CWnd::OnCommand+0x7b
13397, mfc120.dll!CWnd::OnWndMsg+0x62
13398, mfc120.dll!CWnd::WindowProc+0x22
13399, byondwin.dll!CResizableDialog::WindowProc+0x6f (No unwind info)
13400, mfc120.dll!AfxCallWndProc+0x99
13401, mfc120.dll!AfxWndProc+0x34
13402, mfc120.dll!AfxWndProcBase+0x34
13403, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13404, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13405, user32.dll!GetWindow+0x3f0 (No unwind info)
13406, user32.dll!SendMessageW+0x4c (No unwind info)
13407, user32.dll!LoadCursorFromFileA+0x1097 (No unwind info)
13408, user32.dll!LoadCursorFromFileA+0x11d8 (No unwind info)
13409, user32.dll!SetKeyboardState+0x1c7e (No unwind info)
13410, user32.dll!IsCharAlphaA+0x1a9f (No unwind info)
13411, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13412, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13413, user32.dll!GetClientRect+0xc5 (No unwind info)
13414, user32.dll!CallWindowProcA+0x1b (No unwind info)
13415, mfc120.dll!CWnd::DefWindowProcA+0x46
13416, mfc120.dll!CWnd::WindowProc+0x39
13417, mfc120.dll!AfxCallWndProc+0x99
13418, mfc120.dll!AfxWndProc+0x34
13419, mfc120.dll!AfxWndProcBase+0x34
13420, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13421, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13422, user32.dll!CharPrevW+0x138 (No unwind info)
13423, user32.dll!DispatchMessageW+0xf (No unwind info)
13424, user32.dll!IsDialogMessageW+0x11e (No unwind info)
13425, user32.dll!IsDialogMessage+0x5c (No unwind info)
13426, mfc120.dll!CWnd::IsDialogMessageA+0x2e
13427, mfc120.dll!CWnd::PreTranslateInput+0x2b
13428, mfc120.dll!CDialog::PreTranslateMessage+0xc9
13429, mfc120.dll!CWnd::WalkPreTranslateTree+0x21
13430, mfc120.dll!AfxInternalPreTranslateMessage+0x3f
13431, mfc120.dll!CWinThread::PreTranslateMessage+0xb
13432, mfc120.dll!AfxPreTranslateMessage+0x17
13433, mfc120.dll!AfxInternalPumpMessage+0x2b
13434, mfc120.dll!CWnd::RunModalLoop+0xc6
13435, mfc120.dll!CWnd::CreateRunDlgIndirect+0x3e
13436, mfc120.dll!CDialog::DoModal+0x109
13437, byond.exe+0x31e8e (No unwind info)
13438, byond.exe+0x23487 (No unwind info)
13439, byondwin.dll!CVHTMLCtrl::OnBeforeNavigate2Browser+0x71a (No unwind info)
13440, mfc120.dll!_AfxDispatchCall+0x10
13441, mfc120.dll!CCmdTarget::CallMemberFunc+0x1be
13442, mfc120.dll!CCmdTarget::OnEvent+0x18b
13443, mfc120.dll!COccManager::OnEvent+0x14
13444, mfc120.dll!CCmdTarget::OnCmdMsg+0x38
13445, mfc120.dll!CDialog::OnCmdMsg+0x1b
13446, mfc120.dll!COleControlSite::OnEvent+0x45
13447, mfc120.dll!COleControlSite::XEventSink::Invoke+0x4d
13448, ieframe.dll!Ordinal234+0x180b (No unwind info)
13449, ieframe.dll!Ordinal319+0x1763d (No unwind info)
13450, ieframe.dll!Ordinal319+0x1746b (No unwind info)
13451, ieframe.dll!Ordinal319+0x179e4 (No unwind info)
13452, ieframe.dll!Ordinal319+0x30178 (No unwind info)
13453, ieframe.dll!Ordinal319+0x2f345 (No unwind info)
13454, ieframe.dll!SetQueryNetSessionCount+0x1ac2f (No unwind info)
13455, mshtml.dll!CWebOCEvents::BeforeNavigate2+0x23e
13456, mshtml.dll!CDoc::DoNavigate_FireBeforeNavigateEvent+0x10d
13457, mshtml.dll!CDoc::DoNavigate+0x76e
13458, mshtml.dll!CDoc::FollowHyperlink2+0x89b
13459, mshtml.dll!CWindow::FollowHyperlinkHelper+0xf0
13460, mshtml.dll!CWindow::NavigateEx+0xbe
13461, mshtml.dll!COmLocationProxy::InvokeEx+0x4a2
13462, mshtml.dll!CBase::ContextInvokeEx+0x53e
13463, mshtml.dll!CBase::InvokeEx+0x26
13464, mshtml.dll!COmWindowProxy::InvokeEx+0x1ea
13465, mshtml.dll!CBase::VersionedInvokeEx+0xb8
13466, mshtml.dll!PlainInvokeEx+0x99
13467, jscript9.dll!JsVarToExtension+0x127ef3 (No unwind info)
13468, jscript9.dll!DllCanUnloadNow+0x3fc13 (No unwind info)
13469, jscript9.dll!DllCanUnloadNow+0x3fb67 (No unwind info)
13470, jscript9.dll!DllCanUnloadNow+0x3fb33 (No unwind info)
13471, jscript9.dll!DllCanUnloadNow+0x3fc72 (No unwind info)
13472, jscript9.dll!JsVarToExtension+0xbab4 (No unwind info)
13473, jscript9.dll!JsVarToExtension+0x63b76 (No unwind info)
13474, jscript9.dll!JsVarToExtension+0x63bf6 (No unwind info)
13475, jscript9.dll!JsVarToExtension+0xd65e (No unwind info)
13476, jscript9.dll!JsVarToExtension+0x10c39 (No unwind info)
13477, 0x4be04a1 (No unwind info)
13478, jscript9.dll!JsVarToExtension+0x10a66 (No unwind info)
13479, jscript9.dll!JsVarToExtension+0x10c39 (No unwind info)
13480, 0x4be04a9 (No unwind info)
13481, jscript9.dll!JsVarToExtension+0x9dc3 (No unwind info)
13482, jscript9.dll!JsVarToExtension+0xa3f8 (No unwind info)
13483, jscript9.dll!JsVarToExtension+0xa32d (No unwind info)
13484, jscript9.dll!JsVarToExtension+0xa2c0 (No unwind info)
13485, jscript9.dll!DllCanUnloadNow+0x28e6d (No unwind info)
13486, jscript9.dll!JsVarToExtension+0x12a8f6 (No unwind info)
13487, jscript9.dll!JsVarToExtension+0x12ab36 (No unwind info)
13488, mshtml.dll!CWindow::ExecuteCallbackScript+0xb1
13489, mshtml.dll!CWindow::FireTimeOut+0x25c
13490, mshtml.dll!CPaintBeat::ProcessTimers+0x233
13491, mshtml.dll!CPaintBeat::OnWMTimer+0x5d
13492, mshtml.dll!GetBrowserProcess+0x27d
13493, mshtml.dll!GlobalWndProc+0x1d9
13494, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13495, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13496, user32.dll!GetClientRect+0xc5 (No unwind info)
13497, user32.dll!CallWindowProcA+0x1b (No unwind info)
13498, mfc120.dll!_AfxActivationWndProc+0x132
13499, user32.dll!gapfnScSendMessage+0x332 (No unwind info)
13500, user32.dll!GetThreadDesktop+0xd7 (No unwind info)
13501, user32.dll!CharPrevW+0x138 (No unwind info)
13502, user32.dll!DispatchMessageA+0xf (No unwind info)
13503, mfc120.dll!AfxInternalPumpMessage+0x3e
13504, byond.exe+0x2f94e (No unwind info)
13505, 0xffffffffc00111f6 (No unwind info)

The bogus stack frames and the double occurrence of byondcore.dll!DungPager::Login+0x1 actually aren't abnormal for a debugged thread, but mfc120.dll seems to need to dispatch a LOT of messages...

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 5, 2017

Contributor

The process definitely seems to be using some anti-debug protections: the attach breakpoint gets overwritten to cause an AV, and there are various pages containing syscall stubs that syscalls get routed through via exception handlers (mfc120.dll involved? My bet is yes.) On an x86 OS it will simply patch KiFastSystemCall to intercept all syscalls. This gave me a nice BSOD when the AFD device was ZwClosed incorrectly. Perhaps they should stick to writing user mode code.

Anyway, anti-debug isn't new, the question is how are they managing to deadlock debuggers this way? I gave it a quick look in a kernel debugger but nothing stood out: there are no pending IRPs or ALPC messages, and the ERESOURCE win32k is trying to get does eventually get acquired exclusively without shared waiters.

Contributor

Mattiwatti commented Mar 5, 2017

The process definitely seems to be using some anti-debug protections: the attach breakpoint gets overwritten to cause an AV, and there are various pages containing syscall stubs that syscalls get routed through via exception handlers (mfc120.dll involved? My bet is yes.) On an x86 OS it will simply patch KiFastSystemCall to intercept all syscalls. This gave me a nice BSOD when the AFD device was ZwClosed incorrectly. Perhaps they should stick to writing user mode code.

Anyway, anti-debug isn't new, the question is how are they managing to deadlock debuggers this way? I gave it a quick look in a kernel debugger but nothing stood out: there are no pending IRPs or ALPC messages, and the ERESOURCE win32k is trying to get does eventually get acquired exclusively without shared waiters.

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 5, 2017

Member

Hi,

I have noticed this issue — #1481 and bug looks familiar.

I am not an expert and might be wrong, but I suggest to take a look at everything that interferes with event loop due to user interaction. In my code (ax330d/hrdev@1183077#diff-5d2b7017ff23d706dc7b480a368dc400L225) this was due to timer (I don’t remember much details, but user moved mouse on label, tooltip appeared and then GUI was frozen because in PyQt version timer somehow misbehaved).

Did not post this on Github to not to flood with potentially useless comments.

Cheers,
Arthur

Member

mrexodia commented Mar 5, 2017

Hi,

I have noticed this issue — #1481 and bug looks familiar.

I am not an expert and might be wrong, but I suggest to take a look at everything that interferes with event loop due to user interaction. In my code (ax330d/hrdev@1183077#diff-5d2b7017ff23d706dc7b480a368dc400L225) this was due to timer (I don’t remember much details, but user moved mouse on label, tooltip appeared and then GUI was frozen because in PyQt version timer somehow misbehaved).

Did not post this on Github to not to flood with potentially useless comments.

Cheers,
Arthur

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 5, 2017

Member
Member

mrexodia commented Mar 5, 2017

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 6, 2017

Contributor

I noticed that too, I have one thread with a start address of Qt5Core.dll!QThreadPrivate::start calling it, but it doesn't seem to actually be doing anything (it's blocking in WaitForMultipleObjects indefinitely). In any case I don't think Qt is to blame as WinDbg is also affected.

byond.exe does call SendMessage/DispatchMessage/PeekMessage from multiple threads (the ones that start in mshtml.dll), which seems like a very dumb thing to do, but I don't see how that could affect the event loop of the debugger. Even assuming some global shared resource (win32k.sys does have many of those), suspending the thread that owns the resource should result in a session-wide deadlock, which isn't the case because only the debugger process locks up.

I tried deadlock detection with the driver verifier but it turned up nothing, other than locks -v showing that the ERESOURCE x32dbg.exe is acquiring has 11K waiters... but they are all exclusive. There would only be a deadlock if the resource was held shared and someone then tried to acquire it exclusively.

Contributor

Mattiwatti commented Mar 6, 2017

I noticed that too, I have one thread with a start address of Qt5Core.dll!QThreadPrivate::start calling it, but it doesn't seem to actually be doing anything (it's blocking in WaitForMultipleObjects indefinitely). In any case I don't think Qt is to blame as WinDbg is also affected.

byond.exe does call SendMessage/DispatchMessage/PeekMessage from multiple threads (the ones that start in mshtml.dll), which seems like a very dumb thing to do, but I don't see how that could affect the event loop of the debugger. Even assuming some global shared resource (win32k.sys does have many of those), suspending the thread that owns the resource should result in a session-wide deadlock, which isn't the case because only the debugger process locks up.

I tried deadlock detection with the driver verifier but it turned up nothing, other than locks -v showing that the ERESOURCE x32dbg.exe is acquiring has 11K waiters... but they are all exclusive. There would only be a deadlock if the resource was held shared and someone then tried to acquire it exclusively.

@techbliss

This comment has been minimized.

Show comment
Hide comment
@techbliss

techbliss Mar 6, 2017

I tried in Ida windbg debugger and it have same issue with freezing hanging.
However there modified win32 debugger, wich is a mod of windbg do not freeze.,very odd.
I managed to crash debug Ida. Maybe you see something.
Here are the logs.

http://pastebin.com/emC3vRi6
http://pastebin.com/iBcFiyXm
http://pastebin.com/Zf7wCdk9
Last one it also seem Qt5Core.dll!QThreadPrivate:: are trickering some odd behavior.

techbliss commented Mar 6, 2017

I tried in Ida windbg debugger and it have same issue with freezing hanging.
However there modified win32 debugger, wich is a mod of windbg do not freeze.,very odd.
I managed to crash debug Ida. Maybe you see something.
Here are the logs.

http://pastebin.com/emC3vRi6
http://pastebin.com/iBcFiyXm
http://pastebin.com/Zf7wCdk9
Last one it also seem Qt5Core.dll!QThreadPrivate:: are trickering some odd behavior.

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 7, 2017

Contributor

Hmm, nothing in there that I can see. STATUS_BREAKPOINT is certainly odd, but the stack traces are unfortunately useless...

Just to make sure, this is a dump of ida.exe crashing while running with WinDbg as the debugger engine? I might try if I can reproduce, but I doubt I'll have more luck.

I tried a clean x86 installation of Windows 10 yesterday because I got tired of Wow64 crap in the stacktraces. Same for Qt really, so I chose WinDbg as the victim debugger because it has a GUI from 1985 and most definitely only 1 window thread. The only reproducible interesting finding was this:
hang
(Just to be clear, this is a separate LdrGetProcedureAddress from the ones before it, it did this every time right before hanging, long after already having loaded)
Not sure if this is really meaningful or just a red herring though. But one theory could be that:

  • byond.exe registers for a callback message using TrackMouseEvent to know when the mouse enters or leaves its window (or even the login button area)
  • Some process (CSRSS? dwm.exe?) is meant to deliver this message to byond.exe but can't, because all its threads are frozen by the debugger
  • The debugger is somehow blocked on the above

Who knows though. Another thing I found was dwm.exe creating two ghost windows (one for windbg.exe, one for byond.exe), but terminating the dwm thread and the ghost windows didn't help and I also couldn't reproduce it.

Contributor

Mattiwatti commented Mar 7, 2017

Hmm, nothing in there that I can see. STATUS_BREAKPOINT is certainly odd, but the stack traces are unfortunately useless...

Just to make sure, this is a dump of ida.exe crashing while running with WinDbg as the debugger engine? I might try if I can reproduce, but I doubt I'll have more luck.

I tried a clean x86 installation of Windows 10 yesterday because I got tired of Wow64 crap in the stacktraces. Same for Qt really, so I chose WinDbg as the victim debugger because it has a GUI from 1985 and most definitely only 1 window thread. The only reproducible interesting finding was this:
hang
(Just to be clear, this is a separate LdrGetProcedureAddress from the ones before it, it did this every time right before hanging, long after already having loaded)
Not sure if this is really meaningful or just a red herring though. But one theory could be that:

  • byond.exe registers for a callback message using TrackMouseEvent to know when the mouse enters or leaves its window (or even the login button area)
  • Some process (CSRSS? dwm.exe?) is meant to deliver this message to byond.exe but can't, because all its threads are frozen by the debugger
  • The debugger is somehow blocked on the above

Who knows though. Another thing I found was dwm.exe creating two ghost windows (one for windbg.exe, one for byond.exe), but terminating the dwm thread and the ghost windows didn't help and I also couldn't reproduce it.

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 9, 2017

Contributor

After spending much more time on this than I reasonably should have, I believe I have finally isolated the cause, namely the following sequence of calls:
dumb shit
The purpose of the first three should be obvious. After retrieving the current thread ID and that of the foreground window, the exe calls AttachThreadInput, which (quoting MSDN, because you cannot make this up) "attaches or detaches the input processing mechanism of one thread to that of another thread." This is probably the dumbest function I have ever seen in the entire Win32 API, and I have seen a lot of them.

Since the foreground window will often be the debugger window (or any other fucking window, because it's a PC and Windows has had multitasking since the early 90s), this will bind the input processing of the debugger to the thread that owns the byond.exe main window. When a breakpoint hits, this thread predictably stops responding to all input events and the debugger cannot progress because its input is no longer being processed.

Since NtUserGetForegroundWindow is only called infrequently, I applied the following fairly harmless patch:
patch
Since it probably isn't immediately obvious from the name, all this does is make the function exactly the same as GetActiveWindow. In other words, this makes byond.exe think it is the foreground window even when it isn't. I would have preferred to simply return FALSE from AttachThreadInput, but the app didn't seem to like that. If you need accurate NtUserGetForeground behaviour you can probably think of a better patch.

Anyway, with this patch applied I can no longer reproduce the hang with any debugger on any version of Windows. Hopefully it's the same for you guys.

I guess technically this is a win32k.sys bug, in that AttachThreadInput should be cancelled upon suspension of the thread processing the input, because this will always block the second thread. But the real solution would be to remove the API from Windows and fire whoever came up with it.

Contributor

Mattiwatti commented Mar 9, 2017

After spending much more time on this than I reasonably should have, I believe I have finally isolated the cause, namely the following sequence of calls:
dumb shit
The purpose of the first three should be obvious. After retrieving the current thread ID and that of the foreground window, the exe calls AttachThreadInput, which (quoting MSDN, because you cannot make this up) "attaches or detaches the input processing mechanism of one thread to that of another thread." This is probably the dumbest function I have ever seen in the entire Win32 API, and I have seen a lot of them.

Since the foreground window will often be the debugger window (or any other fucking window, because it's a PC and Windows has had multitasking since the early 90s), this will bind the input processing of the debugger to the thread that owns the byond.exe main window. When a breakpoint hits, this thread predictably stops responding to all input events and the debugger cannot progress because its input is no longer being processed.

Since NtUserGetForegroundWindow is only called infrequently, I applied the following fairly harmless patch:
patch
Since it probably isn't immediately obvious from the name, all this does is make the function exactly the same as GetActiveWindow. In other words, this makes byond.exe think it is the foreground window even when it isn't. I would have preferred to simply return FALSE from AttachThreadInput, but the app didn't seem to like that. If you need accurate NtUserGetForeground behaviour you can probably think of a better patch.

Anyway, with this patch applied I can no longer reproduce the hang with any debugger on any version of Windows. Hopefully it's the same for you guys.

I guess technically this is a win32k.sys bug, in that AttachThreadInput should be cancelled upon suspension of the thread processing the input, because this will always block the second thread. But the real solution would be to remove the API from Windows and fire whoever came up with it.

@mrexodia

This comment has been minimized.

Show comment
Hide comment
@mrexodia

mrexodia Mar 9, 2017

Member

Thank you so much @Mattiwatti, I have no words for how impressed I am with this amazing find! ❤️

Until this issue is fixed in Windows I created a plugin Fuck1481 that will trampoline GetForegroundWindow to GetActiveWindow. I tested it 10 times and it doesn't appear to hang anymore!

Perhaps this can be implemented properly in ScyllaHide so it will just never return a window owned by x64dbg from GetForegroundWindow. We'll have to see. I plan to ship this plugin with x64dbg directly as an option after @Maktm or @X39 confirm it solves the issue for them too.

EDIT: Did some more digging and I found this post on MSDN: https://blogs.msdn.microsoft.com/oldnewthing/20130619-00/?p=4043

The documentation of AttachThreadInput also states:

The AttachThreadInput function also fails if a journal record hook is installed. Journal record hooks attach all input queues together.

This is quite interesting, perhaps it is possible to attach a no-op journal record hook to x64dbg to work around this issue without having to change the process memory...

Okay, tried that and a very bad idea! Code:

#include <windows.h>

static HHOOK hHook;

static LRESULT CALLBACK HookProc(int code, WPARAM wParam, LPARAM lParam)
{
    if(code < 0)
        return CallNextHookEx(hHook, code, wParam, lParam);
    switch(code)
    {
    case HC_ACTION:
        break;
    case HC_SYSMODALOFF:
        break;
    case HC_SYSMODALON:
        break;
    }
    return 0;
}

int CALLBACK WinMain(
    _In_ HINSTANCE hInstance,
    _In_ HINSTANCE hPrevInstance,
    _In_ LPSTR     lpCmdLine,
    _In_ int       nCmdShow
)
{
    wchar_t msg[128];
    hHook = SetWindowsHookExW(WH_JOURNALRECORD, HookProc, GetModuleHandle(nullptr), 0);
    wsprintfW(msg, L"%p, 0x%X", hHook, GetLastError());
    MessageBoxW(0, msg, L"JournalHook", 0);
    UnhookWindowsHookEx(hHook);
    return 0;
}

This requires some special privileges, you have to sign the executable and place it somewhere in C:\Program Files

If HookProc is written improperly (and it is because I have no clue what I'm doing) it will generally fully lock up your system requiring you to Ctrl+Alt+Del and sign out to forcefully uninstall the hook. Even when the hook is installed correctly, it will not prevent AttachThreadInput from hanging.

Thanks again,

Duncan

Member

mrexodia commented Mar 9, 2017

Thank you so much @Mattiwatti, I have no words for how impressed I am with this amazing find! ❤️

Until this issue is fixed in Windows I created a plugin Fuck1481 that will trampoline GetForegroundWindow to GetActiveWindow. I tested it 10 times and it doesn't appear to hang anymore!

Perhaps this can be implemented properly in ScyllaHide so it will just never return a window owned by x64dbg from GetForegroundWindow. We'll have to see. I plan to ship this plugin with x64dbg directly as an option after @Maktm or @X39 confirm it solves the issue for them too.

EDIT: Did some more digging and I found this post on MSDN: https://blogs.msdn.microsoft.com/oldnewthing/20130619-00/?p=4043

The documentation of AttachThreadInput also states:

The AttachThreadInput function also fails if a journal record hook is installed. Journal record hooks attach all input queues together.

This is quite interesting, perhaps it is possible to attach a no-op journal record hook to x64dbg to work around this issue without having to change the process memory...

Okay, tried that and a very bad idea! Code:

#include <windows.h>

static HHOOK hHook;

static LRESULT CALLBACK HookProc(int code, WPARAM wParam, LPARAM lParam)
{
    if(code < 0)
        return CallNextHookEx(hHook, code, wParam, lParam);
    switch(code)
    {
    case HC_ACTION:
        break;
    case HC_SYSMODALOFF:
        break;
    case HC_SYSMODALON:
        break;
    }
    return 0;
}

int CALLBACK WinMain(
    _In_ HINSTANCE hInstance,
    _In_ HINSTANCE hPrevInstance,
    _In_ LPSTR     lpCmdLine,
    _In_ int       nCmdShow
)
{
    wchar_t msg[128];
    hHook = SetWindowsHookExW(WH_JOURNALRECORD, HookProc, GetModuleHandle(nullptr), 0);
    wsprintfW(msg, L"%p, 0x%X", hHook, GetLastError());
    MessageBoxW(0, msg, L"JournalHook", 0);
    UnhookWindowsHookEx(hHook);
    return 0;
}

This requires some special privileges, you have to sign the executable and place it somewhere in C:\Program Files

If HookProc is written improperly (and it is because I have no clue what I'm doing) it will generally fully lock up your system requiring you to Ctrl+Alt+Del and sign out to forcefully uninstall the hook. Even when the hook is installed correctly, it will not prevent AttachThreadInput from hanging.

Thanks again,

Duncan

@Maktm

This comment has been minimized.

Show comment
Hide comment
@Maktm

Maktm Mar 9, 2017

After installing the plugin everything works fine. Thanks a lot @Mattiwatti, I owe you one.

Maktm commented Mar 9, 2017

After installing the plugin everything works fine. Thanks a lot @Mattiwatti, I owe you one.

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 9, 2017

Contributor

Glad it worked!

It's funny Raymond Chen shares my feelings re: AttachThreadInput. He usually spends much of his time defending (or explaining the reason for) quirky Windows behaviour that in hindsight was actually implemented for pretty good reasons. He has a The Old New Thing book full of these things that is really interesting to read. But in this case it seems we can all agree that this API was bad from idea to concept to implementation.

By the way, this is a bit off topic, but since I spent a lot of time with a kernel debugger attached throughout this, here are some anti-debugging techniques I found used by this exe that you may want to be aware of @Maktm:

  • Two different timers (a winmm.dll realtime priority thread and NtQueryPerformanceCounter) to check for inconsistencies
  • Kernel debugger checks (NtQuerySystemInformation)
  • Debug port/debug flags checks (NtQueryInformationProcess)
  • Standard IsDebuggerPresent/PEB checks
  • Heap flags checks
  • NtClose(0xINVALID)
  • NtAllocateVirtualMemory with the MEM_WRITE_WATCH flag followed by constant NtQueryVirtualMemory calls to check the region for SW BPs
  • NtGetContextThread for HW BPs
  • C++ exceptions/console CTRL+C exceptions (I actually think these were the reason for exposing the bug - the app triggers an exception, forcing you to press SHIFT+F9 in the debugger, and then immediately checks the foreground window)
  • Code in executable pages that is later 'downgraded' to RW or R only or just freed. There are also many DLLs with exactly 1 private page which indicates they were written to, but I don't know of a good tool to check what was modified exactly.
  • Constant use of SEH to transfer execution control. Combined with frame pointer optimizations this makes the stack trace often completely useless. I found running gflags.exe /i byond.exe +ust (explanation) to help somewhat with this. The stack is still useless but you can fish out more interesting info because ntdll is constantly walking the frames. gflags.exe /i byond.exe +sls is also good - a lot of spam at startup, but the app makes heavy use of GetProcAddress so you can see what it is trying to do.

Since I have a pretty heavily modified kernel (or rootkit if you prefer) that is sort of a combination of ScyllaHide and TitanHide, I actually have the ability to circumvent all of these. It was the fact that this didn't matter at all which lead me to believe this bug (probably) isn't an anti-debug measure, but that the programmers of the app are just stupid.

Contributor

Mattiwatti commented Mar 9, 2017

Glad it worked!

It's funny Raymond Chen shares my feelings re: AttachThreadInput. He usually spends much of his time defending (or explaining the reason for) quirky Windows behaviour that in hindsight was actually implemented for pretty good reasons. He has a The Old New Thing book full of these things that is really interesting to read. But in this case it seems we can all agree that this API was bad from idea to concept to implementation.

By the way, this is a bit off topic, but since I spent a lot of time with a kernel debugger attached throughout this, here are some anti-debugging techniques I found used by this exe that you may want to be aware of @Maktm:

  • Two different timers (a winmm.dll realtime priority thread and NtQueryPerformanceCounter) to check for inconsistencies
  • Kernel debugger checks (NtQuerySystemInformation)
  • Debug port/debug flags checks (NtQueryInformationProcess)
  • Standard IsDebuggerPresent/PEB checks
  • Heap flags checks
  • NtClose(0xINVALID)
  • NtAllocateVirtualMemory with the MEM_WRITE_WATCH flag followed by constant NtQueryVirtualMemory calls to check the region for SW BPs
  • NtGetContextThread for HW BPs
  • C++ exceptions/console CTRL+C exceptions (I actually think these were the reason for exposing the bug - the app triggers an exception, forcing you to press SHIFT+F9 in the debugger, and then immediately checks the foreground window)
  • Code in executable pages that is later 'downgraded' to RW or R only or just freed. There are also many DLLs with exactly 1 private page which indicates they were written to, but I don't know of a good tool to check what was modified exactly.
  • Constant use of SEH to transfer execution control. Combined with frame pointer optimizations this makes the stack trace often completely useless. I found running gflags.exe /i byond.exe +ust (explanation) to help somewhat with this. The stack is still useless but you can fish out more interesting info because ntdll is constantly walking the frames. gflags.exe /i byond.exe +sls is also good - a lot of spam at startup, but the app makes heavy use of GetProcAddress so you can see what it is trying to do.

Since I have a pretty heavily modified kernel (or rootkit if you prefer) that is sort of a combination of ScyllaHide and TitanHide, I actually have the ability to circumvent all of these. It was the fact that this didn't matter at all which lead me to believe this bug (probably) isn't an anti-debug measure, but that the programmers of the app are just stupid.

@Maktm

This comment has been minimized.

Show comment
Hide comment
@Maktm

Maktm Mar 9, 2017

I'm surprised they went to great lengths to protect the executable. From my analysis, I thought 0 measures were taken but it turns out that I was completely wrong.

Btw, what do you mean by

Since I have a pretty heavily modified kernel (or rootkit if you prefer)

Thanks again.

Maktm commented Mar 9, 2017

I'm surprised they went to great lengths to protect the executable. From my analysis, I thought 0 measures were taken but it turns out that I was completely wrong.

Btw, what do you mean by

Since I have a pretty heavily modified kernel (or rootkit if you prefer)

Thanks again.

@Mattiwatti

This comment has been minimized.

Show comment
Hide comment
@Mattiwatti

Mattiwatti Mar 9, 2017

Contributor

Keep in mind that they use a lot of heavyweight libraries (DirectX, mshtml, jscript9) that may be doing their own thing for their own reasons, so not all of these are necessarily from byond.exe/byondcore.dll itself. But there are too many of these checks for it to be a coincidence.

Btw, what do you mean by

Since I have a pretty heavily modified kernel (or rootkit if you prefer)

My ntoskrnl.exe isn't exactly vanilla: a lot of the system calls that are known to be used for anti-debugging are modified to notify me through the kernel debugger when an application flagged as 'bad' is doing something questionable. For example, here is a kernel debugger log of the exe from startup until hitting the login breakpoint. (Ignore the NULL addresses for NtWriteVirtualMemory, that's a bug I need to fix...)

Contributor

Mattiwatti commented Mar 9, 2017

Keep in mind that they use a lot of heavyweight libraries (DirectX, mshtml, jscript9) that may be doing their own thing for their own reasons, so not all of these are necessarily from byond.exe/byondcore.dll itself. But there are too many of these checks for it to be a coincidence.

Btw, what do you mean by

Since I have a pretty heavily modified kernel (or rootkit if you prefer)

My ntoskrnl.exe isn't exactly vanilla: a lot of the system calls that are known to be used for anti-debugging are modified to notify me through the kernel debugger when an application flagged as 'bad' is doing something questionable. For example, here is a kernel debugger log of the exe from startup until hitting the login breakpoint. (Ignore the NULL addresses for NtWriteVirtualMemory, that's a bug I need to fix...)

@Redict

This comment has been minimized.

Show comment
Hide comment
@Redict

Redict Mar 9, 2017

OFFTOP
@Maktm, Use my ByondAPI Library: link

Redict commented Mar 9, 2017

OFFTOP
@Maktm, Use my ByondAPI Library: link

@Maktm

This comment has been minimized.

Show comment
Hide comment
@Maktm

Maktm Mar 9, 2017

@Redict I was reversing it for someone else but don't need it anymore. Thanks for the link.
@mrexodia Is this issue considered closable or will it be closed in the next release of x64dbg?

Maktm commented Mar 9, 2017

@Redict I was reversing it for someone else but don't need it anymore. Thanks for the link.
@mrexodia Is this issue considered closable or will it be closed in the next release of x64dbg?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment