Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix if the hook is not called at the start of the process, the hook may fail #144

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

sonyps5201314
Copy link
Contributor

for example, create a c++ console project, create many threads to call CreateFileW,ReadFile,CloseHandle
and create serveral threads to execute Hook and Unhook CreateFileW and ReadFile function every 100 milliseconds.
it will be failed after some time.

Copy link
Contributor

@bgianfo bgianfo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

It looks like you might have mistakenly included all of the other commits from your other PRs here?
Can you pair this down to only the issue you describe in your pull request description?

@bgianfo bgianfo added enhancement This adds new functionallity to the product bug Something isn't working and removed enhancement This adds new functionallity to the product labels Sep 4, 2020
@sonyps5201314 sonyps5201314 force-pushed the fix_maybe_can_not_hook_functions_not_at_the_starttime_of_a_process branch from b492481 to c029e71 Compare September 5, 2020 05:33
@sonyps5201314
Copy link
Contributor Author

@bgianfo, I have rebased to only include this commit.

…ent operations in Detours, because heap memory API functions may be HOOKed. For example, in AcLayers.dll, there may be lock competition between HOOK heap functions. This is only a temporary circumvention solution, the final solution It should also be ensured that the virtual memory API is not locked and can be called.
@bgianfo bgianfo changed the title fix if the hook is not called at the start of the process, the hook may fail Fix if the hook is not called at the start of the process, the hook may fail Jan 24, 2021
src/detours.cpp Outdated Show resolved Hide resolved
src/detours.h Outdated Show resolved Hide resolved
src/detours.h Outdated Show resolved Hide resolved
src/detours.h Outdated Show resolved Hide resolved
src/detours.cpp Outdated Show resolved Hide resolved
src/detours.cpp Outdated Show resolved Hide resolved
src/detours.cpp Outdated Show resolved Hide resolved
src/detours.cpp Outdated Show resolved Hide resolved
@sonyps5201314 sonyps5201314 force-pushed the fix_maybe_can_not_hook_functions_not_at_the_starttime_of_a_process branch from 0f46187 to 5dbb4c8 Compare January 25, 2021 14:21
@sonyps5201314 sonyps5201314 force-pushed the fix_maybe_can_not_hook_functions_not_at_the_starttime_of_a_process branch from 5dbb4c8 to 2da209b Compare January 25, 2021 14:37
@bgianfo bgianfo added this to In Progress in Next Major Release (4.x) Jan 26, 2021
@sonyps5201314 sonyps5201314 force-pushed the fix_maybe_can_not_hook_functions_not_at_the_starttime_of_a_process branch from 4f64a5d to 9d41534 Compare January 28, 2021 05:23
@bgianfo bgianfo added the needs-attention 👋 This issue / or pull request requires maintainer attention. label Feb 1, 2021
@bgianfo
Copy link
Contributor

bgianfo commented Mar 4, 2021

Hey @sonyps5201314 thanks again for the contribution.

@dtarditi and I looked at this togeather today, and we decided the change as it is, is very risk to merge since it's so large and invasive. Can you please desribe what the actual race condition / bug this is fixing is so we can work on trying to scope down this change? It's not clear from the description or the commits where the bug you are fixing actually.

Is it also possible to introduce a stress test / unit test to validate we have fix the issue and don't regress it in the future?

@bgianfo bgianfo added needs-author-feedback This issue / or pull request requires author feedback before more action can be taken. and removed needs-attention 👋 This issue / or pull request requires maintainer attention. labels Mar 4, 2021
@ghost ghost removed the needs-author-feedback This issue / or pull request requires author feedback before more action can be taken. label Mar 5, 2021
@sonyps5201314
Copy link
Contributor Author

sonyps5201314 commented Mar 6, 2021

Is it also possible to introduce a stress test / unit test to validate we have fix the issue and don't regress it in the future?

@bgianfo
TestDetoursBug144.zip is a BUG reproduction program.
Compile, debug and run the DebugMaster configuration, the program will crash immediately, because it uses the official unfixed master branch code.
image

Compile, debug and run the DebugMyFixedBranch configuration, the program will run normally, because it uses the fixed fix_maybe_can_not_hook_functions_not_at_the_starttime_of_a_process code in my PR.

Note: It is recommended to use VS2013 to compile and debug the program of this project, because VS2019 and VS2017 have the bug that the call stack of the program cannot be viewed when the x86 program crashes or deadlocks. It has been reported for almost a year, and the official hasn't repaired it yet.
vs2017 and vs2019 can`t show full call stack when a deadlock occurs in x86 c++.

@bgianfo bgianfo added the needs-attention 👋 This issue / or pull request requires maintainer attention. label Apr 14, 2021
@fredemmott
Copy link

fredemmott commented Jan 16, 2022

@bgianfo Can you please desribe what the actual race condition / bug this is fixing

The original commit seems to mostly be related to #70 (though covers a broader problem): any new/delete/malloc/free etc after any DetourUpdateThread() is risky:

In windows, there is a per-heap lock, which new/delete/malloc/free always need to acquire, and in most applications, practically all operations will be on the same default heap - in effect, roughly speaking, there's a per-process lock on heap operations. See https://docs.microsoft.com/en-us/windows/win32/memory/heap-functions

DetourUpdateThread() ultimately calls SuspendThread(); after this point, if the thread you've passed in happens to already have the heap lock, nothing else can acquire the lock, and all new/delete/malloc/free will deadlock. This includes:

  • the explicit new operations in DetourUpdateThread() - e.g. if you're updating two threads, and the first one has the heap lock, the second DetourUpdateThread() will deadlock
  • the delete operations in DetourTransactionCommit
  • the malloc inside the printfs inside DETOUR_TRACE

You can reproduce this by:

  • creating 3 threads
  • in thread 1, call HeapLock(GetProcessHeap())
  • just loop in thread 2
  • in thread 3, call DetourTransactionBegin(); DetourUpdateThread(thread_1_handle); DetourUpdateThread(thread_2_handle);

--

There are also several other bugs fixed in later commits; I'm not came across them in practice, so I'm not able to explain them.

--

As for how this relates to 'at the start of the process':

  • updating all threads is a relatively common (and necessary) pattern, which significantly raises the risk of hitting this condition. This PR adds a DetourUpdateAllOtherThreads() function to make this pattern easier, but that's tangental to the bug fix.

  • if you attach on process startup, there are many more mallocs/news happening than later in the process lifecycle: as well as the program code, there's all the implicit ones behind LoadLibrary() etc. Again, this significantly raises the risk

@fredemmott
Copy link

fredemmott commented Jan 16, 2022

Here is a small standalone example: https://gist.github.com/fredemmott/adb9c41d7cff44aecf92bece990f4c3f

The explicit lock is just to make it a reliable repro; you can reproduce it less artificially (and less reliably) by replacing the locking threadproc with:

while (true) {
  delete new int;
}

You can replace 'int' with any type.

@fredemmott
Copy link

An alternative fix for that first issue would be to simply call HeapLock(GetProcessHeap()) in DetourTransactionBegin(), and HeapUnlock(GetProcessHeap()) in DetourTransactionCommit() + DetourTransactionAbort()

@sonyps5201314
Copy link
Contributor Author

sonyps5201314 commented Jan 23, 2022

The latest committed code makes Detours use VirtualAlloc/VirtualFree for memory allocation and deallocation, so there is no Heap lock problem you worry about.

// After calling DetourUpdateAllOtherThreads, you should call DetourTransactionCommit(Ex) or DetourTransactionAbort as soon as possible.
// In addition to the DetourAttach(Ex)\DetourDetach(Ex) call, other user operations should not be included between them,
// because other user operations may cause CRT lock competition, and at this time CRT lock may be owned by other threads.
// Other threads have been suspended at this time, so that may cause deadlock problems
LONG WINAPI DetourUpdateThreadEx(_In_ HANDLE hThread, _In_opt_ BOOL fCloseThreadHandleOnDestroyDetourThreadObject /*= TRUE*/);
BOOL WINAPI DetourUpdateAllOtherThreads();

and you can read the commented text above for DetourUpdateThreadEx and DetourUpdateAllOtherThreads to understand the precautions for using them.
and from the above comments and the code from VS2008:

int __cdecl _heap_init (
        int mtflag
        )
{
#if defined _M_AMD64 || defined _M_IA64
        // HEAP_NO_SERIALIZE is incompatible with the LFH heap
        mtflag = 1;
#endif  /* defined _M_AMD64 || defined _M_IA64 */
        //  Initialize the "big-block" heap first.
        if ( (_crtheap = HeapCreate( mtflag ? 0 : HEAP_NO_SERIALIZE,
                                     BYTES_PER_PAGE, 0 )) == NULL )
            return 0;

you can know that HeapLock(GetProcessHeap()) ... is not safe enough.
You might say use HeapLock((HANDLE)_get_heap_handle()); instead
But if a thread already owns the lock and has been suspended by other threads of the user, you call HeapLock((HANDLE)_get_heap_handle()); it will only make the current thread fall into an infinite wait or deadlock, so The best way is to avoid using the HeapAlloc family of functions for memory allocation.
and you must know that the purpose of DetourUpdateAllOtherThreads is to solve the problem that the HOOKed instruction is being run by other threads, not the Heap lock problem. Heap and CRT lock problems are only one problem that may be caused by DetourUpdateAllOtherThreads. Therefore, I introduced DetourMemAlloc, DetourMemReAlloc, DetourMemFree to avoid this problem. The code in the PR has also been used in many programs of my own and my company and has been running stably in many computers for a long time, so you can believe that they are reliable of.

@fredemmott
Copy link

fredemmott commented Jan 23, 2022

But if a thread already owns the lock and has been suspended by other threads of the user, you call HeapLock((HANDLE)_get_heap_handle()); it will only make the current thread fall into an infinite wait or deadlock

This can generally be avoided by calling HeapLock() before suspending any threads.If something else in the process has already locked a thread which holds the lock, either that thread will eventually be resumed (in which case it's not an infinite wait), or the app will be hanging forever anyway. Either way, even if we can install the hook while something else has the heap lock, my detours (and the application code that calls them) still use the heap, so it isn't a practical concern, as long as my thread acquires a heap lock before suspending any threads.

so The best way is to avoid using the HeapAlloc family of functions for memory allocation.

I don't disagree on this, but it is not necessary for many cases.

The code in the PR has also been used in many programs of my own and my company and has been running stably in many computers for a long time, so you can believe that they are reliable of.

I'm looking for minimal - and understandable - changes on top of master. This PR fixes several issues (and I understand this is more than the heaplock race), and adds new features. Regardless of reliability for you, that makes it not fit my goals. Splitting it up with clear explanations would also likely aid Microsoft in reviewing and ideally merging.

That said, I want to be clear that I appreciate your insight and efforts for the community :)

@sonyps5201314
Copy link
Contributor Author

sonyps5201314 commented Jan 24, 2022

DWORD WINAPI ThreadProc_UserCode(LPVOID lpThreadParameter)
{
	Sleep(1000);
	HeapLock((HANDLE)_get_heap_handle());
	Sleep(10000000);
	HeapUnlock((HANDLE)_get_heap_handle());
	return 0;
}

DWORD WINAPI ThreadProc_TestHook(LPVOID lpThreadParameter)
{
	Sleep(2000);
	while (1)
	{
		OutputDebugStringA("ThreadProc_TestHook prepare for hook.\n");
		//replace "OutputDebugStringA" to "printf" function will trigger another deadlock

		HeapLock((HANDLE)_get_heap_handle());

		OutputDebugStringA("ThreadProc_TestHook start to hook.\n");
		DetourTransactionBegin();

		DetourUpdateThread(GetCurrentThread());

		DetourTransactionCommit();

		HeapUnlock((HANDLE)_get_heap_handle());
	}
	return 0;
}
int main()
{
	HANDLE hThread = CreateThread(NULL, 0, ThreadProc_UserCode, NULL, 0, NULL);
	if (hThread)
	{
		CloseHandle(hThread);
		hThread = NULL;
	}
	Sleep(100);

	hThread = CreateThread(NULL, 0, ThreadProc_TestHook, NULL, 0, NULL);
	if (hThread)
	{
		CloseHandle(hThread);
		hThread = NULL;
	}
	Sleep(100 * 1000);
}

The above is a simple example that doesn't do what you expect.
In addition to our own controllable code, a program may also contain system library code, and code injected by third-party DLLs. The code in ThreadProc_UserCode may appear in third-party DLLs and system libraries. For example, in system library AcLayers.dll, it hooks the Heap series functions and has its own lock implementation for the Heap functions, so you can't think that the call to HeapLock returned, you won't Stuck in the next calls to functions such as HeapAlloc.

I don't disagree on this, but it is not necessary for many cases.

Detours, as a general purpose library program, must be able to accommodate all situations, not just most of what you said.

I'm looking for minimal - and understandable - changes on top of master.

image
The latest commit has been merged with the latest official master branch, so your worries are gone.

This PR fixes several issues

No, all its modifications are for one purpose, which is the title of this PR. Please read the code carefully before expressing your opinion.
As I said before, those Heap and CRT lock problems are just new problems that may arise after the introduction of DetourUpdateAllOtherThreads, so I improved the memory allocation implementation inside Detours to solve this new problem.

    warning C6553: The annotation for function 'DetourTransactionBeginEx' on _Param_(1) does not apply to a value type.
    warning C6553: The annotation for function 'DetourUpdateThreadEx' on _Param_(2) does not apply to a value type.

Flagged By: VS 17.2.6 (CL.exe 14.32.31332.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-attention 👋 This issue / or pull request requires maintainer attention.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants