This is an implementation of TLS injection (T1055.005) for learning purposes and to test detection capabilities to ensure coverage. For details on this technique, please see the report in the TRR Library
A proper payload for TLS injection would ensure only one instance is running at a time. It would also ensure it returns after performing its desire task to allow a hijacked thread to continue to its entry point, otherwise it might render system elements unstable (depending on the injection target). The demo payload does none of this, it simply creates a message box and exits. This payload is sufficient for the demo purposes.
There are many ways to write the injection payload into a remote process'
virtual memory. Any of them could be combined with this technique, but this demo
simply uses VirtualAllocEx and WriteProcessMemory.
The solution can perform the injection in a couple of ways:
- Existing process or new process
- Based on the presence or lack of a TLS Data Directory in the target process
- Based on the trigger setting at the command line
The demo provides the option to create a new suspended process or to inject an existing one. New processes are a common, easy target for injection because they don't serve a legitimate system purpose, so it's easy to hijack them without impacting normal system operations. A new process will call the TLS callbacks naturally, so no trigger mechanism (or waiting period) is required.
If the target process' mapped binary already has a TLS Data Directory, then
the injection code will simply add the address of the injected payload to the
end of the array of TLS callbacks. This is a single write to wherever the array
of callbacks is located, usually the .data or .rdata section.
If the target process does not have an existing TLS Data Directory, the code
will create all required structures. This requires two writes: one to newly
allocated memory to create all the needed structures, and another to the TLS
data directory field in the process' PE header (in memory) to write in the
address of the new _IMAGE_TLS_DIRECTORY. The latter write is unusual and
likely to be flagged by EDR, because modifying the PE header after loading is
unusual and a frequent element in many process hollowing implementations.
We minimize the number of writes by combining the _IMAGE_TLS_DIRECTORY
structure and all the other values needed for TLS in a single mega-structure:
typedef struct _IMAGE_TLS_DIRECTORY64 {
ULONGLONG StartAddressOfRawData; //8 bytes
ULONGLONG EndAddressOfRawData; //8 bytes
PDWORD AddressOfIndex; //8 bytes, pointer (address of the address) to the TLS index. The address it points to must be in a writeable section so the Windows loader can write the address of the TLS index when it's created.
PIMAGE_TLS_CALLBACK* AddressOfCallbacks; //8 bytes, pointer to the TLS callback array. We will use the space immediately after the struct as our array of callbacks.
DWORD SizeOfZeroFill; //4 bytes
DWORD Characteristics; //4 bytes
} IMAGE_TLS_DIRECTORY64;
//We follow the _IMAGE_TLS_DIRECTORY struct immediately with the other values we need.
DWORD 0x00000000 //8 bytes - this will be our AddressOfIndex. This is TLSStructBase + 28h
PVOID PtrToOurPayload //8 bytes - we'll use this as the AddressOfCallbacks array. This is TLSStructBase + 30h
DWORD 0x00000000 //8 bytes - this will serve as the end of the callback array, and also as the value for all the other TLS items we won't be implementing. This is TLSStructBase + 38h
//--------------------------------> Total space for everything is 0x40 bytes.When testing, conhost.exe is an example of a Windows binary with an existing
TLS directory, and rundll32.exe is an example of one without. The repo
contains a powershell script that can be used to search the System32 directory
for binaries with a TLS directory.
The final option supported by the demo is to trigger the injection or wait for it to trigger naturally. Once the TLS callback is in place, any new thread will execute it. If injecting an existing process, you can choose to wait for a new thread (the frequency of this will depend entirely on the target process) or trigger the payload by creating a new thread manually. Waiting is quieter, while a triggered thread can have any start address (it'll never get there) and can be hijacked with any payload (it doesn't serve a legitimate purpose and therefore doesn't need to get back to it once hijacked).
This is demonstration code for personal development and to enable detection engineers to test their detection capabilities to ensure coverage. NEVER use this code on a machine that doesn't belong to you!