Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Branch Monitoring Project


The Branch Monitoring Project.


The Branch Monitor Framework (BMF) is an alternative for runtime process monitoring on modern (Windows) systems. Our approach makes use of Branch Trace Store (BTS) from Intel's processors to implement a dynamic, transparent framework. The framework provides many analysis facilities, such as function call tracing and Control Flow Graph (CFG) reconstruction.


This project is part of Marcus Botacin's master work. Marcus is a Computer Science master candidate at Institute of Computing from University of Campinas, being advised by Prof. Dr. Paulo Lício de Geus and Prof. Dr. André Ricardo Abed Grégio. More detailed information, such as academic papers, can be found at the project page.


From code to real world.


The repository is organized as follows:

  • Client: A simple polling-based driver client able to retrieve and print branch-collected data.
  • BranchClient: An advanced driver client able to perform flow analysis and CG, CFG reconstruction for a given Process ID (PID). You are required to provide addresses for all libraries to be monitored.
  • Branch.Tester: A loop program used for validation purposes.
  • Launcher: A tool to ease monitoring process start up. Given a PID, dumps all memory address and supplies them as inputs to the advanced client.
  • BranchMonitor.NMI: The monitoring driver (NMI handler).
  • BranchMonitor.PMI: The monitoring driver (PMI handler).
  • BranchMonitor.Multi-core: The monitoring driver in a multi-core version (PMI handler).
  • BranchMonitor.Multi-page: The monitoring driver in a multi-page collection version (PMI handler).
  • DumpDLL: A tool to ease introspection headers generation.
  • Kernel: Kernel introspection modules.
  • Misbehavior.Detection: A profiling tool to detect application misbehavior.
  • Transparency.Tests: Tools to attest BranchMonitor's transparency.
  • ROP: CFI verification tools to be used on execution traces.
  • Debugger: A debugger built upon BranchMonitor framework.
  • Utils: General utils for binary analysis using BranchMonitor.
  • PIN.Branch.Monitor: A DBT-Based branch monitor implementation, used for comparative purposes.
  • RetMonitor: PEBS and LBR support, for additional research purposes.


Currently, the BranchMonitor driver is available on two (in fact, many) versions. The first is implemented using an NMI callback to handle interrupts whereas the second is implemented by hooking the performance handler to do so.



Some configurations, such as monitoring core, should be set on the config.h file.

You should define wether you want debug messages to be displayed or not.

#define DEBUG

In this case, you are also required to define the driver name printed on debugger screen. This step is important so that you can filter driver messages being displayed.


You also should set driver name for system and DOS subsystem. This is the name you use to communicate using OpenFile.

#define DRIVERNAME L"\\Device\\BranchMonitor"
#define DOSDRIVERNAME L"\\DosDevices\\BranchMonitor"

You should set on which core the monitor will be enabled.

#define BTS_CORE 3

Introspection Update: As noticed by @smaresca, introspection headers are version-dependent. The values supplied work for Windows 8 x64 6.2 build 9200. Some DLL versions are shown below whereas others can be found on DLL.Versions.

ProductVersion   FileVersion      FileName                                                                             
--------------   -----------      --------                                                                             
6.2.9200.16384   6.2.9200.1638... C:\Windows\System32\ntdll.dll
6.2.9200.16384   6.2.9200.1638... C:\Windows\System32\kernel32.dll

To run the solution on other systems, you need to dump the target DLL and generate the header file. This process is eased by the DumpDLL tool, which parses DLL dumps and produces the correct, ordered outputs, as shown below:

NTDLL Input:

Function Name     : ZwYieldExecution
Address           : 0x0000000180003040
Relative Address  : 0x00003040
Ordinal           : 1971 (0x7b3)
Filename          : ntdll.dll
Full Path         : C:\Windows\system32\ntdll.dll
Type              : Exported Function

NTDLL Output:

strcpy , 4896
strcat , 4720
memcmp , 4496
_local_unwind , 4432
RtlGetCurrentUmsThread , 4240
RtlEnterCriticalSection , 4192
RtlLeaveCriticalSection , 4112


To build the many components of our framework, you should include their paths on the compilation project, as shown below:

Including libraries paths

In my computer, I was compiling under C:\. If you are compiling from other dir, you need to point /src path properly.

To make the BranchClient compilation easier, I included the capstone-3.0.4-win64 on the repository.

You should also define system architecture and configurations, as shown below:

Solution Configuration


All required steps for the win!

Driver Installation

As our driver is not signed, you should disable driver sign enforcement in order to use it.

After installing it, you can load it using services manager, as shown below:

Driver startup


In order to check if the solution is properly working, you can use the simple client to retrieve branch data, as shown below:

Simple client in action

Following the Flow

In order to filter process actions and perform analysis tasks, such as disassembling, you have to start the advanced client with the binary's and libraries' address, as shown below:

BranchClient usage

In order to ease this process, the Launcher is able to perform the task of retrieving address information and launching the client, as you can see below:

Launcher usage

After its startup, the client is already working, as shown below:

BranchClient in action


The BranchClient\examples directory contains some trace examples obtained from real malware samples. I hope they could clarify BranchMonitor's role on binary monitoring. Some identified actions are shown below:

LIB C:\Windows\SysWOW64\user32.dll at 74c68038 (GetCursorPos+0x12) returned to Binary avr.exe at 465806
LIB C:\Windows\SysWOW64\user32.dll at 76489ddc (IsWindowVisible+0x38) returned to Binary Chrome.exe at 4c52a5

In such cases, these functions were used to display the following message:

Message displayed by a malware sample)


One of biggest advantages of using BranchMonitor is the provided transparency. In order to verify such claim, you can use the checks from the Transparency.Tests directory. My intention is not to provide an exhaustive list of anti-dbg techniques, but some transparency insights instead.

Currently implemented tests:

  • IsDebuggerPresent
  • CheckRemoteDebugger
  • OutputDebugString

For more information about anti-analysis tricks, check this.


You can check debug messages if the driver was compiled using the DEBUG flag, as shown:

	#define DEBUG

The debug messages are printed on a debug screen. The following figure shows the messages being printed on DbgView, from SysInternals.

Debugging messages printed on DbgView


Applications build upon the developed framework.


A debugger built upon BranchMonitor framework. The directory is organized as follows:

  • GDB: A GDB stub which can be used to control the BranchMonitor debugger. On the original article, it was integrated into the debugger solution itself, but I released here an standalone version, so people can use it on distinct applications. It is totally based on mseaborn's gdb-debug-stub.
  • Driver: To be released.

GDB Usage

The GDB stub is available by setting the remote target on the GDB client, as shown below:

GDB stub

More information is coming soon.

ROP Detector

As a result of BranchMonitoring framework, some Control Flow Integrity (CFI) policies for ROP attack detection were implemented. You can find on the ROP directory implementations for the CALL-RET and the Gadget-Size policies. Although I have previously described on an article a real-time solution, the hereby published tools are intended for post-analysis. However, you can easily implement these algorithms on the DriverClient, since the traces were retrieved from the tool.

The CALL-RET policy consists on matching pairs of CALLs and RETs, based on the idea of each RET must be preceed by a CALL on an integer flow. This policy is shown below:

('CURRENT STACK ', [['call', 'NewToy', 'printf']])
('CURRENT STACK ', [['call', 'NewToy', 'printf'], ['call', 'printf', '__iob_func']])
('CURRENT STACK ', [['call', 'NewToy', 'printf'], ['call', 'printf', '__iob_func'], ['ret', '__iob_func', 'printf']])
('CURRENT STACK ', [['call', 'NewToy', 'printf']])
('CURRENT STACK ', [['call', 'NewToy', 'printf'], ['ret', 'printf', 'NewToy']])

The gadget size policy is a heuristic which assumes ROP gadgets are smaller than ordinary ones, so a moving window is used to register the execution of a given number of small gadgets, as shown below:

('Detected in', [2, 17, 36, 4, 2, 27, 13, 5, 46, 2])

Anti-Analysis Tricks Detection

Given the transparency characteristic, our framework is able to execute anti-analysis tricks without any problem. It allows us to perform pattern matching searches for evasion attempts and other tricks. By using these detectors, I was able to detect some of them, shown below:

Fake Conditional Jump:

4001b: xor    %eax,%eax
4001d: jne    4000 <main>
4001f: pop    %rbp

CPU Comparison:

4400:   push   %rbp
4401:   str    0x0(%ebp)
4406:   mov    %rsp,%rbp
4409:   mov    $0x0,%edi
440e:   callq  44013 <main+0x13>

Divergence Analysis

One can also use our transparent tracer as a groundtruth for evaluating the way a binary executes inside another tracing tool. The tool under the Utils/Divergence.Analysis is suited for this task. A divergence example is shown below:

0x01 | 0x01
0x02 | 0x02
    / \
---- | 0x41
0x03 | 0x42
    \ /
0x05 | 0x05
0x06 | 0x06

The aforementioned tricks were also detected by inspecting the instruction block placed right before a divergent branch instruction.


The Utils directory contains some tools and utilities for binary analysis using BranchMonitor. Currently, the following tools are available:

  • PrintFunc: A simple script for printing the functions called on a given trace
  • ManualDisasm: A pybfd-based solution for disasming small bytes.


This utility should be used as follows:

Usage: python <trace> --remove-offsets

The called functions can be printed considering or not their offsets, as shown below:

Considering Offsets:


Discarding Offsets:


You can filter the output in order to increase your analysis power. The following example shows function calls being counted.

Counting command:

python $1 $2 | sort | uniq -c | sort -gr

Command Output:

56 printf
12 WriteFile
10 TerminateThread
2 ExitProcess


A tool to disasm small pieces of code from trace-retrieved data.

Usage Example:

Usage: python <trace> <addr>

Example Considering the following trace excerpt:

should disasm from 444417 to 444427

Command Example:

python "\x8b\x45\xf0\x3b\xc7\x74\x11\x8d\x4d\xf0\x51\x8b\x4d\x08\x48\x50" 0

Command Output:

0x4 (size=1)	 pop    rsp
0x5 (size=2)	 js     0x000000000000003b
0x7 (size=5)	 xor    eax,0x3066785c
0xC (size=1)	 pop    rsp
0xD (size=2)	 js     0x0000000000000042

Comparing BranchMonitor with other solutions

Always as possible, I try to compare BranchMonitor with other solutions, either for validation or evaluation. For such purpose, I present here a Dynamic Binary Translation (DBT) tool, implemented on Intel PIN. The tool directory, PIN.Branch.Monitor, is organized as follows:

  • Windows: Instrumentation code to be run on Windows.
  • Linux: Instrumentation code to be run on Linux.
  • Comparison: Comparison results between PIN tool and BranchMonitor.

As this tool is implemented as an instrumentation code, it can be run on Linux or Windows. The small differences between the two versions are function or type names.

The Comparison directory presents the results from running the Branch.Tester code on BranchMonitor and the PIN tool. As can be noticed on the example above, the results are similar.

PIN Result:

From: 0000000077332F89 To: 0x7732ec90 Disasm of 1 instr: call
From: 000000007732EC97 To: 0x7732ecab Disasm of 1 instr: jnz
Disasm of 0x7 bytes from 000000007732EC90: 0x48 0x3b 0xd 0x39 0x8e 0xe 0x0

BranchMonitor Result:

Binary C:\BranchMonitoringProject\Branch.Tester\x64\Debug\Branch.Tester.exe at <0x1ca1> to Binary C:\BranchMonitoringProject\Branch.Tester\x64\Debug\Branch.Tester.exe at <0x1c90>
Binary C:\BranchMonitoringProject\Branch.Tester\x64\Debug\Branch.Tester.exe at <0x1c96> to Binary C:\BranchMonitoringProject\Branch.Tester\x64\Debug\Branch.Tester.exe at <0x1c9a>
should disasm from 7ff6d6ec1c90 to 7ff6d6ec1c96

On both cases, the same number of bytes were considered on the execution trace.

Open Implementation Issues

I am performing some code clean up before publishing the final solution. This way, some features are not available yet. I plan to release such features as soon as possible.


This framework is presented as a proof-of-concept (PoC) of the branch monitoring capabilities, thus some limitations exists, such as:

  • Single Core Analysis: The branch mechanism should be extended to operate on multicore systems.
  • I/O Limitation: Currently, I/O is performed by polling. The driver should be extended to support IOCTLs.
  • Debug:
    • Debug messages are currently implemented as functions. Macros should be used instead.
    • Debug is enabled using #defines. A dynamic control mechanism should be implemented.
    • Debug messages are printed on every function. We need verbosity control.
  • CPU Checks: PERF_COUNT support check is missing.
  • BranchClient Multi-Thread Support: How to launch more threads without breaking flow tracking ?

Other Research Using BranchMonitor

Future Plans

  • Linux version.


I really would like to receive your contributions. By now, a non-exhaustive list of possible contributions:

  • Implementing missing features: See Limitations section.
  • Solving TO-DOs: Lots of improvements along the code.
  • Replacing insecure functions: Remove all strcpy and shell=True from the code.
  • Add new Utils: The more analysis tools the better!

Moving forward:

I really would like to make this project more than a proof of concept, but I don't have time to perform all refactors required for that. To let you know, some required modifications:

  • Integrate all Interrupt handling routines into a single one.
  • Make multi-core support default.
  • Make multi-page data collection default.
  • Implement an userland-kernel page sharing mechanism.
  • Handle x2APIC interrupts.
  • Convert static C headers into a dynamic python-pickle database.
  • Develop an installer.


My academic work related to branch monitoring.

In English

  • We published an academic paper titled Enhancing Branch Monitoring for Security Purposes: From Control Flow Integrity to Malware Analysis and Debugging, in the ACM Transactions on Privacy and Security (TOPS). It covers both theory and practice about branch monitoring. Check Pre Print Here

  • If you want to know more about hardware-assisted monitoring solutions, check out our survey here

In Portuguese


Check out our Youtube playlist.

Mentions to the BranchMonitoring project

It is always great to have our efforts acknowledged, so I present here some mentions to this work:

Please tell me if you are using or referring this project.


A branch-monitor-based solution for process monitoring.







No releases published


No packages published