C++ C CMake GDB Python Protocol Buffer Other
Latest commit bcda6a1 Feb 23, 2017 @pgoodman pgoodman committed on GitHub Llvm38 (#112)
* - Upgrade to official llvm 3.8
- remove boost
- unify all cmake files into a single cmake file
- use official protobuf
- start factoring out x86-specific stuff to eventually make an arm port easier
- simplify the CLI; now use mcsema-lift, with -arch, -os, -cfg, -entrypoint, and -o. No more having to specify the target triple.
- moves source code slightly closer to our style guide

- lifted bitcode is not quite right in some cases, so this isn't a stable branch!
- TODO: re-add test cases to discover source of stability problems.

* Some minor fixes, one to make sure xmm regs in the state struct are properly aligned

* Added missing std defs for option parsing. This makes /bin/ls work properly :-)

* Remove old cmake files

* Minor changes to get_cfg.py and raiseX86.cpp in relation to those changes. Those changes don't fix anything, the purpose was to make symbol names for things match between python and cpp. E.g. get_cfg would name things like dta_0xf00, sub_0xf00, ext_... And it seems that it was dta_ instead of data_ for a reallly flaky and dumb reason but oh well. I also fixed a subtle bug related to saving and restoring of callee saved registers on elf 64. I have not made related changes to elf 32 or pe 32/64, though those may be necessary.

* Minor fix

* Adding mcsema-disass, which is a nice wrapper around get_cfg.py.

* Working on readme and cleaning out (currently) unused stuff from the repo

* Renaming mc-sema dir to mcsema

* new travis file

* Updates to bootstrap and build process

* Minor bootstrap fixes

* Well, don't have windows working yet but this is kind of progress I think

* Travis should work now

* Updating protobuf-cmake files so we can generate a VS2015 solution

* Removing and adding some choco packages from README

* Bootstrap now builds protobuf and generates protobuf files
LLVM should now be built on Windows

* Adding Win32 specific compiler options

* Renamed ConstantInt to CreateConstantInt to satisfy MSVC

* Build Release LLVM to not have linking conflicts of MD vs MDd

* Added some missing instructions

* Adding changes to generate runtimes

* Windows bootstrap works.
Failed to load latest commit information.
cmake/protobuf Llvm38 (#112) Feb 23, 2017
docs Llvm38 (#112) Feb 23, 2017
generated Llvm38 (#112) Feb 23, 2017
mcsema Llvm38 (#112) Feb 23, 2017
tools Llvm38 (#112) Feb 23, 2017
.gdbinit Llvm38 (#112) Feb 23, 2017
.gitignore Llvm38 (#112) Feb 23, 2017
.travis.yml Llvm38 (#112) Feb 23, 2017
ACKNOWLEDGEMENTS.md Llvm38 (#112) Feb 23, 2017
CMakeLists.txt Llvm38 (#112) Feb 23, 2017
LICENSE Llvm38 (#112) Feb 23, 2017
README.md Llvm38 (#112) Feb 23, 2017
bootstrap.bat Llvm38 (#112) Feb 23, 2017
bootstrap.sh Llvm38 (#112) Feb 23, 2017


McSema Slack Chat

McSema (pronounced 'em see se ma'), short for machine code semantics, is a set of tools for lifting x86 and amd64 binaries to LLVM bitcode modules. McSema is able to lift integer, floating point, and SSE instructions.

McSema is separated into two conceptual parts: control flow recovery and instruction translation. Control flow recovery is performed using the mcsema-disass tool, which uses IDA Pro to disassemble a binary file and produces a CFG file. Instruction translation is performed using the mcsema-lift tool, which converts a CFG file into a lifted bitcode module.

McSema is open-source and licensed under the BSD 3-clause license.

Build Status

Linux Build Status

Additional Documentation

Note: McSema is undergoing modernization and architectural changes, so some documentation may be out-of-date, or in the process of being improved.

Getting Help

If you are experiencing undocumented problems with McSema, or just want to learn more and contribute, then ask for help in the #tool-mcsema channel of the Empire Hacking Slack. Alternatively, you can join our mailing list at mcsema-dev@googlegroups.com or email us privately at mcsema@trailofbits.com.

Supported Platforms

McSema is supported on Windows and Linux platforms and has been tested on Windows 10, Ubuntu 14.04, and Ubuntu 16.04.


Name Version
Git Latest
CMake 2.8+
Google Protobuf 2.6.1
LLVM 3.8
Clang 3.8
Python 2.7
Python Package Index Latest
python-protobuf 2.6.1
IDA Pro 6.7+

Getting and Building the Code

Step 1: Install dependencies

On Linux

sudo apt-get update
sudo apt-get upgrade

sudo apt-get install \
     git \
     cmake \
     libprotoc-dev libprotobuf-dev libprotobuf-dev protobuf-compiler \
     python2.7 python-pip \
     llvm-3.8 clang-3.8 \

sudo pip install --upgrade pip
sudo pip install 'protobuf==2.6.1'
Using IDA on 64 bit Ubuntu

If your IDA install does not use the system's Python, you can add the protobuf library manually to IDA's zip of modules.

# Python module dir is generally in /usr/lib or /usr/local/lib
touch /path/to/python2.7/dist-packages/google/__init__.py
cd /path/to/lib/python2.7/dist-packages/              
sudo zip -rv /path/to/ida-6.X/python/lib/python27.zip google/
sudo chown your_user:your_user /home/taxicat/ida-6.7/python/lib/python27.zip

On Windows

Step 1: Download Chocolatey

Download and install Chocolatey.

Step 2: Install Packages

Open Windows Powershell in administrator mode, and run the following.

choco install -y --allowemptychecksum git cmake python2 pip wget unzip 7zip
choco install -y microsoft-visual-cpp-build-tools --installargs "/InstallSelectableItems Win81SDK_CppBuildSKUV1;Win10SDK_VisibleV1"

Step 2: Clone and Enter the Repository

On Linux

Clone the repository
git clone git@github.com:trailofbits/mcsema.git --depth 1
Run the bootstrap script
cd mcsema
./bootstrap.sh Release

On Windows

Clone the repository

Open the Developer Command Prompt for Visual Studio application, and run the following

cd C:\
if not exist git mkdir git
cd git

git clone https://github.com/trailofbits/mcsema.git --depth 1
Run the bootstrap script
cd mcsema

Step 3: Build and install the code

On Linux

cd build
sudo make install

On Windows

Try it Out

If you have a binary, you can get started with the following commands. First, you recover control flow graph information using mcsema-disass. For now, this needs to use IDA Pro as the disassembler.

mcsema-disass --disassembler /path/to/ida/idal64 --arch amd64 --os linux --output /tmp/ls.cfg --binary /bin/ls --entrypoint main

Once you have the control flow graph information, you can lift the target binary using mcsema-lift.

mcsema-lift --arch amd64 --os linux --cfg /tmp/ls.cfg --entrypoint main --output /tmp/ls.bc

There are a few things that we can do with the lifted bitcode. The usual thing to do is to recompile it back to an executable.

clang-3.8 -o /tmp/ls_lifted generated/ELF_64_linux.S /tmp/ls.bc -lpthread -ldl -lpcre /lib/x86_64-linux-gnu/libselinux.so.1