GitHub - rfarley3/CodeXt-ugly: The good, the bad, and the ugly of CodeXt mid-dissertation writing. Just something to get the code online.

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
AvalancheTesting		AvalancheTesting
BranchTesting		BranchTesting
ByteArrays		ByteArrays
CodeXt		CodeXt
DasosPreproc		DasosPreproc
DiskImgs		DiskImgs
EditedS2ESrcFiles		EditedS2ESrcFiles
Given-Code		Given-Code
HostFiles		HostFiles
LoadingBinaries		LoadingBinaries
ObfuscationTesting		ObfuscationTesting
SavedOutputs		SavedOutputs
ShellcodeTools		ShellcodeTools
SymbolicTaintTesting		SymbolicTaintTesting
libDasosf		libDasosf
old		old
outputs		outputs
s2ecmd		s2ecmd
s2eget		s2eget
s2ekill		s2ekill
shellcode-wrapper		shellcode-wrapper
stubs		stubs
udis86-1.7-64b		udis86-1.7-64b
udis86-1.7		udis86-1.7
._Makefile		._Makefile
._NoRetranslation.txt		._NoRetranslation.txt
._Retranslation1.txt		._Retranslation1.txt
.gitignore		.gitignore
CodeXt.sublime-project		CodeXt.sublime-project
CodeXt.sublime-workspace		CodeXt.sublime-workspace
LICENSE		LICENSE
Makefile		Makefile
NoRetranslation.txt		NoRetranslation.txt
Notes.txt		Notes.txt
PluginsDir		PluginsDir
PluginsReadme.txt		PluginsReadme.txt
README		README
Retranslation1.txt		Retranslation1.txt
backtrace.sh		backtrace.sh
calc.pl		calc.pl
constraints.output		constraints.output
dailyCommit.sh		dailyCommit.sh
dirtyCommit.sh		dirtyCommit.sh
firstRan.sh		firstRan.sh
firstRan.sh-backup		firstRan.sh-backup
gdb-core.sh		gdb-core.sh
lessLine		lessLine
make-color		make-color
preprocessed.rawshell		preprocessed.rawshell
randomFillGenerator.c		randomFillGenerator.c
rsnapshot-s2e-commit.conf		rsnapshot-s2e-commit.conf
rsnapshot-s2e-dirty.conf		rsnapshot-s2e-dirty.conf
s		s
scripted1krep20.pl		scripted1krep20.pl
scriptedRun.pl		scriptedRun.pl
searchSource.sh		searchSource.sh
segmented100k.pl		segmented100k.pl
segmented100k2.pl		segmented100k2.pl
segmented10k.pl		segmented10k.pl
showPatch.sh		showPatch.sh
symlinksWithinS2ESrcTree.txt		symlinksWithinS2ESrcTree.txt

Repository files navigation

 .d8888b.              888       Y88b   d88P888    
d88P  Y88b             888        Y88b d88P 888    
888    888             888         Y88o88P  888    
888        .d88b.  .d88888 .d88b.   Y888P   888888 
888       d88""88bd88" 888d8P  Y8b  d888b   888    
888    888888  888888  88888888888 d88888b  888    
Y88b  d88PY88..88PY88b 888Y8b.    d88P Y88b Y88b.  
 "Y8888P"  "Y88P"  "Y88888 "Y8888d88P   Y88b "Y888

Available at https://github.com/rfarley3/CodeXt-ugly.git

A plugin for S2E (https://sites.google.com/site/dslabepfl/proj/s2e).

Created by Ryan Farley  <rfarley3@gmu.edu> or <rfarley@mitre.org>
       and Xinyuan Wang <xwangc@gmu.edu>
       
See "CodeXt: Automatic Extraction of Obfuscated Attack Code from Memory Dump." 
    In Proceedings of the 17th Information Security and Forensics Society 
    Information Security Conference (ISC 2014). Hong Kong, October 2014.

Keywords: Malware Forensics, Binary Analysis, Symbolic Execution

****************************************************
CodeXt extends S2E (with some modifications to the underlying engines: 
S2E/QEMU/KLEE) to monitor x86 byte code for the purposes of shellcode/malware 
modeling, forensics, or generic code extration. Upon real-time detection of
an attack, CodeXt is able to automatically and accurately pinpoint the 
exact start and boundaries of attack code---even if it is mingled with
random bytes in the memory. CodeXt has a generic way of handling 
self-modifying code and multiple layers of encoding, and it can 
automatically extract the complete hidden and transient code protected 
by multiple layers of sophisticated encoders without using any signature 
or pattern of the decoder.

What you can do:
- You can give it a buffer of memory and it will comb through and 
  report back all the executable chunks of code within it. 
- You can give it a fragment of code or a full executable and 
  it will report back detailed execution information.
- You can expand branch exploration by marking segments of 
  memory as symbolic, and each fork will be tracked.
- You can follow data and instruction influences on the data
  via a taint labeling mechanism that leverages KLEE, e.g. you 
  can monitor network applications by marking all input 
  as tainted and determine which attack bytes impact which 
  bytes in a forensics dump. 
- The output information includes:
  * translated instruction trace
  * executed instruction trace
  * data (in a specified range of memory) write trace
  * grouping of instructions into related execution strings
  * visual deltas of memory changes (to follow self-mutating
    code iterations)

*****************************************************