Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
A taint-tracking plugin for the Valgrind memory checking tool
C C++
Branch: master

bugfix: increase TI_MAX to 700

latest commit ecebc4b6e5
Wei Ming Khoo authored

README.md

Taintgrind: a Valgrind taint analysis tool

2014-09-25 Support for client requests

2014-09-15 Support for Valgrind 3.10.0, x86_linux and amd64_linux

2013-12-20 Experimental support for 32-bit ARM, tested on Android 4.4 emulator with API 19

2013-11-18 Currently supporting: Valgrind 3.9.0, x86_linux and amd64_linux

Installation

  1. Download Valgrind and build

    [me@machine ~/] tar jxvf valgrind-X.X.X.tar.bz2
    [me@machine ~/] cd valgrind-X.X.X
    [me@machine ~/valgrind-X.X.X] ./autogen.sh
    [me@machine ~/valgrind-X.X.X] ./configure --prefix=`pwd`/inst
    [me@machine ~/valgrind-X.X.X] make && make install
    
  2. Git clone and build taintgrind

    [me@machine ~/valgrind-X.X.X] git clone http://github.com/wmkhoo/taintgrind.git
    [me@machine ~/valgrind-X.X.X] cd taintgrind 
    [me@machine ~/valgrind-X.X.X/taintgrind] ../autogen.sh
    [me@machine ~/valgrind-X.X.X/taintgrind] ./configure --prefix=`pwd`/../inst
    [me@machine ~/valgrind-X.X.X/taintgrind] make && make install
    

Usage

[me@machine ~/valgrind-X.X.X] ./inst/bin/valgrind --tool=taintgrind --help
...
user options for Taintgrind:
    --file-filter=<full_path>   full path of file to taint [""]

If this field is '*', it is equivalent to --taint-all=yes

    --taint-start=[0,800000]    starting byte to taint (in hex) [0]
    --taint-len=[0,800000]      number of bytes to taint from taint-start (in hex)[800000]
    --taint-all= no|yes         taint all bytes of all files read. warning: slow! [no]
    --tainted-ins-only= no|yes  print tainted instructions only [yes]

Tainted instructions are really instructions where one or more of its input/output variables are tainted.

    --critical-ins-only= no|yes print critical instructions only [no]

At the moment, critical instructions include loads, stores, conditional jumps and indirect jumps/calls. If --critical-ins-only is turned on, all other instructions are not printed. The last two options control the output of taintgrind. If both of these options are 'no', then taintgrind prints every instruction executed. Run without any parameters, taintgrind will not taint anything and the program output should be printed.

Sample output

Run Taintgrind with e.g.

> valgrind --tool=taintgrind --file-filter=/path/to/test.txt --taint-start=0 --taint-len=1 gzip path/to/test.txt

The output of taintgrind is a list of Valgrind IR (VEX) statements of the form

Address/Location | VEX-IRStmt | Runtime value(s) | Taint value(s) | Information flow
0x8049A1B: lm_init (deflate.c:345) | t24_1 = LOAD I8 0x8097ae0 | 0x61 | 0xff | t24_1 <- window

The first instruction indicates a byte (type I8, or int8_t) is loaded from address 0x8097ae0 into temporary variable t24_1. Its run-time value is 0x61, and its taint value is 0xff, which means all 8 bits are tainted. The information flow indicates that taint is flowing from 0x8097ae0 (or window symbol) to t24_1. An instruction with no tainted variables will not have information flow. With debugging information, taintgrind can list the source location (lm_init at deflate.c:345) and the variable name (window).

0x8049A1B: lm_init (deflate.c:345) | t23_1 = 8Sto16 t24_1 | 0x61 | 0xff | t23_1 <- t24_1

Only one run-time/taint value per instruction is shown. That variable is usually the one being assigned, e.g. t23_1 in this case. In the case of an if-goto, it is the conditional variable; in the case of an indirect jump, it is the jump target. Loads and stores have two possible useful run-time values: the address and the data being loaded/stored. We have simply chosen to print the data. Details of VEX operators and IRStmts can be found in VEX/pub/libvex_ir.h .

Notes

Taintgrind is based on Valgrind's MemCheck and Flayer.

Taintgrind borrows the bit-precise shadow memory from MemCheck and only propagates explicit data flow. This means that Taintgrind will not propagate taint in control structures such as if-else, for-loops and while-loops. Taintgrind will also not propagate taint in dereferenced tainted pointers.

Client requests

Taintgrind may be further controlled via client requests:

On a 32-bit OS,

TNT_MAKE_MEM_TAINTED ( UInt *buffer, Size_t len )
TNT_MAKE_MEM_UNTAINTED ( UInt *buffer, Size_t len )
TNT_START_PRINT()
TNT_STOP_PRINT()

For example,

> cat -n sign.c
1  #include "taintgrind.h"

The header file taintgrind.h includes all available client requests.

2  int get_sign(int x) {
3      if (x == 0) return 0;
4      if (x < 0)  return -1;
5      return 1;
6  }

Let us assume get_sign is our function of interest.

7  int main(int argc, char **argv)
8  {
9      // Turns on printing
10     TNT_START_PRINT();

The request TNT_START_PRINT() turns on printing and turns off the variables --tainted-ins-only and --critical-ins-only.

11     int a = 1000;
12     // Defines int a as tainted
13     TNT_MAKE_MEM_TAINTED(&a,4);

The request TNT_MAKE_MEM_TAINTED allows any buffer to be tainted, not just through file I/O or system calls.

14     int s = get_sign(a);
15     // Turns off printing
16     TNT_STOP_PRINT();

TNT_STOP_PRINT() stops further output.

17     return s;
18 }

Compile with

> gcc -Ivalgrind-x.x.x/taintgrind/ -Ivalgrind-x.xx.x/include/ -g sign.c -o sign-cl

Run with

[valgrind-x.xx.x] ./inst/bin/valgrind --tool=taintgrind ~/sign-cl

Should give the first instruction

0x4007C0: main (sign.c:10) | t11_10756 = r32_0 I64 | 0xdeadbeefdeadbeef | 0x0 | 

And the last instruction

0x400878: main (sign.c:16) | JMP 0x40089e | 0x40089e | 0x0 | 

The first tainted instruction should be

0x40083D: main (sign.c:14) | t22_6841 = LOAD I32 t19_5868 | 0x3e8 | 0xffffffff | t22_6841 <- ffefffda8_unknownobj_0

The 2 tainted if-gotos should come up as

0x400730: get_sign (sign.c:3) | IF t40_1543 GOTO 0x400732 | 0x0 | 0x1 | t40_1543
0x40073D: get_sign (sign.c:4) | IF t8_9848 GOTO 0x40073f | 0x0 | 0x1 | t8_9848

As expected, the conditions are both false, and are thus 0. Finally the return value of get_sign should be

0x400746: get_sign (sign.c:5) | r16_3 = 0x1 | 0x1 | 0x0 | 

License

Taintgrind is licensed under GNU GPLv2.

Something went wrong with that request. Please try again.