# Detecting and Mitigating Memory Corruption Errors

- from programmers point-of-view, there are two main ways to detect bufferoverlow in C/C++ programs
- White box and black box testing

## White box testing
- have access to source code
- manually or use scanning tool to detect memory flaws and vulnerabilities
- also called static analysis
- code can be manually read for memory related errors such as memory leak, buffer overrun, etc.
    - pros and cons?
- can use automated tools to scan for errors
    - pros and cons?

## Black box testing
- also called dynamic analysis
- manually test the binary/executable
- employ `fuzz testing` - use automated tools called fuzzer to provide invalid, unexpected or random data as inputs to the program

### Use 3rd party scanners such as Valgrind's Memcheck
- https://valgrind.org/docs/manual/quick-start.html
- compile your program using -g (dubuggin info) and -o1 (line numbers in error message)
    - `-o0` is also a good idea, if you can tolerate the slowdown   
- must install valgrind and libc6-dbg:i386 packages

### Use gcc/g++ compiler flags

### NOTE: Automated tools are not perfect!

In [1]:
! echo kali | sudo -S apt install valgrind -y

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
valgrind is already the newest version (1:3.18.1-1).
The following packages were automatically installed and are no longer required:
  cryptsetup-run fastjar gir1.2-gst-plugins-base-1.0 gnome-desktop3-data
  gnome-session-canberra gstreamer1.0-pulseaudio jarwrapper
  kali-wallpapers-2021.4 kazam libamtk-5-0 libamtk-5-common libavresample4
  libcbor0 libdap27 libdapclient6v5 libepsilon1 libfluidsynth2 libfmt7
  libgdal28 libgeos-3.9.0 libgnome-desktop-3-19 libgupnp-1.2-0 libidn11
  libigdgmm11 libnetcdf18 libntfs-3g883 libodbc1 libodbccr2 libomp-9-dev
  libomp5-9 libperl5.30 libproj19 libqhull8.0 librest-0.7-0 libssl1.0.2
  libtepl-5-0 liburcu6 liburing1 libwireshark14 libwiretap11 libwsutil12
  libxkbregistry0 libxml-dom-perl libxml-perl libxml-regexp-perl libyara4
  linux-image-5.10.0-kali8-amd64 linux-image-5.9.0-kali4-amd64 odbcinst
  odbcinst1debian2 python3-editor python3-exif python3-

In [2]:
# check valgrind version
! valgrind --version

valgrind-3.18.1


In [3]:
%%bash
# install libc6-dbg:i386 for debugging x86 program in x64
echo kali | sudo -S sudo apt install libc6-dbg -y
echo kali | sudo -S sudo apt install libc6-dbg:i386 -y

Reading package lists...
Building dependency tree...
Reading state information...
libc6-dbg is already the newest version (2.33-1).
The following packages were automatically installed and are no longer required:
  cryptsetup-run fastjar gir1.2-gst-plugins-base-1.0 gnome-desktop3-data
  gnome-session-canberra gstreamer1.0-pulseaudio jarwrapper
  kali-wallpapers-2021.4 kazam libamtk-5-0 libamtk-5-common libavresample4
  libcbor0 libdap27 libdapclient6v5 libepsilon1 libfluidsynth2 libfmt7
  libgdal28 libgeos-3.9.0 libgnome-desktop-3-19 libgupnp-1.2-0 libidn11
  libigdgmm11 libnetcdf18 libntfs-3g883 libodbc1 libodbccr2 libomp-9-dev
  libomp5-9 libperl5.30 libproj19 libqhull8.0 librest-0.7-0 libssl1.0.2
  libtepl-5-0 liburcu6 liburing1 libwireshark14 libwiretap11 libwsutil12
  libxkbregistry0 libxml-dom-perl libxml-perl libxml-regexp-perl libyara4
  linux-image-5.10.0-kali8-amd64 linux-image-5.9.0-kali4-amd64 odbcinst
  odbcinst1debian2 python3-editor python3-exif python3-humanize python3-o

[sudo] password for kali: 





In [4]:
# let's use demos/memory_leak.cpp program for demo
! cat demos/memory_leak.cpp

  #include <stdlib.h>
  #include <cstring>
  #include <cstdio>

  void f(char * arg)
  {
	 // C dynamic memory
	 int* x = (int *)malloc(10 * sizeof(int));
	 // C++ dynamic memory
	 char* name = new char[20];
	 
	 x[10] = 0;        // problem 1: heap block overrun
		                // problem 2: memory leak -- x not freed
	 strcpy(name, arg);
	 // problem 3: heap block overrun
	 // problem 4: memory leak -- x not freed
	 printf("Hello %s\n", arg);
  }

  int main(int argc, char* argv[1])
  {
	 // what if f() is called over and again in an infinite loop, e.g. 
	 f(argv[1]);
	 return 0;
  }


In [5]:
# compile with -g -o0 options to use with valgrind
! g++ -m32 -g -o0 demos/memory_leak.cpp -o memory_leak.exe

In [6]:
# Run the program with an argument
! ./memory_leak.exe John

Hello John


In [7]:
# program crashes or behaves unexpectedly
! ./memory_leak.exe "some very very very very long string"

malloc(): corrupted top size


In [8]:
# by default gives summary of memory leak
# doesn't give the detail/full memory leaks info
! valgrind ./memory_leak.exe "John Smith"

==41417== Memcheck, a memory error detector
==41417== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==41417== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==41417== Command: ./memory_leak.exe John\ Smith
==41417== 
==41417== Invalid write of size 4
==41417==    at 0x109205: f(char*) (memory_leak.cpp:12)
==41417==    by 0x109264: main (memory_leak.cpp:23)
==41417==  Address 0x4d94a80 is 0 bytes after a block of size 40 alloc'd
==41417==    at 0x483E670: malloc (vg_replace_malloc.c:381)
==41417==    by 0x1091E8: f(char*) (memory_leak.cpp:8)
==41417==    by 0x109264: main (memory_leak.cpp:23)
==41417== 
Hello John Smith
==41417== 
==41417== HEAP SUMMARY:
==41417==     in use at exit: 60 bytes in 2 blocks
==41417==   total heap usage: 4 allocs, 2 frees, 20,028 bytes allocated
==41417== 
==41417== LEAK SUMMARY:
==41417==    definitely lost: 60 bytes in 2 blocks
==41417==    indirectly lost: 0 bytes in 0 blocks
==41417==      possibly lost: 0 bytes in 

In [11]:
! valgrind --leak-check=full -s ./memory_leak.exe "John Smith"

==41441== Memcheck, a memory error detector
==41441== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==41441== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==41441== Command: ./memory_leak.exe John\ Smith
==41441== 
==41441== Invalid write of size 4
==41441==    at 0x109205: f(char*) (memory_leak.cpp:12)
==41441==    by 0x109264: main (memory_leak.cpp:23)
==41441==  Address 0x4d94a80 is 0 bytes after a block of size 40 alloc'd
==41441==    at 0x483E670: malloc (vg_replace_malloc.c:381)
==41441==    by 0x1091E8: f(char*) (memory_leak.cpp:8)
==41441==    by 0x109264: main (memory_leak.cpp:23)
==41441== 
Hello John Smith
==41441== 
==41441== HEAP SUMMARY:
==41441==     in use at exit: 60 bytes in 2 blocks
==41441==   total heap usage: 4 allocs, 2 frees, 20,028 bytes allocated
==41441== 
==41441== 20 bytes in 1 blocks are definitely lost in loss record 1 of 2
==41441==    at 0x484000B: operator new[](unsigned int) (vg_replace_malloc.c:633)
==41441==  

## gcc/g++ Warning flags and AddressSanitizer
- https://en.wikipedia.org/wiki/AddressSanitizer
- https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Warning-Options.html#Warning-Options

- `-Wall` - display all the warning
- `-Wpedantic` - display nonstandard warnings
- `-Wextra` - print extra newer warning messages
- `-Wconversion` - warning any implicit type conversions
- Warnings are like static analysis

- `-fsanitize=address` - use address sanitizer (ONLY works on Linux)
- must compile and run the program to see the results of any buffer-over-flow errors (dynamic analysis)
- For more: https://www.osc.edu/resources/getting_started/howto/howto_use_address_sanitizer

In [16]:
! g++ -std=c++17 -m32 -g -o0 -Wall -Wpedantic -Wextra -Wconversion -fsanitize=address demos/memory_leak.cpp -o memory_leak.exe

[01m[Kdemos/memory_leak.cpp:[m[K In function ‘[01m[Kint main(int, char**)[m[K’:
   20 |   int main([01;35m[Kint argc[m[K, char* argv[1])
      |            [01;35m[K~~~~^~~~[m[K


In [17]:
# run the program to see the Address Sanitizer's result
# detects overflow during run-time
! ./memory_leak.exe

[1m[31m==41528==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf5000f78 at pc 0x5655629d bp 0xffffc1f8 sp 0xffffc1ec
[1m[0m[1m[34mWRITE of size 4 at 0xf5000f78 thread T0[1m[0m
    #0 0x5655629c in f(char*) demos/memory_leak.cpp:12
    #1 0x56556338 in main demos/memory_leak.cpp:23
    #2 0xf74fa904 in __libc_start_main ../csu/libc-start.c:332
    #3 0x56556120 in _start (/home/kali/projects/SystemSecurity/memory_leak.exe+0x1120)

[1m[32m0xf5000f78 is located 0 bytes to the right of 40-byte region [0xf5000f50,0xf5000f78)
[1m[0m[1m[35mallocated by thread T0 here:[1m[0m
    #0 0xf7ab4ffb in __interceptor_malloc ../../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x56556249 in f(char*) demos/memory_leak.cpp:8
    #2 0x56556338 in main demos/memory_leak.cpp:23
    #3 0xf74fa904 in __libc_start_main ../csu/libc-start.c:332

SUMMARY: AddressSanitizer: heap-buffer-overflow demos/memory_leak.cpp:12 in f(char*)
Shadow bytes around the buggy address:


In [None]:
# let's compile demos/stack_overflow/so_stdio.cpp with address sanitize flag and warning
! g++ -std=c++17 -m32 -g -o0 -Wall -Wpedantic -Wextra -Wconversion -fsanitize=address demos/stack_overflow/so_stdio.cpp -o so_stdio.exe

In [18]:
# let's manually test it... perhaps string not long enough
! echo "here you go some long long long string..." | ./so_stdio.exe

buffer is at 0xffffc1b0
Give me some text: Acknowledged: here you go some long long long string... with length 41
Good bye!


In [19]:
# just overflow BUFFSIZE of 128
! python -c 'print("A"*200)' | ./so_stdio.exe

buffer is at 0xffffc1b0
[1m[31m==41564==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffffc230 at pc 0x56556447 bp 0xffffc128 sp 0xffffc11c
[1m[0m[1m[34mWRITE of size 1 at 0xffffc230 thread T0[1m[0m
    #0 0x56556446 in mgets(char*) demos/stack_overflow/so_stdio.cpp:39
    #1 0x565565c4 in bad() demos/stack_overflow/so_stdio.cpp:50
    #2 0x565566fb in main demos/stack_overflow/so_stdio.cpp:58
    #3 0xf74fa904 in __libc_start_main ../csu/libc-start.c:332
    #4 0x565561d0 in _start (/home/kali/projects/SystemSecurity/so_stdio.exe+0x11d0)

[1m[32mAddress 0xffffc230 is located in stack of thread T0 at offset 160 in frame[1m[0m
[1m[0m    #0 0x565564a8 in bad() demos/stack_overflow/so_stdio.cpp:45

  This frame has 1 object(s):
    [32, 160) 'buffer' (line 46)[1m[32m <== Memory access at offset 160 overflows this variable[1m[0m
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and

## Fixing memory leak and over-run vulnerabilities
- find the vulnerable line of code/functions, etc. and fix it
- see `demos/memory_leak_fixed.cpp` for demo

In [12]:
! cat demos/memory_leak_fixed.cpp

#include <stdlib.h>
#include <cstring>
#include <cstdio>

void f(char * arg)
{
	// C dynamic memory
	int* x = (int *)malloc(10 * sizeof(int));
	// C++ dynamic memory
	char* name = new char[20];

	x[9] = 0; // problem 1: heap block overrun
			// problem 2: memory leak -- x not freed
	strncpy(name, arg, sizeof(char)*20-1);
	name[19] = '\0';
	// problem 3: heap block overrun
	// problem 4: memory leak -- name not freed
	printf("Hello %s\n", name);
	free(x); // C
	delete[] name; // C++
}

int main(int argc, char* argv[1])
{
	// what if f() is called over and again in an infinite loop, e.g. 
	f(argv[1]);
	return 0;
}


In [13]:
# compile with -g -o0 options to use with valgrind
! g++ -m32 -g -o0 demos/memory_leak_fixed.cpp -o memory_leak_fixed.exe

In [14]:
# manually check the fix
! ./memory_leak_fixed.exe "some very very very very long string"

Hello some very very very


In [15]:
# check with valgrind
! valgrind --leak-check=yes ./memory_leak_fixed.exe "some very very very very long string"

==41501== Memcheck, a memory error detector
==41501== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==41501== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==41501== Command: ./memory_leak_fixed.exe some\ very\ very\ very\ very\ long\ string
==41501== 
Hello some very very very
==41501== 
==41501== HEAP SUMMARY:
==41501==     in use at exit: 0 bytes in 0 blocks
==41501==   total heap usage: 4 allocs, 4 frees, 20,028 bytes allocated
==41501== 
==41501== All heap blocks were freed -- no leaks are possible
==41501== 
==41501== For lists of detected and suppressed errors, rerun with: -s
==41501== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
