# Detecting and Mitigating Memory Corruption Errors

## Memory corruption
- https://cwe.mitre.org/data/definitions/787.html
- according to MITRE, "memory corruption" is often used to describe the consequences of writing to memory outside the bounds of a buffer that is invalid, when the root cause is something other than a sequential copy of excessive data from a fixed starting location. This may include issues such as incorrect pointer arithmetic, accessing invalid pointers due to incomplete initialization or memory release, etc.

- from programmers point-of-view, there are two main ways to detect memory corruption errors in C/C++ programs
- White box and black box testing

## White box testing
- also called static analysis
- have access to source code
- manually read and review source code for memory related errors such as memory leak, buffer overflow, underflow, etc.
    - pros and cons?
- can use automated tools to scan for code and API that leads to memory related errors
    - pros and cons?

## Black box testing
- also called dynamic analysis
- manually test the binary/executable
- employ `fuzz testing` - use automated tools called fuzzer to provide invalid, unexpected or random data as inputs to the program

### Use 3rd party scanners such as Valgrind's Memcheck
- https://valgrind.org/docs/manual/quick-start.html
- compile your program using -g (dubuggin info) and -o1 (line numbers in error message)
    - `-o0` is also a good idea, if you can tolerate the slowdown   
- must install valgrind and libc6-dbg:i386 packages

### Use gcc/g++ compiler flags

### NOTE: Automated tools are not perfect!

In [1]:
! echo kali | sudo -S apt install valgrind -y

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
valgrind is already the newest version (1:3.19.0-1).
The following packages were automatically installed and are no longer required:
  catfish dh-elpa-helper docutils-common gir1.2-xfconf-0 libcfitsio9 libgdal31
  libmpdec3 libnginx-mod-http-geoip libnginx-mod-http-image-filter
  libnginx-mod-http-xslt-filter libnginx-mod-mail libnginx-mod-stream
  libnginx-mod-stream-geoip libpoppler123 libprotobuf23 libpython3.10
  libpython3.10-dev libpython3.10-minimal libpython3.10-stdlib libtiff5
  libzxingcore1 nginx-common nginx-core python-pastedeploy-tpl
  python3-alabaster python3-commonmark python3-docutils python3-imagesize
  python3-roman python3-snowballstemmer python3-speaklater python3-sphinx
  python3.10 python3.10-dev python3.10-minimal ruby3.0 ruby3.0-dev ruby3.0-doc
  sphinx-common
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.


In [2]:
# check valgrind version
! valgrind --version

valgrind-3.19.0


In [3]:
%%bash
# install libc6-dbg:i386 for debugging x86 program in x64
echo kali | sudo -S sudo apt install libc6-dbg -y
echo kali | sudo -S sudo apt install libc6-dbg:i386 -y

[sudo] password for kali: 



Reading package lists...
Building dependency tree...
Reading state information...
libc6-dbg is already the newest version (2.36-8).
The following packages were automatically installed and are no longer required:
  catfish dh-elpa-helper docutils-common gir1.2-xfconf-0 libcfitsio9 libgdal31
  libmpdec3 libnginx-mod-http-geoip libnginx-mod-http-image-filter
  libnginx-mod-http-xslt-filter libnginx-mod-mail libnginx-mod-stream
  libnginx-mod-stream-geoip libpoppler123 libprotobuf23 libpython3.10
  libpython3.10-dev libpython3.10-minimal libpython3.10-stdlib libtiff5
  libzxingcore1 nginx-common nginx-core python-pastedeploy-tpl
  python3-alabaster python3-commonmark python3-docutils python3-imagesize
  python3-roman python3-snowballstemmer python3-speaklater python3-sphinx
  python3.10 python3.10-dev python3.10-minimal ruby3.0 ruby3.0-dev ruby3.0-doc
  sphinx-common
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.






Reading package lists...
Building dependency tree...
Reading state information...


E: Unable to locate package libc6-dbg:i386


CalledProcessError: Command 'b'# install libc6-dbg:i386 for debugging x86 program in x64\necho kali | sudo -S sudo apt install libc6-dbg -y\necho kali | sudo -S sudo apt install libc6-dbg:i386 -y\n'' returned non-zero exit status 100.

In [4]:
# let's use demos/memory_leak.cpp program for demo
! cat demos/memory_leak.cpp

  #include <stdlib.h>
  #include <cstring>
  #include <cstdio>

  void f(char * arg)
  {
	 // C dynamic memory
	 int* x = (int *)malloc(10 * sizeof(int));
	 // C++ dynamic memory
	 char* name = new char[20];
	 
	 x[10] = 0;        // problem 1: heap block overrun
		                // problem 2: memory leak -- x not freed
	 strcpy(name, arg);
	 // problem 3: heap block overrun
	 // problem 4: memory leak -- x not freed
	 printf("Hello %s\n", arg);
  }

  int main(int argc, char* argv[1])
  {
	 // what if f() is called over and again in an infinite loop, e.g. 
	 f(argv[1]);
	 return 0;
  }


In [5]:
# compile with -g -o0 options to use with valgrind
# compile as 64-bit binary as valgrind will not work on 32-bit due to lack of 
# libc6-dbg:i386 library
! g++ -g -o0 demos/memory_leak.cpp -o memory_leak.exe

In [6]:
# Run the program with an argument
! ./memory_leak.exe John

Hello John


In [7]:
# program crashes or behaves unexpectedly
! ./memory_leak.exe "some very very very very long string"

malloc(): corrupted top size


In [8]:
# by default gives summary of memory leak
# doesn't give the detail/full memory leaks info
! valgrind ./memory_leak.exe "John Smith"

==305603== Memcheck, a memory error detector
==305603== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==305603== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==305603== Command: ./memory_leak.exe John\ Smith
==305603== 
==305603== Invalid write of size 4
==305603==    at 0x109199: f(char*) (memory_leak.cpp:12)
==305603==    by 0x1091F1: main (memory_leak.cpp:23)
==305603==  Address 0x4d7aca8 is 0 bytes after a block of size 40 alloc'd
==305603==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==305603==    by 0x10917E: f(char*) (memory_leak.cpp:8)
==305603==    by 0x1091F1: main (memory_leak.cpp:23)
==305603== 
Hello John Smith
==305603== 
==305603== HEAP SUMMARY:
==305603==     in use at exit: 60 bytes in 2 blocks
==305603==   total heap usage: 4 allocs, 2 frees, 73,788 bytes allocated
==305603== 
==305603== LEAK SUMMARY:
==305603==    definitely lost: 60 bytes in 2 blocks
==305603==    indirectly lost: 0 bytes in 0 blocks
==305603==      poss

In [9]:
! valgrind --leak-check=full -s ./memory_leak.exe "John Smith"

==305612== Memcheck, a memory error detector
==305612== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==305612== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==305612== Command: ./memory_leak.exe John\ Smith
==305612== 
==305612== Invalid write of size 4
==305612==    at 0x109199: f(char*) (memory_leak.cpp:12)
==305612==    by 0x1091F1: main (memory_leak.cpp:23)
==305612==  Address 0x4d7aca8 is 0 bytes after a block of size 40 alloc'd
==305612==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==305612==    by 0x10917E: f(char*) (memory_leak.cpp:8)
==305612==    by 0x1091F1: main (memory_leak.cpp:23)
==305612== 
Hello John Smith
==305612== 
==305612== HEAP SUMMARY:
==305612==     in use at exit: 60 bytes in 2 blocks
==305612==   total heap usage: 4 allocs, 2 frees, 73,788 bytes allocated
==305612== 
==305612== 20 bytes in 1 blocks are definitely lost in loss record 1 of 2
==305612==    at 0x484220F: operator new[](unsigned long) (vg_replace_mall

## gcc/g++ Warning flags and AddressSanitizer
- https://en.wikipedia.org/wiki/AddressSanitizer
- https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Warning-Options.html#Warning-Options

- `-Wall` - display all the warning
- `-Wpedantic` - display nonstandard warnings
- `-Wextra` - print extra newer warning messages
- `-Wconversion` - warning any implicit type conversions
- Warnings are like static analysis

- `-fsanitize=address` - use address sanitizer (ONLY works on Linux)
- must compile and run the program to see the results of any buffer-over-flow errors (dynamic analysis)
- For more: https://www.osc.edu/resources/getting_started/howto/howto_use_address_sanitizer

In [10]:
! g++ -std=c++17 -g -o0 -Wall -Wpedantic -Wextra -Wconversion -fsanitize=address demos/memory_leak.cpp -o memory_leak.exe

[01m[Kdemos/memory_leak.cpp:[m[K In function ‘[01m[Kint main(int, char**)[m[K’:
   20 |   int main([01;35m[Kint argc[m[K, char* argv[1])
      |            [01;35m[K~~~~^~~~[m[K


In [11]:
# run the program to see the Address Sanitizer's result
# detects overflow during run-time
! ./memory_leak.exe

[1m[31m==305876==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000000038 at pc 0x559c34aa4230 bp 0x7ffcb0bc6910 sp 0x7ffcb0bc6908
[1m[0m[1m[34mWRITE of size 4 at 0x604000000038 thread T0[1m[0m
    #0 0x559c34aa422f in f(char*) demos/memory_leak.cpp:12
    #1 0x559c34aa42a8 in main demos/memory_leak.cpp:23
    #2 0x7f811b846189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #3 0x7f811b846244 in __libc_start_main_impl ../csu/libc-start.c:381
    #4 0x559c34aa4100 in _start (/home/kali/Sp23/SoftwareSecurity/memory_leak.exe+0x1100)

[1m[32m0x604000000038 is located 0 bytes to the right of 40-byte region [0x604000000010,0x604000000038)
[1m[0m[1m[35mallocated by thread T0 here:[1m[0m
    #0 0x7f811beb89cf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x559c34aa41de in f(char*) demos/memory_leak.cpp:8
    #2 0x559c34aa42a8 in main demos/memory_leak.cpp:23
    #3 0x7f811b846189 in __libc_start_

In [9]:
# let's compile demos/stack_overflow/so_stdio.cpp with address sanitize flag and warning
! g++ -std=c++17 -m32 -g -o0 -Wall -Wpedantic -Wextra -Wconversion -fsanitize=address demos/stack_overflow/so_stdio.cpp -o so_stdio.exe

[01m[Kdemos/stack_overflow/so_stdio.cpp:[m[K In function ‘[01m[Kchar* mgets(char*)[m[K’:
   33 |         *ptr = [01;35m[Kch[m[K;
      |                [01;35m[K^~[m[K
   39 |         *(++ptr) = [01;35m[Kch[m[K;
      |                    [01;35m[K^~[m[K
[01m[Kdemos/stack_overflow/so_stdio.cpp:[m[K In function ‘[01m[Kint main(int, char**)[m[K’:
   55 | int main([01;35m[Kint argc[m[K, char *argv[]) {
      |          [01;35m[K~~~~^~~~[m[K
   55 | int main(int argc, [01;35m[Kchar *argv[][m[K) {
      |                    [01;35m[K~~~~~~^~~~~~[m[K


In [10]:
# let's manually test it... perhaps string not long enough
! echo "here you go some long long long string..." | ./so_stdio.exe

buffer is at 0xffffbd00
Give me some text: Acknowledged: here you go some long long long string... with length 41
Good bye!


In [12]:
# just overflow BUFFSIZE of 128
! python -c 'print("A"*200)' | ./so_stdio.exe

buffer is at 0xffc37b80
[1m[31m==305903==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffc37c00 at pc 0x565bd437 bp 0xffc37af8 sp 0xffc37aec
[1m[0m[1m[34mWRITE of size 1 at 0xffc37c00 thread T0[1m[0m
    #0 0x565bd436 in mgets(char*) demos/stack_overflow/so_stdio.cpp:39
    #1 0x565bd5b4 in bad() demos/stack_overflow/so_stdio.cpp:50
    #2 0x565bd6eb in main demos/stack_overflow/so_stdio.cpp:58
    #3 0xf7023294  (/lib32/libc.so.6+0x23294)
    #4 0xf7023357 in __libc_start_main (/lib32/libc.so.6+0x23357)
    #5 0x565bd1c6 in _start (/home/kali/Sp23/SoftwareSecurity/so_stdio.exe+0x11c6)

[1m[32mAddress 0xffc37c00 is located in stack of thread T0 at offset 160 in frame[1m[0m
[1m[0m    #0 0x565bd498 in bad() demos/stack_overflow/so_stdio.cpp:45

  This frame has 1 object(s):
    [32, 160) 'buffer' (line 46)[1m[32m <== Memory access at offset 160 overflows this variable[1m[0m
HINT: this may be a false positive if your program uses some custom stack unwind me

## Fixing memory leak and over-run vulnerabilities
- find the vulnerable line of code/functions, etc. and fix it
- see `demos/memory_leak_fixed.cpp` for demo

In [13]:
! cat demos/memory_leak_fixed.cpp

#include <stdlib.h>
#include <cstring>
#include <cstdio>

void f(char * arg)
{
	// C dynamic memory
	int* x = (int *)malloc(10 * sizeof(int));
	// C++ dynamic memory
	char* name = new char[20];

	x[9] = 0; // problem 1: heap block overrun
			// problem 2: memory leak -- x not freed
	strncpy(name, arg, sizeof(char)*20-1);
	name[19] = '\0';
	// problem 3: heap block overrun
	// problem 4: memory leak -- name not freed
	printf("Hello %s\n", name);
	free(x); // C
	delete[] name; // C++
}

int main(int argc, char* argv[1])
{
	// what if f() is called over and again in an infinite loop, e.g. 
	f(argv[1]);
	return 0;
}


In [14]:
# compile with -g -o0 options to use with valgrind
! g++ -g -o0 -Wpedantic -Wextra -Wconversion -fsanitize=address demos/memory_leak_fixed.cpp -o memory_leak_fixed.exe

In [15]:
# manually check the fix
! ./memory_leak_fixed.exe "some very very very very long strin adfa asf afaf adfa dag"

Hello some very very very


In [16]:
# check with valgrind
! valgrind --leak-check=yes ./memory_leak_fixed.exe "some very very very very long string"

==306001== Memcheck, a memory error detector
==306001== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==306001== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==306001== Command: ./memory_leak_fixed.exe some\ very\ very\ very\ very\ long\ string
==306001== 
==306001==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
==306001== 
==306001== HEAP SUMMARY:
==306001==     in use at exit: 0 bytes in 0 blocks
==306001==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==306001== 
==306001== All heap blocks were freed -- no leaks are possible
==306001== 
==306001== For lists of detected and suppressed errors, rerun with: -s
==306001== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
