## Accessing Invalid Memory

accessing invalid memory means that the process tried to access a portion of the system's memory that wasn't assigned to it.

It typically happens with low-level languages like C or C++ where the programmer needs to take care of requesting the memory that the program is going to use and then giving that memory back once it's not needed anymore. In these languages, the variables that store memory addresses are called pointers.

ommon programming errors that lead to segmentation faults or segfaults include forgetting to initialize a variable, trying to access a list element outside of the valid range, trying to use a portion of memory after having given it back, and trying to write more data than the requested portion of memory can hold. So what can you do if you have a program that's said vaulting? The best way to understand what's going on is to attach a debugger to the faulty program.

For this to be possible, we'll need our program to be compiled with debugging symbols. This means that on top of the information that the computer uses to execute the program, the executable binary needs to include extra information needed for debugging, like the names of the variables and functions being used. These symbols are usually stripped away from the binaries that we run to make them smaller

When doing this, we might find that the crash happens inside a call to a library function. This is separate from the application itself, so we need to install the debugging symbols for that library. 


**Microsoft compilers can also generate debugging symbols in a separate PDB file**. Some Windows software providers let users download the PDP files that correspond to their binaries to let them properly debug failures. One of the trickiest things about this invalid memory business is that we're usually dealing with **undefined behavior** .This means that the code is doing something that's not valid in the programming language. 

When trying to understand problems related to handling invalid memory, **valgrind** can help us a lot. Valgrind is a very powerful tool that can tell us if the code is doing any invalid operations no matter if it crashes are not. Valgrind lets us know if the code is accessing variables before initializing them. 

Valgrind is available on Linux and Mac OS, and **Dr. Memory** is a similar tool that can be used on both Windows and Linux. 



## Unhandled exceptions and Errors

We might get a type error or an attribute error if we try to take an action on a variable that wasn't properly initialized or division by zero error if we tried to well, divide by zero.

When the code generates one of these errors without handling it properly, the program will finish unexpectedly. 

In general, **unhandled errors** happen because the codes making wrong assumptions maybe the program's trying to access a resource that's not present or the code assumes that the user will enter a value but the user entered and empty string instead.


The **traceback** shows the lines of the different functions that were being executed when the problem happened.

 So when the error message isn't enough, we'll need to debug the code to find out where things are going wrong. For that, we can use the **debugging tools available for the application's language. For a Python, program we can use the BDB interactive debugger which lets us do all the typical debugging actions like executing lines of code one-by-one or looking at how the variables change values.**
 
 
Statements like these could show the contents of variables, the return values of functions or metadata like the length of a list or size of a file. This technique is called **print f debugging**. The name comes from the print f function used to print messages to the screen in the C programming language.

### Logging

When changing code to print messages to the screen, the best approach is to add the messages in a way that can be easily enabled or disabled depending on whether we want the debug info or not. In Python, we can do this using the logging module. This module, lets us set how comprehensive we want our code to be. We can say whether we want to include all debug messages, or only info warning or error messages. Then when printing the message, we specify what type of message we're printing.


### Instead of crashing unexpectedly, you want the program to inform the user of the problem and tell them what they need to do.

For example, say you have an application that crashes with a permission denied error. Rather than the program finishing unexpectedly, you'll want to modify the code to catch that error and tell the user what the permission problem is so they can fix it.

## Fixing someone Else's code

 Writing good comments is one of those good habits that pays off when trying to understand code written by others and also your past self. 
 
 Another thing that can help to understand someone else's code is reading the tests associated to the code. Well-written tests can tell us what each function is expected to do. 
 
But when the project has thousands or tens of thousands of lines of code, you can't really read the whole thing. You'll need to focus on the functions or modules that are part of the problem that you're trying to fix.

One possible approach in this case, would be to start with the function where the error happened, then the function or functions that call it, and so on until you can grasp the contexts that led to the problem.

## Debugging a Segmentation Fault


### Core Files
When an application crashes like this, it's useful to have a core file of the crash. Core files store all the information related to the crash so that we or someone else can debug what's going on. 

To generate core files, **by running the ulimit command, then using the -c flat for core files, and then saying unlimited to state that we want core files of any size.**

``` 
ulimit -c unlimited
```
This file contains all the information of what was going on with the program when it crashed. 

We can use it to understand why the program crashed by passing it to the **GDB debugger**. We'll call it **gdb-c core to give it a core file and then example to tell it where the executable that crashed is located.**

look at the full backtrace of the crash by using the **backtrace command.**

We can use the **up command** to move to the calling function in the backtrace and check out the line and copy parameters that caused the crash.

We can get more contexts for the code that failed by calling the **list command that shows the lines around the current one.**

GDB uses the dollar sign followed by a number to give separate identifiers to each result it prints.

What are those weird numbers starting with 0x? Those are hexadecimal numbers, and they are used to show addresses in memory where some data is store


### Null Pointer
The second element is a pointer to zero also known as a null pointer. Zero is never a valid pointer. It usually signals the end of data structures in C. 

So our code is trying to access the second element in the array, but the array only has one valid element. In other words, the for loop is doing one iteration to many. **This is known as an off-by-one error, and it's a super common error.** 

## Debugging a Python Crash

We'll start the debugger by running pdb3 and then passing the script that we want to run and any parameters that our script needs. In our case, we'll call pdb3 update products.py new products.csv.
```
pdb3 updateproducts.py newproducts.py
```

We could run each of the instructions in the file one by one using the next command. But there's a lot going on here. 

So we need to go through a lot of lines until we reach the failure. Alternatively, we can tell the debugger to continue the execution until it either finishes or crashes.

What are those characters appearing before product code? If we search online for the sequence of characters, will find that they represent the Byte Order Mark or BOM which is used in UTF-16 to tell the difference between a file stored using Little-endian and Big-endian

There is a special value called UTF-8-sig that we can set as the encoding parameter of the open function. Setting this encoding means that Python will get rid of the BOM when files include it and behave as usual when they don't.

## Resources for Debugging Crashes

Check out the following links for more information:

1- https://realpython.com/python-concurrency/

2- https://hackernoon.com/threaded-asynchronous-magic-and-how-to-wield-it-bba9ed602c32

3- https://stackoverflow.com/questions/33047452/definitive-list-of-common-reasons-for-segmentation-faults

4-https://sites.google.com/a/case.edu/hpcc/home/important-notes-for-new-users/debugging-segmentation-faults


Readable Python code on GitHub:

- https://github.com/fogleman/Minecraft
- https://github.com/cherrypy/cherrypy
- https://github.com/pallets/flask
- https://github.com/tornadoweb/tornado
- https://github.com/gleitz/howdoi
- https://github.com/bottlepy/bottle/blob/master/bottle.py
- https://github.com/sqlalchemy/sqlalchemy
