# Debugging and Profiling

Debugging is the process of identifying and fixing errors, bugs, or issues in software code. It involves locating and resolving the parts of the code that are causing unintended behavior or crashes. Debugging can be done through tools, techniques, and methodologies to trace, analyze, and correct problems in order to ensure the software works as intended.

**Types of Debugging**
1. Print Debugging: Adding print statements in your code to output specific values or messages to the console, helping you understand the flow of the program.

2. Interactive Debugging: Using debugging tools integrated into an integrated development environment (IDE) to set breakpoints, step through code, inspect variables, and analyze the program's behavior in real-time.

3. Logging: Adding log statements at various points in your code to capture information about the program's execution, which can be helpful for analyzing issues later.

4. Remote Debugging: Debugging code running on a remote system or device using tools that allow you to connect to and control the debugging process from your local development environment.

5. Unit Testing: Writing and running small, isolated tests for individual units (functions, classes, methods) of code to ensure they work correctly.

6. Integration Testing: Testing the interactions between different components or modules of your software to identify and fix issues that may arise when they are combined.

7. Memory Debugging: Identifying and fixing memory-related issues such as memory leaks or buffer overflows that can cause crashes or performance problems.

8. Profiling: Analyzing the performance of your code to identify bottlenecks and areas that can be optimized for better efficiency.

9. Static Analysis: Using tools to analyze your code without executing it, detecting potential issues like coding style violations, potential bugs, or security vulnerabilities.

10. Fuzz Testing: Providing random or unexpected inputs to your code to uncover unexpected behavior and potential vulnerabilities.

11. Symbolic Debugging: Analyzing code using symbolic information, such as variable names and high-level abstractions, to understand and fix issues.

Each type of debugging has its own advantages and use cases, and often a combination of these techniques is used to effectively identify and resolve software issues.

**Steps in Debugging**
1. Reproduce the Issue: Start by understanding and replicating the problem or issue you're encountering. This helps ensure that you're working with the same conditions under which the problem occurs.

2. Isolate the Problem: Narrow down the scope of the issue to identify which part of the code is causing the problem. This might involve using print statements, logging, or debugging tools to trace the flow of execution.

3. Set Breakpoints: Use breakpoints in your code to pause its execution at specific points. This allows you to inspect variables, step through the code, and understand how it behaves.

4. Step Through the Code: Use debugging tools to step through the code line by line, observing the values of variables and checking if the program is behaving as expected.

5. Inspect Variables: Examine the values of variables and data structures at different points in the code to identify discrepancies or unexpected behavior.

6. Check for Errors: Look for syntax errors, logical errors, or runtime errors that might be causing the issue. Fix these errors as you encounter them.

7. Use Debugging Tools: Leverage debugging tools provided by your IDE or other tools to aid in the process. These might include variable inspectors, call stack viewers, and memory analyzers.

8. Modify and Test: Make changes to the code that you suspect might be causing the issue, and then test the program to see if the problem persists or is resolved.

9. Regression Testing: After making changes, run regression tests to ensure that the changes haven't introduced new issues or broken other parts of the code.

10. Iterate: If the issue isn't resolved, repeat the steps, adjusting your approach based on new insights and information you've gathered.

11. Document and Communicate: Keep track of your debugging process, the steps you've taken, and the changes you've made. This documentation can be valuable for future reference and for communicating with other team members.

12. Verify the Fix: Once the issue is resolved, thoroughly test the code to ensure that the problem is truly fixed and that no new issues have arisen.

Debugging can be an iterative and sometimes challenging process, but with persistence and a systematic approach, you can effectively identify and resolve software issues.

Python has following built-in modules that allow us to debug our code:
- `bdb` Debugger framework
- `faulthandler` dump the python traceback
- `pdb` python debugger framework
- `profiler` python profilers
- `timeit` measure the execution of small code snippets
- `trace` trace or track python statement execution
- `tracemalloc`  trace or track memory allocations

**Note:**
- This table (namely audit events table) contains all the events raised by the `sys.audit()` & `PySys_Audit()` calls throughout the CPython runtime and start library. [Audit Table](https://docs.python.org/3/library/audit_events.html)

## Bdb

`bdb` module is a built-in debugging framework in python which has various debugger functions to do basic debugging.

**Classes in Bdb**
- `class bdb.BreakPoint(self, file, line, temporary=False, cond=None, funcnam=None)` this class implements the debugging techniques like breakpoints, ignore counts, disabling and re-enabling & conditionals
  - Breakpoints are indexed by number through a list called `bpbynumber` and by (file, line) pairs through `bplist`. 
  - The former points to a single instance of class Breakpoint. The latter points to a list of such instances since there may be more than one breakpoint per line.
  - If a `funcname` is defined, a breakpoint hit will be counted when the first line of that function is executed. 

- `class bdb.Bdb(skip=None)` this is generic python debugger base class
  - its responsibility is to take care of the details of the trace facility
  - `skip` if given must be an iterable of glob-style  module name patterns.

**Methods in Bdb**
- `bdb.checkfuncname(b, frame)` return True if we should break here, depending on the way the Breakpoint `b` was set
- `bdb.effective(file, line, frame)` return a tuple `(active breakpoint, delete temporary flag)` or `(None, None)` as the breakpoint to act upon
- `bdb.set_trace()` start debugging with a `Bdb` instance from caller's frame
- `bdb.Breakpoint.deleteMe()` delete the breakpoint from the list associated to a file/line. 
- `bdb.Breakpoint.enable()` mark the breakpoint as enabled.
- `bdb.Breakpoint.disable()` mark the breakpoint as disabled.
- `bdb.Breakpoint.bpformat()` return a string with all the information about the breakpoint, nicely formatted:
  - Breakpoint number.
  - Temporary status (del or keep).
  - File/line position.
  - Break condition.
  - Number of times to ignore.
  - Number of times hit.
- `bdb.Breakpoint.bpprint(out=None)` print the output of `bpformat()` to the file out, or if it is None, to standard output.
- `bdb.Bdb.` methods of these base class are not usually needed to be overwritten [info about them here](https://docs.python.org/3/library/bdb.html)
  
**Attributes in Bdb**
- `file` file name of the Breakpoint.
- `line` line number of the Breakpoint within file.
- `temporary` True if a Breakpoint at (file, line) is temporary.
- `cond` condition for evaluating a Breakpoint at (file, line).
- `funcname` function name that defines whether a Breakpoint is hit upon entering the function.
- `enabled` True if Breakpoint is enabled.
- `bpbynumber` numeric index for a single instance of a Breakpoint.
- `bplist` dictionary of Breakpoint instances indexed by (file, line) tuples.
- `ignore`  umber of times to ignore a Breakpoint.
- `hits` count of the number of times a Breakpoint has been hit.

**Note:** 
- Methods of `bdb.Bdb` class are usually not needed to be overwritten. We mainly use the `bdb.BreakPoint` class to perform debugging.
- `bdb` is abbreviation for Breakpoint Debugger

## FaultHandler

`faulthandler` provide services to dump the python traceback explicitly, on a fault, after a timeout, or on a user signal.

For some signals `faulthandler` is active by default but for some we have to activate it manually.

The fault handler is called on catastrophic cases and therefore can only use signal-safe functions, because of this limitation traceback dumping is minimal compared to normal Python tracebacks:
- Only ASCII is supported. The `backslashreplace `error handler is used on encoding.
- Each string is limited to 500 characters.
- Only the filename, the function name and the line number are displayed. (no source code)
- It is limited to 100 frames and 100 threads.
- The order is reversed: the most recent call is shown first.

To see tracebacks application must run in terminal.

By default, the Python traceback is written to `sys.stderr`

The Python Development Mode calls `faulthandler.enable()` at Python startup.

**Methods in FaultHandler**
- `faulthandler.dump_traceback(file=sys.stderr, all_threads=True)` dump the tracebacks of all threads into file. If all_threads is False, dump only the current thread
- `faulthandler.enable(file=sys.stderr, all_threads=True)` enable the fault handler: 
  - install handlers for the `SIGSEGV, SIGFPE, SIGABRT, SIGBUS and SIGILL` signals to dump the Python traceback. 
  - If all_threads is True, produce tracebacks for every running thread. Otherwise, dump only the current thread

- `faulthandler.disable()` disable the fault handler: 
  - uninstall the signal handlers installed by `enable()`

- `faulthandler.is_enabled()` check if the fault handler is enabled
- `faulthandler.dump_traceback_later(timeout, repeat=False, file=sys.stderr, exit=False)` dump the tracebacks of all threads, after a `timeout` of timeout seconds, or every timeout seconds 
  - if `repeat` is True. 
  - If `exit` is True, `call _exit()` with `status=1` after dumping the tracebacks. 

- `faulthandler.cancel_dump_traceback_later()` cancel the last call to `dump_traceback_later()`.
- `faulthandler.register(signum, file=sys.stderr, all_threads=True, chain=False)` register a user signal: 
  - install a handler for the signum signal to dump the traceback of all threads, or of the current thread 
  - if all_threads is False, into file
  - Call the previous handler if chain is True

- `faulthandler.unregister(signum)` unregister a user signal: 
  - uninstall the handler of the signum signal installed by register()
  - return True if the signal was registered, False otherwise.

In [None]:
# Enabling faulthandler in cmd
! python3 -q -X faulthandler

## Pdb

`pdb` or python debugger defines an interactive source code debugger for python scripts.

It supports:
- Conditional breakpoints
- Single stepping at source line level
- Inspection of stack frames
- Source code listing
- Evaluating arbitrary code in the context of any stack frame
- Supports post-mortem debugging 

To use this:
- first `import pdb` and do `pdb.set_trace()` to break into debugger
- secondly add `breakpoint()` anywhere in code where we want to breakpoint

```
import pdb

pdb.set_trace()

def alpha(x, y:)-> None:
    breakpoint()
    # some code
```

To use `pdb` from command line do this:
```
python -m pdb script.py
```

When we invoke `pdb` using command line it by default enters post-mortem debugging mode.


**Classes in Pdb**
- `class pdb.Pdb(completekey='tab', stdin=None, stdout=None, skip=None, nosigint=False, readrc=True)` Pdb is the debugger class
  - `completekey` `stdin` and `stdout` arguments are passed to the underlying `cmd.Cmd` class
  - `skip` if given, must be an iterable of glob-style module name patterns. The debugger will not step into frames that originate in a module that matches one of these patterns.
  - `readrc` argument defaults to true and controls whether Pdb will load `.pdbrc` files from the filesystem

**Methods in Pdb**
- `pdb.run(statement, globals=None, locals=None)` execute the statement (given as a string or a code object) under debugger control

- `pdb.runeval(expression, globals=None, locals=None)` evaluate the expression (given as a string or a code object) under debugger control. When `runeval()` returns, it returns the value of the expression. Otherwise this function is similar to `run()`

- `pdb.runcall(function, *args, **kwds)` call the function (a function or method object, not a string) with the given arguments. When `runcall()` returns, it returns whatever the function call returned. The debugger prompt appears as soon as the function is entered

- `pdb.set_trace(*, header=None)` enter the debugger at the calling stack frame. This is useful to hard-code a breakpoint at a given point in a program, even if the code is not otherwise being debugged (e.g. when an assertion fails). If given, header is printed to the console just before debugging begins

- `pdb.post_mortem(traceback=None)` enter post-mortem debugging of the given traceback object. If no traceback is given, it uses the one of the exception that is currently being handled (an exception must be being handled if the default is to be used)

- `pdb.pm()` enter post-mortem debugging of the traceback found in `sys.last_traceback`


**Note:** For debugger commands [refer](https://docs.python.org/3/library/pdb.html) 

In [None]:
import pdb

pdb.set_trace()

def double(x):
   breakpoint()
   return x * 2
val = 3
print(f"{val} * 2 is {double(val)}")