# Bridging Debugging Gaps: Creating a DAP-Compatible Debug Adapter between Binaries and Debuggers

## Introduction

This Jupyter notebook serves as the codebase for the implementation of the Debug Adapter using the Debug Adapter Protocol (DAP) for our thesis project. Our objective is to bridge various development tools with the GDB debugger, thereby providing a unified interface for debugging across multiple platforms.

## What is a Debug Adapter?

A Debug Adapter serves as an intermediary between a development environment and a debugger. Instead of each tool having its own unique interface for every debugger, the Debug Adapter Protocol provides a common set of conventions that can be used by any tool to communicate with any debugger. This results in significant efficiency gains as development tools can now interact with a variety of debuggers through a standardized protocol.


## Purpose of the Codebase

This code provides an abstraction layer over the GDB debugger, encapsulating the basic functionalities required for debugging such as loading programs, setting command-line arguments, and starting the program execution. Further, it lays the foundation for implementing more advanced features like setting breakpoints, stepping through code, and inspecting variable values, among others.


## What is a Debug Adapter?

A Debug Adapter serves as an intermediary between a development environment and a debugger. Instead of each tool having its own unique interface for every debugger, the Debug Adapter Protocol provides a common set of conventions that can be used by any tool to communicate with any debugger. This results in significant efficiency gains as development tools can now interact with a variety of debuggers through a standardized protocol.

## Setup and Installation

### 1. Install required modules:

Before running the code, ensure you have the required installations installed. The primary debugger we utilize here is `gdb` 

# Introduction to GDB (GNU Debugger)

In the context of this thesis, GDB, which stands for **GNU Debugger**, plays a crucial role in facilitating the debugging and analysis of computer programs written in languages like C and C++. GDB is a powerful and versatile tool that serves as a debugger for various programming languages. In this section, we will provide an overview of GDB, its significance, and the reasons for its inclusion in our research.

## Role of GDB

**GDB** serves as a debugging tool used by software developers and researchers to:

1. **Identify and Fix Bugs**: GDB allows developers to identify and rectify errors, also known as bugs, within their source code. These bugs can range from simple syntax errors to complex logical issues that affect the program's behavior.

2. **Inspect Program Execution**: GDB provides the capability to inspect a program's execution step by step. This includes examining variables, memory, and the call stack to understand how a program behaves during runtime.

3. **Set Breakpoints**: Breakpoints are markers set within the code to pause program execution at specific points. GDB allows developers to set breakpoints, making it easier to examine the program's state at critical junctures.

4. **Analyze Core Dumps**: When a program crashes, GDB can be used to analyze core dump files, providing valuable insights into the state of the program at the time of the crash.

5. **Profiling and Performance Analysis**: GDB can be utilized for profiling and performance analysis, helping developers optimize their code for efficiency.

#### We select GDB (GNU Debugger) for our thesis due to its essential role in debugging and analyzing complex software systems.


### 2. Install GDB:
If you don't have GDB installed, you'll need to get it set up on your system. How you do this will vary depending on your OS:

##### For Debian-based Linux distributions:


In [14]:
sudo apt-get install gdb

SyntaxError: invalid syntax (2556847051.py, line 1)

##### For Red Hat-based distributions:

In [15]:
sudo yum install gdb

SyntaxError: invalid syntax (1532665435.py, line 1)

##### For macOS (using Homebrew):

In [16]:
brew install gdb

SyntaxError: invalid syntax (1692721093.py, line 1)

##### Ensure GDB is correctly installed by running:

In [None]:
gdb --version


##  Debugger Class

The presented cell establishes a basic framework for interfacing with the GDB debugger. The `Debugger` class provides an abstraction layer, allowing for easier interaction with the underlying debugger functionalities:

- **Imports**: 
  - `gdb`: Module for GDB interactions.
  - `re`: Regular expression library, generally used for string manipulations and parsing.

- **Debugger Class**:
  - **Initialization**: On creation, it takes the path to the executable we wish to debug and loads it into GDB.
  - **Setting Arguments**: Facilitates setting any command-line arguments needed by the program during its execution.
  - **Program Execution**: Provides a method to start the program under the GDB debugger.
  
As the notebook progresses, we'll extend the capabilities of this class to integrate more advanced debugging operations, offering a comprehensive debugging interface.

In [None]:
# Import required libraries
import gdb
import re

class Debugger:
    def __init__(self, path_to_executable):
        # Initialize the Debugger with the path to the executable
        self.path = path_to_executable
        self.args = ""
        self.load_program()

    def load_program(self):
        # Load the program using GDB
        gdb.execute("file {}".format(self.path))

    def set_args(self, arguments):
        # Set command-line arguments for the program
        arg_str = arguments
        gdb.execute("set args '{}'".format(arg_str))

    def start_program(self):
        # Start the program execution
        gdb.execute("run", to_string=True)

    # Additional methods for breakpoints, stepping, and more
        def set_breakpoint(self, location):
        gdb.Breakpoint(location)
        print("Breakpoint set at {}".format(location))

    def set_breakpoint_at_line(self, line_number):
        brkpnt_format = self.path.split("/")[-1] + ".c"
        breakpoint_location = "{}:{}".format(brkpnt_format, line_number)
        gdb.Breakpoint(breakpoint_location)
        
        return ("Breakpoint set at {}:{}".format(brkpnt_format, line_number))

    def step_over(self):
        gdb.execute('next')

    def continue_exec(self):
        gdb.execute('continue', to_string=True)

    def get_stack_trace(self):
        print("STACKTRACE:")
        return gdb.execute('bt', to_string=False)
    
    def print_current_line():
        frame = gdb.selected_frame()
        sal = frame.find_sal()
        print(f"Current line: {sal.line} in {sal.symtab.filename}")

    def quit(self):
        # Disable confirmation prompts
        gdb.execute('set confirm off')
        gdb.execute('quit')

    def get_variable_value(self, variable_name):
        return gdb.parse_and_eval(variable_name)
    
    def print_local_variables(self):
        print("Printing Local Variables")
        locals_info = gdb.execute('info locals', to_string=True)
        print(locals_info)

    def set_breakpoints_on_all_functions(self):
        # Get the list of all functions
        functions_info = gdb.execute("info functions", to_string=True)

        # Split the information by lines
        lines = functions_info.splitlines()

        # Compile a regular expression to match C/C++ function names
        regex = re.compile(r'\w+(\w+::)*\w+\(')

        # Iterate over each line
        for line in lines:
            # Search for function names using the regex pattern
            match = regex.search(line)
            if match:
                # The regex captures the function name with a trailing '('
                # We remove that '(' to get just the function name
                function_name = match.group()[:-1]
                # Set breakpoint on this function
                self.set_breakpoint(function_name)

    def set_breakpoints_on_returns(self, func_name):
        print(f"Setting breakpoints on return instructions of {func_name}")
        
        # Disassemble the function and collect the output
        disas_output = gdb.execute(f"disassemble {func_name}", to_string=True)
        
        # Split the output by lines
        lines = disas_output.split('\n')

        # Regular expression to extract address
        regex = re.compile(r"^(0x[0-9a-f]+)")

        # For each line, check if it contains the 'ret' instruction
        for line in lines:
            if 'ret' in line:
                address = line.split()[0]
                gdb.Breakpoint("*{}".format(address))
                print(f"Breakpoint set at address {address}")

# DebugServer

The DebugServer cell serves as a critical bridge between our custom debugger and external clients. It functions as a communication hub, responsible for receiving incoming requests from clients, processing these requests, and orchestrating interactions with the debugger to perform debugging operations. 

- **Imports**: 
  - `socket`: This module is integral for establishing a server-client architecture, enabling the DebugServer to create a communication endpoint, bind to a specific address and port, and listen for incoming connection requests from clients. The server utilizes this module to receive and send data to connected clients, forming the backbone of interaction in the debugging environment.
  - `json`: The json module is crucial for encoding and decoding JSON data. Given the universally accepted nature of JSON as a data exchange format, this module is essential for parsing received JSON formatted string requests and for encoding response data into JSON format before transmitting back to the clients.
  
## Client-Server Interaction Flow

- Server Initialization: The server starts and waits for a client connection.
- Connection Establishment: Upon a client connection, the server enters into a communication loop with the client.
- Command Reception & Processing: The server receives JSON formatted string commands, processes them by invoking the corresponding debugger methods, and formulates a response.
- Response Transmission: The structured response, encoded in JSON format, is then transmitted back to the client, ensuring the client is apprised of the outcome of their request.
- Connection Termination: The server gracefully terminates the connection once the client disconnects.
- This structured and interactive cell ensures precise and user-friendly interaction between clients and the debugger, allowing diverse debugging operations to be conducted with ease and accuracy.

In [None]:
import socket
import json

class DebugServer:

    def __init__(self, debugger, host='localhost', port=12345):
        self.debugger = debugger
        self.server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.server_socket.bind((host, port))
        self.server_socket.listen(5)
        print(f"Listening on {host}:{port}")

    def handle_client(self, client_socket):

        seq = 0
        while(True):
            request_data = client_socket.recv(1024).decode('utf-8')
            if not request_data:  # Check if data received is empty
                print("Client disconnected")
                return  # Return from this function to handle other clients or stop
            print("Data is:",request_data)
            print("\n")
            headers, json_string = request_data.split('\n', 1)
            request = json.loads(json_string)
            
            print(request)

            response = {
                'type': 'response',
                'success': False,
                'seq': seq,
                'request_seq': 0,
                'body': {},
                'command': request['command'],
                'message': ''
            }
            

            try:
                if request['command'] == 'initialize':
                    path_to_executable = request['path_to_executable']
                    if not path_to_executable:
                        response["status"] = "error_path"
                        response["data"] = str(e)
                        response["success"] = False
                        raise ValueError("Path to executable not provided")
                    
                    self.debugger.load_program(path_to_executable)
                    response["success"] = True

                elif request['command'] == 'setArgs':
                    #print("Setting Arguments")
                    print("XDLMAO" + request['args'])
                    self.debugger.set_args(request['args'])
                    response["success"] = True

                elif request['command'] == 'start':
                    #print("Starting/Running")
                    self.debugger.start_program()
                    response["success"] = True

                elif request['command'] == 'setBreakpoints':
                    #print("Setting")
                    self.debugger.set_breakpoint(request['location'])
                    response["success"] = True

                elif request['command'] == 'next':
                    self.debugger.step_over()
                    response["success"] = True

                elif request['command'] == 'continue':
                    self.debugger.continue_exec()
                    response["success"] = True

                elif request['command'] == 'stackTrace':
                    response["data"] = self.debugger.get_stack_trace()
                    response["success"] = True

                elif request['command'] == 'get_variable':
                    var_value = self.debugger.get_variable_value(request['var_name'])
                    response["data"] = str(var_value)
                    response["success"] = True

                elif request['command'] == 'print_locals':
                    response["data"] = self.debugger.print_local_variables()
                    response["success"] = True
                elif request['command'] == 'setBreakpointAtLine':
                    response['message'] = self.debugger.set_breakpoint_at_line(request['line'])
                    response["success"] = True
                else:
                    response["success"] = False
                    response["status"] = "error"
                    response["data"] = "Unknown command"

            except Exception as e:
                response["status"] = "error"
                response["data"] = str(e)
            
            response["seq"] = seq
            client_socket.send(json.dumps(response).encode())
            #client_socket.close()
            seq += 1

    def run(self):
        client_socket, addr = self.server_socket.accept()
        print(f"Accepted connection from {addr}")
        self.handle_client(client_socket)

            
        client_socket.close()


In [None]:
# Create a debugger instance
debugger = Debugger()

# Create and run the server
server = DebugServer(debugger)
server.run()