Skip to content

Dor-sketch/openu_course20465_project

Repository files navigation

🛠 Assembler for Custom Assembly Language

This project involved the development of an assembler for a specialized assembly language. Its primary aim was to convert human-readable assembly instructions into binary machine code, bridging the gap between high-level programming concepts and low-level execution on computers.

Implemented in ANSI C, this project demonstrates a strong understanding of foundational programming principles. It was part of the 20465 System Programming Laboratory course at The Open University of Israel studied during the 2021-2022 academic year, and achieved a grade of 98.


Table of Contents


🚀 Features

  • Preprocessing 🧹: The assembler supports preprocessing tasks, including macro expansion and line numbering.

  • Syntax Checking ✅: The assembler ensures syntax accuracy, checking for valid opcodes and operands.

  • Symbol Table 📚: The assembler generates a symbol table, computing label memory addresses.

  • Machine Code Generation 💻: The assembler produces the machine code and data images.

  • Output Files 📁: The assembler prints output files such as the machine code file, external data words file, and entry type symbols file.

  • Error Handling 🚨: The assembler handles various syntax and semantic errors, providing descriptive error messages, including line numbers and error types with clickable links to the relevant code.

  • Dynamic Memory Allocation 🧠: The assembler uses dynamic memory allocation to manage memory efficiently.

  • Modular Design 🧩: The assembler is designed with a modular architecture, with each module responsible for a specific task.

  • Coding Standards 📏: The assembler adheres to the project's coding standards, including naming conventions, indentation, and documentation.

  • Testing 🧪: The assembler is thoroughly tested, with a test suite that covers all possible scenarios, including valgrind memory leak checks with no errors.


🤖 Usage

New GUI

Alt text Alt text

The assembler now includes a new GUI, allowing users to assemble assembly code with a few clicks. The GUI is built with Gtk+. It's written in c++ but integrates with the assembler's c codebase using extern "C". This allows the assembler main function to get services from the GUI, such as the input file path and output directory path without having to change the assembler's codebase.

Note: The GUI is currently only tested on Ubuntu 22.04 and MacOS Sonoma and consider a work in progress. For stable usage, please use the command line interface or prevoius version of the Assembler.


Before runing the assembler, make sure you have gcc and Gtk+ installed on your machine.

You can install Gtk+ on MacOS using brew:

brew install gtk+3

Command Line

Use the assembler by providing an input file with assembly code. The output includes several files: a machine code file, an external data words file, and an entry type symbols file.

make
./assembler {input - without .as extension. e.g. input_example}

✅ Examples

Successful Assembly Output

The screenshots below demonstrate the successful output files generated by the assembler from the input_example.as file:

  • Assembly Code Snippet (ps.am):

    ; Assembly code that defines data, strings, and contains various instructions
    ; including 'add', 'prn', 'lea', 'inc', 'mov', 'sub', 'bne', 'cmp', 'dec', and 'stop'.
    .entry LIST
    .extern  W
    MAIN:  add r3,LIST
    LOOP:  prn #48
      macro m1 ; macro definition
      inc r6
      mov r3, W
      endm
      lea STR, r6
      m1 ; macro call
      sub r1, r4
      bne END
      cmp vall, #-6
      bne END[r15]
      dec K
    .entry MAIN
      sub LOOP[r10],r14
    END:  stop
    STR:  .string "abcd"
    LIST:  .data 6,-9
      .data -100
    .entry K
    K:  .data 31
    .extern va

    note: the macro will be expanded in the preprocessor stage:

    .entry LIST
    .extern  W
    MAIN:  add r3,LIST
    LOOP:  prn #48
      lea STR, r6
      inc r6
      mov r3, W
      sub r1, r4
      bne END
      cmp vall, #-6
      bne END[r15
      dec K
    .entry MAIN
      sub LOOP[r10],r14
    END:  stop
    STR:  .string "abcd"
    LIST:  .data 6,-9
      .data -100
    .entry K
    K:  .data 31
    .extern vall
  • Entry Symbol Table (input_example.ent):

      ; List of entry symbols and their addresses
      K,0144,0005
      LIST,0144,0002
      MAIN,0096,0004
    
  • External Symbol References (input_example.ext):

      ; External symbols and their references in the code
      vall BASE 0125
      vall OFFSET 0126
      W BASE 0115
      W OFFSET 0116
    
  • Machine Code Output (input_example.ob):

    ; Binary representation of the assembly code
    41 9
    0100 A4-B0-C0-D0-E4
    ... (additional lines of machine code)
    0149 A4-B0-C0-D1-Ef
    

Error Handling

Below is a screenshot showing how the assembler handles various syntax and semantic errors from errors_example.as. Each error message is designed to be descriptive, guiding the user to identify and rectify the issues within the assembly code.

image

The error messages include issues like undefined operations, missing operands, invalid target registers, and failures to find symbols for direct addressing mode, showcasing the assembler's comprehensive error-checking capabilities.


🧩 Modules

The assembler includes several modules:

  • 📝 pre.c: Manages preprocessing tasks, including macro expansion and line numbering.
  • 🔎 syntax.c: Ensures syntax accuracy, checking for valid opcodes and operands.
  • 🚦 first_pass.c: Conducts the first assembly pass, generating a symbol table and computing label memory addresses.
  • 🚀 second_pass.c: Performs the second pass, producing the machine code and data images.
  • 🖨️ print_output.c: Prints output files such as the machine code file, external data words file, and entry type symbols file.
  • 🏁 main.c: Coordinates the other modules to produce the final output.

👥 Contributing

Contributors are welcome! Fork the repository and submit a pull request with your changes. Please ensure your contributions are well-tested and adhere to the project's coding standards.

📜 License

This project is licensed under the MIT License.