hasm stands for Hack Assembler, an assembler for the Hack Platform. This project is based on the sixth chapter of the book "The Elements of Computing System: Building a Modern Computer from First Principles", by Nisan and Schocken, where the platform is fully described. For more information, see nand2tetris.
The hasm assembler is a cross-platform command line tool. From a valid .asm file, hasm will generate a valid .hack textual binary file. This output file can then be executed in any implementation of the Hack Platform.
Staying true to the chapter's goal, the assembler is implemented without the help of parser generators nor tools like boost's Spirit. Some parts of the code follow the authors' suggested API. On the other hand, some parts aim for a broader implementation, leaving the original API aside.
hasm goes a bit further than the assembler described in the book, offering the extra functionalities:
- code analysis (lexical, syntactic and a tiny bit of semantic) with error messages;
- ability to export the symbol table to a file.
Let us take it that there is a file named add.asm with the following Hack assembly machine language program:
@2
D=A
@3
D=D+A
@0
M=D
We can run the assembler from the command line, passing the path for the source file as an argument:
$ ./hasm --input-file add.asm
This will generate an output file called add.hack, with the following content:
0000000000000010
1110110000010000
0000000000000011
1110000010010000
0000000000000000
1110001100001000
This file can then be loaded into any Hack-compliance machine and it will correctly reflect the logic of the program described in the source file.
If you feel the need to take a look at the symbol table generated during the assembling process, you can add the flag --symbol-table
(or the shorthand -s
) to the command line as so:
$ ./hasm --symbol-table --input-file pong.asm
This will generate (alongside with the assembled pong.hack) a text file called pong.sym that could look like this:
0x0001 LCL
0x0023 END_GT
0x0033 END_LT
0x028e LOOP_ball.setdestination
0x072d LOOP_ball.bounce
0x17f6 LOOP_keyboard.readline
0x6000 KBD
The .sym exported table will list the addresses of user defined symbols (like END_GT and LOOP_ball.bounce in the example above), as well as Hack predefined symbols, like KBD and LCL, etc. for instance.
Running hasm in the command line with the argument --help
(or the shorthand -h
) will print the usage message:
$ ./hasm --help
hasm: assembler for the nand2tetris hack platform
Usage: ./hasm [OPTIONS]
Options:
-h,--help Print this help message and exit
-i,--input-file TEXT input .asm file
-s,--symbol-table Needs: --input-file
export symbol table (to <input file>.sym)
-v,--version Display program version information and exit
The assembler is written in C++20 and uses CMake to manage the building process. Aside from the C++ Standard Library, hasm uses CLI11 and Catch2.
The following list enumerates the tools and dependencies' minimum requirements:
- C++20 compiler
- CMake 3.25 or higher
- CLI11 (managed by the main CMake script)
- Catch2 (managed by the main CMake script)
Run the classic cmake + make on the source directory. It is recommend to run cmake from out of source, that is, usually from a build directory inside the source directory. Here is an example on Linux:
hasm$ mkdir build
hasm$ cd build
hasm/build$ cmake ..
hasm/build$ cmake --build .