hack-computer

Implementation of a computer system as described in the book "The Elements of Computing Systems" by Nisan and Schocken.

Blog Posts

Throughout this project I published a series of blog posts:

Hardware platform

1. Boolean Logic

Boolean algebra
Gate logic
- Multibit gate logic
Hardware construction
Hardware description language
Hardware simulation

2. Boolean Arithmetic

Binary numbers
Binary addition
Signed binary numbers
Adders
- Half-adder
- Full-adder
- Adder
- Incrementer
ALU

3. Sequential Logic

The Clock
Flip-Flops
Registers
Memories
Counters
Why Time Matters
- Combinational chips (depend on inputs regardless of time)
- Sequential chips (containing one or more DFF gates)

4. Machine Language

Machines
- Memory
- Processor
- Registers
Languages
Commands
- Arithmetic and logic operations
- Boolean operations
- Memory access
  - Direct addressing
  - Immediate addressing
  - Indirect addressing
- Flow of control
Hack language specification overview

5. Computer Architecture

The stored program concept
The von Neumann architecture
Memory
CPU
Registers
Input and output

Software Hierarchy System

6. Assembler Compiler

The assembler is the first module in the software hierarchy. Chapter 4 presented machine languages in both assembly and binary. This chapter explains how to develop a Hack assembler that generates binary code to run on chapter 5's hardware platform. Assemblers must manage user-defined symbols and resolve them to physical memory addresses, often using a symbol table data structure.

The software projects creating translator programs (assembler, virtual machine, compiler) can be coded in any programming language.

Machine languages are specified in symbolic and binary forms. Binary codes, e.g. 110000101000000110000000000000111, represent machine instructions for the hardware. For example, the 8 leftmost bits could represent an opcode (e.g. LOAD), the next 8 bits a register (e.g. R3), and the final 16 bits an address (e.g. 7).

Depending on the hardware logic design and agreed-upon machine language, a 32-bit pattern can cause the hardware to 'load the contents of Memory[7] into register R3'. To simplify the complexity of machine languages with many operation codes, memory addressing modes, and instruction formats, instructions can be documented using an agreed-upon syntax, e.g. LOAD R3,7 instead of 110000101000000110 000000000000111.

7. Virtual Machine I: Stack Operation

We will approach this task in two stages, each spanning two chapters. High-level programs will first be translated into an intermediate code, which then gets translated into machine language. This two-tier translation model is an old idea from the 1970s, recently revived by modern languages like Java and C#. The intermediate code is designed to run on a Virtual Machine, allowing software to run on many processors and systems without changing the source code. This chapter presents a VM architecture modeled after the Java VM. We focus on two perspectives: motivate and specify the VM, then implement it over the Hack platform with a VM translator. Another implementation is a VM emulator, which runs the VM on a standard PC using Java.

We present a VM language consisting of four types of commands: arithmetic, memory access, program flow, and subroutine calling. We split the implementation into two parts, each covered in a separate chapter and project. This chapter builds a basic translator for arithmetic and memory access commands. The next chapter adds program flow and subroutine calling. The virtual machine illustrates many ideas in computer science, such as emulation, code compatibility, and stack processing.

8. Virtual Machine II: Program Control

Program control
Branching
Functions

9. High-level language

10. Compiler I: Syntax analysis

Background
- Compilers usually consist of two modules: syntax analysis and code generation.
- Syntax analysis is usually divided into two modules: tokenizing and parsing.
Lexical Analysis
- Lexical analysis is grouping the characters into tokens (as defined by the language syntax), while ignoring white space and comments. This could also be called scanning or tokenizing.
Grammars
- To process the tokens, we need to group them based on grammar rules into language constructs such as variable declarations, statements, and expressions.
- A context-free grammar is a set of rules specifying how syntactic elements in some language can be formed from simpler ones.
- The grammar specifies allowable ways to combine tokens, also called terminals, into higher-level syntactic elements, also called non-terminals.
Parsing
- To check whether a grammar ‘‘accepts’’ an input text as valid, we parse it. This means determining the exact correspondence between the text and the rules of a given grammar.
- Since the grammar rules are hierarchical, the output of the parser can be described in a tree-oriented data structure called a parse tree or a derivation tree.
- Recursive Descent Parsing: for every rule in the grammar describing a non-terminal, the parser program has a recursive routine designed to parse that non-terminal.
- Whenever non-terminal has several alternative derivation rules and the first token suffices to resolve which rule to use without ambiguity, the grammar is called LL(0).
The Jack Language Grammar
- Formal spec of the language, aimed at Jack compiler developers
A Syntax Analyzer for the Jack Language
- At each point in the parsing process, knowing the structural identity of the program element that it is currently reading in a complete recursive sense.
The Syntax Analyzer’s Input
The Syntax Analyzer’s Output

11. Compiler II: code generation

Background
- Data translation: programs manipulate many different types of variables which have different life cycles and scopes. The compiler must map all variables on a symbol table based on its type and kind.
- Command translation: expression evaluation and flow control.
Specification
- Usage: prompt> JackCompiler source
- Standard mapping over the virtual machine: file function and naming, memory allocation and access, subroutine calling, returning from void methods, constants, and use of the OS services.
Implementation
- The compiler builds upon the syntax analyzer. The overall compiler architecture will morph into 5 total modules: JackCompiler (top-level driver that sets up and invokes the other modules), JackTokenizer (tokenizer), SymbolTable (symbol table), VMWriter (output module for generating VM code), and the CompilationEngine (recursive top-down compilation engine).

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
01		01
02		02
03		03
04		04
05		05
assembler		assembler
jackProgramExample		jackProgramExample
unit4-compiler-i		unit4-compiler-i
unit5-compiler-ii		unit5-compiler-ii
vmTranslator		vmTranslator
vmTranslator2		vmTranslator2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hack-computer

Blog Posts

Hardware platform

1. Boolean Logic

2. Boolean Arithmetic

3. Sequential Logic

4. Machine Language

5. Computer Architecture

Software Hierarchy System

6. Assembler Compiler

7. Virtual Machine I: Stack Operation

8. Virtual Machine II: Program Control

9. High-level language

10. Compiler I: Syntax analysis

11. Compiler II: code generation

12. Operating System

About

Releases

Packages

Languages

License

maxdemaio/hack-computer

Folders and files

Latest commit

History

Repository files navigation

hack-computer

Blog Posts

Hardware platform

1. Boolean Logic

2. Boolean Arithmetic

3. Sequential Logic

4. Machine Language

5. Computer Architecture

Software Hierarchy System

6. Assembler Compiler

7. Virtual Machine I: Stack Operation

8. Virtual Machine II: Program Control

9. High-level language

10. Compiler I: Syntax analysis

11. Compiler II: code generation

12. Operating System

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages