usage: compiler.py [-h] [--print-tokens] [--save-tokens SAVE_TOKENS] [--print-productions] [-s SAVE_PRODUCTIONS] [--symbol-table SYMBOL_TABLE] source file
positional arguments:
source file Path to the source code file that will be compiled
options:
-h, --help show this help message and exit
--print-tokens print tokens returned by lexer
--save-tokens SAVE_TOKENS
Specify output file for tokens
--print-productions Print productions to console
-s SAVE_PRODUCTIONS, --save-productions SAVE_PRODUCTIONS
Save syntax analyzer productions to a file
--symbol-table SYMBOL_TABLE
Save symbol table to a file
A comment starts with a [*
and ends with a *]
. Comments are ignored by the lexical analyzer and syntax analyzer.
[* This is a one line comment *]
[* This is a multiline comment
that spans three rows. All three
rows will be ignored by the compiler. *]
An identifier is a sequence of letters (a - z), digits (0 - 9), and underscores ("_"). The first character in an identifier must be a letter. Case is insignificant, so the lexical analyzer must convert all uppercase letters to lowercase so that all tokens are lowercase.
The following identifiers are keywords of the language and cannot be used for ordinary identifiers.
boolean else endif
false function if
integer print real
return scan true
while endwhile
Integers are unsigned decimal integers. They are a sequence of digits.
Examples of integers
7 153 1849375932
A real is an integer, followed by a dot ("."), followed by an integer. There must be an integer before the dot, and an integer after the dot, meaning that something like 89.
or .57
would be invalid.
Examples of reals
3.14 22.5 0.99 25.0
The following tokens are operators.
+ - * /
== != > <
<= => =
The following tokens are separators.
( ) { }
, ; $
Space (' ') is also a separator used to separate tokens and symbols.