This is a compiler made to work off a simplified programming language grammar
Currently in Phase 3
Program := class id { Memberdecls }
Memberdecls := Fielddecls Methoddecls
Fielddecls := Fielddecl Fielddecls
| λ
Methoddecls := Methoddecl Methoddecls
| λ
Fielddecl := Optionalfinal Type id Optionalexpr ;
| Type id [ intlit ] ;
Optionalfinal := final
| λ
Optionalexpr := = Expr
| λ
Methoddecl := Returntype id ( Argdecls ) { Fielddecls Stmts } Optionalsemi
Optionalsemi := ;
| λ
Returntype := Type
| void
Type := int
| char
| bool
| float
Argdecls := ArgdeclList
| λ
ArgdeclList := Argdecl , ArgdeclList
| Argdecl
Stmts := Stmt Stmts
| λ
Stmt := if ( Expr ) Stmt OptionalElse
| while ( Expr ) Stmt
| Name = Expr ;
| read ( Readlist ) ;
| print ( Printlist ) ;
| printline ( PrintlineList ) ;
| id ( ) ;
| id ( Args ) ;
| return ;
| return Expr ;
| Name ++ ;
| Name -- ;
| { Fielddecls Stmts } Optionalsemi
OptionalElse := else Stmt
| λ
Name := id
| id [ Expr ]
Args := Expr , Args
| Expr
Readlist := Name , Readlist
| Name
Printlist := Expr , Printlist
| Expr
Printlinelist := Printlist
| λ
Expr := Name
| id ( )
| id ( Args )
| intlit
| charlit
| strlit
| floatlit
| true
| false
| ( Expr )
| ~ Expr
| - Expr
| + Expr
| ( Type ) Expr
| Expr Binaryop Expr
| ( Expr ? Expr : Expr )
Binaryop := * | / | + | - | < | > | <= | >= | == | <> | || | &&\
There are a few additional rules that are important to note:
● an identifier has a leading letter followed by zero or more letters or digits
● an integer literal has a digit followed by zero or more digits
● a character literal begins with a ', is followed by a single character description and then is
terminated by a ', a single character description can be any legal character other than ' or ,
to signify those characters they should begin with a \, as in '\'' or '\\', some special
characters include '\t' (tab character), '\n' (newline character)
● a floating point literal consists of one or more digits, followed by a decimal point (.),
followed by one or more digits
● a string literal begins with a ", is followed by zero or more string characters and ends with a
", the string characters cannot include a newline character, a tab character, a backslash
character or a double quote directly, these must be signified using \ (as in \n for newline, \t
for tab character, \\ for backslash and \"), so "ab\tcd\n\"" is a string consisting of an a, b,
tab character, c, d, newline character and a double quote character
● white space includes space, newline, return and tab characters
● comments are included as follows:
○ on any line the characters \\ start a comment that is ended at the end of that line
○ \* opens a comment that continues until the first occurrence of *\
To build simply run 'make run' and to clean run 'make clean'
At this stage, the output is simply the token lexeme pairs and where they appear in the input file
To run Phase 1 test make sure under run is phase1 Then to build simply run 'make` or Just build by using 'make phase1'
At this stage, the output is a simplified program structure based on the AST made on a smaller section of the grammar
To build simply run 'make' or if you are in a later phase, 'make phase2'
At this stage, the output is a structured program based on the AST of the whole grammar. Additionally, some of the grammar has been refactored to account for things such as Shift/Reduce conflicts.
To build simply run 'make' or if you are in a later phase, 'make phase3'
This stage adds a Symbol Table and Type checking to the compiler.