Skip to content

πŸ”Œ A compiler that scans, parses, and performs context-sensitive analysis of WLP4 code

Notifications You must be signed in to change notification settings

kushbhag/WLP4Compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ’» WLP4 Compiler πŸ’»

The WLP4 Compiler was a compiler that implements scanning, parsing, context sensitive analysis, and code generation of WLP4 Code. The end product was to write a compiler that translates WLP4 code into MIPS assembly language. WLP4 is a subset of the C++ language, and includes functions, integers, pointers, arrays, conditional statements, and while loops. The source code for this program is kept secret because the compiler was built as a part of UWaterloo's Foundations of Sequential Programs course. Feel free to email me at bhagat.kush.a@gmail.com to see it.


πŸ”§ Process to build the compiler

Scanning

The compiler was built by first tokenizing/scanning the WLP4 code. Scanning was implemented using a Simplified Maximal Munch Algorithm, that attempts to consume input until it gets stuck, at which point it determines whether or not the input is in an accepting state. If the input is in an accepting state, then a token is produced, otherwise it will not backtrack and reject the input. The algorithm scanned through the code using an NFA, where the lexical syntax used can be found here

Parsing

Because of the limitations of regular languages when trying to represent arbitrary nesting of something like parentheses, we implemented parsing using Pushdown Automata to figure out whether the code could be recognized using a Context Free Grammar (CFG). The language in question here can be found here. The end goal of parsing is to generate a derivation of the input string given a CFG. This derivation uniquely defines a parse tree which we can use to represent the structure of the program. Within this compiler, we made use of Bottom-Up Parsing algorithm, specifically the LR(1) parser.

Context-Sensitive Analysis

The next step of the compiler was to identify whether the code followed the context-sensitive rules of WLP4. Two of these rules that are very common in a lot of languages are...

  • Not declaring two variables with the same name
  • A variable cannot be used before it is declared

The full set of semantic rules (type-inference rules) can be found here, while the full set of context-sensitive rules can be found here. This process wasn't too complicated, the compiler simply had to parse through the parse tree (built in the parsing section) and ensure that each of the rules were being followed.

Code Generation and Optimization

Definitely my favourite part of the compilation process was code generation. There wasn't any hand wavy algorithm that we could use for this portion, it was just identifying an optimal approach that we could use to generate MIPS assembly code. The real challenge came when trying to optimize the code generation. The course instructors challenged the students to reduce the size of the MIPS program that the compiler generates for a particular WLP4 program (kept secret). Many optimization techniques were used like Constant Folding, Constant Propogation, Strength Reduction, and Dead Code Elimination. The goal was to reduce the generated code size below 180,000 bytes, which I was able to do, as well as coming in third place.

Score Board Marmoset

Was it worth grinding a couple of extra days to try and come first? Probably not, but I absolutely enjoyed the challenge, and loved the competition that 6533 gave me!

About

πŸ”Œ A compiler that scans, parses, and performs context-sensitive analysis of WLP4 code

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages