Skip to content

tobielf/SER502-Spring2017-Team10

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SER502-Spring2017-Team10

Members

@tobielf, Guo Xiangyu

@run-dong-zhu, Zhu Rundong

@XimingFeng, Feng Ximing

@kkmacart, MacArthur Katie

Development Plan

MacArthur Katie, Lexical Analysis(Tokenizing)

Feng Ximing, Syntax Analysis(Parsing)

Zhu Rundong, Semantic Analysis(Intermediate Code Generating)

Guo Xiangyu, Runtime Environment(Executing)

System Requirement

Platform

Linux(native)
MacOS(native)
Windows(Cygwin required)

Tools

GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program's source files.

We use make to generating our software, including the unit test of each component, the debug version of the software and of course, the release version of our software.

Doxygen is the de facto standard tool for generating documentation from annotated C++ sources, but it also supports other popular programming languages such as C, Objective-C, C#, PHP, Java, Python, IDL (Corba, Microsoft, and UNO/OpenOffice flavors), Fortran, VHDL, Tcl, and to some extent D.

We use doxygen to generating the documentation of our software based on the comment we wrote in our code.

Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools.

We use valgrind to detect and eliminate possible memory leak to ensure the robustness of our software.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees.

We use antlr to help us understanding the structure of the parsing tree generated by our language, then we implement our parser/byte code generator by using the diagram of parsing tree generated by antlr as a reference.

Slack brings all your communication together in one place. It's real-time messaging, archiving and search for modern teams. Visit our slack channel.

We heavily relied on Slack to communicate between team members.

Cygwin is: a large collection of GNU and Open Source tools which provide functionality similar to a Linux distribution on Windows. A DLL (cygwin1.dll) which provides substantial POSIX API functionality.

We use Cygwin to compatible with Windows platform.

Install Directions

~$ mkdir SER502-Team10
~$ cd SER502-Team10 
~$ wget https://github.com/tobielf/SER502-Spring2017-Team10/archive/master.zip
~$ tar xvf master.tar.gz --strip 1

Build and Run Directions

Build compiler and runtime. The build script under the project folder will build the compiler and runtime, copy them to "bin" folder.

~$ ./build.sh

Run the compiler and runtime. The run script under the project folder will run the compiler and runtime on the testing program we wrote under "data" folder.

~$ ./run.sh

Or you can write some test code in our language and run the compiler and runtime by yourself.

Usage:
./compiler <input file> <output file>
e.g ./compiler program1.ten program1.asm
Usage:
./runtime <input file>
e.g ./runtime program1.asm

YouTube Video Link

The presentation video about this project is here.

Language Name

TEN will be our language name since we are team TEN.

Also, in Chinese "ten" means perfect, we will try to make this small language project as perfect as possible in the limited time.

Grammar Rule

grammar ten;
program : stmt_list ;
stmt_list : stmt ';' stmt_list | stmt ';' ;
stmt : decl_stmt | assign_stmt | if_stmt | for_stmt | print_stmt | ;
decl_stmt : 'var' ID ;
assign_stmt : ID 'is' expr ;
if_stmt : 'if' '(' boolean_expr ')' 'then' '{' stmt_list '}' | 
          'if' '(' boolean_expr ')' 'then' '{' stmt_list '}' 'else' '{' stmt_list '}' ;
for_stmt : 'for' ID 'from' expr 'to' expr 'step' expr '{' stmt_list '}' ;
print_stmt : 'print' expr ;
boolean_expr : expr '=' expr | expr '<>' expr |
               expr '<' expr | expr '<=' expr | 
               expr '>' expr | expr '>=' expr | 
               boolean_val ;
boolean_val : 'true' | 'false' ;
expr : term res1;
res1 : '+' term | '-' term | ;
term : factor res2;
res2 : '*' factor | '/' factor | '%' factor |  ;
factor : '(' expr ')' | NUMBER | ID ;
NUMBER : [0-9]+ ;
ID : [a-z|A-Z]+ ;
WS : [ \t\r\n]+ -> skip ;

Byte Code

Type 1: declare
DEC var1						; declare a variable

Type 2: assignment
MOV var1, var2/value		; assign var2/value to var1

Type 3: arithmetic operation
ADD var1, var2/value		; add var2/value to var1
SUB var1, var2/value		; sub var2/value from var1
MUL var1, var2/value		; mul var2/value to var1
DIV var1, var2/value		; div var2/value from var1
MOD var1, var2/value		; mod var1 with var2/value

Type 4: compare and jump operation
CMP var1, var2/value		; CMP var1 and var2/value, set $flag = var1 - var2/value.
JE  label			; Jump to label if $flag Equal to zero.
JNE label			; Jump to label if $flag Not Equal to zero.
JL  label			; Jump to label if $flag Less than zero.
JLE label			; Jump to label if $flag Less than or Equal to zero.
JG  label			; Jump to label if $flag Great than zero.
JGE label			; Jump to label if $flag Great than or Equal to zero.
JMP label			; Jump to label, always.

Type 5: output result
OUT var1/value			; Output var1/value.

Design Details

  • Paradigm of the Language

      imperative paradigm
    
  • Parsing Process(Technique)

    Parsing Process

      TopDown Parser
      Syntax Directed Translation
      Operational Semantic
    
  • Example Program in High Level Language

    For more example, please see here

    Example: counting the sum of all the even and odd numbers between 1 and 100 using our language.

     var i;
     var evenSum;
     var oddSum;
     evenSum is 0;
     oddSum is 0;
     for i from 1 to 100 step 1 {
         if ((i % 2) = 0) then {
             evenSum is evenSum + i;
         } else {
             oddSum is oddSum + i;
         };
     };
     print evenSum;
     print oddSum;
    
  • Design Decision

    1. No dynamic memory allocation and function call in our language to keep it as simple as possible.
    2. Only support one primitive type(integer), so we don't need to do the type checking in our compiler.
    3. Ambiguous in "Dangling else". This ambiguity is resolved by forcing the user to use { and } to incorporating a statement list.
    4. For virtual machine(runtime environment), we didn't limit the number of registers. So the intermediate code can use DEC var to declare as many variables as it needs.
    5. Also in the virtual machine, it provided an OUT instruction to allow the user to print the value inside the store.
  • Data Structure

      ArrayList for the symbol table.
      Left-child right-sibling binary tree for the syntax tree.
    
  • Recursion method in the grammar

      Right Recursion
    
  • Tools & developing language planning to use

      C programming as the developing language
      ANTLR helping verify the grammar rule
      Valgrind checking memory leaks in the code
    
  • Runtime Environment

    It will be a virtual machine, changing the storage state by executing the byte-code command in sequence.

##Highlights##

Now, we are going to introduce the highlights of our project. One aspect is how the software has been developed, and the other aspect is what's the extra feature we implemented beyond the minimum requirement.

Development strategy

We divided the two weeks development time into four phases, and using the Agile method with Test Driven Development:

Dev phases:
Dev1: April 15(Sat)-17(Mon), layout the interfaces between components.
Dev2: April 18(Tue)-21(Fri), implement basic data structure and simple statement.
Dev3: April 22(Sat)-24(Mon), implement lexical analysis and complex statement.
Dev4: April 25(Tue)-28(Mon), integrated all parts, testing and fixing the bug.

All detail description of four phases can track on these issues:

Task 7: Developing Phase One.

Task 8: Developing Phase Two.

Task 9: Developing Phase Three.

Task 10: Developing Phase Four.

Before we start We unified the coding style in Task 5: Coding Style. so that the code wrote by different members will look like the same. Also, we manually wrote eight test program and corresponding bytecode under data folder, two tests per person in Task 6: Testing Data. By doing so we can compare them with the compiler actually generate in the final release to verify it works properly.

During the coding Everyone developed his/her code under his/her branch and performed the unit test in their code. You can type make test_link_node make test_symbol_table make test_link_list make test_parsing_tree to generate an independent program to run the unit test for basic data structures, and you can type make test to generate three programs to run the unit test for lexical parser and bytecode.

After the coding We performed code review activity on each members code. At the end of each phase, everyone sent out a Pull/Request to request others review his/her code. Only the code has been thoroughly reviewed, it can merge into the master branch. All Pull/Request and reviewing activity can track on these P/Rs:

Katie {Dev1, Dev2, Dev3, Dev4}

Ximing {Dev1, Dev2, Dev3, Dev4}

Rundong {Dev1, Dev2, Dev3, Dev4}

Xiangyu {Dev1, Dev2, Dev3, Dev4}

Integrated Testing You can use make debug for compiler and runtime, it will generate a debug version of our compiler and runtime. In the debug mode, it will produce extra debugging info on stderr to help you pinpoint the bug. For example, the debug version of the runtime called test_runtime, you can run it with a byte code file then it will output the instructions loaded into the virtual machine and the execution sequence of the runtime.

You can use stream redirect to output these debugging message into a file, like ./test_runtime demo.asm 2>debug.txt the output of the runtime will remain on the terminal, but the debugging message will go into debug.txt. You can also redirect this info, like ./test_runtime demo.asm >result.txt 2>debug.txt, so the result will go into result.txt.

Bonus Point

  1. We implemented print statment in our language to output the variable or value. So you can:

    var i;
    i is -1;
    print i;
    print 1+2;
    print i+2;
    

the output will be

	-1
	3
	1
  1. We implemented variable scope in our runtime environment. So you can declare the variable with same name in different scope:

    var i;
    var j;
    i is 10;
    j is 0;
    for j from 0 to 5 step j + 1 {
    	var i;
    	i is j;
    	print i;
    };
    print i;
    

the output will be

	0
	1
	2
	3
	4
	10

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •