Skip to content
No description, website, or topics provided.
Python Rust Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
part1
part10
part11
part12
part13
part14
part15
part16
part17
part2
part3
part4
part5
part6
part7
part8
part9
.gitignore
LICENSE
README.md
run_py_tests.sh

README.md

Fusion


Run all Python tests:

$ ./run_py_tests.sh

Learning (Part - 1):

  • Concept of interpreter.
  • Concept of compiler.
  • Difference between compiler and interpreter.
  • Concept of token.
  • Lexical analysis - The process of breaking sentences in to token is called lexical analysis.
  • How scanner or tokenizer as the part of the intrpreter or compiler that does the lexical analysis.

Learning (Part - 2):

  • Concept of lexeme.
  • Parsing - The process of recognizing a phrase in the stream of tokens or, to put it differently, the process of finding structure in the stream of tokens is called parsing.
  • How parser works.

Learning (Part - 3):

  • Concept of syntax diagram.
  • How syntax analysis works.
  • How syntax analyzer works.
  • How the interpreter is returning the only number if I don't pass any arithmatic operator.
  • How the interpreter is performing while I pass arbitary number of additions and subtractions.
  • How the interpreter is performing while I pass arbitary number of multiplication and division.

Learning (Part - 4):

  • The concept of grammer.
  • The idea of rules / productions in the grammer.
  • Concept and identification of terminals(expt, factor) and non-terminals(PLUS, MINUS, etc).
  • Head and Body of a rule.
  • start symbol - It is the non-terminal symbol of the left side of the first rule.
  • Here we can set the current token to the first token taken from the input. Because whenever the object of Interpreter() class will be created then the __init__() method will be called. So we don't have to explicitly set the token to the first token like here.

Learning (Part - 5):

  • concept of associativity and precedence for operator.
  • How we calculate arithmetic expressions containing any number of +, -, *, or / operators.
  • Learned how to write unit tests.

Learning (Part - 6):

  • I have learned How to calculate an arithmatic expression containing parentheses.
  • Leraned how to divide parentheses into tokens.

Learning (Part - 7):

  • Concept about the tree data structure.
  • Idea of root, node, interior node, leaf node, parent, child(left, right).
  • Concept of concrete syntax tree.
  • AST(abstract-syntax-tree) - An abstract syntax tree (AST) is a tree that represents the abstract syntactic structure of a language construct where each interior node and the root node represents an operator, and the children of the node represent the operands of that operator.
  • Difference between AST and the parse tree.
  • Various traversal method(preorder, inorder, postorder).
  • Modified the parser and interpreter and split them apart. Now the parser is getting a token from lexer and returning generating the AST for the interpreter to traverse and interpret the input.

Learning (Part - 8):

  • Concept of unary operators and it's precedence.
  • Extending the grammar to handle unary plus and minus operators.
  • Extending the parser to generate an AST with UnaryOp node and the interpreter with a new visit_UnaryOp method to interpret unary operators.
  • Calling the function factor(self) from inside it in #L168 and #L172 so that it can get the unary operator token.

Learning (Part - 9):

  • Made changes to the grammar.
    program : compound_statement DOT
    
    compound_statement : BEGIN statement_list END
    
    statement_list : statement
                   | statement SEMI statement_list
    
    statement : compound_statement
              | assignment_statement
              | empty
    
    assignment_statement : variable ASSIGN expr
    
    empty :
    
    expr: term ((PLUS | MINUS) term)*
    
    term: factor ((MUL | DIV) factor)*
    
    factor : PLUS factor
           | MINUS factor
           | INTEGER
           | LPAREN expr RPAREN
           | variable
    
    variable: ID
    
  • Made changes in the lexer like adding new tokens and added a peak and _id method and also edited the get_next_token method for new tokens.
  • Added new AST nodes(Compound, Assign, Var, NoOp) to the parser for new language constructs.
  • Added new methods corresponding to the new grammar rules to our recursive-descent parser and updated the existing methods like factor and term.
  • Added new visitor methods to the interpreter.
  • Added a dictionary for storing variables.

Learning (Part - 10):

  • Added few grammar rules like block, declaration, variable_declaration, type_spec and updated some grammar like term and factor. Here below our new grammar -
    program : PROGRAM variable SEMI block DOT
    
      block : declarations compound_statement
    
      declarations : VAR (variable_declaration SEMI)+
                   | empty
    
      variable_declaration : ID (COMMA ID)* COLON type_spec
    
      type_spec : INTEGER | REAL
    
      compound_statement : BEGIN statement_list END
    
      statement_list : statement
                     | statement SEMI statement_list
    
      statement : compound_statement
                | assignment_statement
                | empty
    
      assignment_statement : variable ASSIGN expr
    
      empty :
    
      expr : term ((PLUS | MINUS) term)*
    
      term : factor ((MUL | INTEGER_DIV | FLOAT_DIV) factor)*
    
      factor : PLUS factor
             | MINUS factor
             | INTEGER_CONST
             | REAL_CONST
             | LPAREN expr RPAREN
             | variable
    
      variable: ID
    
  • Made changes in the lexer like -
    • Adding new tokens.
    • Adding new reserved keywords and updating some previous one.
    • Adding new skip_comment method to handle pascal comment and updated the integer and get_next_token method.
  • Added new AST nodes like Program, Block, VarDecl, Type.
  • Added new methods and updated the existing parser methods like program, term, factor.
  • Added new visitor methods and updated the visit_BinOp method.

Learning (Part - 11):

  • Symbol - It is an identifier of some program entity like a variable, subroutine, or built-in type. Added two classes called BuiltinTypeSymbol , VarSymbol and both are inheriting the class Symbol.
  • Symbol table - A symbol table is an abstract data type (ADT) for tracking various symbols in source code. Added SymbolTable class to handle this operation.
  • Edited the visit_Assign and visit_Var method so that it can check weather a variable is declared or not before they are used in assignments and expressions.

Learning (Part - 12):

  • Procedure declaration - it is a language construct that defines an identifier (a procedure name) and associates it with a block of Pascal code.
  • Updated the grammar(declarations) -
    declarations : VAR (variable_declaration SEMI)+
                       | (PROCEDURE ID SEMI block SEMI)*
                       | empty
    
  • Updated the lexer by adding PROCEDURE in new token and reserved keywords.
  • Updated the parser by adding the ProcedureDecl AST nodes and modified the declarations method to support procedure declaration.
  • Updated the interpreter by adding a visit_ProcedureDecl in the Interpreter class.

Learning (Part - 13):

  • Semantic analysis - it is just a process to help us determine whether a program makes sense, and that it has meaning, according to a language definition.
  • Added an algorithm by which we store all information about variable declarations in a stash and when you see a variable reference, such as in the assignment statement x := x + y, search the stash by the variable’s name to see if the stash has any information about the variable. If yes then the variable has been declared and if not then there is a semantic error.
  • Concept how to code a semantic analyzer that walks an AST, builds the symbol table, and does basic semantic checks.

Learning (Part - 14):

  • Concept of scope(global scope, nested scope) in a programming language and how it helps.
    • Every scope creates an isolated name space, which means that variables declared in a scope cannot be accessed from outside of it.
    • You can re-use the same name in different scopes and know exactly, just by looking at the program source code, what declaration the name refers to at every point in the program.
    • In a nested scope you can re-declare a variable with the same name as in the outer scope, thus effectively hiding the outer declaration, which gives you control over access to different variables from the outer scope.
  • Enhanced our SymbolTable class to ScopedSymbolTable class by updating some stuffs, So that it can handle scope feature in our new spi.
  • Changed the grammar and updated the parser so that it can handle the procedure parameters.
  • Concept of procedure symbol and introduced a new class called ProcedureSymbol class.
  • Concept of scope symbol table. Added visit_ProcedureDecl and updated our visit_Program method to create the scoped symbol tables. Updated the semantic analyzer visitor methods visit_VarDecl, visit_Var for inserting and looking up the symbols.
  • Updated the ScopedSymbolTable class and add a variable enclosing_scope that will hold a pointer to the scope’s enclosing scope and this will be link between scopes. Updated the visit_Program and visit_ProcedureDecl methods to create an actual link to the scope’s enclosing scope.
  • We updated the lookup method in ScopedSymbolTable class to seach for variable declaration in the scopes.
  • source-to-source compiler - It is a compiler that translates a program in some source language into a program in the same (or almost the same) source language.

Learning (Part - 15):

  • Improved the error reporting in lexer, parser ad semantic analyzer. Introduced new classes called ErrorCode, Error, LexerError, ParserError, SemanticError.[Here ErrorCode is an enumeration class]
  • To provide the position of the error message we updated the Lexer class constructor and the advance method so that it can count the line and column.
  • Instead of having the token types defined as module level variables, we are going to move them into a enumeration class called TokenType.
  • Created a new method called _build_reserved_keywords so that it can build the 'RESERVED KEYWORDS' from the class TokenType.
  • Refactored the declaration method and move the procedure declaration into another method called procedure_declaration.
  • Added the log method to both ScopedSymbolTable and SemanticAnalyzer. Updated the main function and added a command line option "--scope" to turn scope logging on and off(off by default).

Learning (Part - 16):

  • We extended our statement grammar and included a new rule called proccall_statement.
  • We added a method called proccall_statement and a class called ProcedureCall.
  • Distinguished between procedure call and assignment of any variable by updaing the statement methodd.
  • Added a new variable called WRONG_PARAMS_NUM in the class ErrorCode that contain the error message for wrong number of parameters in procedure call. Added the method called visit_ProcedureCall in the class of SemanticAnalyzer for checking parametrs and if there is any problm it will throw the error.
  • Now the program grammar is like this -
    program : PROGRAM variable SEMI block DOT
    
          block : declarations compound_statement
    
          declarations : (VAR (variable_declaration SEMI)+)? procedure_declaration*
    
          variable_declaration : ID (COMMA ID)* COLON type_spec
    
          procedure_declaration :
               PROCEDURE ID (LPAREN formal_parameter_list RPAREN)? SEMI block SEMI
    
          formal_params_list : formal_parameters
                             | formal_parameters SEMI formal_parameter_list
    
          formal_parameters : ID (COMMA ID)* COLON type_spec
    
          type_spec : INTEGER | REAL
    
          compound_statement : BEGIN statement_list END
    
          statement_list : statement
                         | statement SEMI statement_list
    
          statement : compound_statement
                    | proccall_statement
                    | assignment_statement
                    | empty
    
          proccall_statement : ID LPAREN (expr (COMMA expr)*)? RPAREN
    
          assignment_statement : variable ASSIGN expr
    
          empty :
    
          expr : term ((PLUS | MINUS) term)*
    
          term : factor ((MUL | INTEGER_DIV | FLOAT_DIV) factor)*
    
          factor : PLUS factor
                 | MINUS factor
                 | INTEGER_CONST
                 | REAL_CONST
                 | LPAREN expr RPAREN
                 | variable
    
          variable: ID
    

Learning (Part - 17):

  • The concept of memory system and why we need it.
  • stack - A stack is a data structure that is based on a “last-in-first-out” policy (LIFO), which means that the most recent item added to the stack is the first one that comes out. Stack will have following methods -
    • Push (to add the item into the stack)
    • Pop (to remove the item from the stack)
    • Peek (to return the item at the top of the stack without removing it)
  • Activatio Record - An activation record is a dictionary-like object for maintaining information about the currently executing invocation of a procedure or function, and also the program itself.
  • Replaced the GLOBAL_MEMORY dictionary with the CallStack() in the interpreter class.
  • Update the visit_Program method to use the call stack to push and pop an activation record that will hold the values of global variables.
  • Update the visit_Assign method to store a key-value pair in the activation record at the top of the call stack.
  • Update the visit_Var method to access a value by its name from the activation record at the top of the call stack.
  • Add a log method and update the visit_Program method to use it to print the contents of the call stack when interpreting a program.
You can’t perform that action at this time.