# Forth Project: Semantic Analysis

Once syntax trees are built, additional analysis can be done by evaluating
attributes on tree nodes to gather necessary semantic information from the
source code not easily detected during parsing. It usually includes type
checking, and symbol table construction to makes sure a variable is declared
before use.

## Program Checking
First, you will need to define a symbol table that keeps track of
previously declared identifiers.  The symbol table will be consulted
whenever the compiler needs to lookup information about variable and
constant declarations.

Next, you will need to define objects that represent the different
builtin data types and record information about their capabilities.

### Type System
Let's define classes that represent types.  There is a general class used to represent
all types.  Each basic type is then a singleton instance of the type class.
```
class uCType(object):
      pass

int_type = uCType("int",...)
float_type = uCType("float",...)
char_type = uCType("char", ...)
```
The contents of the type class is entirely up to you.  However, you will minimally need
to encode some information about what operators are supported (+, -, *, etc.), and
default values.

Once you have defined the built-in types, you will need to make sure they get registered
with any symbol tables or code that checks for type names.

In [None]:
class uCType(object):
    '''
    Class that represents a type in the uC language.  Basic
    Types are declared as singleton instances of this type.
    '''
    def __init__(self, name, binary_ops=set(), unary_ops=set(),
                 rel_ops=set(), assign_ops=set()):
        '''
        You must implement yourself and figure out what to store.
        '''
        self.typename = name
        self.unary_ops = unary_ops
        self.binary_ops = binary_ops
        self.rel_ops = rel_ops
        self.assign_ops = assign_ops

# Create specific instances of basic types. You will need to add
# appropriate arguments depending on your definition of uCType
IntType = uCType("int",
                 unary_ops   = {"-", "+", "--", "++", "p--", "p++", "*", "&"},
                 binary_ops  = {"+", "-", "*", "/", "%"},
                 rel_ops     = {"==", "!=", "<", ">", "<=", ">="},
                 assign_ops  = {"=", "+=", "-=", "*=", "/=", "%="}
                 )

FloatType = uCType("float",
                   ...
    )
CharType = uCType("char",
                   ...
    )

# Array, Pointer & Function types need to be instantiated for each declaration
class ArrayType(uCType):
    def __init__(self, element_type, size=None):
       """
       type: Any of the uCTypes can be used as the array's type. This
             means that there's support for nested types, like matrices.
       size: Integer with the length of the array.
       """
       self.type = element_type
       self.size = size
       super().__init__(None, unary_ops={"*", "&"}, rel_ops={"==", "!="})

...

In your type checking code, you will need to reference the
above type objects.   Think of how you will want to access
them.

### Visiting the AST
The following classes for visiting the AST are taken from Python’s ast module:

In [None]:
class NodeVisitor(object):
    """ A base NodeVisitor class for visiting uc_ast nodes.
        Subclass it and define your own visit_XXX methods, where
        XXX is the class name you want to visit with these
        methods.

        For example:

        class ConstantVisitor(NodeVisitor):
            def __init__(self):
                self.values = []

            def visit_Constant(self, node):
                self.values.append(node.value)

        Creates a list of values of all the constant nodes
        encountered below the given node. To use it:

        cv = ConstantVisitor()
        cv.visit(node)

        Notes:

        *   generic_visit() will be called for AST nodes for which
            no visit_XXX method was defined.
        *   The children of nodes for which a visit_XXX was
            defined will not be visited - if you need this, call
            generic_visit() on the node.
            You can use:
                NodeVisitor.generic_visit(self, node)
        *   Modeled after Python's own AST visiting facilities
            (the ast module of Python 3.0)
    """

    _method_cache = None

    def visit(self, node):
        """ Visit a node.
        """

        if self._method_cache is None:
            self._method_cache = {}

        visitor = self._method_cache.get(node.__class__.__name__, None)
        if visitor is None:
            method = 'visit_' + node.__class__.__name__
            visitor = getattr(self, method, self.generic_visit)
            self._method_cache[node.__class__.__name__] = visitor

        return visitor(node)

    def generic_visit(self, node):
        """ Called if no explicit visitor function exists for a
            node. Implements preorder visiting of the node.
        """
        for c in node:
            self.visit(c)

### Semantic Rules

Finally, you'll need to write code that walks the AST and enforces
a set of semantic rules.  Here is a complete list of everything you'll
need to check:

1.  Names and symbols:

    All identifiers must be defined before they are used.  This includes variables &
    functions.  For example, this kind of code generates an error:
```
       a = 3;              // Error. 'a' not defined.
       int a;
```
    Note: typenames such as "int", "float", and "char" are built-in names that
    should be defined at the start of the program.

2.  Types of literals

    All literal symbols must be assigned a type of "int", "float", "char" or "string".  
    For example:
```
       42;         // Type "int"
       4.2;        // Type "float"
       'x';        // Type "char"
       "forty";    // Type "string"
```
    To do this assignment, check the Python type of the literal value and attach
    a type name as appropriate.

3.  Binary operator type checking

    Binary operators only operate on operands of the same type and produce a
    result of the same type.   Otherwise, you get a type error.  For example:
```
        int a = 2;
        float b = 3.14;

        int c = a + 3;    // OK
        int d = a + b;    // Error.  int + float
        int e = b + 4.5;  // Error.  int = float
```

4.  Unary operator type checking.
```
    Unary operators return a result that's the same type as the operand.
```

5.  Supported operators

    Here are the some examples of operators supported by each type:
```
    int:      binary_ops { +, -, *, /}, unary_ops { +, -}
    float:    rel_ops { ==, !=, <, <=}, assign_ops { +=, -=}
```
    Attempts to use unsupported operators should result in an error. 
    For example:
```
        char a[] = "Hello" + "World";     // OK
        char b[] = "Hello" * "World";     // Error (unsupported op *)
```

6.  Assignment, indexing, etc.

    The left and right hand sides of an assignment operation must be
    declared as the same type. The size os objects must match. The index of an array must be of type int, etc.
    See the examples below:
    ```
    int v[4] = {1, 2, 3};     // Error (size mismatch on initialization)
    float f;
    int j = v[f];             // Error (array index must be of type int)
    j = f;                    // Error (canot assign float to int)
    ```
    However, string literals can be assigned to array of chars. See the example below
    ```
    char c[] = "Susy";        // Ok
    ```
    In this case, the size of ```c``` must be inferred from the initialization

For walking the AST, use the NodeVisitor class. A shell of the code is provided below. Use it as a guide.


In [None]:
class SymbolTable(object):
    '''
    Class representing a symbol table.  It should provide functionality
    for adding and looking up nodes associated with identifiers.
    '''
    def __init__(self):
        self.symtab = {}
    def lookup(self, a):
        return self.symtab.get(a)
    def add(self, a, v):
        self.symtab[a] = v

class Visitor(NodeVisitor):
    '''
    Program visitor class. This class uses the visitor pattern. You need to define methods
    of the form visit_NodeName() for each kind of AST node that you want to process.
    Note: You will need to adjust the names of the AST nodes if you picked different names.
    '''
    def __init__(self):
        # Initialize the symbol table
        self.symtab = SymbolTable()

        # Add built-in type names (int, float, char) to the symbol table
        self.symtab.add("int",uctype.int_type)
        self.symtab.add("float",uctype.float_type)
        self.symtab.add("char",uctype.char_type)

    def visit_Program(self,node):
        # 1. Visit all of the global declarations
        # 2. Record the associated symbol table
        for _decl in node.gdecls:
            self.visit(_decl)

    def visit_BinaryOp(self, node):
        # 1. Make sure left and right operands have the same type
        # 2. Make sure the operation is supported
        # 3. Assign the result type
        self.visit(node.left)
        self.visit(node.right)
        node.type = node.left.type

    def visit_Assignment(self, node):
        ## 1. Make sure the location of the assignment is defined
        sym = self.symtab.lookup(node.location)
        assert sym, "Assigning to unknown sym"
        ## 2. Check that the types match
        self.visit(node.value)
        assert sym.type == node.value.type, "Type mismatch in assignment"
