<a href="https://colab.research.google.com/github/COSMASVICTOR/cosmasvictor/blob/main/Zara_Compiler_assignments.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Zara Compiler - Week 1 & Week 2


1. Mohamed Abubakar J17-1374-2022  
2. Victor Cosmas    J17-1239-2022
3. Franck Mutugi    J17-1370-2022

This project involves building components of a compiler for the Zara programming language.
- Week 1: Implementation of a Symbol Table
- Week 2: Implementation of a Lexical Analyzer (Lexer)

Programming Language: **Python 3**  
Platform: **Google Colab**

## Week 1: Symbol Table

In [None]:
# Symbol Class
class Symbol:
    def __init__(self, name, type_, scope='global'):
        self.name = name
        self.type = type_
        self.scope = scope

    def __str__(self):
        return f"Name: {self.name}, Type: {self.type}, Scope: {self.scope}"

# Symbol Table Class
class SymbolTable:
    def __init__(self):
        self.symbols = {}

    def add_symbol(self, name, type_, scope='global'):
        if name in self.symbols:
            print(f"Warning: '{name}' already declared. Updating type to {type_}")
        self.symbols[name] = Symbol(name, type_, scope)

    def get_symbol(self, name):
        return self.symbols.get(name, None)

    def update_symbol(self, name, new_type):
        if name in self.symbols:
            self.symbols[name].type = new_type
        else:
            print(f"Error: '{name}' not found in Symbol Table.")

    def display(self):
        print("=== Symbol Table Entries ===")
        for symbol in self.symbols.values():
            print(symbol)

In [None]:
# Create a Symbol Table instance
sym_table = SymbolTable()

# Add Zara program variables and methods
sym_table.add_symbol('x', 'integer')
sym_table.add_symbol('y', 'float')
sym_table.add_symbol('name', 'string')
sym_table.add_symbol('arr', 'array')
sym_table.add_symbol('stk', 'stack')
sym_table.add_symbol('computeSum', 'method')

# Display all symbols
sym_table.display()

# Update a symbol (Example)
sym_table.update_symbol('x', 'float')

# Display after update
print("\n=== After Updating 'x' to float ===")
sym_table.display()

=== Symbol Table Entries ===
Name: x, Type: integer, Scope: global
Name: y, Type: float, Scope: global
Name: name, Type: string, Scope: global
Name: arr, Type: array, Scope: global
Name: stk, Type: stack, Scope: global
Name: computeSum, Type: method, Scope: global

=== After Updating 'x' to float ===
=== Symbol Table Entries ===
Name: x, Type: float, Scope: global
Name: y, Type: float, Scope: global
Name: name, Type: string, Scope: global
Name: arr, Type: array, Scope: global
Name: stk, Type: stack, Scope: global
Name: computeSum, Type: method, Scope: global


## Week 2: Lexical Analyzer

In [None]:
import re

# Token specifications
token_specification = [
    ('KEYWORD', r'\b(if|else|do|while|for|method|class|return)\b'),
    ('DATATYPE', r'\b(integer|float|string|array|stack)\b'),
    ('OPERATOR', r'==|=|\+|\*|/|-|>'),
    ('NUMBER', r'\b\d+(\.\d+)?\b'),
    ('IDENTIFIER', r'\b[a-zA-Z_][a-zA-Z0-9_]*\b'),
    ('LPAREN', r'\('),
    ('RPAREN', r'\)'),
    ('LBRACE', r'\{'),
    ('RBRACE', r'\}'),
    ('SEMICOLON', r';'),
    ('COMMA', r','),
    ('NEWLINE', r'\n'),
    ('SKIP', r'[ \t]+'),
    ('MISMATCH', r'.'),  # Any other character
]

# Master regex
tok_regex = '|'.join(f'(?P<{name}>{regex})' for name, regex in token_specification)

# Lexer function
def lexer(code):
    tokens = []
    for mo in re.finditer(tok_regex, code):
        kind = mo.lastgroup
        value = mo.group()
        if kind in ['NEWLINE', 'SKIP']:
            continue
        elif kind == 'MISMATCH':
            raise RuntimeError(f'Unexpected character: {value}')
        else:
            tokens.append((kind, value))
    return tokens

In [None]:
# Sample Zara Program
zara_code = '''
integer x;
float y;
string name;
if (x > 0) {
    x = x - 1;
}
method add(integer a, integer b) {
    return a + b;
}
'''

# Run the Lexer
token_list = lexer(zara_code)

# Display tokens
print("=== Tokens from Zara Code ===")
for token in token_list:
    print(token)

=== Tokens from Zara Code ===
('DATATYPE', 'integer')
('IDENTIFIER', 'x')
('SEMICOLON', ';')
('DATATYPE', 'float')
('IDENTIFIER', 'y')
('SEMICOLON', ';')
('DATATYPE', 'string')
('IDENTIFIER', 'name')
('SEMICOLON', ';')
('KEYWORD', 'if')
('LPAREN', '(')
('IDENTIFIER', 'x')
('OPERATOR', '>')
('NUMBER', '0')
('RPAREN', ')')
('LBRACE', '{')
('IDENTIFIER', 'x')
('OPERATOR', '=')
('IDENTIFIER', 'x')
('OPERATOR', '-')
('NUMBER', '1')
('SEMICOLON', ';')
('RBRACE', '}')
('KEYWORD', 'method')
('IDENTIFIER', 'add')
('LPAREN', '(')
('DATATYPE', 'integer')
('IDENTIFIER', 'a')
('COMMA', ',')
('DATATYPE', 'integer')
('IDENTIFIER', 'b')
('RPAREN', ')')
('LBRACE', '{')
('KEYWORD', 'return')
('IDENTIFIER', 'a')
('OPERATOR', '+')
('IDENTIFIER', 'b')
('SEMICOLON', ';')
('RBRACE', '}')
