**COMPILADORES - AULA 04**

**Prof. Luciano Silva**

**OBJETIVOS DA AULA:**



*   Revisar o processo de análise sintática
*   Implementar um analisador sintático para a linguagem TINY-C



In [None]:
!pip install rply

Collecting rply
  Downloading https://files.pythonhosted.org/packages/c0/7c/f66be9e75485ae6901ae77d8bdbc3c0e99ca748ab927b3e18205759bde09/rply-0.7.8-py2.py3-none-any.whl
Installing collected packages: rply
Successfully installed rply-0.7.8


**REVISÃO DO PROCESSO DE ANÁLISE SINTÁTICA**

Na nossa última aula, implementamos um analisador sintático completo para o comando de atribuição com expressões ariméticas envolvendo números inteiros sem sinal:

\<atrib\>::= ID "=" \<expression\>

\<expression\> ::= NUMBER

       | expression> "+" \<expression\>
 
       | <expression> "-" \<expression\>
 
       | <expression> "*" \<expression\>
 
       | <expression> "/" \<expression\>
 
       | "(" <expression> ")"

O primeiro passo foi implementar um analisador léxico para esta gramática, mostrado abaixo:


In [None]:
from rply import LexerGenerator

lg = LexerGenerator()

lg.add('ID', r'[a-zA-Z][a-zA-Z0-9]*')
lg.add('EQUALS', r'=')
lg.add('NUMBER', r'\d+')
lg.add('PLUS', r'\+')
lg.add('MINUS', r'-')
lg.add('MUL', r'\*')
lg.add('DIV', r'/')
lg.add('OPEN_PARENS', r'\(')
lg.add('CLOSE_PARENS', r'\)')

lg.ignore('\s+')

lexer = lg.build()

O segundo passo foi implementar as classes em Python para representar os nós da árvore sintática gerada pelo analisador sintático:

In [None]:
from rply.token import BaseBox

class Number(BaseBox):
    def __init__(self, value):
        self.value = value

    def eval(self):
        return self.value

class BinaryOp(BaseBox):
    def __init__(self, left, right):
        self.left = left
        self.right = right

class Add(BinaryOp):
    def eval(self):
        return self.left.eval() + self.right.eval()

class Sub(BinaryOp):
    def eval(self):
        return self.left.eval() - self.right.eval()

class Mul(BinaryOp):
    def eval(self):
        return self.left.eval() * self.right.eval()

class Div(BinaryOp):
    def eval(self):
        return self.left.eval() / self.right.eval()

class Attrib(BaseBox):
    def __init__(self, id, expression):
        self.id = id
        self.expression = expression

Finalmente, foi implementado o analisado sintático para o comando de atribuição:

In [None]:
from rply import ParserGenerator

pg = ParserGenerator(
    # A list of all token names, accepted by the lexer.
    ['NUMBER', 'OPEN_PARENS', 'CLOSE_PARENS',
     'PLUS', 'MINUS', 'MUL', 'DIV','ID','EQUALS'
    ],
    # A list of precedence rules with ascending precedence, to
    # disambiguate ambiguous production rules.
    precedence=[
        ('left', ['PLUS', 'MINUS']),
        ('left', ['MUL', 'DIV'])    
    ]
)

# regra <atrib>::= ID "=" <expression>

@pg.production('atrib : ID EQUALS expression')
def attrib(p):
  return Attrib(p[0].getstr(),p[2])

@pg.production('expression : NUMBER')
def expression_number(p):
    # p is a list of the pieces matched by the right hand side of the
    # rule
    return Number(int(p[0].getstr()))

@pg.production('expression : OPEN_PARENS expression CLOSE_PARENS')
def expression_parens(p):
    return p[1]

@pg.production('expression : expression PLUS expression')
@pg.production('expression : expression MINUS expression')
@pg.production('expression : expression MUL expression')
@pg.production('expression : expression DIV expression')
def expression_binop(p):
    left = p[0]
    right = p[2]
    if p[1].gettokentype() == 'PLUS':
        return Add(left, right)
    elif p[1].gettokentype() == 'MINUS':
        return Sub(left, right)
    elif p[1].gettokentype() == 'MUL':
        return Mul(left, right)
    elif p[1].gettokentype() == 'DIV':
        return Div(left, right)
    else:
        raise AssertionError('Oops, this should not be possible!')

parser = pg.build()

Realizamos um teste com um comando de atribuição:

In [None]:
arvore=parser.parse(lexer.lex('x=1+2*3'))
print(arvore)
print(arvore.id)
print(arvore.expression.eval())

**EXERCÍCIO**

*Implementar e testar um analisador sintático para a gramática da linguagem TINY-C. Todos os nós da árvore sintática devem implementar o método print que, quando invocado, deve mostrar todo o conteúdo armazenado nos seus atributos.*




program ::= function (function)*

function ::= type_specifier id ‘(‘ param_decl_list ‘)’ compound_stmt

type_specifier ::= int | char

param_decl_list ::= parameter_decl (‘,’ parameter_decl )*

param_decl ::= type_specifier id

compound_stmt ::= ‘{‘ (var_decl stmt)? ‘}’

var_decl ::= type_specifier var_decl_list ‘;’

var_decl_list ::= variable_id ( ‘,’ variable_id)*

variable_id ::= id ( ‘=’ expr )? | id '[' num ']'

stmt ::= compound_stmt | cond_stmt | while_stmt | break ‘;’ | continue ‘;’ | return expr ‘;’ | readint ‘(‘ id ‘)’ ‘;’ | writeint ‘(‘ expr ‘)’ ‘;’

cond_stmt ::= if ‘(‘ expr ‘)’ stmt (else stmt)?

while_stmt ::= while ‘(‘ expr ‘)’ stmt

expr ::= id ‘=’ expr | condition

condition ::= disjunction | disjunction ‘?’ expr ‘:’ condition

disjunction ::= conjunction | disjunction ‘||’ conjunction

conjunction ::= comparison | conjunction ‘&&’ comparison

comparison ::= relation | relation ‘==’ relation

relation ::= sum | sum (‘<’ | ‘>’) sum

sum ::= sum ‘+’ term | sum ‘-’ term | term

term ::= term ‘*’ factor | term ‘/’ factor | term ‘%’ factor | factor

factor ::= ‘!’ factor | ‘-’ factor | primary

primary ::= num | charconst | id | ‘(‘ expr ‘)’ 

A implementação do analisador léxico está disponível abaixo:

In [117]:
from rply import LexerGenerator

lg = LexerGenerator()

lg.add('INT', r'int') # OK
lg.add('CHAR', r'char') # OK
lg.add('BREAK', r'break')
lg.add('CONTINUE', r'continue')
lg.add('RETURN', r'return')
lg.add('READINT', r'readint')
lg.add('WRITEINT', r'writeint')
lg.add('IF', r'if')
lg.add('WHILE', r'while')
lg.add('ID', r'[a-zA-Z][a-zA-Z0-9]*') # OK
lg.add('OPEN_PAR', r'\(') # OK
lg.add('CLOSE_PAR', r'\)') # OK
lg.add('OPEN_COL', r'\[') # OK
lg.add('CLOSE_COL', r'\]') # OK
lg.add('VIRG', r'\,')
lg.add('OPEN_CH', r'\{') # OK
lg.add('CLOSE_CH', r'\}') # OK
lg.add('PVIRG', r'\;')
lg.add('COMPEQUALS', r'==') # OK
lg.add('COMPMAIOR', r'\>') # OK
lg.add('COMPMENOR', r'\<') # OK
lg.add('EQUALS', r'=') # OK
lg.add('INTERROG', r'\?')
lg.add('DOISP', r'\:')
lg.add('DISJ', r'\|\|') # OK
lg.add('CONJ', r'&&') # OK
lg.add('NOT', r'\!')
lg.add('NUMBER', r'\d+') # OK
lg.add('CHARCONST', r'\'\S\'')
lg.add('PLUS', r'\+') # OK
lg.add('MINUS', r'-') # OK 
lg.add('MUL', r'\*') # OK
lg.add('DIV', r'/') # OK 
lg.add('MOD', r'\%') # OK

lg.ignore('\s+')

lexer = lg.build()

In [118]:
from rply.token import BaseBox

'''

class Program(BaseBox):
    def __init__(self, function[])
        self.function = function

class Function(BaseBox):
    def __init__(self, Typedef, id, params_list, func_statement)
        self.typedef = Typedef.typedef
        self.id = id
        self.params_list = params_list
        self.func_statement = func_statement

class Typedef(BaseBox):
    def __init__(self, typedef):
        self.typedef = typedef

'''

class Attrib(BaseBox):
    def __init__(self, typedef, id, expression):
        self.typedef = typedef
        self.id = id
        self.expression = expression

class Number(BaseBox):
    def __init__(self, value):
        self.value = value

    def eval(self):
        return self.value

class BinaryOp(BaseBox):
    def __init__(self, left, right):
        self.left = left
        self.right = right

class Comp_maior(BinaryOp):
    def eval(self):
        if (self.left.eval() > self.right.eval()):
          return 1
        else:
          return 0

class Comp_menor(BinaryOp):
    def eval(self):
        if (self.left.eval() < self.right.eval()):
          return 1
        else:
          return 0

class Comp_equal(BinaryOp):
    def eval(self):
        if (self.left.eval() == self.right.eval()):
          return 1
        else:
          return 0

class Conj(BinaryOp):
    def eval(self):
        if ((self.left.eval() == 1) and (self.right.eval() == 1)):
          return 1
        else:
          return 0

class Disj(BinaryOp):
    def eval(self):
        if ((self.left.eval() == 1) or (self.right.eval() == 1)):
          return 1
        else:
          return 0

class Add(BinaryOp):
    def eval(self):
        return self.left.eval() + self.right.eval()

class Sub(BinaryOp):
    def eval(self):
        return self.left.eval() - self.right.eval()

class Mul(BinaryOp):
    def eval(self):
        return self.left.eval() * self.right.eval()

class Div(BinaryOp):
    def eval(self):
        return self.left.eval() / self.right.eval()

class Mod(BinaryOp):
    def eval(self):
        return self.left.eval() % self.right.eval()

In [135]:
#implemente seu analisador sintático aqui...
#modificar o analisador sintático para reconhecer atribuições

from rply import ParserGenerator

pg = ParserGenerator(
    # A list of all token names, accepted by the lexer.
    ['NUMBER',
     'PLUS', 'MINUS', 'MUL', 'DIV','ID','EQUALS', 'COMPEQUALS',
     'COMPMAIOR', 'COMPMENOR', 'OPEN_PAR', 'CLOSE_PAR', 'OPEN_COL', 'CLOSE_COL',
     'OPEN_CH', 'CLOSE_CH','MOD', 'CHAR', 'INT', 'CONJ', 'DISJ'
     ],
    # A list of precedence rules with ascending precedence, to
    # disambiguate ambiguous production rules.
    precedence=[
        ('left', ['PLUS', 'MINUS']),
        ('left', ['MUL', 'DIV']),
        ('left', ['MOD'])
    ]
)
# regra <atrib>::= <atrib> "=" <expression>

@pg.production('atrib : CHAR ID EQUALS expression | INT ID EQUALS expression')
def attrib(p):
  return Attrib(p[0].getstr(),p[1].getstr(),p[3])

@pg.production('expression : NUMBER')
def expression_number(p):
    # p is a list of the pieces matched by the right hand side of the
    # rule
    return Number(int(p[0].getstr()))

@pg.production('expression : OPEN_PAR expression CLOSE_PAR')
def expression_parens(p):
    return p[1]

@pg.production('expression : OPEN_CH expression CLOSE_CH')
def expression_ch(p):
    return p[1]

@pg.production('expression : OPEN_COL expression CLOSE_COL')
def expression_col(p):
    return p[1]

@pg.error
def error_handle(token):
    raise ValueError(token)

@pg.production('expression : expression PLUS expression')
@pg.production('expression : expression MINUS expression')
@pg.production('expression : expression MUL expression')
@pg.production('expression : expression DIV expression')
@pg.production('expression : expression COMPEQUALS expression')
@pg.production('expression : expression COMPMAIOR expression')
@pg.production('expression : expression COMPMENOR expression')
@pg.production('expression : expression MOD expression')
@pg.production('expression : expression CONJ expression')
@pg.production('expression : expression DISJ expression')
def expression_binop(p):
    left = p[0]     
    right = p[2]
    if p[1].gettokentype() == 'PLUS':
        return Add(left, right)
    elif p[1].gettokentype() == 'MINUS':
        return Sub(left, right)
    elif p[1].gettokentype() == 'MUL':
        return Mul(left, right)
    elif p[1].gettokentype() == 'DIV':
        return Div(left, right)
    elif p[1].gettokentype() == 'COMPEQUALS':
        return Comp_equal(left, right)
    elif p[1].gettokentype() == 'COMPMAIOR':
        return Comp_maior(left, right)
    elif p[1].gettokentype() == 'COMPMENOR':
        return Comp_menor(left, right)
    elif p[1].gettokentype() == 'MOD':
        return Mod(left, right)
    elif p[1].gettokentype() == 'CONJ':
        return Conj(left, right)
    elif p[1].gettokentype() == 'DISJ':
        return Disj(left, right)
    else:
        raise AssertionError('Oops, this should not be possible!')

parser = pg.build()



In [136]:
sTest = '''
int x = 7
'''

In [137]:
#teste seu analisador com um pequeno programa em TINY-C.
arvore=parser.parse(lexer.lex(sTest))
print(arvore)
print(arvore.typedef)
print(arvore.id)
print(arvore.expression.eval())

<__main__.Attrib object at 0x7fccd5b66910>
int
x
7
