# C interpreter

# Second Stage

Design a grammar using an aupward translator that accepts the following inputs from the C language:
- Conditional sentences (if - else)
- *printf* funcion

## Grammar design


### C Conditional statements (if - else)

In *C* the base structure of the conditional statement is:

``` 
    if(condition)
        instruction ;
```

Or could be:

```
    if(condition) {
        instruction ;
        instruction ;
        ...
        instruction ;
    }
```

The condition in *C* could be any expression.


We can represent this with the following grammar:

- exprIF -> if `(` condition `)` instruction `;`
- condition -> expr
- instruction -> `{` exprins `;` `}`
- instruction -> expr
- exprins -> exprins `;` expr
- exprins -> expr


Now we can include this into our grammar and test it, remember that
the condition and the instructions of the *if* statement has more priority so: 

- exprIF -> if `(` condition `)` instruction
- condition -> expr
- instruction -> `{` exprins  `}`
- instruction -> def
- exprins -> exprins `;` def
- exprins -> asign


- def -> asign `;`
- asign -> ID `=` asign
- asign -> expr
- expr -> exprOR
- exprOR -> exprOR `||` exprAND
- exprOR -> exprAND
- exprAND -> exprAND `&&` exprE
- exprAND -> exprE
- exprE -> exprE `[==, !=]` exprC
- exprE -> exprC
- exprC -> exprC `[<, <=, >, >=]` exprA
- exprC -> exprA
- exprA -> exprA `[+, -] ` add
- exprA -> add
- add -> add `[*, /]` fact
- add -> fact
- fact -> `-` fact
- fact -> num
- fact -> `ID`

## Implementing the lexical analyzer (Lexer)

We will reuse the main lexical analyzer implementation, adding the *if* logic necessary to process the new rules

In [6]:
from sly import Lexer

class Scanner(Lexer):
    tokens = {ID, CNUM, EQ, NEQ, OR, AND, LEQ, GEQ, IF}
    literals ={'=', '<', '>', '/', '!', '+', '-' , '*', ';', '{', '}', '(', ')'}

    # Ignore whitespace and tabulations

    ignore = ' \t'

    # Regular expressions rules for tokens

    ID = r'[a-zA-Z][\w_]*'
    EQ = r'=='
    NEQ = r'!='
    GEQ = r'>='
    LEQ = r'<='
    AND = r'&&'
    OR = r'\|\|'

    # Special cases
    ID['if'] = IF

    @_(r'\d+')
    def CNUM(self, t):
        t.value = int(t.value)
        return t

    # Error handling rule

    def error(self, t):
        print('<-'*10,"Illegal character '{}'".format(t.value[0]), '->'*10)
        self.index += 1
        t.type='Illegal'
        t.value =t.value[0]
        return t

## Testing the lexical analyzer (Lexer)


### Importing data

In [11]:
import pandas as pd 

data = pd.read_csv('../assets/testing/if_sentences.csv')
data[['if_sentences']]

Unnamed: 0,if_sentences
0,if(a > 5) { b = 4; c = 5 + b; }
1,if(b < a) c = 4;


In [10]:
lexer = Scanner()
sentences = data['if_sentences'].values
pass_or_not = []
all_token_pass = True

for index, sentence in enumerate(sentences):
    print('-' * 80,"{} Lexically Testing sentence: '{}'".format(index, sentence),'-' * 80, sep='\n')
    for token in lexer.tokenize(sentence):
        print(" type = '{}', value = '{}'".format(token.type, token.value))
        if all_token_pass and 'Illegal' in token.type:
            all_token_pass = False
    
    pass_or_not.append('Pass') if all_token_pass else pass_or_not.append('FAIL')
    all_token_pass = True

data['Test'] = pass_or_not

--------------------------------------------------------------------------------
0 Lexically Testing sentence: 'if(a > 5) { b = 4; c = 5 + b; }'
--------------------------------------------------------------------------------
 type = 'IF', value = 'if'
 type = '(', value = '('
 type = 'ID', value = 'a'
 type = '>', value = '>'
 type = 'CNUM', value = '5'
 type = ')', value = ')'
 type = '{', value = '{'
 type = 'ID', value = 'b'
 type = '=', value = '='
 type = 'CNUM', value = '4'
 type = ';', value = ';'
 type = 'ID', value = 'c'
 type = '=', value = '='
 type = 'CNUM', value = '5'
 type = '+', value = '+'
 type = 'ID', value = 'b'
 type = ';', value = ';'
 type = '}', value = '}'


In [12]:
data

Unnamed: 0,if_sentences
0,if(a > 5) { b = 4; c = 5 + b; }
1,if(b < a) c = 4;
