# C Interpreter

# Fourth Stage

Design a grammar using an upward translator that accepts the following inputs from the C language:
- scanf
- main function of C language
- void functions of C language

To develop this we will divide de problem in 3 parts. First we will develop the lexical and grammar analyzer of *scanf* *main* function statement, then the *void* function and to finish we will introduce the semantic rules and the way to read the program from a file in python.


## 1. Grammar Definition for scanf

The basic structure of *scanf* is:

```c
 scanf("format", &variable_name);
````

And we can use with multiple values:

```c
 scanf("format", &variable_name, &variable_name, &variable_name);
````


We can represent this with the following grammar:
- scanexpr <- `SCANF` `(` stringexpr scanftail
- scanftail <- `,` `&`fact scanftail
- scanftrail <- `)`  
- stringexpr <- CSTRING
- fact -> `-` fact
- fact -> num
- fact -> `ID`

## Implementing the Lexical Analyzer (Lexer)

In [16]:
from sly import Lexer

class Scanner(Lexer):
    tokens = {SCANF, CSTRING, ID}
    literals = {'(', ')', ',','&', ';'}

    # Ignore whitespaces and tabs

    ignore = ' \t'

    # Regular expressions for tokens
    CSTRING = r'\"(\\.|[^\"])*\"'
    ID = r'[a-zA-Z][\w_]*'
    ID['scanf'] = SCANF

    @_(r'\d+')
    def CNUM(self, t):
        t.value = int(t.value)
        return t

    # Error handling rule

    def error(self, t):
        print('<-'*10,"Illegal character '{}'".format(t.value[0]), '->'*10)
        self.index += 1
        t.type='Illegal'
        t.value =t.value[0]
        return t

## Testing lexical Analyzer(Scanf statements)

In [18]:
import pandas as pd 

data = pd.read_csv('../assets/testing/scanf_sentences.csv', delimiter="'")
data[['scanf_sentences']]

Unnamed: 0,scanf_sentences
0,"scanf(""%d %d %d %d"", &minx, &maxx, &miny, &maxy);"
1,"scanf(""%s %d %f %c"", &var1, &var2, &var3, &var4);"
2,"scanf(""%s"",&name);"


In [19]:
lexer = Scanner()
sentences = data['scanf_sentences'].values
pass_or_not = []
all_token_pass = True

for index, sentence in enumerate(sentences):
    print('-' * 80,"{} Lexically Testing sentence: '{}'".format(index, sentence),'-' * 80, sep='\n')
    for token in lexer.tokenize(sentence):
        print(" type = '{}', value = '{}'".format(token.type, token.value))
        if all_token_pass and 'Illegal' in token.type:
            all_token_pass = False
    
    pass_or_not.append('Pass') if all_token_pass else pass_or_not.append('FAIL')
    all_token_pass = True

data['Test'] = pass_or_not

--------------------------------------------------------------------------------
0 Lexically Testing sentence: 'scanf("%d %d %d %d", &minx, &maxx, &miny, &maxy);'
--------------------------------------------------------------------------------
 type = 'SCANF', value = 'scanf'
 type = '(', value = '('
 type = 'CSTRING', value = '"%d %d %d %d"'
 type = ',', value = ','
 type = '&', value = '&'
 type = 'ID', value = 'minx'
 type = ',', value = ','
 type = '&', value = '&'
 type = 'ID', value = 'maxx'
 type = ',', value = ','
 type = '&', value = '&'
 type = 'ID', value = 'miny'
 type = ',', value = ','
 type = '&', value = '&'
 type = 'ID', value = 'maxy'
 type = ')', value = ')'
 type = ';', value = ';'
--------------------------------------------------------------------------------
1 Lexically Testing sentence: 'scanf("%s %d %f %c", &var1, &var2, &var3, &var4);'
--------------------------------------------------------------------------------
 type = 'SCANF', value = 'scanf'
 type = '(',

## 2. Grammar Definition for void functions

The basic structure of *void function* is:

```c
void fuction_name([parameters]*) { instructions }
````
We can represent this with the following grammar:

- functionDec <- `VOID` ID`(` parameters `)` `{`instructions`}`
- instructions <- def instructions
- instructions <- statements instructions
- instructions <- `e`
- parameters <- ID parameters'
- parameters' <-, ID parameters'
- parameters' <- `e`
- statements <- scanexpr
- scanexpr <- `SCANF` `(` stringexpr scanftail
- scanftail <- `,` `&`fact scanftail
- scanftrail <- `)`  
- stringexpr <- CSTRING
- fact -> `-` fact
- fact -> num
- fact -> `ID`

The basic structure for call functions is:

- def -> asign `;`
- asign -> ID `=` asign
- asign -> expr
- expr -> functionexpr
- functionexpr <- ID`(` arguments `)`
- arguments <- fact, arguments
- arguments <- `e` 
- fact -> `-` fact
- fact -> num
- fact -> `ID`



## 2.Testing lexical analizer (Void functions)

In [29]:
import pandas as pd 

data = pd.read_csv('../assets/testing/void_functions_sentences.csv', delimiter="'")
data[['void_functions']]

Unnamed: 0,void_functions
0,"void test(){a=0; b=0; scanf(""%d %d"", a,b); c =..."
1,"void test(a , b){ c = b + 3; a = a + 5;}"
2,test();
3,"test(5, 2);"
4,"test(a, b);"


In [25]:

class Scanner(Lexer):
    tokens = {ID, CNUM, EQ, NEQ, OR, AND, LEQ, GEQ, IF, VOID, SCANF, CSTRING}
    literals ={'=', '<', '>', '/', '!', '+', '-' , '*', ';', '{', '}', '(', ')', ',', '&'}

    # Ignore whitespace and tabulations

    ignore = ' \t'

    # Regular expressions rules for tokens

    ID = r'[a-zA-Z][\w_]*'
    EQ = r'=='
    NEQ = r'!='
    GEQ = r'>='
    LEQ = r'<='
    AND = r'&&'
    OR = r'\|\|'
    
    CSTRING = r'\"(\\.|[^\"])*\"'

    # Special cases
    ID['if'] = IF
    ID['void'] = VOID
    ID['scanf'] = SCANF


    @_(r'\d+')
    def CNUM(self, t):
        t.value = int(t.value)
        return t

    # Error handling rule

    def error(self, t):
        print('<-'*10,"Illegal character '{}'".format(t.value[0]), '->'*10)
        self.index += 1
        t.type='Illegal'
        t.value =t.value[0]
        return t

In [30]:
lexer = Scanner()
sentences = data['void_functions'].values
pass_or_not = []
all_token_pass = True

for index, sentence in enumerate(sentences):
    print('-' * 80,"{} Lexically Testing sentence: '{}'".format(index, sentence),'-' * 80, sep='\n')
    for token in lexer.tokenize(sentence):
        print(" type = '{}', value = '{}'".format(token.type, token.value))
        if all_token_pass and 'Illegal' in token.type:
            all_token_pass = False
    
    pass_or_not.append('Pass') if all_token_pass else pass_or_not.append('FAIL')
    all_token_pass = True

data['Test'] = pass_or_not

--------------------------------------------------------------------------------
0 Lexically Testing sentence: 'void test(){a=0; b=0; scanf("%d %d", a,b); c = 3 + 2 + a; d = 5 + b;}'
--------------------------------------------------------------------------------
 type = 'VOID', value = 'void'
 type = 'ID', value = 'test'
 type = '(', value = '('
 type = ')', value = ')'
 type = '{', value = '{'
 type = 'ID', value = 'a'
 type = '=', value = '='
 type = 'CNUM', value = '0'
 type = ';', value = ';'
 type = 'ID', value = 'b'
 type = '=', value = '='
 type = 'CNUM', value = '0'
 type = ';', value = ';'
 type = 'SCANF', value = 'scanf'
 type = '(', value = '('
 type = 'CSTRING', value = '"%d %d"'
 type = ',', value = ','
 type = 'ID', value = 'a'
 type = ',', value = ','
 type = 'ID', value = 'b'
 type = ')', value = ')'
 type = ';', value = ';'
 type = 'ID', value = 'c'
 type = '=', value = '='
 type = 'CNUM', value = '3'
 type = '+', value = '+'
 type = 'CNUM', value = '2'
 type = '+', v