Skip to content

Saurabh702/C_Interpreter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

C_Interpreter

Porting interpreter in python to C (Blog : https://ruslanspivak.com/lsbasi-part1/ )

This is an interpreter project. The reference project can be found in the link above.

Our aim is to port the interpreter in the above project to C from python and add some new features. Unlike python, everything in C is a structure and there are no classes. Therefore, we had to create some custom header files of our own. The list of important files in this repository are listed below.

  • calc1.c - The main code which will be updated in the master branch
  • str.h - The header file containing custom structures and reference to codes of some universal functions
  • str_fnt.c - The file containing the implementation of the functions referenced by the "str.h" header file.
  • UNIT_TESTS - A Simple text file containing a list of unit test cases.
  • makefile - Builds the interpreter

The compiler Library used is GCC. GCC stands for GNU C Compiler. It is a popular open source compiler for C and an alternative compiler to CLang.

The command used to compile is:
gcc <filename> <additional_files> <options>

The command used to compile the interpreter is:

gcc calc1.c str_fnt.c -o calc1 -g -F dwarf

where:
    -o indicates save output to file
    -g produces debugging information
    -F indicates the format of the debugging information
    The debugging information is used only with debuggers like GDB. It allows more finer control and view of the code while debugging.

Concept and Theory

This implementation is processing the input token by token.

Take this example: 1 + 1

Step 1: Convert into Tokens

Here 1 is one token, + is one more token, 1 is the last token.
    Token: 1, Token: +, Token: 1

The code also does checks whether more than one token is present in the input.

  • If input is a single number, the output is same as the input

  • If input is a random character, the output is parse error

  • If input is 2 integers separated by space without an operation, the output is parse error

  • All the spaces are ignored so spaces between tokens don't matter.

Step 2: Parse Tokens

Once the tokens have been obtained, they are parsed. The parse() function compares the current_token with a type to see if the current token is of the same type. If it is, parse() automatically goes to the next token else it returns an error in form of parse_error() function.

In the above example, parse(1,"INTEGER") will compare if 1 is an integer or not. Parse is interfaced through the interpret() function which takes one argument, the interpreter struct variable which processes the input expression. Parse will automatically cause the interpreter variable to proceed to the next token which is '+' here using the get_next_token().

Step 3: Evaluate Expressions

Once all the tokens have been generated, the expression is evaluated. Before evaluation, each token is converted into its appropriate type. Each token is stored in a variable except the operator. The operators are compared and tokens are operated based on the input operator.

In the above example, the '+' token which is a character is compared with + to find out if the operation to be done is addition or not. If true then the tokens(operands) are added and the result is the output.

Features

  • Ignores spaces while tokenizing
  • Multiple operations on the same text line
  • Checking errors in input
  • Precedence checking ( * is given more priority than + )
  • Creation of custom tokens

To do

  • Add support of parenthesis
  • Add support of stack

Sample Outputs

  1. calc> 1+1
    2
  2. calc> 1 + 1 + 3 + 4
    9
  3. calc> 1*100/100 - 100 + 1
    -98

NOTE : For more info about this project, click this