In [1]:
#This notebook contains a sort of hybrid between C++ and cuppa. As yacc is all I am familiar with,
#I used yacc for this project. I did attempt to use other parsers briefly, but had trouble with implementation.
#I have a particular love for C, and thought the challenge would be fun.
#The implementation was not very successful. I hit multiple roadblocks with syntax errors in the parser.
#I restarted several times over a couple weeks before finally using cuppa 5 as a template. 
#I thought of my language as an optimized version of C++, keeping the best features and discarding the rest.
#I didn't quite reach the goal, but if I had then I think I would prefer my language to C or Python.

#In the end I made more of a hybrid between cuppa5 and C++ as I opted to leave out specific features I didn't like.
#One of the primary aspects of the C language that I intentionally left out was the LL1 requirement.
#A typical C program runs through function declarations, main, then function bodies. 
#The lR1 parser reads the function bodies first, then the main program, then the function declarations.
#It seemed like a waste to create a language that was restricted to declarations and a main function.
#The grammar would have been relatively simple to adjust, I would have only needed to change a few things. 
#First, the grammar rules:

#    program : stmt_list
#    stmt_list : stmt stmt_list
#              | empty
#    stmt : VOID_TYPE ID '(' opt_formal_args ')' stmt
#         | data_type ID '(' opt_formal_args ')' stmt
#         | data_type ID opt_init ';'

#Would change to :

#    program : declaration
#    declaration : VOID_TYPE ID '(' opt_formal_args ')' ';' declaration
#                | data_type ID '(' opt_formal_args ')' ';' declaration
#                | data_type ID opt_init ';' declaration
#                | INTEGER_TYPE MAIN '(' opt_formal_args ')' '{' stmt_list '}' opt_func
#    opt_func : VOID_TYPE ID '(' opt_formal_args ')' stmt
#             | data_type ID '(' opt_formal_args ')' stmt
#    stmt : data_type ID opt_init ';'
#    ...
#    opt_func : VOID_TYPE ID '(' opt_formal_args ')' stmt opt_func
#             | data_type ID '(' opt_formal_args ')' stmt opt_func
#             | empty

#Second, interp_walk would include a pre-pass that checked to make sure all function declarations 
#had a matching function body after the main program. It would move those function bodies up to the start
#of the AST in place of the declarations. Then it would switch the main args to variable declarations and
#treat the main function as a block statement. Once done, the actual interp_walk would just read the bodies 
#and establish them at a global scope. Overall it was very easy to implemetnt, just tedious, so I thought I would handle
#two more difficult challenges instead.

#The big issue I faced is the combination of references and increment/decrement operators. This is where I really ran into
#trouble and why I had to restart multiple times. I didn't realize the impacts each change I made had on the rest of the
#program. I kept making several changes at a time to grammar, frontend, and lex. I would reach a similar point to where I
#had reached before, try to add a few new rules, and cause a parser error that I didn't understand.
#It took me a long time, but I eventually learned from my mistake. I added pre-increment, pre-decrement, post increment,
#and post-decrement to my compiler one rule at a time. I had to include a new rule called an operable to keep the
#prefixes and postfixes separate from the actual storable variables.

#Post decrement and increment both were incredibly harder to instantiate than I thought. They required adding new return
#variables but only to specific tuples. Any statement besides a function declaration can contain a suffix, so they had to be
#added throughout the program. Additionally, I had to create a function to analyze the suffix return value because multiple
#storables can be in the same statement.

#My architecture contains char, integer, double, and string value types. Each type can be an array type or a function type.
#I never ended up finishing the addition of pointers to my language. They are present in the grammar, frontend, and interp_walk.
#However, I spent all of my time trying ot figure out the pre/post inc/dec operators and never added the pointer heirarchy 
#to the symbol table or type promotion. Because of this my program is in a strange state where it can accept pointers
#but it ignores them. However, something finally clicked at the very end that allowed me to finish the suffix operators. 
#With another day I have no doubt I would have the pointers finished.

#Below are a bunch of examples demonstrating all my language rules.

In [2]:
from clex import lexer
from cfrontend import parser
from cstate import state
from cinterp_walk import walk
from grammar_stuff import dump_AST

In [3]:



program = \
'''
char b = 'p';
printf( b );
'''
print("")
print("COMPARISON TEST")
state.initialize()
parser.parse(program, lexer=lexer)
dump_AST(state.AST)
walk(state.AST)


COMPARISON TEST

(seq 
  |(decl b 
  |  |(char) 
  |  |(val 
  |  |  |(char) p)) 
  |(seq 
  |  |(print 
  |  |  |(oper 
  |  |  |  |(id b) 
  |  |  |  |(nil))) 
  |  |(nil)))
p


In [4]:
program = \
'''
void print1(double b) { 
    printf(b);
}
int print2(int a) {
    printf(a);
    return a;
}
double print3(double b) {
    printf(b);
    return b;
}
double b = 1;
int a = 1;
print1(b);
a = print2(a);
b = print3(b);
'''
print("FUNCTION TEST")
state.initialize()
parser.parse(program, lexer=lexer)
#dump_AST(state.AST)
walk(state.AST)


program = \
'''
double b = 1;
int a = 1;
while(a == b) { 
    printf("while");
    a = a + a;
}
a = 1;
b = a *b;
b = a/ b;
b = a-b;
b = 1;

if (a + b == 2) {
    printf("eq");
}
if (a + b <= 2) {
    printf("le");
}
if (a + b >= 2) {
    printf("ge");
}
if (a + b != 1) {
    printf("ne");
}
if (a + b > 1) {
printf("gt");
}
if (a + b < 4) {printf("lt");}
if (a && b) printf("and");
if (a || b) {
    printf("or");
}
b = 10;
for ( a = 0; a<b; a+1) {
printf(a);
}
a+=a;
printf(a);
a *= a;
a/= a;
printf(a);
a -= a;
printf(a);
printf(a++);
printf(a--);
printf(a);
'''
print("")
print("COMPARISON TEST")
state.initialize()
parser.parse(program, lexer=lexer)
#dump_AST(state.AST)
walk(state.AST)

FUNCTION TEST
1.0
1
1.0

COMPARISON TEST
while
eq
le
ge
ne
gt
lt
and
or
0
1
2
3
4
5
6
7
8
9
20
1
0
0
1
0


## 

In [None]:
#I was going through some family troubles during this project, and I wish I had been able to give it more time.
#Now that I finally understand the structure of the tuples and how to effectively manipulate them,
#I believe I could finish the pointer architecture with no trouble. In fact, I think it would have been rather fun.
#The pointers were the part of C++ that I really wanted to instantiate. However, I also wanted to save them for last,
#and I kept on finding more unique rules and aspects of C++ that I wanted to add.
#In the end though, I am satisfied because I solved the suffix problem that was causing me endless headaches.