# Lesson 3 - Programming Language Specification
Let's use what we learned so far in order to specify a programming language of our very own: Wowza. Wowza is a simple, imperative language. In this lesson, we will specify the lexical and syntax rules of the Wowza language, but first, here's a sample program of the Wowza langauge:

```
/*
  My Recursive Fibonacci Program
*/

func num fib(num n) {
  if (n < 2) return 0;
  if (n == 2) return 1;
  return fib(n-2) + fib(n-1)
}

num n = 10;
print("The first " + n + " numbers in the Fibonacci sequence:");
for i from 1 to n {
  print(fib(i));
}
```

Pretty neat, right? Let's get into to laying out the rules that we will use in later chapters as we build our own Wowza interpreter.

## Lexical Rules
Lexical rules focus on mapping lexemes with their respecitive token. We need to create rules that state: if a lexeme contains these characters, then it must be this token. A good notation to express these rules is [regular expressions](https://en.wikipedia.org/wiki/Regular_expression).

A great resource for practicing and testing regular expressions: https://regexr.com/.

First, let's specify the lexical rules for each operator should be:
```
assign_op: r"="
add_op: r"+"
sub_op: r"-"
mul_op: r"*"
div_op: r"/"
eq_op: r"=="
neq_op: r"!="
lt_op: r"<"
lte_op: r"<="
gt_op: r">"
gte_op: r">="
```

Next, let's speicfy the rules for some of the special characters that we will encounter:
```
begin_block: r"{"
end_block: r"}"
begin_paren: r"("
end_paren: r")"
comma_sep: r","
end_stmt: r";"
```

Now, let's specify the rules for **kewords**. Each keword is a small string of only lowercase characters, so the corresponding regular expression is:
```
keyword: r"^[a-z]+$"
kw_list = [
    bool, elif, else, false, for, 
    from, func, if, num, print, 
    string, to, true, void, while
]
```

We now need to specify the rules for **identifiers**, which are alphanumeric characters, that must begin with a letter (underscores are also allowed):
```
id: r"^[A-Za-z_][A-Za-z0-9_]*$"
```

The rules for a **string literal** are straight forward. Any character, except `"`, can be placed between two quotation marks:
```
string_lit: r"^\"[^\"]*\"$"
```

Finally, we need to define the **number literal**. Simply put, a number literal can be written as any of the following: `2`, `0.0`, `-23.023`, etc. It's corresponding regular expression is this: 
```
num_lit: r"^(-)?(([1-9][0-9]*(\.[0-9]+)?)|(0\.[0-9]+)|0)$"
```

## Syntax Rules
Syntax rules focus on the structure and arrangement of tokens. We need to create rules that state: this component A can be evaluated as a combination of these components (B, C, and D). A good notation to express these rules is [Backus-Naur Form (BNF)](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form). Let's now go through the different parts of a program and how we can represent them in BNF.

### Statement List: &lt;stmt_list&gt;
One of the most fundamental components in the Wowza language is the statement list. As you can guess, the simple Wowza language structure is:
```
stmt1();
stmt2();
stmt3();
...
```
A statement list is a sequence of statements, separated by a semicolon. We can represent this rule in BNF in this way:

```
<stmt_list> ::= <stmt> |
                <stmt> <end_stmt> |
                <stmt> <end_stmt> <stmt_list>
```
The `|` characters signify "or". Which is to say, a statement list could be just a `<stmt>` (one statement) or `<stmt> <end_stmt>` (one statement followed by a semi-colon). 

### Statement Block: &lt;stmt_block&gt;
Suppose, we wright a conditional statement in Wowza:
```
if thisIsTrue {
    doThis();
}
```
We need a way to express all the statements between the `{}`. We can easily do this by using the stmt list that has already been defined:
```
<stmt_block> ::= <begin_block> <stmt_list> <end_block>
```
Looking up at our lexical rules, we can see that a `begin_block` token refers to a `{`, while an `end_block` token refers to a `}`.

### Statement: &lt;stmt&gt;
Every Wowza program is a sequence of statements, similar to many other imperative languages. A statment can be either a variable declaration, assignment, conditional statement, for loop, while loop, function definition, or function call. We can easily define this in BNF as:
```
<stmt> ::= <declare_stmt> |
           <assign_stmt> |
           <cond_stmt> |
           <for_loop> |
           <while_loop> |
           <funcdef_stmt> |
           <func_stmt> |
           <ret_stmt>
```

### Declaration Statement: &lt;declare_stmt&gt;
Let's look at a few variable declarations in Wowza:
```
num number;
string my_name;
bool isThisTrue;
```
So, there are two main components: `type` and `identifier`. A type can only be one the following keywords: `num`, `string`, `bool`, or `void`. We can express this in BNF as:
```
<declare_stmt> ::= <type> <id>

<type> ::= <num_kw> | 
           <string_kw> | 
           <bool_kw> | 
           <void_kw>
```

### Assignment Statement: &lt;assign_stmt&gt;
Now, let's see a couple of Wowza assignment statements:
```
number = 2;
string wow = "WOW";
```
Notice in the second statement, we are also declaring `wow`. We can express both of these ideas in BNF like this:
```
<assign_stmt> ::= <id> <assign_op> <expr> |
                  <declare_stmt> <assign_op> <expr>
```

### Expressions: &lt;expr&gt;
First, let's look at some expressions in Wowza:
```
// Numerical Expressions
a + 2
5 - n
4 * (n / 2)

// String Expressions
wow + "hello"

// Boolean Expressions
isTrue == isFalse
n < 4
n >= doubleNum(7)

```

Now, let's define the top-level `expr` token and the `term` token:
```
<expr> ::= <term>
           <op_expr> |
           <bool_expr> |
           <begin_paren> <expr> <end_paren>

<term> ::= <id> |
           <num_lit> |
           <string_lit> |
           <bool_lit> |
           <func_stmt>
           
<bool_lit> ::= <true_kw> |
               <false_kw>
```


An `expr` can be evaluated to a `op_expr` (operator expression) or a `bool_expr` (boolean expression). So let's express all the operators (arithmetic and boolean) that can be used in each of these expressions:

```
<op_expr> ::= <term> <add_op> <expr> |
              <term> <sub_op> <expr> |
              <term> <mul_op> <expr> |
              <term> <div_op> <expr>

<bool_expr> ::= <term> <eq_op> <expr> |
                <term> <neq_op> <expr> |
                <term> <lt_op> <expr> |
                <term> <lte_op> <expr> |
                <term> <gt_op> <expr> |
                <term> <gte_op> <expr>
```

### Conditional Statement: &lt;cond_stmt&gt;
As in most programming languages, Wowza expresses conditions in a `if...elif...else` manner. All of the following conditional statments are valid:
```
// Single-line
if d {print("d is true!!!")}

// No else
if d == "wow" {
    print(d);
} elif d != "wow" {
    print("not good");
}

if n < 3 {
    print("n is less than")
} else if n > 3 {
    print("n is greater than")
} else {
    print("n is equal")
}
```

We need a clever way to make sure that `elif`s and `else`s are optional, and that there can be multiple recurring `elif`s. We actually can represent this rule in BNF using three components: `cond_stmt`, `if_stmt`, and `else_stmt`. Here are the rules:
```
<cond_stmt> ::= <if_stmt> |
                <if_stmt> <else_stmt>

<if_stmt> ::= <if_kw> <expr> <stmt_block>

<else_stmt> ::= <else_kw> <stmt_block>
                <elif_kw> <expr> <stmt_block> |
                <elif_kw> <expr> <stmt_block> <else_stmt>
```

As we evalute this BNF, we can see that it does meet all of Wowza's needs.

### Loops: &lt;for_loop&gt; & &lt;while_loop&gt;
Compared to the conditional statement rules, Wowza syntax rules for loops are very simple. Here are two example loops written in Wowza:
```
for i from 1 to 3 {
    print(i);
}

while thisIsTrue {
    print("this is still true");
}
```

The corresponding BNF notation is:
```
<for_loop> ::= <for_kw> <id> <from_kw> <term> <to_kw> <term> <stmt_block>

<while_loop> ::= <while_kw> <bool_expr> <stmt_block>
```

### Functions: &lt;funcdef_stmt&gt;, &lt;func_stmt&gt;, & &lt;ret_stmt&gt; 
Finally, we have reached the last construct in the Wowza language: functions. In Wowza, we can define and call functions like so:
```
func num foo(num n1, num n2) {
    return n1 + n2;
}

num value = foo(2,2); // 2 + 2 = 4
```

There are three key statements that we must define: `funcdef_stmt` (function definition), `func_stmt` (function call statement), and `ret_stmt` (return statment). 

Let's first work through the function definition and return statement parts. A function definition must state a return type, a function identifier, an argument list, and a statement block. Here is the BNF notation:
```
<funcdef_stmt> ::= <func_kw> <type> <id> <begin_paren> <arg_list> <end_paren> <stmt_block>

<arg_list> ::= <type> <id> |
               <type> <id> <comma_sep> <arg_list>

<ret_stmt> ::= <return_kw> <expr>
```

Next, we need to do something very similar to the function call statement. To call the function, we must state the identifier, a pair of parenthesis, and a parameter list to pass in.
```
<func_stmt> ::= <id> <begin_paren> <param_list> <end_paren>

<param_list> ::= <expr> |
                 <expr> <comma_sep> <param_list>
```

## Conclusion
We have now fully specified the lexical and syntax rules. In the next chapter we will look into designing and implementing a lexical analyzer and a syntax analyzer, which will get us closer to having a fully-functioning Wowza interpreter.