# Defining Grammars in Herb.jl

The program space in Herb.jl is defined using a (context-free) grammar. 
This notebook guides demonstrates how such a grammar can be created. 

### Setup

In [1]:
include("../src/HerbExamples.jl")
using .Herb

### Creating a simple grammar

In [2]:
g₁ = Herb.Grammars.@cfgrammar begin
    Int = 1
    Int = 2
    Int = 3
    Int = Int * Int
    Int = Int + Int
end

1: Int = 1
2: Int = 2
3: Int = 3
4: Int = Int * Int
5: Int = Int + Int


This cell contains a very simple arithmetic grammar. 
The grammar is defined using the `@cfgrammar` macro. 
This macro converts the grammar definition in the form of a Julia expression into Herb's internal grammar representation. 
Macro's are executed during compilation.
If you want to load a grammar during execution, have a look at the `Herb.Grammars.expr2cfgrammar` function.

Defining every integer one-by-one can be quite tedious. Therefore, it is also possible to use the following syntax that makes use of a Julia iterator:

In [3]:
g₁ = Herb.Grammars.@cfgrammar begin
    Int = |(0:9)
    Int = Int * Int
    Int = Int + Int
end

1: Int = 0
2: Int = 1
3: Int = 2
4: Int = 3
5: Int = 4
6: Int = 5
7: Int = 6
8: Int = 7
9: Int = 8
10: Int = 9
11: Int = Int * Int
12: Int = Int + Int


You can do the same with lists:

In [4]:
g₁ = Herb.Grammars.@cfgrammar begin
    Int = |([0, 2, 4, 6, 8])
    Int = Int * Int
    Int = Int + Int
end

1: Int = 0
2: Int = 2
3: Int = 4
4: Int = 6
5: Int = 8
6: Int = Int * Int
7: Int = Int + Int


Variables can also be added to the grammar by just using the variable name:

In [5]:
g₁ = Herb.Grammars.@cfgrammar begin
    Int = |(0:9)
    Int = Int * Int
    Int = Int + Int
    Int = x
end

1: Int = 0
2: Int = 1
3: Int = 2
4: Int = 3
5: Int = 4
6: Int = 5
7: Int = 6
8: Int = 7
9: Int = 8
10: Int = 9
11: Int = Int * Int
12: Int = Int + Int
13: Int = x


Grammars can also work with functions. 
After all, `+` and `*` are just infix operators for Julia's identically-named functions.
You can use functions that are provided by Julia, or functions that you wrote yourself:

In [6]:
f(a) = a + 1

g₁ = Herb.Grammars.@cfgrammar begin
    Int = |(0:9)
    Int = Int * Int
    Int = Int + Int
    Int = f(Int)
    Int = x
end

1: Int = 0
2: Int = 1
3: Int = 2
4: Int = 3
5: Int = 4
6: Int = 5
7: Int = 6
8: Int = 7
9: Int = 8
10: Int = 9
11: Int = Int * Int
12: Int = Int + Int
13: Int = f(Int)
14: Int = x


### Working with grammars

If you want to implement something using these grammars, it is useful to know about the functions that you can use to manipulate  a grammar. This section isn't necessarily complete, but it aims to give an overview of the most important functions. 

It is recommended to also read up on [Julia metaprogramming](https://docs.julialang.org/en/v1/manual/metaprogramming/) if you aren't already familiar with that.

One of the most important things about grammars is that each rule has an index associated with it:

In [7]:
g₁ = Herb.Grammars.@cfgrammar begin
    Int = |(0:9)
    Int = Int + Int
    Int = Int * Int
    Int = x
end

collect(enumerate(g₁.rules))

13-element Vector{Tuple{Int64, Any}}:
 (1, 0)
 (2, 1)
 (3, 2)
 (4, 3)
 (5, 4)
 (6, 5)
 (7, 6)
 (8, 7)
 (9, 8)
 (10, 9)
 (11, :(Int + Int))
 (12, :(Int * Int))
 (13, :x)

We can use this index to extract information from the grammar.

### isterminal

`isterminal` returns `true` if a rule is terminal, i.e. it cannot be expanded. For example, rule 1 is terminal, but rule 11 is not, since it contains the non-terminal symbol `:Int`. 

In [8]:
Herb.Grammars.isterminal(g₁, 1)

true

In [9]:
Herb.Grammars.isterminal(g₁, 11)

false

### return_type

This function is rather obvious; it returns the non-terminal symbol that corresponds to a certain rule. The return type for all rules in our grammar is `:Int`.

In [10]:
Herb.Grammars.return_type(g₁, 11)

:Int

### child_types

`child_types` returns the types of the nonterminal children of a rule in a vector.
If you just want to know how many children a rule has, and not necessarily which types they have, you can use `nchildren`

In [11]:
Herb.Grammars.child_types(g₁, 11)

2-element Vector{Symbol}:
 :Int
 :Int

In [12]:
Herb.Grammars.nchildren(g₁, 11)

2

### nonterminals

The `nonterminals` function can be used to obtain a list of all nonterminals in the grammar.

In [13]:
Herb.Grammars.nonterminals(g₁)

1-element Vector{Symbol}:
 :Int

### Adding rules

It is also possible to add rules to a grammar during execution. This can be done using the `add_rule!` function.
As with most functions in Julia that end with an exclamation mark, this function modifies its argument (the grammar).

A rule can be provided in the same syntax as is used in the grammar definition.
The rule should be of the `Expr` type, which is a built-in type for representing expressions. 
An easy way of creating `Expr` values in Julia is to encapsulate it in brackets and use a colon as prefix:

In [14]:
Herb.Grammars.add_rule!(g₁, :(Int = Int - Int))

1: Int = 0
2: Int = 1
3: Int = 2
4: Int = 3
5: Int = 4
6: Int = 5
7: Int = 6
8: Int = 7
9: Int = 8
10: Int = 9
11: Int = Int + Int
12: Int = Int * Int
13: Int = x
14: Int = Int - Int


### Removing rules

It is also possible to remove rules in Herb.jl, however, this is a bit more involved. 
As said before, rules have an index associated with it. 
The internal representation of programs that are in the search space defined by the grammar makes use of those indices to efficiently store programs. 
Blindly removing a rule would shift the indices of other rules, and this could mean that existing programs get a different meaning or become invalid. 

Therefore, there are two functions associated with removing rules:

- `remove_rule!` removes a rule from the grammar, but fills its place with a placeholder. Therefore, the indices stay the same, and only programs that use the removed rule become invalid.
- `cleanup_removed_rules!` removes all placeholders and shifts the indices of the other rules.


In [15]:
Herb.Grammars.remove_rule!(g₁, 11)

1: Int = 0
2: Int = 1
3: Int = 2
4: Int = 3
5: Int = 4
6: Int = 5
7: Int = 6
8: Int = 7
9: Int = 8
10: Int = 9
11: nothing = nothing
12: Int = Int * Int
13: Int = x
14: Int = Int - Int


In [16]:
Herb.Grammars.cleanup_removed_rules!(g₁)

1: Int = 0
2: Int = 1
3: Int = 2
4: Int = 3
5: Int = 4
6: Int = 5
7: Int = 6
8: Int = 7
9: Int = 8
10: Int = 9
11: Int = Int * Int
12: Int = x
13: Int = Int - Int


### Saving & loading grammars

If you want to store a grammar on the disk, you can use the `store_cfg` and `read_cfg` functions to store and read grammars respectively. 

In [17]:
Herb.Grammars.store_cfg("demo.txt", g₁)

In [18]:
Herb.Grammars.read_cfg("demo.txt")

1: Int = 0
2: Int = 1
3: Int = 2
4: Int = 3
5: Int = 4
6: Int = 5
7: Int = 6
8: Int = 7
9: Int = 8
10: Int = 9
11: Int = Int * Int
12: Int = x
13: Int = Int - Int
