# Introduction to GF

This notebook introduces GF, the Grammatical Framework ([https://www.grammaticalframework.org/](https://www.grammaticalframework.org/)), which is responsible for parsing in GLIF.
We will only work through a "hello world" level example - to learn the true power of GF we recommend taking a look at the [GF tutorial](https://www.grammaticalframework.org/doc/tutorial/gf-tutorial.html).

If you are already familiar with GF, you should still skim through this notebook to see how GF can be used in a GLIF notebook.

In this notebook we will develop a very small grammar that supports sentences like:
* *John loves Mary*
* *Mary runs and jumps*
* *John loves Mary and runs*

## Abstract syntax

A GF grammar consists of an **abstract syntax**, which describes what abstract syntax trees are supported by the grammar, and (potentially multiple) **concrete syntaxes** that describe how the abstract syntax trees correspond to strings in a particular language.

Normally, a GF abstract syntax is described in a `.gf` file, but in a GLIF notebook, we can simply enter it into a code cell:

In [1]:
-- Comments start with --

-- Let us call the abstract syntax `Gossip`:
abstract Gossip = {
    cat           -- the `cat` keyword is used to introduce the syntactic categories
        S;        -- for complete sentences
        Person;   -- "John", "Mary", ...
        Verb;     -- "runs", "loves Mary", ...
    
    fun           -- the `fun` keyword is used to introduce rules
        -- the `sentence` rule combines a `Person` and a `Verb` to get a complete sentence (`S`)
        sentence : Person -> Verb -> S;    -- "John" -> "runs" -> "John runs"
        and: Verb -> Verb -> Verb;         -- "runs" -> "loves Mary" -> "runs and loves Mary"
        
        -- "terminals"
        john: Person;
        mary: Person;
        run: Verb;
        jump: Verb;
        
        -- transitive verbs like "love" require an object
        love : Person -> Verb;    -- "John" -> "loves John"
        hate : Person -> Verb;
}

With the abstract syntax in place, we can now express the abstract syntax trees for our example sentences.
GF usually uses a string representation.
For example, the sentence *John loves Mary and runs* would have the abstract syntax tree `sentence john (and (love mary) run)`.
Abstract syntax trees can be visualized with the `visualize_tree` command:

In [2]:
visualize_tree sentence john (and (love mary) run)

Dropdown(layout=Layout(width='max-content'), options=('0.0. sentence john (and (love mary) run)',), value='0.0…

Image(value=b'<?xml version="1.0" encoding="UTF-8" standalone="no"?>\n<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.…

# Concrete syntax

In the concrete syntax we describe how abstract syntax trees can be linearized into strings.
GF is designed such that it can also generate a parse from that description.
Let us start by making a concrete syntax for English, again as a code cell:

In [3]:
concrete GossipEng of Gossip = {
    lincat    -- after the `lincat` keyword, we describe what concrete types the syntactic categories have
              -- in this very simple example, everything should be a string
        S = Str; Person = Str; Verb = Str;

    lin       -- after the `lin` keyword, we describe the linearization of rules
        
        -- the `sentence` rule gets a `Person` (`pers`) and a `Verb` (`vrb`) and concatenates them (`++`).
        sentence pers vrb = pers ++ vrb;
        -- the `and` rule takes two `Verb`s and concatenates them with and "and" in between:
        and v1 v2 = v1 ++ "and" ++ v2;
        
        john = "John";
        mary = "Mary";
        run = "runs";
        jump = "jumps";
        love pers = "loves" ++ pers;
        hate pers = "hates" ++ pers;
}

With the concrete syntax in place, we can now linearize and parse sentences:

In [4]:
linearize sentence john (and (love mary) run)

In [5]:
parse "John loves Mary and runs"

By default, GF tries to parse something of the category `S`, if we want to parse anything else, we have to specify the category, otherwise we get a parser error:

In [6]:
parse "loves Mary"

In [7]:
parse -cat=Verb "loves Mary"

Commands can be concatenated with the `|` operator.
For example, we can first parse a sentence and then visualize the resulting abstract syntax tree:

In [8]:
parse "John loves Mary and runs" | visualize_tree

Dropdown(layout=Layout(width='max-content'), options=('0.0. sentence john (and (love mary) run)',), value='0.0…

Image(value=b'<?xml version="1.0" encoding="UTF-8" standalone="no"?>\n<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.…

Similarly, we can use the `generate_random` command to generate random abstract syntax trees and directly linearize them:

In [9]:
generate_random -number=8 | linearize

A common application of GF is translation.
The idea is to have two concrete syntaxes that share the same abstract syntax.
In our example, we have an English concrete syntax and we will additionally create a German concrete syntax.
Then we can parse a sentence using e.g. the English concrete syntax and linearize it using the German concrete syntax, which effectively translates a sentence from English to German.

In [10]:
concrete GossipGer of Gossip = {
    lincat
        S = Str; Person = Str; Verb = Str;
    
    lin
        sentence pers vrb = pers ++ vrb;
        and v1 v2 = v1 ++ "und" ++ v2;
        
        john = "Johann";    -- let us translate names as well
        mary = "Maria";
        run = "rennt";
        jump = "springt";
        love pers = "liebt" ++ pers;
        hate pers = "hasst" ++ pers;
}

Now that we have two different concrete syntaxes loaded, we always have to specify what language we want - otherwise GF will use all languages:

In [11]:
-- all languages
linearize sentence john (and (love mary) run)

In [12]:
-- only German
linearize -lang=Eng sentence john (and (love mary) run)

In [13]:
-- and finally translation
parse -lang=Eng "John loves Mary and runs" | linearize -lang=Ger