# Experimenting with `pyrealb`

This Jupyter notebook shows a few live examples of text realization with **pyrealb**.  

It can also be used directly in your browser (through Binder) with [this link](https://mybinder.org/v2/gh/lapalme/pyrealb-jupyter/64a7272b3a15404274b1082825af290b4bfee63e?urlpath=lab%2Ftree%2Fpyrealb-en.ipynb). **On opening, select the menu item `Run/Run all cells`**

**pyrealb** is a system for realizing English or French Sentences from a specification given as a Python structure built with constructor and function calls.

The names of constructors have been chosen to be similar to the usual conventions in constituent grammars: terminals embedded in phrases each of them can be modified by options specified using object property functions. It is also possible to use a dependency syntax notation to build sentences.

**pyrealb** manages automatically conjugation, declensions and most agreements between constituents. One important feature is the fact that once an affirmative sentence has been defined, many variants (e.g. negative, passive, interrogative, etc.) can be obtained by just adding specific options. Not all options are described in this document, the complete list can be found in the [**pyrealb** documentation](docs/documentation.html).

This *Jupyter* notebook briefly introduces **pyrealb** syntax with a few simple examples and also shows more challenging ones.  You can modify the examples and immediately see the effects on the realization. Once an expression is modified, it can be executed with (shift-return) or by clicking one of the appropriate buttoons.

*Nota bene*: When **pyrealb** detects a specification error which often results in realizing a word between double square brackets `[[...]]`, it also writes a warning on the console which is displayed before the result. 

First import the package and indicate that English text will be realized.

In [1]:
from pyrealb import *
loadEn()

## Creation and realization of a first word

We call a constructor to create a Python structure, for example to create a noun.

In [2]:
N('cat')

<pyrealb.Terminal.N at 0x10d064b80>

This call shows that a Python object of type `Terminal` has been created. As we will show, this object can be saved in a variable, used and modified like any Python object. Its realization is obtained by asking for its string value with `str(..)` (`print(..)` does this implicitely) or by calling the object function property `.realize()`.

To simplify notation in the rest of this notebook, we define a function to realize each argument and create a string joining the realizations separated by a separator (comma, by default).

In [3]:
def realize(*exps,sep=', '):
    return sep.join(exp.realize() for exp in exps)

It can be called as follows:

In [4]:
realize(N('cat'),N("dog"),N("butterfly"))

'cat, dog, butterfly'

## Terminal creation
The constructor is called by giving the base form as parameter:
* singular for an article or a noun

In [5]:
realize(D("a"),
        N("cat"))

'a, cat'

* infinitive for a verb which is conjugated to the present tense at the 3rd person singular

In [6]:
realize(V("love"))

'loves'

* first person singular for a pronoun which is declined by default at neutral 3rd person singular

In [7]:
realize(Pro("I"))

'it'

* adjective, adverb, preposition, conjunction and *canned text*, with the base form which is not declined in English

In [8]:
realize(A("good"),
        Adv("so"),
        P("of"),
        C("or"),
        Q("Wow!!"))

'good, so, of, or, Wow!!'

* date and time specified with the usual JavaScript syntax. When called without argument, it returns the current date and time. We will see later how to display specific fields of a date.

In [9]:
realize(DT("2020-12-25"),
        DT())

'on Friday, December 25, 2020 at 0:00:00 a.m., on Monday, January 30, 2023 at 3:29:38 p.m.'

* number corresponding the numeric value given as parameter

In [10]:
realize(NO(123),
        NO(45678))

'123, 45,678'

### Terminal modifications such as declension and conjugation with options, i.e. [functions that are object properties of the terminal](https://github.com/lapalme/pyrealb/blob/main/docs/documentation.html):
* number: singular `.n("s")`, plural `.n("p")`
* gender: masculine `.g("m")`, feminine `.g("f")`, neutral `.g("n")`

In [11]:
realize(D("a").n("p"),  # empty string when plural...
        N("cat").n("p"),
        Pro("I").g("n"))

', cats, it'

* tense: simple past `.t("ps")`, future `.t("f")`, ...
* person : first `.pe(1)`, second `.pe(2)` or third `.pe(3)` to be combined with number

In [12]:
realize(Pro("I").pe(3).n("p"),
        V("eat").t("ps").pe(3).n("p"),
        Pro("me").pe(3),
        V("finish").t("f").pe(1).n("p"))

'they, ate, it, will finish'

* date formatting

In [13]:
realize(DT().dOpt({"hour":False,"minute":False,"second":False,"nat":true}),
        DT("2020-12-25").dOpt({"hour":False,"minute":False,"second":False,"nat":False}),
        sep="; ")

'on Monday, January 30, 2023; Friday 12/25/2020'

* number formatting

In [14]:
realize(NO(123).dOpt({"nat":True}),
        NO(15).dOpt({"ord":True}),
        NO(3.141592).dOpt({"mprecision":4}))

'one hundred twenty-three, fifteenth, 3.1416'

* HTML output

In [15]:
realize(D('a').tag("b"),
               A("grey").tag("i"),
               N("cat").tag("a",
                            {"href":"https://en.wikipedia.org/wiki/Cat",
                             "target":"_blank"}))

'<b>a</b>, <i>grey</i>, <a href="https://en.wikipedia.org/wiki/Cat" target="_blank">cat</a>'

## Phrase creation
Phrases are created by embedding calls to constructors of terminals or of other phrases. Examples of such phrases are:
* *Noun Phrase* (`NP`) in which the number and gender of the first noun or pronoun is propagated to the other components of the phrase; adjectives are always realized before the noun.
* *Verb Phrase* (`VP`) which often embeds a noun phrase as a direct object
* *Sentence* (`S`) which combines phrases. The first noun phrase in a `S` is taken as the subject of the sentence with which the verb of the `VP` will agree. By default, `S` at the top-level capitalizes the first letter and adds a full stop at the end.

We now create two `NP`s, a `VP` and a `S`.

In [16]:
np1 = NP(D("the"),
         N("cat").n("p"),
         A("small"),
         A("black"))

In [17]:
np2 = NP(D("a"), 
         N("mouse").n("p"))

In [18]:
s = S(np1,
      VP(V("eat").t("f"), 
          np2))

Now realize them:

In [19]:
realize(np1,np2,s)

'the small black cats, mice, The small black cats will eat mice. '

### Phrase modifications using options, i.e. [functions that are object properties of a phrase](https://github.com/lapalme/pyrealb/blob/main/docs/documentation.html)

In [20]:
realize(S(np1,
          VP(V("eat").t("f"), 
             np2)).typ({"neg":True,"pas":True}))

'Mice will not be eaten by the small black cats. '

As options modify the internal structure of a phrase when it is realized, it is preferable to work on a **_copy_** of the original phrase if this phrase is expected to be reused. 

**This _problem_ is potentially worse in the context of this _Jupyter Notebook_ because a cell can be reevaluated many times modifying the original object at each realization.**

A simple way of creating a copy is to define a function that will return a new expression at each call. For our previous `np1`, we can define the following function allowing to specify the number as parameter, singular when not specified. By convention, we suffix the name with `_f` to remind that it is a function.

In [21]:
def np1_f(n="s"):
  return NP(D("the"),
            N("cat").n(n),
            A("small"),
            A("black"))

This function can be called multiple times as follows:

In [22]:
realize(np1_f("p"),np1_f())

'the small black cats, the small black cat'

As these functions most often only return an expression, we can simplify the notation using a _lambda_. We can define functions for our previous phrases and realize two new sentences:

In [23]:
np2_f = lambda n="s": NP(D("a"),N("cheese").n(n))
s_f = lambda n="s" : S(np1_f(n),VP(V("eat"),np2_f(n)))
realize(s_f("p"),s_f().typ({"neg":True,"pas":True}))

'The small black cats eat cheeses. , A cheese is not eaten by the small black cat. '

To simplify the notation for showing the sentence modifications, we define the following function to realize sentence function with optional modifications.

In [24]:
def show(ph_f,mod_typ={}):
    return ph_f().typ(mod_typ).realize()

### Original sentence:

* its realization

In [25]:
show(s_f)

'The small black cat eats a cheese. '

* its negation

In [26]:
show(s_f,{"neg":True})

'The small black cat does not eat a cheese. '

* its negation in passive mode

In [27]:
show(s_f,{"neg":True, "pas":True})

'A cheese is not eaten by the small black cat. '

* question about the subject of the verb

In [28]:
show(s_f,{"int":"wos"})

'Who eats a cheese? '

* question about the object of the verb

In [29]:
show(s_f,{"int":"wad"})

'What does the small black cat eat? '

### Variant with a subordinate phrase

We add a subordinate phrase as the last constituent of `np2` which is cloned which prevents the modification of the original `np2`. The verb of the subordinate is set to the perfect tense, the progressive mode and with a *necessity* modality

In [30]:
sp_f = lambda : S(np2_f().add(SP(Pro("that"),
                                 np1_f(),
                                 VP(V("eat").t("ps"))).typ({"perf":True,"prog":True,"mod":"nece"})),
                 VP(V("be"),
                    A("white")).t("ps"))

* its realization

In [31]:
show(sp_f)

'A cheese that the small black cat should have been eating was white. '

* its negation

In [32]:
show(sp_f,{"neg":true})

'A cheese that the small black cat should have been eating was not white. '

### Coordinate phrase

`CP` realizes a list of phrases in which the elements are separated with a comma except for the last two that are separated by a conjunction specified within the phrase (often at the start)

In [33]:
cp_f = lambda: CP(C("and"),
                  NP(D("the"),N("cat")),
                  NP(D("the"),N("mouse")),
                  NP(D("a"),N("rabbit")))  

* its use within a sentence

In [34]:
show(lambda:S(cp_f(), 
              VP(V("come").t("ps"))))

'The cat, the mouse and a rabbit came. '

* new elements can be added to it, here a new `NP` with a number spelled out

In [35]:
show(lambda:S(cp_f().add(NP(NO(25).nat(true),N("dog"))),
              VP(V("come").t("ps"))))

'The cat, the mouse, a rabbit and twenty-five dogs came. '

### Pronominalization

To ease reading, it is often interesting to replace a noun phrase by a pronoun.
Given the two following noun phrases:

In [36]:
man_f = lambda n="s": NP(D("the"), N("man").n(n),A("pretty"))
woman_f = lambda n="s": NP(D("a"), N("woman").n(n),A("intelligent"))

We realize an initial sentence

In [37]:
show(lambda:S(man_f("p"),
              VP(V("love"),
                 woman_f("p"))))

'The pretty men love intelligent women. '

We pronominalize the subject.

In [38]:
show(lambda:S(man_f().pro(),
              VP(V("love"),
                 woman_f())))

'He loves an intelligent woman. '

e now pronominalize the object and see that the appropriate gender has been used

In [39]:
show(lambda:S(man_f(),
              VP(V("love"),
                woman_f().pro())))

'The pretty man loves her. '

## Dependency creation
To realize sentences, it is also possible to use a notation inspired by the [Dependency Grammar](https://en.wikipedia.org/wiki/Dependency_grammar) formalism. 

A dependency is created with a function giving the name of the relation: `root`, `subj` (subject), `det` (determiner), `comp` (complement), `mod` (modifier). Their first parameter is a `Terminal` which is the head of the dependency. The other parameters, if any, are dependencies associated with the head.

Rather than combining phrases to build a sentence, as shown above, the structure is built by calls to functions corresponding to names of relations determining the role of this dependency in the sentence. **pyrealb** uses the information about the roles perform agreement between words of the sentence. 

In the following example, the plural on the subject affects the determiner and the verb.

In [40]:
show(lambda:root(V("eat").t("p"),
                 subj(N("cat").n("p"),
                      det(D("a"))),
                 comp(N("mouse"),
                      det(D("the")),
                      mod(A("white")))))

'Cats eat the white mouse. '

Coordinated dependencies are build with the coord function that has a conjunction as head and, as dependents, relations that must all be of the same type. For example:

In [49]:
s1_f = lambda:root(V("eat"),
                   coord(C("and"),
                         subj(N("boy"),det(D("the"))),
                         subj(N("girl"),det(D("the")))),
                   coord(C("or"),
                         comp(N("soup"),
                              mod(N("vegetable")).pos("pre")),                
                         comp(N("pork").n("s")),
                         comp(N("chicken"))))
print(show(s1_f),
      show(s1_f,{"pas":True}),
      sep="\n")

The boy and the girl eat vegetable soup, pork or chicken. 
Vegetable soup, pork or chicken is eaten by the boy and the girl. 


**pyrealb** determines the word ordering in a sentence: `det` and `subj` appear before thee head, while `comp` and `mod` come after. This default ordering can be changed by adding the `.pos(..)` option with either `"pre"` or `"post"` like in the preceding example when it is preferable to put the modifier to `soup` before. When there is a tie in ordering, the realizer uses the order of the specification.

## Conclusion

These expressions have illustrated some of the capabilities of **pyrealb** for realizing English sentences. Most agreements are performed automatically; elision is also taken care of. Once the original affirmative sentence structure is set up, many variations can be obtained by means of options.

Other [demonstrations](https://github.com/lapalme/pyrealb/tree/main/demos) are available.

[Guy Lapalme](mailto:lapalme@iro.umontreal.ca)