Skip to content

Generation rules

SimGus edited this page Jan 11, 2020 · 14 revisions

As was explained in the syntax specifications, generation rules, also called "templates", make up the contents of a unit description. We explain here what they are and how to use them.

Definition of a generation rule

A generation rule is a rule that is part of a unit declaration. It is parsed by the parser and used by the generator to generate a string (or a set of strings).

A generation rule is actually a sequence of a positive number of rule contents or sub-rules. A sub-rule can be seen as a part of a rule that is able to generate a string (or a set of strings) as well.

Generating a string from a rule simply corresponds to using each of its sub-rule to generate a string, and joining the resulting strings together. There are different types of sub-rules, which are described in the following sections.

Words

They are the simplest sub-rules that exist: the only thing they are able to generate is the word they represent.

To put a word inside a rule, simply write it, separated from other sub-rules by whitespaces or special characters. A sequence of words will thus generate the same sequence.

For example, the rule hello will generate the string hello, while the rule I'm back, made of 2 words (I'm and back), will generate I'm back.

There are a certain number of special characters that are used to give information to the parser and generator. If you want to use one of those symbols as part of a word, the name of a unit or any other place, you might need to escape them using a backslash \ (e.g. use \; instead of ;).

Here is an exhaustive list of characters that can be escaped:

Character name Symbol
Double slashes //
Slash /
Semi-colon ;
Square brackets [ and ]
Curly braces { and }
Tilde ~
At sign @
Percent symbol %
Backslash \
Pipe |
Question mark ?
Hashtag #
Dollar sign $
Ampersand &

Note that in some unambiguous cases, escaping a character will not be required. If the parsing fails however, try escaping those characters (not other ones).

Choices

A choice is a set of rules which chooses one of those rules at random upon generation, and generates its contents. Since v1.6.0, there is no limitations on the contents of a choice. A choice is denoted with square brackets [ and ] enclosing the whole choice, each of the rules it contains being separated by pipes |. Prior to v1.6.0, their syntax used curly braces { and }, and slashes /. While this syntax is still accepted, it is deprecated and the newer syntax should be used instead.

For example, [hello|Hi] would generate either hello or Hi at random; the rule This is a [nice ~[chatbot]|great thing|cool program] would generate either This is a nice chatbot, This is a nice chatterbot, This is a nice bot, This is a great thing or This is a cool program.

Note that the following unit definitions are similar (they would generate the same strings):

~[greetings]
  Hello world
  Hi world

and

~[greetings]
  [Hello|Hi] world

As you can see, choices don't need to be used to get a certain result. They are however very useful to prevent your template files from growing too large. An obvious disadvantage is readability.

The special characters that should be escaped if you want to use them inside a choice are the same as those escaped within word groups.

Another use of choices is to have only one rule inside it, and use it to apply some modifiers on this rule. Modifiers will be discussed on page generation modifiers. Prior to v1.6.0, this use of the newer syntax was the only one accepted and was known as the "word groups".

Unit references

This type of sub-rule represents a "call" to a unit defined somewhere else in the template file(s). This "call" asks the unit to generate some string of characters.

A unit definition is made of a declaration, which contains the unit's type and name, and a list of rules. When asking a unit to generate a string, it simply chooses one of those rules and generates a string from this. This is the string that is generated by the unit.

In order to reference a unit within a rule, you will want to write the special character that corresponds to the type of the unit you're referencing (either %, @ or ~), followed by the unit's name surrounded by brackets [ and ].

For example, if we defined the following unit:

~[chatbot]
   chatbot
   chatterbot
   bot

the sub-rule ~[chatbot] will generate either chatbot, chatterbot or bot; the rule Are you a ~[chatbot]? will generate one of the following strings: Are you a chatbot?, Are you a chatterbot? or Are you a bot? (Note that for this last rule, ? is actually considered a word since it is not part of unit reference.)

Once again, several special characters should be escaped (additionally to those escaped for words and word group):

Character name Symbol
Hashtag #
Dollar sign $

Except for simple words, all those sub-rules can have their generation behavior modified thanks to modifiers which are described here.

You can also learn which types of units exist here.