# GHOST (General Holistic Organism Scripting Tool)

# Goals and Vision

The design intention of Ghost (the “General Holistic Organism Scripting Tool”) is to allow human authors to script behaviors for artificial characters.   Significant flexibility is desired, e.g. to support
* Purely textual chatbots, animated characters, or physical robots
* Characters controlled precisely by human-authored rules; or characters with a great deal of autonomy, where human-authored rules serve mainly to tweak parameters and indicate propensities

Ghost is envisioned as having both a textual and a graphical user interface.   The textual interface is aimed to have roughly the same level of complexity as ChatScript (on which it is heavily based), meaning that it should be usable by non-programmers who are able to deal with a fairly intricate formal syntax. 

The GUI is not yet designed at time of writing, but is intended to make Ghost authoring feasible for a broader class of content authors, and to make it easier and more rapid for everyone.   Some brief notes  regarding the envisioned GUI are given at the end of this document.

# High Level Design

Ghost is implemented as a DSL (Domain-Specific Language) within the Scheme shell associated with the OpenCog engine.    Ghost syntax closely resembles (and in most respects is identical to) ChatScript; but the Ghost interpreter is written in Scheme and runs in the OpenCog Scheme shell.   Ghost rules are interpreted into Atoms living in OpenCog’s Atomspace semantic knowledge store, and are executed within OpenCog using the OpenCog Pattern Matcher and Action Selector and associated mechanisms.

A Ghost rule has two required parts, a pattern and an action; and an optional third part, a goal or set of goals.   The general semantic is: When the pattern is observed, if the goal is being pursued by the agent, then consider taking the action. If no goal is specified, a generic goal such as “interact with others” is implicitly assumed. A Ghost “script” consists of a set of rules, which may be grouped into files called “topics.”   A topic may have some annotations associated with it, which apply to all the rules in the topic (unless overridden by annotations specific to certain rules.)

Both patterns and actions may optionally reference Scheme functions.  (Support for functions written in other languages may be added in a later version.)   An envisioned usage pattern is: Writing Ghost rules and, in coordination with the Ghost rule authoring process, adding new Scheme functions to help the rules do their things.

An example simple Ghost rule is:
```
#goal: (novelty=1)
s: ( Hi there ^no_people_around ) Who said that? ^look_around
```
This rule is somewhat trivial as it lacks variables.   But it shows the basic format; here we have:
 * Pattern = someone says “Hi there” to the agent, and there are no people around (i.e. the Scheme function no_people_around outputs True)
 * Action = the agent says “Who said that?” and then invokes the Scheme function look_around
 * Goal = novelty … so the agent should use this rule when the “novelty” goal is important to it
 
Behind the scenes, the Ghost interpreter uses OpenCog’s Pattern Matcher to match the patterns of its rules to content in OpenCog’s Atomspace knowledge store.  The specifics of this matching are simple now, but are highly customizable and can be made more sophisticated in future versions.



# Installation
**Opencog**

Ghost is a module inside [Opencog](https://github.com/opencog/opencog). So install Opencog first.

**Relex**

[RelEx](https://github.com/opencog/relex) is a dependency parser for the English language. It extracts dependency relations from Link Grammar, and adds some shallow semantic analysis. The primary use of RelEx is as a language input front-end to the OpenCog artificial general intelligence system. Ghost uses relex.

# Hello in Ghost
Open guile shell and import the necessary modules:

In [1]:
(use-modules (opencog)
             (opencog nlp)
             (opencog nlp relex2logic)
             (opencog openpsi)
             (opencog ghost)
             (opencog ghost procedures))

Next start the [relex](https://github.com/opencog/relex) server:

**Without Docker:**
```
$ cd /path/to/relex/project/directory/
$ ./opencog-server.sh
```

**With Docker:**   
When using docker, inside the started guile shell set the host for the container running relex. This way
the relex server will be looked for at the specified IP address instead of the default 127.0.0.1.

In [2]:
(set-relex-server-host)

172.21.0.2

Parse your rules using `ghost-parse` for single rule or `ghost-parse-file` for files containing rules:

In [3]:
(ghost-parse "u: (hello) hello there! ^keep")

Test the parsed rules using `test-ghost`.
When the input doesn't match any pattern:

In [4]:
(test-ghost "hi")

()

When the input matches the pattern of the previously parsed rule:

In [5]:
(test-ghost "hello")

((WordNode "hello")
 (WordNode "there")
 (WordNode "!")
)

The result sentence is returned as a list of WordNodes containing the words. To extract the sentence do the following.

In [6]:
(map cog-name (test-ghost "hello"))

(hello there !)

# Design Overview

* A GHOST rule is essentially an OpenPsi rule (aka psi-rule), which is an ImplicationLink that can be expressed as
```
context AND action -> goal
```

* That is if a pattern or context is matched action is then executed and the goal is said to be achieved.

* A goal also has an ugre to express the level of need to achieve that goal.

* Ghost rules are organized into topics. Before parsing any rules a topic can be set and all subsequent rules created will belong to that topic until another topic is set or the end of the file is reached. If no topic is set a default topic is used.

* When a GHOST rule is being created, it will firstly be passed to a parser for syntax checking and preliminary interpretation. Any rules that is not syntactically correct or with unsupported features will be rejected at this stage.

* The parser will then pass the intermediate interpretations (aka terms) to a translator that converts them into OpenCog atoms to be stored in the AtomSpace.

* Ghost action selector is responsible for selecting a rule that is applicable to a given context. When a textual input is received, rules that satisfy the given context will first be selected as candidates. A full context evaluation will then be done for each of the candidates. Action selector will pick one of them based on their satisfiability and their truth value. Satisfiability of rules are compared based on the following criteria:
   
   * Wether a pattern is matched or not
   * The strength of the rule's goal
   * The urge of the goal
   * Importance of the rule (rules just selected are considered less important)
   * Whether a rule is in the current topic (priority is given)

For example consider two rules with the same pattern but different goal strength:

In [7]:
(ghost-parse "#goal: (goal1=0.5) u: (hi) hi there ^keep")

In [8]:
(ghost-parse "#goal: (goal1=0.7) u: (hi) hello there ^keep")

In [9]:
(test-ghost "hi")

((WordNode "hi")
 (WordNode "there")
)

The action selector picks the rule with the higher goal strength.

* Once a rule is selected it will not be considered again for subsequent inputs unless it is specified otherwise via
`^keep` function or `keep` topic feature.

# Syntax

The syntax of GHOST rules is modeled heavily on [ChatScript](https://github.com/bwilcox-1234/ChatScript/blob/master/WIKI/ChatScript-Basic-User-Manual.md#rules). However, GHOST uses several ChatScript features for different purposes than they are normally used in ChatScript; and also contains some additional features.

## Topic
Rules are bundled into topics. All rules after a topic definition will belong to that rule. The topic declares its name, its keywords, and then its rules. It ends with the end of the file or a new topic declaration.

```
topic: ~NAME features [list of keywords]
```

In [10]:
(ghost-parse "topic: ~GREETING [hi hello]")

Now all rules created will be under GREETING topic. If `keep` topic feature is used all rules in the topic can be considered repeatedly by the action selector.

## Label
A rule can optionally be given a label by which it can be referred by other rules or from the guile shell:

In [11]:
(ghost-parse "u: lbl (ghost) Ghost is a behavior scripting tool ^keep")

In [12]:
(ghost-get-rule "lbl")

(ImplicationLink (stv 0.9 0.9)
   (AndLink
      (TrueLink
         (ExecutionOutputLink
            (GroundedSchemaNode "scm: ghost-execute-action")
            (ListLink
               (WordNode "Ghost")
               (WordNode "is")
               (WordNode "a")
               (WordNode "behavior")
               (WordNode "scripting")
               (WordNode "tool")
            )
         )
         (PutLink
            (StateLink
               (AnchorNode "GHOST Last Executed")
               (VariableNode "$x")
            )
            (ConceptNode "lbl")
         )
         (ExecutionOutputLink
            (GroundedSchemaNode "scm: ghost-record-executed-rule")
            (ListLink
               (ConceptNode "lbl")
            )
         )
         (PutLink
            (StateLink
               (AnchorNode "GHOST Current Topic")
               (VariableNode "$x")
            )
            (ConceptNode "GHOST GREETING")
         )
      )
      (SatisfactionLink
         (Va

## Goal
In a ghost rule the satisfiability of a context followed by execution of the action implies achievement of the given goal. Goal has value (0-1) which indicates the strength of the implication link in the rule. The higher the goal value the more likely that the execution of the action achieves that goal given that the context is satisfied.

```
context AND action ==> goal
```

There are two ways of creating goals,

**1. Top level goal(s)**

In [13]:
(ghost-parse "goal: (please_user=0.8)")

In this case, all the rules created after it will be having the same goal and the same weight, until another top level goal or the end of file is reached. For example the following rule uses the above top level goal.

In [14]:
(ghost-parse "u: lbl1 (hello) Hello sweet wonderful human")

`ghost-rule-tv` returns the truth value of the implication link as `stv mean confidence`. The mean value is the strength of the goal.

In [15]:
(ghost-rule-tv "lbl1")

(stv 0.800000 0.900000)

It is also possible to create a list of rules that are ordered.

In [16]:
(ghost-parse "ordered-goal: (please_user=0.8)")

The rules being created under ordered-goals will have a different weight, based on the order of creation. The relationship between the order and the weight forms a geometric sequence with a factor of 0.5.

For example, if there are five rules under the above please_user=0.8 goal, the first rule will have a weight of 0.4, the second one will have 0.2, the third one will have 0.1, and so on. The sum of the weights will get closer to the weight of the top level goal (0.8) if more rules are created under it.

**2. Rule level goal(s)**
In this case, the goals will only be linked to the rule created immediately after it. Top level goals will also be linked to the rule if there are any. A top level goal will be overwritten by a rule level goal if the same goal is defined. Any number of rule level goals can be specified inside the goal declaration parenthesis.

In [17]:
(ghost-parse "#goal: (novelty=0.67 please_user=0.4) u: (what be you name) I forgot")

In [18]:
(map cog-name (test-ghost "what is your name"))

(I forgot)

## Urge
A goal can have an urge to express the level of urgency to achive that goal. Tweaking the urge value of a goal affects the chances of corresponding rules to be selected by the action selector. If the urge of a goal is increased corresponding rules will be more likely to be selected. 

The urge of a goal is 1 (maximum) by default. The default urge can be changed, and it should be done before creating the goal, for example:

In [19]:
(ghost-parse "urge: (tease_user=1 creativity=0.5)")

The urge value of the goal can be changed via OpenPsi function `psi-increase-urge GOAL VALUE` and `psi-decrease-urge GOAL VALUE`

## Lemma
Lemma is a base word that represents all forms of a word that have the same meaning. In English, for example, ***run, runs, ran*** and ***running*** are forms of the same word. **run** is the lemma. In chatscript parlance this is the canonical form. 

Ghost assists you in generalizing your patterns. It simultaneously matches both the original word and a canonical form of it if your pattern word is in the canonical form. And it checks both lowercase and uppercase forms of your words. For nouns, the canonical form is the singular. So if your pattern is:

In [20]:
(ghost-parse "u: (dog) I have a cat ^keep") 

this will respond equally to I like dogs and I have a dog.

In [21]:
(map cog-name (test-ghost "I like dogs"))

(I have a cat)

In [22]:
(map cog-name (test-ghost "I have a dog"))

(I have a cat)

Whereas the pattern

```
u: (dogs) I have a cat
```
will only respond to ***I like dogs*** but not to ***I have a dog***.

For verbs, the canonical form is the infinitive tense. If your pattern is:

In [23]:
(ghost-parse "u: (what be *1) I don't know ^keep")

This will respond equally to ***What is it?*** and ***What are you?*** and ***What am I?.***

In [24]:
(map cog-name (test-ghost "What is it?"))

(I don't know)

In [25]:
(map cog-name (test-ghost "What are you"))

(I don't know)

In [26]:
(map cog-name (test-ghost "What am I"))

(I don't know)

In [27]:
(map cog-name (test-ghost "What am I and what are you"))

(I don't know)

* Possessive suffixes ' and 's transform to the word 's.
* Adjectives and adverbs revert to their base form.
* Determiners a an the some these those that become a.
* Text numbers like two thousand and twenty one transcribe into digit format.
* Floating point numbers migrate to integers if they match value exactly, while currency values become floating       point.
* Personal pronouns like me my myself mine move to the subject form I, while whom, whomever whoever whose shift to   who and anyone somebody anybody become someone and whatever becomes what, whenever becomes when, whichever         becomes which.

## Phrase
When you want Ghost to treat multiple-word phrases as a single entity put the words inside double quotes. You should always put multiple-word proper names in double quotes, particularly ones with embedded punctuation. You want Ghost to know that the entire phrase is considered a single entity. So

```
u: ( "Dr. Watson" )
u: ( "The Beatles" )
```

In [28]:
(ghost-parse "u: ( \" dr. watson \" ) How may I help You?")

In [29]:
(map cog-name (test-ghost "dr. watson"))

(How may I help You ?)

## Choice
You can match alternate words in the same position by placing those choices in brackets.

In [30]:
(ghost-parse "?: (you [swim ride fish ]) I do ^keep")

In [31]:
(map cog-name (test-ghost "do you swim?"))

(I do)

In [32]:
(map cog-name (test-ghost "do you fish?"))

(I do)

Notice that elements of a choice can be sequences of words either as double-quoted phrases or as paren sequences

In [33]:
(ghost-parse "?: (you [eat ingest \"binge and purge\" (feed my face ) ] meat) I love meat")

In [34]:
(map cog-name (test-ghost "do you feed my face meat"))

(I love meat)

## Concept
Choices are handy for synonyms, but you have to repeat them over and over in different rules. At such point being able to declare a list of choices in one place and use them everywhere else becomes convenient. This is the concept set. It is hugely important in writing patterns that match meaning.

Unlike choices, a concept cannot use paren notation to hold a sequence of words, though it can use quoted expressions.

A concept is a top-level declaration consisting of a name starting with ~ and consisting of only alpha-numeric characters and underscores. A concept has a list of words it defines. You can use the set name in any pattern or topic keyword list in place of a word.

In [35]:
(ghost-parse "concept: ~eat [eat ingest \"binge and purge\"]")

Once a concept is defined you can use it in your patterns.

In [36]:
(ghost-parse "s: (I ~eat meat) Do you really? I am a vegan.") 

In [37]:
(map cog-name (test-ghost "I ingest meat"))

(Do you really ? I am a vegan .)

See [this](https://github.com/bwilcox-1234/ChatScript/blob/master/WIKI/ChatScript-Basic-User-Manual.md#concepts) for more info.

## Optional
Sometimes you can expect a word might or might not be supplied. Your pattern can reflect this, swallowed it when present. `{}` is just like choice `[]`, except the match is optional. It is allowed to fail.

In [38]:
(ghost-parse "u: (define {word concept} hate) Sorry. I don't know it. ^keep ")

In [39]:
(map cog-name (test-ghost "define word hate"))

(Sorry . I don't know it .)

In [40]:
(map cog-name (test-ghost "define concept hate"))

(Sorry . I don't know it .)

You can also use quoted pharases and parenthesis notation.

In [41]:
(ghost-parse "u: ( define { \"the word\" (the meaning of) } love ) Sorry. I don’t know it. ^keep")

In [42]:
(map cog-name (test-ghost "define the word love"))

(Sorry . I don’t know it .)

In [43]:
(map cog-name (test-ghost "define the meaning of love"))

(Sorry . I don’t know it .)

## Indefinite Wildcard
The wildcard * means 0 or more words in sequence. It can be used to widen a pattern:

In [44]:
(ghost-parse "u: (when * you * home) I go home tomorrow ^keep")

This pattern responds to ***When will you go home*** and ***When Roger is with you, will there be anyone at home?***

In [45]:
(map cog-name (test-ghost "When will you go home"))

(I go home tomorrow)

In [46]:
(map cog-name (test-ghost "When Roger is with you, will there be anyone at home?"))

(I go home tomorrow)

In [47]:
(map cog-name (test-ghost "When you home?"))

(I go home tomorrow)

## Precise Wildcard
As you may notice, indefinite wildcards can allow all sorts of mischief to creep into a match. An overprotective way to manage this is using wildcards that tell you exactly how many words can be swallowed up. The * followed by a number names how many words it absorbs.

In [48]:
(ghost-parse "u: (when *1 you *1 to school) I went to school yesterday ^keep")

This matches ***When did you go school*** but won’t accept wide variances like ***When Roger is with you*** nor will it accept ***when you went school*** which hasn’t room for the first ****1***.

In [49]:
(map cog-name (test-ghost "When did you go to school?"))

(I went to school yesterday)

In [50]:
(map cog-name (test-ghost "When you go to school"))

()

## Range-restricted Wildcard

The usual way to manage the excesses of the previous wildcards is to use a range restricted wildcard. This is an * followed by a ~ and a number, like *~3. It means from 0 up through that number, or approximately that number.

A common choice is *~2. This leaves room for some filler words (like a determiner and an adjective or perhaps some kind of adverb), without requiring them or letting the sentence stray.


In [51]:
(ghost-parse "u: (you *~2 go *~2 gym) I often go to that gym. ^keep")

This responds equally to ***You can go to gym*** and ***you should not go to your gym***.

In [52]:
(map cog-name (test-ghost "You can go to gym"))

(I often go to that gym .)

In [53]:
(map cog-name (test-ghost "You should not go to gym"))

(I often go to that gym .)

## Match Variable
When you use wildcards and sets in a pattern, you can ask the system to memorize briefly the word it matches. Just place an underscore in front of what you want memorized.

The purpose of memorizing is to be able to use the value on output. The results of memorization are stored on match variables named _0, _1, etc, depending upon how many underscores you use in the pattern.

In [54]:
(ghost-parse "concept: ~meat [ham chicken beef]")

In [55]:
(ghost-parse "u: ( do you eat _~meat ) No, I hate _0. ^keep")

In [56]:
(map cog-name (test-ghost "do you eat ham"))

(No , I hate ham .)

If the input is do you eat ham the output would be No, I hate ham. Of course, the value of _0 is only guaranteed for the execution of this rule. Match variables may be clobbered when you execute another rule. Or they may last for a while.

At most it will last for the duration of the current volley (several sentences maybe) after which it should be presumed trashed. Whenever you start a volley, you should presume match variables all hold unknown junk.

In [57]:
(ghost-parse "u: ( do you eat _[ ham eggs bacon] ) I eat '_0.")

In [58]:
(map cog-name (test-ghost "do you eat eggs"))

(I eat eggs .)

When the system memorizes your underscore match, it stores both the original word, its canonical form, and the position of the text. On output, by default you get the canonical form. If you want the original form, you must precede your reference with an apostrophe. In the above rule, for example, if `'_0` were changed to `_0` the output would be `I eat egg`; the canonical form of `eggs` would be used. 

For more than one `_` use `_0` and `_1` and so on.

In [59]:
(ghost-parse "u: ( do you like _* or _* ) I don’t like '_0 so I guess that means I prefer '_1.")

In [60]:
(map cog-name (test-ghost "do you like tea or coffee"))

(I don’t like tea so I guess that means I prefer coffee .)

If you memorize an optional area, `{test me}`, then you get either the word that matched or the match variable is set to null if it fails to match. A null variable prints nothing on output.

If you use match variables, they are allocated in the order of the pattern. E.g.,
```
s: ( _~fruit [_~animal _bear] _~like )
```
In the above, _0 is a fruit and _2 is a like, and the _~animal or _bear is _1.

If you had NOT put _in front of bear, you are at risk that the ~like match may be _1 or _2, depending on what happened inside `[]`. That's your headache if you use nested memorization.

See [here](https://github.com/bwilcox-1234/ChatScript/blob/master/WIKI/ChatScript-Basic-User-Manual.md#_-match-variables) for more.

## User Variable

If you need memory that lasts beyond the current input, one source of this is user variables. A variable is named with a starting dollar sign or two and then an alphabetic letter and then the rest must be alpha, digit, underscore, or hyphen. You initialize it using a C-style assignment in the output.

The = assignment operator MUST be separated from the variable and the value by at least one space, otherwise the system has no way to tell you don't want it to simply output some bizarre word.

Unlike match variables, user variables hold a single value only.

See [here](https://github.com/bwilcox-1234/ChatScript/blob/master/WIKI/ChatScript-Basic-User-Manual.md#user_variables) for more.

In [61]:
(ghost-parse "u: ( I eat _*1 ) $food = '_0 I eat oysters.")

In [62]:
(map cog-name (test-ghost "I eat bread"))

(I eat oysters .)

Once user variables are set you can later use them in the output. Note that `$food` is also inside the pattern which checks if `$food` is set first. Otherwise the rule won't trigger.

In [63]:
(ghost-parse "u: (what do I eat $food) You eat $food ^keep")

In [64]:
(map cog-name (test-ghost "what do I eat"))

(You eat bread)

## Sentence Boundary
Sometimes, to get a proper meaning in the pattern, you need to actually know where an input begins or ends. For example:

```
u: (what is an elephant) An elephant is a pachyderm. 
```

matches ***Tell me what is an elephant*** and ***what is an elephant*** and ***what is an elephant doing in the room***. 

That last one is inappropriately matched.

The > matches the end of the sentence. This makes it possible to correctly manage the above sentences as follows:

In [65]:
(ghost-parse "u: (what is an elephant > ) An elephant is a pachyderm. ^keep ")

In [66]:
(map cog-name (test-ghost "Tell me what is an elephant"))

(An elephant is a pachyderm .)

In [67]:
(map cog-name (test-ghost "what is an elephant doing in the room"))

(I don't know)

The < doesn’t really match the start of the sentence so much as it sets the current position of matching to the start of the sentence. Thus

In [68]:
(ghost-parse "u: ( roses < I like ) I like roses too. ^keep")

matches ***I like roses*** because it finds roses anywhere in the sentence, then the < resets the match position to the sentence start, and then it finds ***I like*** at the beginning. Of course this will not match ***You know I like roses*** because I is not at the start of the sentence.

In [69]:
(map cog-name (test-ghost "I like roses"))

(I like roses too .)

In [70]:
(map cog-name (test-ghost "You know I like roses"))

()

## Negation

The absence of words is represented using ! and means it must not be found anywhere after the current match location. When placed at the start of the pattern, it means not anywhere in the sentence at all.

In [71]:
(ghost-parse "u: ( ![ not never rarely ] I * eat meat ) You eat meat. ^keep")

In [72]:
(map cog-name (test-ghost "I never eat meat" ))

()

In [73]:
(map cog-name (test-ghost "I eat meat" ))

(You eat meat .)

## Unordered Matching

Often times you are interested in matching several keywords, but you explicitly want any order of them. For example the sentence ***I love birds*** is a lot like ***Birds are what I love*** but subject and object move around. One somewhat tedious way to match in any order is:
```
u: ( I < * love < * birds ) I love birds too.
```

This works by going back to the beginning of the sentence and allowing any number of words to match a wildcard until the next keyword is found. It’s ugly. The cleaner way is to use the unordered markers.

In [74]:
(ghost-parse "u: ( << I birds love >> ) I love birds too. ^keep")

In [75]:
(map cog-name (test-ghost "I love birds"))

(I love birds too .)

In [76]:
(map cog-name (test-ghost "Birds are what I love"))

(I love birds too .)

Since the words can be matched in any order, this resets the scanning mechanism back the original starting condition, which is always < * meaning you can match the next the next item anywhere in the sentence.

Position is freely reset to the start following the << >> sequence so if you had the pattern:

```
u: ( I * like << really >> photos)
```
and input ***photos I really like*** then it would match because it found ***I * like*** then found anywhere ***really*** and then reset the position freely back to start and found ***photos*** somewhere in the sentence.

## Rejoinder

Rejoinders are attempts to predict a user’s immediate response to something the chatbot says. They cannot be triggered except on input immediately after the rule they follow has issued output. Rejoinder hierarchy can be set using letters `a-q`

```
u: (you have a cake) yes do you want some?
    a: (yes) here you go"
    a: (no) your loss

```

## Function