# ADVISER Tutorial

In this tutorial, you will learn
1. how to install and run the ADVISER dialog system
2. how to build custom modules
3. how to combine modules into a functional dialog system

## Installation

ADVISER needs to be executed in a python3 environment.

After downloading the code, navigate to the top level directory.
It should contain a _requirements.txt_ file which lists all requirements you need to install in order to run ADVISER.
You can do this by creating a virtual environment by executing the following commands from the top level directory:
1. Make sure you have virtualenv installed by executing
```sh
python3 -m pip install --user virtualenv
```
2. Create the virtual environment (replace _envname_ with a name of your choice)
```sh
python3 -m venv envname
```
3. Source the environment (this has to be repeated every time you want to use ADVISER inside a new terminal session)
```sh
source envname/bin/activate
```
4. Install the required packages
```sh
pip install -r requirements.txt
```

5. To make sure your installation is working, navigate to the *advidser* folder:
```sh
cd adviser
```
and execute 
```python
python run_chat.py --domain restaurants
```

6. To end your conversation, type `bye`.

## Default modules

ADVISER comes with reference implementations for all typical dialog system components:


* **modules.nlu.nlu**: 
An implementation for a Natural Language Understanding Unit (NLU) based on regular expressions.
It takes a natural language utterance string as input and parses it into user actions.
User actions carry the following important information:
    * user action type: describes the generic action type (e.g. if the user informs about sth., if they request sth., ...)
    * slot: the ontology slot name corresponding to the action (e.g. the user could inform about food type)
    * value: the value the user mentioned for this slot (e.g. italian food, or no value if they requested the food type instead of informing about it) 
  


* **modules.bst.bst**: 
Simple Belief State Tracker (BST) which keeps track of the values/requests given by the user and updates the appropriate discourse acts the user action corresponds to. 
Requests are reset after each turn. 
If the user actions provide confidence scores in (0,1), the probabilities for each slot will be normalized.


* **modules.policy.policy_handcrafted**:
A handcrafted policy implementation. 
Its purpose is to choose a fitting system action according to the current belief state.
Output is a list of system actions which carry information similar to the user actions:
    * system action type
    * slot
    * list of values for each slot (note that this is a difference to the user actions, because actions like `select` need to offer multiple values for one slot to the user


* **modules.nlg.nlg**:
Converts system actions into natural language to be more readable by the user.

## Implementing your own modules

All dialog system components inherit from the base class **modules.module.Module**.
Most modules depend on the dialog domain, e.g. restaurants or hotels which can be assigned via the constructor.

For illustration purposes, consider this slightly shortened version:

In [None]:
from utils.logger import DiasysLogger

class Module(object):
    def __init__(self, domain, subgraph: dict = None, logger : DiasysLogger = DiasysLogger()):
        self.domain = domain
        self.is_training = False
        # ...

    def forward(self, dialog_graph, **kwargs):
        raise NotImplementedError

    def start_dialog(self, **kwargs):
        return {}

    def end_dialog(self, sim_goal):
        pass
    
    def train(self):
        self.is_training = True
        if self.subgraph is not None:
            for module_name in self.subgraph:
                self.subgraph[module_name].train()

    def eval(self):
        self.is_training = False
        if self.subgraph is not None:
            for module_name in self.subgraph:
                self.subgraph[module_name].eval()   

To implement your own module,
start by creating a new class which inherits from **modules.module.Module** and calling the constructor of the parent class (this is important!):

In [None]:
from modules.module import Module

class MyCustomNLU(Module):
    def __init__(self, domains, subgraph : dict = None):
        super(MyCustomModule, self).__init__(domains, subgraph=subgraph)

Processing a dialog follows this procedure for every module in the dialog graph:

1. Before the user or sytem take their first turn, *start_dialog* is called. This is the place to initialize variables needed for a new dialog.
2. *forward* will be called until the dialog is completed (either system or user ended the dialog, e.g. by saying 'bye', or the maximum amount of turns was reached). Typically, one forward pass of the dialog system includes a  system turn as well as the user turn.
3. After the dialog finished, successful or not, *end_dialog* is called. You can use this to reset variables between dialogs and/or do logging, calculate dialog-level metrics, etc.

While overriding *start_dialog*/*end_dialog* is not required,
**overriding the _forward()_ method is neccessary for any module**.
Make sure to include the `**kwargs` parameter and that required inputs provide a default value.

It is your responsibility to ensure that all required information for the forward functions in your dialog graph is available.
To see the required inputs, look at the code or documentation for each used module's forward function.
As an example, let's look at the handcrafted belief state tracker's forward function:

In [None]:
from typing import List
from utils.useract import UserAct
from utils.sysact import SysAct
from utils.beliefstate import BeliefState

# handcrafted belief state tracker forward function signature
def forward(self, 
            dialog_graph,                     # <-- required default parameter, access to the whole dialogsystem
            user_acts: List[UserAct] = None,  # <-- output from the NLU: a list of user actions
            beliefstate: BeliefState = None,  # <-- current beliefstate object 
            sys_act: List[SysAct] = None,     # <-- output from the policy: a list of system actions (e.g. from the last turn if you have the module order NLU -> BST -> POLICY)
            **kwargs
           ) -> dict(beliefstate=BeliefState): # <-- required output, has to be a dictionary! 
        
        # ...
        # process user and system acitons, update belief state accordingly
        # ...
        
        return {'beliefstate': beliefstate}   # <-- return value encapsulated in dictionary

As is shown in the code snippet above, return values have to be dictionaries.
This is so that return values can be mapped to named function arguments of other forward functions of modules in the dialog graph.

Going into more detail, the dialog sytem manages a dictionary.
When calling *start_dialog*, initial values are written into this dictionary.
Each time a module's *forward* function is called, the dialog system checks if values for the input names of this module are available in the dictionary - if so, they will be passed to the function call - if not, the default values provided in the function definition will be used.
The return values of this function will then be inserted into the dictionary (or updated, if already present).
After a dialog is completed, the dictionary will be emptied again.

To illustrate this, let's look at an example (feel free to check the existing modules' codes to get a better understanding of this mechanism - they are located in the `adviser/modules` directory and subdirectories):

Imagine the dialog system's dicionary is empty in the beginning:
```python
{}
```
A user input module which reads user input from the terminal does not require any input. 
Inside its *forward* function, it reads user input from the terminal and returns it as 
```python
def forward(self, dialog_graph, **kwargs):
    user_input = input("User input > ")
    return {'user_utterance': user_input}
```
Suppose the user enters the phrase
*I am looking for an italian restaurant*.
The dialog system's dictionary now looks like this:
```python
{'user_utterance': 'I am looking for an italian restaurant'}
```
The NLU now processes a user utterance and tries to extract the user actions from it.
It will therefore require *user_utterance* as input and return the parsed user actions:
```python
def forward(self, dialog_graph, user_utterance : str = '', **kwargs):
    usr_acts = []
    if 'italian' in user_utterance:
        user_acts.append(UserAct(act_type=UserActionType.Inform, slot='food', value='italian'))
    return {'user_acts': usr_acts}
```
The dialog sytem's dictionary is now going to look like this:
```python
{
 'user_utterance': 'I am looking for an italian restaurant',
 'user_acts': UserAct(act_type=UserActionType.Inform, slot='food', value='italian')
}
```
The BST then consumes the *user_acts* from this dictionary, etc.

A final view of our custom NLU class:

In [None]:
from modules.module import Module
from utils.useract import UserActionType

class MyCustomNLU(Module):
    def __init__(self, domain, subgraph : dict = None):
        super(MyCustomNLU, self).__init__(domain, subgraph=subgraph)
        
    def forward(self, dialog_graph, user_utterance : str = '', **kwargs):
        user_acts = []
        if 'italian' in user_utterance:
            user_acts.append(UserAct(act_type=UserActionType.Inform, slot='food', value='italian'))
        return {'user_acts': user_acts}



_train()_ and _eval()_ can be used to tell trainable modules how they should behave, either by overwriting these methods or by querying the _is__training_-member.
For example, if you want to train a policy, make sure to call _train()_ before simulating dialogs - but to evaluate final performance or chat with the system, you don't want the policy to change further, which is done by calling _eval()_.

Time to test our new custom NLU with the restaurant domain (it should output an inform action for italian food):

In [None]:
import os
from utils.domain.jsonlookupdomain import JSONLookupDomain

# initialize restaurant domain and our custom NLU module
domain = JSONLookupDomain(name='ImsCourses')
mynlu = MyCustomNLU(domain=domain)

user_input = "i like italian food"                                   # test input
result = mynlu.forward(dialog_graph=None, user_utterance=user_input) # process input
print(result)

## Logging

If you want to record information from certain modules or whole dialogs,
you can make use of the built-in logger module.
It expands on python's logging module, but you still can use all of the python logger methods.

The different log levels (listed in increasing priority):
* DIALOGS - record whole dialogs
* RESULTS - record only results of each dialog (e.g. successfull / not successfull)
* INFO    - only log module informatoin
* ERRORS  - only log errors
* NONE    - don't log anything
Each log level will also log all information from log levels with lesser priority, e.g. logging results will also log infos and errors.

You can also configure the output of the logger, either logging to 
1) a file or
2) the console.

Every subclass of `Module` has a default logging module of type `utils.logger.DiasysLogger` you can access via the `self.logger` property.
If you don't specify a customized logging instance when initializing a module, 
a standard logger will be used which only logs errors to the console.

The same logger instance may be shared across multiple modules or each module can have its own logger, with different log levels or file output paths.

E.g. a basic logger that will log results to the console (and will create no logfile):

In [None]:
from utils.logger import DiasysLogger, LogLevel

# configuration
logger = DiasysLogger(console_log_lvl=LogLevel.RESULTS)


# ... other imports

# ... your code

To integrate this logger into your new module, just pass the logger to the constructor and call the logging-related methods on the `self.logger` property.

In [None]:
import os
from modules.module import Module
from utils.useract import UserActionType, UserAct
from utils.logger import DiasysLogger, LogLevel
from utils.domain.jsonlookupdomain import JSONLookupDomain

domain = JSONLookupDomain(name='ImsCourses')

# initialize a logger that will print results of our custom NLU to the console
logger = DiasysLogger(console_log_lvl=LogLevel.RESULTS)

class MyCustomNLU(Module):
    def __init__(self, domain, subgraph : dict = None, logger : DiasysLogger = logger):
        super(MyCustomNLU, self).__init__(domain, subgraph=subgraph)
        
    def forward(self, dialog_graph, user_utterance : str = '', **kwargs):
        user_acts = []
        if 'italian' in user_utterance:
            user_acts.append(UserAct(act_type=UserActionType.Inform, slot='food', value='italian'))
        
        # Logging
        logger.dialog_turn("This message is not going to be printed because of the specified log level")
        logger.result("NLU result: " + str(user_acts))
        
        return {'user_acts': user_acts}
    
mynlu = MyCustomNLU(domain=domain)
user_input = "i like italian food"                                   # test input
result = mynlu.forward(dialog_graph=None, user_utterance=user_input) # process input

If you want to log to a file, you can also configure the output path.
The next example sets up a logger which will log complete dialogs to a file and only print errors to the console.

In [None]:
from utils.logger import DiasysLogger, LogLevel

# Configuration
logger = DiasysLogger(console_log_lvl=LogLevel.ERRORS,
                      file_log_lvl=LogLevel.DIALOGS,       # <-- the log level for your log file output
                      logfile_folder='mylogs',                  # <-- the folder where your log file should be created
                      logfile_basename='mynewlog')             # <-- this is the base name of your log file

## The dialog graph + Building & running a dialog system

Typically, dialog system modules are combined in a pipeline as displayed in the following figure (blue modules are included in ADVISER):

<img src="tutorialresources/pipeline.png" width="800" align="center"/>

However, ADVISER allows you to freely re-order, remove or insert modules as long as the required inputs for each module inside the pipeline are provided.
For example, if you want to build an end-to-end system, you may discard most of the modules.
Or instead of a linear structure, you could also create branching structures. 

In ADVISER, you create a dialog graph by providing a list with instances of the modules in the order you wish them to be executed (e.g. it makes sense to place a NLU *after* an input module).

To construct a dialog system, instantiate all components you want to include in your dialog graph and hand them over to the top level class **dialogsystem.DialogSystem**.

Example of a typical dialog system pipeline similar to the one displayed in the above figure (without ASR and TTS):

In [1]:
import os
from dialogsystem import DialogSystem
from modules.nlu import HandcraftedNLU
from modules.bst import HandcraftedBST
from modules.nlg import HandcraftedNLG
from modules.policy import DQNPolicy, HandcraftedPolicy
from modules.surfaces import ConsoleInput, ConsoleOutput
from utils.domain.jsonlookupdomain import JSONLookupDomain
from utils.logger import DiasysLogger, LogLevel

logger = DiasysLogger(console_log_lvl = LogLevel.DIALOGS)

# instantiate domain
domain = JSONLookupDomain(name='ImsCourses')

# instantiate components
nlu = HandcraftedNLU(domain=domain, logger=logger)
bst = HandcraftedBST(domain=domain, logger=logger)
policy = HandcraftedPolicy(domain=domain, logger=logger)
nlg_template = os.path.join('resources', 'templates', 'ImsCoursesMessages.nlg')
nlg = HandcraftedNLG(domain=domain, template_file=nlg_template, logger=logger)

# connect components to a dialog system
hdc_system = DialogSystem(nlu,
                         bst,
                         policy,
                         nlg,
                         logger=logger)

This system however does not yet know how to handle user input and how to present the system's output.
We will cover this in the next section.

## Chatting with the dialog system

Currently, two different ways of handling Input and Output are available in the `modules/surfaces` directory:
* Console interface (**ConsoleInput / ConsoleOutput** classes)
* Graphical User Interface (**GuiInput / GuiOutput** classes)

If you want to chat with your dialog system,
make sure to add an instance of either the console or GUI classes for input handling before your NLU (the NLU already  requires user input) and one for Output after your NLG (there is no human-readable text before NLG processing).

Extending the example dialog graph from above with console based in- and output looks like this:

In [2]:
# ... other imports
from modules.surfaces import ConsoleInput, ConsoleOutput
from utils.logger import DiasysLogger, LogLevel

# ... component instantiation as above

# input / output handling modules
user_input = ConsoleInput(logger=logger)
user_output = ConsoleOutput(logger=logger)

# connect components to a dialog system
hdc_system = DialogSystem(user_input,  # <-- user input before the NLU
                         nlu,
                         bst,
                         policy,
                         nlg,
                         user_output,
                         logger=logger) # <-- user output after the NLG

In [3]:
# start one dialog with a human user 
hdc_system.run_dialog()

logger: # DIALOG 0 STARTED #
logger: # TURN 0 #


Please select your language: English or German
>>> german


logger: User Actions: []
logger: System Action: welcomemsg()
logger: # TURN 1 #


System: Willkommen zum IMS Lehrveranstaltungs Chat Bot. Wie kann ich dir weiterhelfen?
>>> Hi, ich hätte gerne einen Kurs von Franz Kafka.


logger: User Utterance: Hi, ich hätte gerne einen Kurs von Franz Kafka.
logger: User Actions: [UserAct("Hi, ich hätte gerne einen Kurs von Franz Kafka.", UserActionType.Hello, None, None, 1.0), UserAct("Hi, ich hätte gerne einen Kurs von Franz Kafka.", UserActionType.Inform, lecturer, franz kafka, 1.0)]
logger: System Action: inform_byname(name="['introduction to psycholinguistics']",lecturer="['franz kafka']")
logger: # TURN 2 #


System: Der Kurs introduction to psycholinguistics wird von Franz Kafka unterrichtet.
>>> In welchem Semester findet er statt?


logger: User Utterance: In welchem Semester findet er statt?
logger: User Actions: [UserAct("In welchem Semester findet er statt?", UserActionType.Request, turn, None, 1.0)]
logger: System Action: inform_byname(turn="['sose']",name="['introduction to psycholinguistics']")
logger: # TURN 3 #


System: Der Kurs introduction to psycholinguistics wird im Sommersemester angeboten.
>>> Danke und tschüss.


logger: User Utterance: Danke und tschüss.
logger: User Actions: [UserAct("Danke und tschüss.", UserActionType.Bye, None, None, 1.0), UserAct("Danke und tschüss.", UserActionType.Thanks, None, None, 1.0)]
logger: System Action: closingmsg()
logger: # DIALOG 0 FINISHED #


System: Danke schön, bis zum nächsten Mal.


# Creating a new domain

In order to create a new domain, you need a database with the following properties:
* format: sqllite 3
* 1 table is usable only
* each column is a slot
    * binary slots should only have values true or false 

1) Create the ontology for the new domain

With that, you can execute the domain creation tool by calling
```bash
python tools/ontology/create_ontology.py path/to/your/database/YourDomainName.db
```
from the top level directory.

As a fist step, you have to choose the table you want to use (we will use the ImsCourses database located in resources/databases for demonstration purposes):
<img src="tutorialresources/ontcreator_select_db.png" width="500"/>

Afterwards, you are asked to name your new domain.
The following selection screens allow you to choose the slots for each slot type.
An active selection is indicated by a blue circle.

When asked for a primary key, this means you should reference the column in your database which uniquely discriminates all database entries (preferably in a human-readable way, not just an index) - in case of restaurants or hotels, this could be the name of the venues. 

Possible slot types are
* Informable: information the user can inform the system about
* System requestable: information the system can actively ask the user
* Requestable: information the user can request from the system

Other things to choose from (when in doubt, just select all of them) are 
* user discourse acts (e.g. acknowledge, hello, repeat, ...)
* user methods - they indicate if the current user utterance queries the system by constraints or if they used a primary key, if the dialog goal is fullfilled or if they asked for alternatives

A special value you can select is the `dontcare` value, which, when included, means that the user can just say they don't care about the corresponding slot and leave filling it up to the system (e.g. when searching for a restaurant and the system asks for the _area_, a user might say something like `I don't care` - the system will then omit the _area_ slot as a constraint in the database queries).

Selection screens for slots / values look like this:
<img src="tutorialresources/ontcreator_selections.png" width="700"/>

The resulting .json file containing the ontology will look like this (excerpt):
<img src="tutorialresources/ontcreator_json.png" width="500"/>

If your database is not already located inside the folder `resources/databases`,
confirm the copy operation when asked by the ontology creation tool.

After the tool terminated successfully, 
check that the following two files were created inside the folder `resources/databases`:
* \[YourDomainName\].db
* \[YourDomainName\].json

2) Use the new domain

Create an instance of your new domain by providing the name of your domain and
paths to the newly created json-ontology file and sqllite database.
These paths have to be relative to the adviser top level directory, e.g. `resources/databases`.

```python
from utils.domain.jsonlookupdomain import JSONLookupDomain

your_domain_instance = JSONLookupDomain(name='YourDomainName', 
                            json_ontology_file='resources/databases/YourDomainName.json', 
                            sqllite_db_file='resources/databases/YourDomainName.db')
                            
```

Use this instance to instantiate the modules that constitute your dialog graph.

## Regular Expressions (Regexes)

We define Regular Expressions and refer you to some tutorials if necessary, because they are used in the next section.

A _regular expression_ is a sequence of characters that define a search pattern [Reference](https://en.wikipedia.org/wiki/Regular_expression)

There are several tutorial online to learn regular expressions with Python:

* Official Python website: https://docs.python.org/3/howto/regex.html
* https://pymotw.com/3/re/
* https://pythonprogramming.net/regular-expressions-regex-tutorial-python-3/

Here you can test your regular expressions and get a writen explanation of what they do: https://regex101.com/


## Rule-based Natural Language Understanding (NLU) through Regexes

The Regex Generator (adviser/tools/regex_generator/) is a set of Python files that helps create regular expressions (regexes) to detect the User Intent (also refered as User Act) and saves them in JSON files (Location: diasys/adviser/resources/regexes/) .

Note: Act, action and intent are treated as synonyms and they are used indistinctly.

We have two levels of Intents: General and Domain-dependent


### General User Acts 

General User acts are domain-independent, i.e. they are applicable to any new domain. Their respective regular expressions aim at detecting the User Acts: _Hello_, _Bye_, _Affirm_, _Thanks_, _Repeat_, _Request Alternatives_.

The python script GeneralRegexes.py creates and saves the respective regular expressions. Any change and exectution of this file will affect all Regexes-based NLU modules. 

Regexes for general user acts are stored in diasys/adviser/resources/regexes/GeneralRules.json

Example:

Python expression to create regular expressions for the User Act Affirm:

```python
# Affirm
self.rAFFIRM = "((yes|yeah|(\\b|^)ok\\b|(\\b|^)OK|okay|sure|^y$|(\\b|^)yep(\\b|$)" \
               "|(that('?s| is) )?(?<!not\ )(?<!no\ )(right|correct|confirm)))(\s)?$"
self.general_regex[UserActionType.Affirm.value] = self.rAFFIRM
```

Regular expression in JSON file:

```python
"affirm": "((yes|yeah|(\\b|^)ok\\b|(\\b|^)OK|okay|sure|^y$|(\\b|^)yep(\\b|$)|(that('?s| is) )?(?<!not\\ )(?<!no\\ )(right|correct|confirm)))(\\s)?$",

```

The resulting regex is capable to detect these examples:
* yes
* correct
* right

But not 
* not right
* that's not correct


### Domain-dependent User Acts

There are User Acts that depend on a particular domain (i.e. ImsCourses, ImsImpModules), since a slot or a slot-value pair is involved, which refers to information that is inherent to this domain, e.g. Request(slot) and Inform(slot=value).

The python script should be YourDomainNameRegexes.py, e.g. ImsCoursesRegexes.py.

The resulting ontology from the new domain creation is important at this point, because the script YourDomainNameRegexes.py  makes use of the ontology to build iteratively the Inform and Request regexes.

In YourDomainNameRegexes.py it is possible to add vocabulary linked to each slot in set_indomain_regexes(), as well as synonyms for each value if necessary in set_synonyms().

Examples:

* Vocabulary for slot _lecturer_ from ImsCourses in set_indomain_regexes()

```python
self.slot_vocab["lecturer"] = "(lecturer|teacher|responsible|professor|examiner)"
```

With this, the slot _lecturer_ can be also referred as _teacher, responsible, professor_ or _examiner_

* Synonyms for slot _module__name_ and slot 'computational linguistics team laboratory' from ImsCourses in set_synonyms()


```python
slot = "module_name"
self.slot_values[slot]['computational linguistics team laboratory'] += '|((CL\ ' \
                                                                               '|Computational\ Linguistics)?' \
                                                                               'Team\ Lab(oratory)?)'
```

With this, the value can be referred as it appears in the database ('computational linguistics team laboratory'), but also the synonyms specified, this is 'CL', 'Computational Linguistics' and 'Team Lab' and 'Tem Laboratory'. 


For our use case, i.e. ImsCourses, the script ImsCoursesRegexes.py creates domain-dependent regexes stored in:
* Inform Acts: diasys/adviser/resources/regexes/ImsCoursesInformRules.json
* Request Acts: diasys/adviser/resources/regexes/ImsCoursesRequestRules.json


Regex examples:

#### _Inform(lang=de)_ , i.e. language = German

```python
"lang": {
        "de": "(\\\\b|^|\\ )(what\\ about|want|have|need|looking\\ for|used\\ for)(\\ a(n)?)*\\ (\\bde\\b)|(german|deutsch)|(\\bde\\b)|(german|deutsch)(\\ ((would|seems\\ to)\\ be\\ (good|nice)($|[^\\?]$)|seems\\ (good|nice)($|[^\\?]$)))|(\\ |^)(\\bde\\b)|(german|deutsch)(\\ (please|and))*",
       }

```


The resulting regex is capable to detect these examples:
* What about german please
* German seems to be nice



#### _Request(applied _nlp)_

```python
"applied_nlp": "(is\\ (it|(this|the)\\ (course))\\ (from|about|related \\ to)\\ ((applied\\ )?(nlp|natural language processing))(\\?)*)"
```

The resulting regex is capable to detect these examples:
* Is this course related to nlp??
* is the course from applied natural language processing



#### NOTES:

* The Regex Generator works independently from the dialog system, therefore the scripts that create the JSON files containing the regexes should be ready before they are called by any NLU implementation.

* We recommend to use ImsCoursesRegexes.py as template for any other implementation, because it contains the methods to save the JSON files and to compile the regexes.

* Running the existing ImsCoursesRegexes.py (or any other YourDomainNameRegexes.py) from the terminal

```
python ImsCoursesRegexes.py
```

## Rule-based and statistical Belief State Tracker (BST)

The BST maintains a representation of the current dialog state. Our rule-based implementation receives a list of user acts from the NLU that are decoded and stored with probabilities in the belief state. The BST also detects the presence of discourse acts, e.g. _Hello_, _Repeat_, _Inform_ and _Request_. Moreover, it stores information from the system history including the last requested slot and last entity offered.

Our machine learning based belief state tracker is trained to predict the belief state directly from text without the need for an NLU.
To track the constraints and requests issued by the user, we feed system actions and user input turn-wise into a recurrent network and concatenate the resulting hidden states of both inputs before predicting the final belief state.

These approaches can be found in 
* Rule-based: adviser/modules/bst/bst.py
* Statistical: adviser/modules/bst/ml/mlbst.py

## Rule-based Policy

The rule-based policy aims to provide users with a single entity matching the constraints they have specified. 
After each turn the policy verifies that the user has not ended the dialog. 
It then reads the current belief state and generates a suitable query for the database. 
If there are multiple results, the next system act will be to request more information from the user to disambiguate. 
Otherwise, the system is able to make an offer -- directly informing the user about a specific entity -- or to give more details about a current offer. 

The implementation is in `adviser/modules/policy/policy_handcrafted.py`


## User Simulator

To support automatic evaluation of policies and training of reinforcement learning policies, a user simulator can be used. 
The simulated user behaves like a real user following a goal and answering any system action with one or more user actions. 

The package `modules.simulator` provides currently the `HandcraftedUserSimulator` class, which implements the [agenda-based user simulator by Schatzmann et al.](http://mi.eng.cam.ac.uk/~sjy/papers/stwy07.pdf) 
On initialization it loads the configuration file `usermodel.cfg` from the same location where the package `modules.simulator.hdc` resides.
This gives you the possibility to change the behaviour of the user simulator, while the parameters are described in the config.

<div class="alert alert-block alert-info">
<b>Parameters:</b> The parameters will be used on initialization of the user simulator in the future superseding 
</div>


## Training a reinforcement learning policy

In order to train a reinforcement learning policy with a user simulator instead of chatting with it,
the dialog graph needs to change slightly.
User input and output modules need to be replaced by a user simulator module:

<img src="tutorialresources/policy_training.png" width="350" align="center"/>

For the sake of reproducibility, it is recommended to store the random seeds you used for training and evaluation along with the results and initialize all random generators used in your pipeline with it.

If you call `utils.common.init_random(seed=12345)` before training, the python random module as well as numpy, pytorch and tensorflow modules will be initialized with this random seed.
If you run the system twice with the same seed, you should get exactly the same results and dialogs.
Random seeds themselves should also be chosen randomly.

Example training and evaluation code:

In [None]:
import os
from dialogsystem import DialogSystem
from modules.bst import HandcraftedBST
from modules.policy import DQNPolicy
from modules.simulator import HandcraftedUserSimulator
from utils.domain.jsonlookupdomain import JSONLookupDomain
from modules.policy.rl.experience_buffer import NaivePrioritizedBuffer
from utils import common

random_seed = 12345

TRAIN_EPOCHS = 4       # complete training episodes
TRAIN_EPISODES = 1000  # training dialogs per epoch -> 4*1000=4000 training dialogs
EVAL_EPISODES = 500    # test dialogs per epoch -> 4*500=2000 training dialogs
MAX_TURNS = -1         # no upper limit on maximum turns

common.init_random(seed=random_seed)

# initialize dialog system components
domain = JSONLookupDomain(name='ImsCourses')
bst = HandcraftedBST(domain=domain)
user_sim = HandcraftedUserSimulator(domain=domain)
policy= DQNPolicy(domain=domain, lr=0.0001, eps_start=0.3, gradient_clipping=0.0, buffer_cls=NaivePrioritizedBuffer, replay_buffer_size=8192, train_dialogs=TRAIN_EPISODES)

# create dialog graph
rlds = DialogSystem(policy,
                    user_sim,
                    bst)

# training loop
for i in range(TRAIN_EPOCHS):
    # begin next training epoch
    rlds.train()
    
    for episode in range(TRAIN_EPISODES):
        # train 1000 dialogs for this epoch
        rlds.run_dialog(max_length=MAX_TURNS)
    rlds.num_dialogs = 0  # reset: IMPORTANT for epsilon scheduler in dqnpolicy
    
    # begin evaluation 
    rlds.eval()
    for episode in range(EVAL_EPISODES):
        # evaluate 500 dialogs for this epoch
        rlds.run_dialog(max_length=MAX_TURNS)
    rlds.num_dialogs = 0 # reset: IMPORTANT for epsilon scheduler in dqnpolicy

policy.save()  # store trained policy

To create your own reinforcement learning policy, the easiest way might be to derive from `modules.policy.policy_rl.RLPolicy` module.

It constructs a set of action templates from base action names and slot names from the ontology,
e.g. if you have a slot `food` in your ontology, it will create actions like `inform#food`, `request#food` and `select#food` which are used to inform a user about a certain type of food, to request the food type from the user or to let the user choose between multiple food types.
These action templates are the initial output of the reinforcement leraning policy, 
which then get expanded by using beliefstate and database information.
Suppose that a restaurant _two two_ was already selected and the user now asks the system what kind of food they serve.
The appropriate system action could be `inform#food`.
Augmenting this with the knowledge from our belief state, we can formulate a database query for the chosen restaurant and food slot.
The result will then be used to create a full action from the template action: `inform(name='two two', food='french')`.
Methods which perform this augmentation start with `_expand_` in their name, e.g. `_expand_inform()`.

## Evaluation

In order to evaluate your final dialog system, you can add an instance of the evaluation module **modules.policy.evaluation.PolicyEvaluator** at the end of your dialog graph.
It will log average turn number, success rate and average reward.
Additionally, it can generate tensorboard output for live analysis and more details.

Augmenting the system from above to record statistics:

In [None]:
# ... other imports
from modules.policy.evaluation import PolicyEvaluator

# ... component instantiation as above

evaluator = PolicyEvaluator(domain=domain, 
                            use_tensorboard=True,  # <-- create nice live graphs
                            experiment_name='eval_my_awesome_system') # <-- used for naming output folders 

# connect components to a dialog system
rlds = DialogSystem(policy,
                    user_sim,
                    bst,
                    evaluator)

# training loop
for i in range(TRAIN_EPOCHS):
    # begin next training epoch
    rlds.train()
    evaluator.start_epoch()   # <-- notify evaluator that we start a new epoch
    
    for episode in range(TRAIN_EPISODES):
        # train 1000 dialogs for this epoch
        rlds.run_dialog(max_length=MAX_TURNS)
    rlds.num_dialogs = 0  # reset: IMPORTANT for epsilon scheduler in dqnpolicy
    evaluator.end_epoch()   # <-- notify evaluator that epoch ended
    
    # begin evaluation 
    rlds.eval()
    evaluator.start_epoch()   # <-- notify evaluator the we start a new epoch
    for episode in range(EVAL_EPISODES):
        # evaluate 500 dialogs for this epoch
        rlds.run_dialog(max_length=MAX_TURNS)
    evaluator.end_epoch()   # <-- notify evaluator that epoch ended

policy.save()  # store trained policy

## Handcrafted Natural Language Generation (NLG) through Sentence Templates

We have a module called HandcraftedNLG under adviser/modules/nlg/, whose main purpose is to convert the System Act into a human readable sentence by means of templates. Each  possible  systemact  is  mapped  to  exactly  at least one utterance.

To  reduce  large  number  of  mappings, templates  are  used  in a way that  allow  multiple  mappings  from  system  acts  to  their  respective  utterance  at  once.   By  specifying  placeholders  for  a system act's slots and/or values, the utterance can be  formulated  independent  of  the  actual  realizations 

```
inform(name={X}, ects={Y}) → "The course {X} is worth {Y} ECTS."
```

During  the dialog, the system iterates through the templates and chooses the first one for which the system act fits the template's signature. 

Under adviser/resources/templates the handcrafted templates are stored in text files. Each file is divided into several sections, we add examples:

* Templates for system general acts

```
template hello(): "Welcome to the IMS courses chat bot. How may I help you?"
```

* Templates for system requests

```
template request(lang): "In which language shall the course be held?"
```

* Methods for system informs

```
function info(slot, value)
	if slot = "ects": "is worth {value} ECTS"
    if slot = "turn"
		if value = "sose": "is offered in the summer semester"
		if value = "wise": "is offered in the winter semester"
```


* Templates for system informs

```
template inform_byname(name)
	"I found the course {name}. What do you want to know about it?"
template inform_byname(name, turn)
	"The course {name} {info("turn", turn)}."
```

* Similarly, templates for system confirm, request more and select are included.


Example:

If system act is 

```
inform(name='computational linguistics team laboratory', turn='wise')
```

The NLG output will be

```
The course computational linguistics team laboratory is offered in the winter semester.
```


Notes:

* Each domain has a ist own YourDomainNameMessages.nlg file
* No python script needs to be modified (domain independent)