*We strongly encourage you to read the [paper](https://rgdoi.net/10.13140/RG.2.2.17884.19846) before reading this tutorial.

# Introduction

In this tutorial, we want to build a simple dialogue template where the user asks, "\<agent\>, How many items does \<random_player\> have?" and the agent must answer the question with a single sentence, "The \<player\> has \<#\> items" or "I don't know how many items \<player\> has." Because the environment is partially observable, there might be multiple correct answers. Sometimes, the player may appear to have only two visible items, but in reality, they can have five items. Both answers are acceptable in this case.

Let's first create an instance of the DialogueGenerator and an instance of the world where the dialogues will happen. We will use the "easy world" in our example, but you can choose the "hard world" if you prefer.

In [None]:
import os
from dialoguefactory import DialogueGenerator
import dialoguefactory.environments.easy as easy_env

home_directory = os.path.expanduser('~')

error_path = os.path.join(home_directory, 'dialoguefactory_logs', 'error.log')
context_path = os.path.join(home_directory, 'dialoguefactory_logs', 'context.log')
os.makedirs(os.path.dirname(error_path), exist_ok=True)
os.makedirs(os.path.dirname(context_path), exist_ok=True)

world = easy_env.build_world()
generator = DialogueGenerator(world, error_path, context_path)

# Creating the participants' policies

The first step in creating a new template is to make the participants' policies. There are two types of policies: user-based and agent-based. For our dialogues, we use rule-based policies. A rule-based policy is a deterministic model where the developer programs the policy logic. On the other hand, a model-based approach involves using a mathematical model like a machine learning model. Building rule-based policies is recommended for precise responses, but you can use any method to develop your policies.


## The user policy

Implementing the user policies is straightforward because the user issues a single request sentence and later does not contribute to the dialogue.

Since the user's request is parametrized by the *user*, *agent*, and a *random_player*, creating a function that generates the request is useful. We use "request" and "query" interchangeably in our project. 

We use a Describer object to represent the meaning of the sentence. The Describer contains PropBank arguments. The describer argument contains a value and a language component (Word, Phrase, Sentence). The value carries the meaning, and the language component is the syntactic part. They can differ in some cases. As seen in our new request above, the syntax can contain additional punctuation and auxiliary words like 'does' that do not contribute to the sentence's meaning. We create an instance of the Describer object using the function describers.have(). If you need to create a function that initializes a Describer object for a verb not available in the Describer module, first, you need to annotate the sentence to PropBank arguments. To automatically annotate the sentence, we use the [following parser](https://verbnetparser.com/). Note that the automatic parser is not always 100% accurate. Later, you can write a code similar to the [describers.have()](https://revivegretel.com/docs/_modules/dialoguefactory/language/describers.html#have) function.

We also assign a describer mapper to the sentence so that a Describer object can be mapped to our query sentence. The describer mapping is useful to convert the semantics (Describer) to syntax (Sentence). The request mapper is used for converting a statement to a request.

The tmp is used in our user's [AndPolicy](https://revivegretel.com/docs/dialoguefactory.policies.html#dialoguefactory.policies.user_policies.AndPolicy). An example of an *And* request is: John, go to the kitchen, then Andy, how many items does Hannah have?, then ...
By adding the *tmp* to our "HowMany" request, the "And" user policy can use our new request automatically. The *tmp* in the example above is "then." 


In [None]:
import dialoguefactory.language.desc_mappers as dm
import dialoguefactory.language.helpers as he
import dialoguefactory.language.components as lc
import dialoguefactory.language.describers as tdescribers


@he.auto_fill([6], ["speaker"])
def query_have(tmp=(None, None), agent=(None, None), possession=(None, None), 
               owner=(None, None), neg=(None, None), rel=(None, None), speaker=None):
    """
    Creates a Sentence for the verb have in the following form:

        <agent>, <possession> does <owner> (not) have?

    An example is: Gretel, how many items does Jim have?
    """
    if lc.verb_inf(rel[0]) != "have":
        return None

    sent = lc.Sentence([tmp[1],
                        agent[1],
                        lc.Word(","),
                        possession[1],
                        lc.Word("does"),
                        owner[1],
                        neg[1],
                        rel[1],
                        lc.Word('?')],
                       speaker=speaker)
    desc = tdescribers.have(owner, rel, neg, possession )
    desc.args["AM-DIS"] = lc.Arg(agent[0], agent[1])
    if tmp[0] is not None:
        desc.args["AM-TMP"] = lc.Arg(tmp[0], tmp[1])

    sent.describers = [desc]
    sent.customizers["desc_mapping"] = lc.Customizer(dm.have, {})
    sent.customizers["request_mapping"] = lc.Customizer(he.returns_same, {"sentence": sent})

    
    return sent


Here, we show an example of our sentence. As you can see, we do not pass tuples to our query_have function. The decorator auto_fill automatically converts the argument value to a language component (Word, Phrase, Sentence).

In [None]:
example_query = query_have(
          agent=world.player,
          possession = ["How", "many", "items"],
          owner= world.player2,
          rel="have",
          speaker=world.player)
print (example_query.to_string())

Once we have our query function built, we can proceed to create the user policy. The user policy must adhere to the "interface" from the [Policy](https://revivegretel.com/docs/dialoguefactory.policies.html#dialoguefactory.policies.base_policies.Policy) class. In this context, the user is represented by self.player.

The implementation of get_steps is straightforward. This is because the user only provides one response at the beginning of the dialogue and does not respond further. We pass the user request in the *sentences.say* function, resulting in the final response: "\<user\> says: \<agent\>, How many items does \<random_player\> have?"

The players are described using properties and attributes. For instance, the self.agent can be described using the name 'Gretel' or its size and type 'the medium person.' The definite article' the' is omitted if the request uses the agent's second description. As a result, we generate the description elements using `agent.describe()` and later remove the article if it exists. This is for the sake of syntax while retaining the argument value `self.agent`, an Entity. This way, if two sentences are compared with different descriptions of the same Entity, they will be identical.

The policy is reset after the user request because we use one instance of UserPolicy during our dialogue generation. We change the policy's parameters before initializing each new dialogue (shown in the **Creating the template** section below). 

Each policy has a goal; in this case, the agent and the user share the same dialogue goal. We have implemented the get_goal function in the agent's policy for convenience. Hence, the user policy's get_goal function returns None.



In case the user's AutoPolicy is used, and our new policy HowManyItems is active, the self.agent and the self.owner will not be None. We do not create multiple instances of the UserPolicy because, in the future, user policies may need to retain information as the simulation runs.

In [None]:
import copy

from dialoguefactory.policies.user_policies import UserPolicy
import dialoguefactory.language.sentences as tsentences

class HowManyUserPolicy(UserPolicy):
    def __init__(self, player, agent=None, owner=None, dialogue=None):
        super().__init__(player, dialogue)
        self.agent = agent
        self.owner = owner

    def get_steps(self, **params):
        sent = None
        if self.agent is None or self.owner is None:
            return None

        if self.dialogue is not None:
            player_prev_utters = self.dialogue.get_player_utters(self.player)
        else:
            player_prev_utters = []

        if len(player_prev_utters) < 1:
            tmp = params.get("tmp", None)
            self.agent.describe()
            agent_desc_elems = copy.copy(self.agent.description.elements)
            if agent_desc_elems[0] == "the":
                del agent_desc_elems[0]
            
            request_how_many = query_have(tmp=tmp,
                                          agent=(self.agent, self.agent.describe(agent_desc_elems)), 
                                          possession = ["How", "many", "items"],
                                          owner= self.owner,
                                          rel="have",
                                          speaker=self.player)

            sent = tsentences.say(self.player, None, 'says',
                                  request_how_many, speaker=self.player)
            self.reset()

        return sent
     
    def reset(self):
        self.agent = None
        self.owner = None

    def save_state(self):
        return self.agent, self.owner

    def recover_state(self, state):
        self.agent = state[0]
        self.owner = state[1]
        

We will test our new policy by having Max (a very large person) act as the user, Gretel (a medium-sized person) as the agent, and Andy (an orange bear) as the object owner. Let's observe what the request looks like. Please note that Max, Gretel, and Andy can also be described using other properties such as their nickname or surname.

In [None]:
example_policy = HowManyUserPolicy(world.inv, world.player, world.bear)
sentence = example_policy.execute()
print (sentence.to_string())


We store all user policies in the DialogueGenerator class under the [user_policy_database](https://revivegretel.com/docs/dialoguefactory.html#dialoguefactory.dialogue_generator.DialogueGenerator.user_policy_database) dictionary. We use a single instance of the user and agent policies as they can retain information as the simulation runs and the context expands. The dictionary maps each player in the world to a list of their user policies. Additionally, we attach the instance of the HowMany policy to each user's [AutoPolicy](https://revivegretel.com/docs/dialoguefactory.policies.html#dialoguefactory.policies.base_policies.AutoPolicy).

Currently, we do not use auto-policies for our dialogues to save time. Auto-policies are useful when the user request is issued in a specific context. For example, if user motivations like being hungry are implemented, the auto-policy will select the user policy that issues the following requests: "Go to the kitchen" or "Order food".


In [None]:
for user in world.players:
    user_pol = HowManyUserPolicy(user)
    generator.user_policy_database[user].append(user_pol)
    generator.user_auto_policy_database[user].list_policies.append(user_pol)
    

Explain the addition of the goal etc. say we already have implemented checkers. explain how the specific checker works instead
explain that the checkers use only information from the context that the agent has seen. Explain tsentences is used to avoid overlap of list of sentences.

## The agent's policy

Let's create the agent's policy now, which we'll name HowManyAgentPolicy.

The agent's policy is responsible for returning the agent's response and goal based on the context. The BasePolicy contains an `execute()` function, which takes the first utterance of the dialogue (the user request in this case) and calls the `parse()` function. The parse function verifies that the user request is "HowMany." This verification is useful when the agent's `AutoPolicy` is used to find the appropriate policy in the context automatically.

The owner is extracted from the sentence, and the `task()` is called to determine the next agent's response and goal.

The logic for the policy is as follows:
1. Count the number of owner's objects whose location is revealed. If there are some objects that have been revealed, we add the sentence: "\<owner\> has <\#visible> items" to the list of valid responses.
2. If no objects have been revealed, it is uncertain whether the owner has any items. Therefore, we check whether the agent has knowledge of "\<owner\> has no items" in the knowledge base.
3. If the check is not True, then it means the agent does not know how many items the owner has.
4. Sometimes, the true number of objects that the owner has can differ from the number of objects revealed in the context. We allow the agent to guess the correct answer under these circumstances.




In some cases, the environment provides this information, and the knowledge base stores this information using the updaters. Later, our checkers verify against the knowledge base to see whether this information is available. You can find more information about updaters and checkers below.

To help developers build agent policies more efficiently, we developed the KnowledgeBase. The KnowledgeBase stores all the factual information that continuously comes from the context. It is useful for quickly and easily checking if the information in the sentence is present in the context 

In [None]:
from dialoguefactory.policies.base_policies import BasePolicy
from dialoguefactory.environment import entities as em
from dialoguefactory.policies import goals as tgoals 

class HowManyAgentPolicy(BasePolicy):

    def parse(self, last_user_command):
        
        describer = last_user_command.describers[0]

        if describer.get_arg("Rel", _type=0).infinitive != "have":
            return None, None
 
        owner = describer.get_arg("Arg-PAG")
        
        how_many_query = query_have(describer.get_arg("AM-TMP"), self.player, ["How", "many", "items"], owner, None, "have")

        if last_user_command == how_many_query:
            if isinstance(owner, em.Entity):
                return self.task(owner)

        return None, None
        
    def task(self, owner):
        knowledge_base = self.dialogue.dia_generator.knowledge_base
        counter = 0
        for obj in owner.objects:
            if knowledge_base.check(tsentences.be([obj, "'s", "location"], "is", None, ["in", owner])):
                counter += 1

        steps, goal_steps = [], []
        if counter > 0:
            visible_objs_sent = tsentences.have(owner, 
                                   'has',
                                   None,
                                   [str(counter), 'items'])
            
            steps.append(visible_objs_sent)
            
        else:
            no_items_sent = tsentences.have(owner,
                                    'has',
                                    'not',
                                    'items')
            if knowledge_base.check(no_items_sent):
                steps.append(no_items_sent)
            else:
                statement = tsentences.have(owner,
                                            'has',
                                            None,
                                            ['how', 'many', 'items'], speaker=self.player)
                del statement.parts[-1]
                for element in reversed(statement.parts[-3:]):
                    statement.parts.insert(0, element)
                del statement.parts[-3:]
                
                do_not_know_sent = tsentences.know(self.player, "not", "know", statement, speaker=self.player)
                steps.append(do_not_know_sent)

        if len(owner.objects) > 0:
            non_vis_objs_sent = tsentences.be(owner,
                                      'has',
                                      None,
                                      [str(len(owner.objects)), 'items'])
            
            if non_vis_objs_sent not in steps:
                goal_steps.append(non_vis_obj_sent)
        elif len(owner.objects) == 0:
            no_items_sent = tsentences.have(owner,
                                    'has',
                                    'not',
                                    'items')
            goal_steps.append(no_items_sent)

        def add_say(sentences):
            new_list = []
            for sent in sentences:
                new_list.append(tsentences.say(self.player, None, "says", sent, speaker=self.player))
            del sentences[:]
            sentences.extend(new_list)


        for listt in [steps, goal_steps]:
            add_say(listt)
                
        
        goal = tgoals.Goal(tgoals.multiple_correct,
                               self.dialogue,
                               self.player,
                               goal_steps+steps,
                               len(self.dialogue.get_utterances()) - 1
                               )
        return steps, goal


We add the `HowManyAgentPolicy` instance to the agent's policy databases in the same way we added the user policy.

In [None]:
for agent in world.players:
    agent_pol = HowManyAgentPolicy(agent)
    generator.agent_policy_database[agent].append(agent_pol)
    generator.agent_auto_policy_database[agent].list_policies.append(agent_pol)

# Creating the template

After we have developed the policies, creating the template is reduced to creating a function. The templates are functions that create and initialize an instance of the Dialogue class. They are considered primitive if they don't contain another template in the function parameters, and complex otherwise. You can check some examples of primitive and complex templates in the module [templates](https://revivegretel.com/docs/dialoguefactory.generation.html#module-dialoguefactory.generation.templates). Upon creating the template function, the template must be appended to generator.primitive_templates if the template is primitive or to generator.complex_templates otherwise.


In [None]:
from dialoguefactory.generation import helpers as gh
from dialoguefactory.generation.templates import init_dialogue

def howmany_template(dia_generator, user, agent, random_player, entities_descriptions=None):
    user_policy = gh.find_policy(dia_generator.user_policy_database[user],
                                      HowManyUserPolicy)
    user_policy.agent = agent
    user_policy.owner = random_player

    agent_policy = gh.find_policy(dia_generator.agent_policy_database[agent],
                                       HowManyAgentPolicy)
    dialogue = init_dialogue(dia_generator, user_policy, agent_policy,
                             entities_descriptions)
    
    return dialogue

The new template includes the following parameters: user, agent, and a random player (the owner of the items). The *entities_descriptions* is a dictionary that maps Entity to Description. This dictionary is used to specify the description of the entities that appear in the dialogue. For instance, you can describe the user, the agent, and the random player with their nicknames. Otherwise, their descriptions are randomly generated as the dialogue runs. 

Since our template is primitive, we're adding it to the list of primitive templates.

In [None]:
generator.primitive_templates.append(howmany_template)

The template is randomly selected during the dialogue generation process. The template parameters *user* and *agent* are randomly generated using the parameter generators. When generated, the parameters are temporarily stored in the Python dictionaries [generator.curr_prim_params](https://revivegretel.com/docs/dialoguefactory.html#dialoguefactory.dialogue_generator.DialogueGenerator.curr_prim_params) and [generator.curr_complex_params](https://revivegretel.com/docs/dialoguefactory.html#dialoguefactory.dialogue_generator.DialogueGenerator.curr_complex_params). This allows the *parameterA* generator to fetch the value of *parameterB* from the dictionaries if the generation of *parameterA* depends on the generation of *parameterB*. Once the template parameters are generated, the dictionaries *generator.curr_prim_params* and *generator.curr_complex_params* are cleared. 

Since there is no parameter generator for the *random_player* parameter, we create one. We reuse the parameter generator [random_world_list](https://revivegretel.com/docs/dialoguefactory.generation.html#dialoguefactory.generation.param_generators.random_world_list), and we pass the list "players" so that it randomly selects a player from the world.

All parameter generators are stored in the lists *generator.prim_param_generators* and *generator.complex_param_generators*, depending on whether they generate the parameters for the primitive or complex template. We add our *random_player* generator to the *prim_param_generators* since our template is primitive.

In [None]:
from functools import partial
from dialoguefactory.generation import param_generators as pg

generator.prim_param_generators.append(("random_player", partial(pg.random_world_list, generator.curr_prim_params,
                                                                  "players")))

Let's look at some dialogues generated from our template when mixed with other templates.

In [None]:
generator.run(100)
for utterance in generator.context:
    print (utterance.to_string())

# Submitting your templates

Please refer to the following [link](https://github.com/smartinovski/dialoguefactory#submitting-your-templates) to see how to submit your newly developed dialogue templates.

# Appendix

You may find the following sections helpful if your templates require additional components to be developed.

## The knowledge base

In order to assist developers in creating agent policies more efficiently, we have developed the [KnowledgeBase](https://revivegretel.com/docs/dialoguefactory.state.html#dialoguefactory.state.knowledge_base.KnowledgeBase). The KnowledgeBase stores all available factual information from the context and the meta context. It is helpful for quickly and easily determining whether the information conveyed in a sentence is explicitly or implicitly present in the context. The knowledge base includes a set of functions known as checkers, which search for various types of information in the context. During the checking process, each of the checkers is called, and the first one that responds provides the final result. Examples of checkers can be found in the [kn_checkers module](https://revivegretel.com/docs/_modules/dialoguefactory/state/kn_checkers.html).

Additionally, the knowledge base consists of a list of functions called updaters, which are responsible for updating the knowledge base with truthful sentences. Examples of updaters can be found in the [kn_updaters module](https://revivegretel.com/docs/_modules/dialoguefactory/state/kn_updaters.html).

To create a new updater or checker, create a function with parameters *kb_state* and *sent* where *kb_state* is an instance of the [KnowledgeBase](https://revivegretel.com/docs/dialoguefactory.state.html#dialoguefactory.state.knowledge_base.KnowledgeBase) class and *sent* is an instance of the class [Sentence](https://revivegretel.com/docs/dialoguefactory.language.html#dialoguefactory.language.components.Sentence). Once you've created them, they can be appended to the following lists: *knowledge_base.kn_updaters* or *knowledge_base.kn_checkers*

## Creating new environment actions

If you need to create a new action for your policy, this section is useful. Examples of environment actions include *drop*, *open*, and *say*. For instance, if you want to create a policy for unlocking and locking containers and doors, start by creating a function called "unlock." To kickstart the process, you can refer to the source code of the [environment.actions](https://revivegretel.com/docs/dialoguefactory.environment.html#module-dialoguefactory.environment.actions) module for action examples. Once you've built the action function, you'll need to create an environment policy. The environment policy parses the user request and translates it into an action. For example, if a user says, "Hannah, unlock the green door in the kitchen," the environment will translate the request to the action unlock(player, green_door, kitchen). You can find an example of an environment policy in the [env_policies](https://revivegretel.com/docs/dialoguefactory.policies.html#module-dialoguefactory.policies.env_policies) module.