# **Ontology-driven dialogue simulator for generating task-oriented dialogue datasets**

This notebook builds a Dialogue Simulator that generates conversations between a User Simulator and TOD System, based on a provided ontology.

Our approach aims to bootstrap the development of a conversational agent by generating a dataset that contains an arbitrary number of dialogues for training any neural network of one’s choice. We build a Machine-to-Machine (M2M) system, with three main components: a prompt generator, a user simulator, and a task-oriented dialogue system (TODS). With the help of semantic technologies, the domain-scope knowledge is mapped under an ontology, and the dialogue context is represented as a local knowledge graph, while pre-defined rules transform text templates into natural language responses. The final metrics obtained highlight the benefits of the aforementioned framework.

More details about the work can be found in the attached paper, at https://github.com/IonutIga/Dialogue-Simulator/. The work is the result of a Master Thesis by V.I. Iga, coordonated by prof. G.C. Silaghi, at FSEGA, UBB, Cluj-Napoca, Romania.

The TOD System alone can be found here --> https://github.com/IonutIga/TOD-System.

A NLU BERT-based model finetuned using datasets generated by the Dialogue Simulator can be found here --> https://github.com/IonutIga/Domain-Specific-NLU-BERT.

**In order for this notebook to run properly**, load the sample files from the repository, then the provided ontology (which is also the general KB where all valid discussed instances are inserted).

## Install and import libraries

In [None]:
!pip install rdflib
import rdflib as r
import datetime
import re
import random as rand
import torch
from itertools import permutations
from nltk.corpus import wordnet as wn
import inflect
import ast
import json
import time as ttime
from google.colab import files
from tqdm.auto import tqdm
from rdflib.namespace import CSVW, DC, DCAT, DCTERMS, DOAP, FOAF, ODRL2, ORG, OWL, \
                           PROF, PROV, RDF, RDFS, SDO, SH, SKOS, SOSA, SSN, TIME, \
                           VOID, XMLNS, XSD

## **Define Classes**

In [3]:
# class used to solve certain tasks which are not bot specific

class Utils:
    def __init__(self):
      self.inflection = inflect.engine()

    # function to escape the user input which contains special characters in order for the queries to work

    def escape_special_chars(self, string_to_escape):
      return repr(re.sub('\\\\','\\\\',re.escape(string_to_escape)))

    # function used to get the unique ID which is the concatenation of the current date and time

    def getID(self):
        now = datetime.datetime.now()
        id = now.strftime('%Y%m%d%H%M%S%f')
        return id

    # function to check for the params of a procedure. It returns a dictionary with the action and the mandatory/optional params

    def checkParams(self, params, state):
        p = {
              'act': '',
              'mandatory' : [],
              'optional' : []
          }
          # always check for params
        if params:
            for k, v in params.items():
                if k not in state.keys():
                    if v[0] == 1:
                        p['mandatory'].append(k)
                    else:
                        p['optional'].append(k)
            if len(p['mandatory']) > 0:
                p['act'] = 'requireParams'
            else:
                p['act'] = 'confirmParams'
        else:
            p['act'] = 'default'
        return p

    # function to tokenize the input based on a pattern

    def tokenize(self,pattern, user):
        tokens = re.split(pattern,user)
        for t in tokens:
            if t == '':
                tokens.remove(t)
        return tokens

    # function to replace the placeholders with the real values

    def replace_placeholder(self, text, placeholder, values, additional_info = None):
        if additional_info != None:
            text += additional_info
        for p, v in zip(placeholder, values):
          text = text.replace(p,v)
        return text

    # check if a word is in the ID format; 20 is the number of digits in which the datetime is converted

    def isID(self, id, known_entities):
      date = ''
      for e in known_entities:
        if e in id:
          entity, date = re.split(f'({e})', id)[1:]
      if len(date) == 20:
        return True, entity
      else:
        return False, False

    # get the instance ID or the entity type that the user is referring to

    def getInstanceOrEntity(self, tokens, intent_keywords, known_entities):

      tok = [t.lower() for t in tokens]
      if len(tok) <= 1:
        return False, False

      # check if a word is in the intent keywords list
      for v in intent_keywords:
          if v in tok:
            isID, entity = self.isID(tok[tok.index(v) + 1].capitalize(), known_entities)
            if isID:
              return tok[1].capitalize(), entity
            elif len(tok) >= 3:
              en = tok[tok.index(v) + 2].capitalize()
              raw_en = tok[tok.index(v) + 2]
              en = self.inflection.singular_noun(en) if self.inflection.singular_noun(en) != False else en
              if en in known_entities:
                return raw_en, False
              else:
                return False, tokens[tok.index(v) + 2]
            else:
              return False, False

    # get the parameters' values from the user utterance; the findings are based on a fixed position in the utterance
    # the function maps the select, delete and update usecases. For update, some extra coding is needed, therefore the intent is passed

    def getParamsValues(self, tokens, intent, params, slots, general_existence_words):
      ignore_param = 999999999999999
      i = 0
      isActive = ''
      active_values = {}
      if intent == 'update':
        slots['new_values'] = {}
        slots['old_values'] = {}

      if not params:
        return slots

      for t in tokens:
        # keywords for detecting the new values
        if t in ['changing','modifying'] and intent == 'update':
          slots['old_values']  = active_values
          active_values = {}
          isActive = 'new_values'
          # keywords for detecting old values enumeration; old values means the filtering values
        elif t in ['where', 'which','filter', 'filters'] and intent == 'update' :
          slots['new_values'] = active_values
          active_values = {}
          isActive = 'old_values'
        for k in params.keys():
          #print(t, ignore_param)
          tt = 'has' + t.capitalize()
          if tt == k:
            if i != ignore_param:
              v = tokens[i + 2]
              #enable the use of explicit unknown values for a parameter, such as "manager is someone with name John"
              if v.lower() in general_existence_words:

                #special case when a literal paramater is to be filtered, such as "code is something like 12"
                if tokens[i+3] in ['like','containing']:
                  val = ''
                  for d in tokens[i + 2:i + 2 + 3]:
                    val += d + ' '
                    if intent == 'update':
                      active_values[k] = val.strip()
                    else:
                      slots[k] = val.strip()
                else:
                  ignore_param = i + 2 + 2
                  val = ''
                  for d in tokens[i + 2:i + 2 + 4]:
                    val += d + ' '
                  if intent == 'update':
                    active_values[k] = val.strip()
                  else:
                    slots[k] = val.strip()
              else:
                if intent == 'update':
                  active_values[k] = v.strip()
                else:
                  slots[k] = v.strip()
        i += 1

      # put the detected values in the right position; if no old values were provided, then only new values are saved
      if active_values and intent == 'update':
        if isActive == 'old_values':
          slots['old_values'] = active_values
        elif isActive == 'new_values':
          slots['new_values'] = active_values
        else:
          slots['new_values'] = active_values

      return slots

    # function to print elements from a list of dictionaries, according to a certain template

    def printListOfDictionaries(self, dict_list, keys_to_avoid = []):
      results = '\n'
      for r in dict_list:
        response = ' {'
        for k, v in r.items():
          if k not in keys_to_avoid:
            response += f' {k}: {v}; '
        response += '}; '
        results += response + '\n'

      return results

    # function to print elements from a list, having available two templates.

    def printListOfLiterals(self, lit_list, format_params = False, lit_to_avoid = []):
      results = ''
      size = len(lit_list) - len(lit_to_avoid)
      counter = 0
      for r in lit_list:
        if r not in lit_to_avoid:
          counter += 1
          if format_params:
            info = f'{r[3:].lower()}, ' if counter < size else f'{r[3:].lower()}. '
            results += info
          else:
            info = f'{r}, ' if counter < size else f'{r}. '
            results += info
      return results

    # function to print a dictionary according to a certain template

    def printDictionary(self, dict_items, keys_to_avoid = [], extra_words = None):
      results = ''
      size = len(dict_items) - len(keys_to_avoid)
      counter = 0
      for k, v in dict_items.items():
        if k not in keys_to_avoid:
          counter += 1
          if k != 'ID':
            if extra_words:
              info = f'{k} {extra_words} {v}, ' if counter < size else f'{k} {extra_words} {v}'
              results += info
            else:
              info = f"{k[3:]} is {v}, " if counter < size else f"{k[3:]} is {v}; "
              results += info
          else:
            info = f'{k} is {v}, ' if counter < size else f'{k} is {v}; '
            results += info

      return results

    # create the dictionary which maps an instance

    def createDict(self,intent, text, slots, positions):
      return {"intent": intent,
              "text": text,
              "slots": slots,
              "positions": positions}

    # calculate de start and end index of a word in a phrase

    def entityDetails(self,phrase, entity):
      startIndex = phrase.find(entity)
      endIndex = startIndex + len(entity) - 1
      return startIndex, endIndex

    # read a random line from a file

    def read_random_line(self, file):
      f = open(file)
      lines = f.read().splitlines()
      myline =rand.choice(lines)
      f.close()
      return myline


In [4]:
# class to query a graph and process the results

class Query:

    def __init__(self, namespace):

        self.namespace = namespace.__str__()
        self.utils = Utils()
        self.pattern = '/|#'

    # function to get the parameters of a specific concept from the ontology

    def params_query(self, graph, entity):

        params = {}
        params_query = f"""
                        PREFIX : <{self.namespace}>
                        PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                        PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                        PREFIX owl: <http://www.w3.org/2002/07/owl#>
                            SELECT ?p ?y ?z WHERE {{
                                                    {{
                                                        ?p rdfs:domain :{entity};
                                                        rdfs:range ?x.
                                                        ?x owl:minQualifiedCardinality ?y.
                                                        OPTIONAL {{?x owl:onClass|owl:onDataRange ?z }}
                                                    }}
                                                    UNION
                                                    {{
                                                        ?p rdfs:domain owl:Thing;
                                                        rdfs:range ?x.
                                                        ?x owl:minQualifiedCardinality ?y.
                                                        OPTIONAL {{?x owl:onClass|owl:onDataRange ?z }}
                                                    }}
                                            }}"""

        qres = graph.query(params_query)
        for row in qres:
              params[self.utils.tokenize(self.pattern, row.p)[-1]] = [int(row.y), str(row.z.rsplit('/')[-1])]

        return params

    # function to check the existence of a specific instance

    def existence_query(self, graph, subj, pred = '?y', obj = '?z'):

        existence_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                    ASK {{:{subj} {pred} {obj}}}
                            """

        qres = graph.query(existence_query)
        return qres

    # function to check whether a process was canceled or confirmed

    def confirm_cancel_query(self, graph, entity):

        confirm_cancel_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                    ASK {{
                                        :System :confirm|:cancelProcedure ?x.
                                        ?x :entity :{entity}.
                                        }}

                            """

        qres = graph.query(confirm_cancel_query)
        return qres

    # function to retrieve the state of an instance

    def state_query(self, graph, entity):

        params = {}
        state_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                    SELECT * WHERE {{
                                        :{entity} ?x ?y.
                                        FILTER (?x != rdf:type)
                                        }}

                            """

        qres = graph.query(state_query)
        for row in qres:
          params[self.utils.tokenize(self.pattern, row.x)[-1]] = str(self.utils.tokenize(self.pattern, row.y)[-1])

        return params

    # function to check the state of an insert procedure

    def insert_procedure_state_query(self, graph, entity):

        params = {}
        params_requiring_list = {}
        insert_procedure_state_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                    SELECT ?procedure ?x ?y WHERE {{
                                        ?procedure a :Insert;
                                                  :instance :{entity};
                                                  ?x ?y.
                                        FILTER (?x != rdf:type)
                                        }}

                            """

        qres = graph.query(insert_procedure_state_query)

        for row in qres:

          params['ID'] = self.utils.tokenize(self.pattern, row.procedure)[-1]
          pred = self.utils.tokenize(self.pattern, row.x)[-1]
          if pred not in params_requiring_list.keys():
            params_requiring_list[pred] = [0]
          else:
            params_requiring_list[pred][0] += 1
          try:
            params_requiring_list[pred].append(ast.literal_eval(self.utils.tokenize(self.pattern, row.y)[-1]))
          except:
            params_requiring_list[pred].append(str(self.utils.tokenize(self.pattern, row.y)[-1]))

        for k, v in params_requiring_list.items():
          if v[0] == 0:
            params[k] = v[1]
          else:
            params[k] = v[1:]

        return params

    # function to check if an instance of an entity type with (optional) specific parameters exists. Returns all information about it

    def pre_existing_query(self, graph, entity, paramsq):

        params = {'ID' : ''}
        instances = []

        pre_existing_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
                                    SELECT * WHERE {{
                                            ?x a :{entity};
                                                {paramsq}
                                                ?y ?z.
                            FILTER (?y NOT IN (owl:topObjectProperty, rdf:type))
                            }}

                            """

        qres = graph.query(pre_existing_query)
        if not qres.bindings:
          return instances
        for row in qres:
          isPreexisting = False
          if instances:
            for w in instances:
              if w['ID'] == self.utils.tokenize(self.pattern, row.x)[-1]:
                isPreexisting = True
                w[self.utils.tokenize(self.pattern, row.y)[-1]] = str(self.utils.tokenize(self.pattern, row.z)[-1])
          if isPreexisting == False:
            if self.utils.tokenize(self.pattern, row.x)[-1] != params['ID'] and params['ID'] != '':
              params = {}
            params['ID'] = self.utils.tokenize(self.pattern, row.x)[-1]
            params[self.utils.tokenize(self.pattern, row.y)[-1]] = str(self.utils.tokenize(self.pattern, row.z)[-1])
            instances.append(params)

        return instances

    # function to check if an instance of an entity type with (optional) specific parameters exist
    # the function enables checking for a specific relationship and only returns relationships of interest

    def pre_existing_param_query(self, graph, entity, param, prop = '?y'):

        params = {'ID' : ''}
        instances = []
        y = prop
        if y != '?y':
            y = ''
        param = self.utils.escape_special_chars(param)
        pre_existing_params_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
                                    SELECT ?x {y} ?z WHERE {{
                                            ?x a :{entity};
                                             {prop} ?z.
                            {'FILTER (?y NOT IN (owl:topObjectProperty, rdf:type))' if y == '?y' else ''}
                            FILTER regex(STR(?z), {param})
                            }}

                            """
        qres = graph.query(pre_existing_params_query)
        if not qres.bindings:
          return instances

        for row in qres:
          isPreexisting = False
          if instances:
            for w in instances:
              if w['ID'] == self.utils.tokenize(self.pattern, row.x)[-1]:
                isPreexisting = True
                if y == '?y':
                  w[self.utils.tokenize(self.pattern, row.y)[-1]] = str(self.utils.tokenize(self.pattern, row.z)[-1])
                else:
                  w[prop[1:]] = str(self.utils.tokenize(self.pattern, row.z)[-1])
          if isPreexisting == False:
            if self.utils.tokenize(self.pattern, row.x)[-1] != params['ID'] and params['ID'] != '':
              params = {}
            params['ID'] = self.utils.tokenize(self.pattern, row.x)[-1]
            if y == '?y':
                params[self.utils.tokenize(self.pattern, row.y)[-1]] = str(self.utils.tokenize(self.pattern, row.z)[-1])
            else:
                params[prop[1:]] = str(self.utils.tokenize(self.pattern, row.z)[-1])
            instances.append(params)

        return instances

    # function to check if an instance depends (is in relationship with) on other instances

    def dependency_query(self, graph, param):

        params = {'ID' : ''}
        instances = []

        dependency_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
                                   SELECT  ?x ?b ?c WHERE{{
                                            ?x ?b ?c.
                                          {{
                                             SELECT ?x WHERE {{
    	                                                        ?x ?y ?z.
                                                              FILTER regex(STR(?z), '{param}')
		                                                          FILTER (?y NOT IN (owl:topObjectProperty, rdf:type))
                                                              }}
                                          }}
                                 FILTER (?b NOT IN (owl:topObjectProperty, rdf:type))
                                        }}


                            """

        qres = graph.query(dependency_query)
        if not qres.bindings:
          return instances
        for row in qres:
            isPreexisting = False
            if instances:
              for w in instances:
                if w['ID'] == self.utils.tokenize(self.pattern, row.x)[-1]:
                  isPreexisting = True
                  w[self.utils.tokenize(self.pattern, row.b)[-1]] = str(self.utils.tokenize(self.pattern, row.c)[-1])
            if isPreexisting == False:
              if self.utils.tokenize(self.pattern, row.x)[-1] != params['ID'] and params['ID'] != '':
                params = {}
              params['ID'] = self.utils.tokenize(self.pattern, row.x)[-1]
              params[self.utils.tokenize(self.pattern, row.b)[-1]] = str(self.utils.tokenize(self.pattern, row.c)[-1])
              instances.append(params)

        return instances

    # function to select information about a specific instance

    def select_instance_query(self, graph, ID):

        params = {}

        select_instance_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
                                    SELECT * WHERE {{
                                            :{ID} ?y ?z.
                            FILTER (?y NOT IN (owl:topObjectProperty))
                            }}

                            """

        qres = graph.query(select_instance_query)
        if not qres.bindings:
            return params
        params['ID'] = ID
        for row in qres:
            params[self.utils.tokenize(self.pattern, row.y)[-1]] = str(self.utils.tokenize(self.pattern, row.z)[-1])

        return params

    # function to retrieve information about a specific procedure

    def procedure_query(self, graph, ID, entity):

      params = {}
      local_params = {}
      instances = []
      params_requiring_list = {}

      procedure_query = f"""
                                PREFIX : <{self.namespace}>
                                PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                                PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
                                   SELECT  ?pred ?obj ?pred2 ?obj2 WHERE {{
                                      :{ID} a :{entity};
                                      ?pred ?obj.
                                      OPTIONAL {{?obj ?pred2 ?obj2}}
                                      FILTER (?pred NOT IN (owl:topObjectProperty))
                            }}

                            """

      qres = graph.query(procedure_query)
      if not qres.bindings:
          return params
      for row in qres:
        if type(row.obj) == r.BNode:
          isPreexisting = False
          if instances:
            for w in instances:
              if w['pred'] == self.utils.tokenize(self.pattern, row.pred)[-1]:
                isPreexisting = True
                w[self.utils.tokenize(self.pattern, row.pred2)[-1]] = str(self.utils.tokenize(self.pattern, row.obj2)[-1])
          if isPreexisting == False:
            local_params = {}
            local_params['pred'] = self.utils.tokenize(self.pattern, row.pred)[-1]
            if row.pred2 != None:
              local_params[self.utils.tokenize(self.pattern, row.pred2)[-1]] = str(self.utils.tokenize(self.pattern, row.obj2)[-1])
            instances.append(local_params)
        else:
          pred = self.utils.tokenize(self.pattern, row.pred)[-1]
          if pred not in params_requiring_list.keys():
            params_requiring_list[pred] = [0]
          else:
            params_requiring_list[pred][0] += 1
          try:
            params_requiring_list[pred].append(ast.literal_eval(self.utils.tokenize(self.pattern, row.obj)[-1]))
          except:
            params_requiring_list[pred].append(str(self.utils.tokenize(self.pattern, row.obj)[-1]))

      for k, v in params_requiring_list.items():
        if v[0] == 0:
          params[k] = v[1]
        else:
          params[k] = v[1:]

      for w in instances:
        pred = w['pred']
        w.pop('pred')
        params[pred] = w

      return params

    # function that retrieves information about a specific concept (Class)

    def class_query(self, graph, entity):
      params = {}
      class_query = f"""
                        PREFIX : <{self.namespace}>
                        PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                        PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                        PREFIX owl: <http://www.w3.org/2002/07/owl#>
                            SELECT ?x ?y WHERE {{
                                                    :{entity} a owl:Class;
                                                            ?x ?y.
                                                 FILTER (?x NOT IN (rdf:type,rdfs:subClassOf))
                                            }}"""

      qres = graph.query(class_query)
      for row in qres:
        params[self.utils.tokenize(self.pattern, row.x)[-1]] = str(self.utils.tokenize(self.pattern, row.y)[-1])

      return params

    # function that return the number of instances per concept class

    def nr_of_instances_per_class_query(self, graph, entity):
      params = {}
      nr_of_instances_per_class_query = f"""
                              PREFIX : <{self.namespace}>
                              PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                              PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                              PREFIX owl: <http://www.w3.org/2002/07/owl#>
                                  SELECT (COUNT(?x) as ?nre) WHERE {{
                                                          ?x a :{entity}.
                                                      FILTER (?x NOT IN (rdf:type,rdfs:subClassOf))
                                                  }}"""

      qres = graph.query(nr_of_instances_per_class_query)
      for row in qres:
        return row.nre


In [5]:
# class which holds the structure of the generator of prompts (for the dialogue between the user simulator-chatbot)

class PromptGenerator:

  def __init__(self, graph, namespace, supported_entity, general_existence):
    self.prompt = '' # the prompt to be generated
    self.procedure = '' # the procedure to be generated
    self.supported_entity = supported_entity # the entity types supported by the ontology
    self.general_existence = general_existence # terms used to express general existence in natural language words
    self.params = {} # the parameters specific to an entity type
    self.inflection = inflect.engine() # engine to check whether a word is in singular or plural form
    self.graph = graph # the general knowledge graph
    self.query = Query(namespace) # query utility object
    self.utils = Utils() # utils utility object


  #for complex intents (which may have parameters):
    #do procedure a/an/all entity/entities [with param value param something like value param someone with anything value etc.] [where the same only for update]
  #for simple intents: say procedure

  def generate_prompt(self):

    # function to add parameters to a prompt

    def add_params(self, entity, procedure, select_slots = [], insert_instance_ID = None):

      response = ''
      entity = entity
      if self.inflection.singular_noun(entity) != False:
        entity = self.inflection.singular_noun(entity)
      self.params = self.query.params_query(self.graph, entity)
      select_slots = []

      if not select_slots:
        for k in self.params.keys():
          select_slots.append(k)

      if select_slots:
        nr_of_params_prob = rand.randint(1,len(select_slots))
        l = list(permutations(range(0, nr_of_params_prob)))
        permutation = l[rand.randint(0,len(l)-1)]
      else:
        nr_of_params_prob = 0

      i = 1
      while i <= nr_of_params_prob:

        slot = select_slots[permutation[i-1]]
        type_of_param = str(self.params[slot][1])
        key = slot[3:].lower()

        if type_of_param.startswith('XMLSchema') == False:
          if procedure != 'insert':
            # instance
            type_of_instance = rand.choices([1,2,3,4],[0.48, 0.22, 0.22,0.08])[0]
            instance_params = self.query.params_query(self.graph, type_of_param)
            type_params = self.query.class_query(self.graph, type_of_param)
            general_existence_pool = [[2,5],[0,1]] if type_params['category'] == 'human' else [[0,1],[0,1]]
            parameter = list(instance_params.keys())[rand.randint(0,len(instance_params.keys())-1)][3:].lower()


            match type_of_instance:
              case 1:
                name = type_of_param + 'name' + '.txt'
                value = self.utils.read_random_line(name)
                parameter = 'name'
              case 2:
                name = type_of_param + parameter + '.txt'
                value = self.utils.read_random_line(name)
                value = f'{self.general_existence[rand.randint(general_existence_pool[0][0],general_existence_pool[0][1])]} with {parameter} {value}'
              case 3:
                name = type_of_param + parameter + '.txt'
                value = self.utils.read_random_line(name)
                value = f'{self.general_existence[rand.randint(general_existence_pool[0][0],general_existence_pool[0][1])]} with {self.general_existence[rand.randint(general_existence_pool[1][0],general_existence_pool[1][1])]} {value}'
              case 4:
                all_entity_ids = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, type_of_param,'')]
                value = rand.choice(all_entity_ids) if all_entity_ids else type_of_param + self.utils.getID()
                parameter = 'ID'


            if i == nr_of_params_prob and i != 1:
              response += f'{key} {value}'
            else:
              response += f'{key} {value} '

            i += 1
          else:
            i += 1
        else:
          # literal
          type_of_literal = rand.choices([1,2],[0.7, 0.3])[0]
          name = entity + key + '.txt'

          match type_of_literal:
            case 1:
              value = self.utils.read_random_line(name)
            case 2:
              value = self.utils.read_random_line(name)
              value = value[0:rand.randint(1, len(value)-1)] if len(value) > 1 else value
              value = f'something like {value}'

          if i == nr_of_params_prob and i != 1:
            response += f'{key} {value}'
          else:
            response += f'{key} {value} '

          i += 1

      return response

    # function to generate a procedure

    def procedure_generator(self, procedure,main_cases_prob, with_params_prob):

      for i in range(0,rand.randint(1,2)):

        entity = rand.choice(self.supported_entity)
        choice_entity = entity.lower() if rand.choices([1,2],[0.9,0.1])[0] == 1 else self.utils.read_random_line('wrongentities.txt').lower()
        procedure_prob = rand.choices([1,2,3], main_cases_prob)[0]

        match procedure_prob:
          case 1 | 2:
            #procedure a entity [with] [where only for update] |  #procedure all entities [with] [where only for update]
            self.prompt += f'do {procedure} a {choice_entity.lower()}' if procedure_prob == 1 else f'do {procedure} all {self.inflection.plural(choice_entity.lower())}'
            self.prompt += f' with {add_params(self, entity, procedure)}' if rand.choices([1,2],with_params_prob)[0] == 1 else ''
            self.prompt = self.prompt.strip()
            if procedure == 'update':
               self.prompt += f' where {add_params(self, entity,"update")}' if rand.randint(0,1) == 0 else ''
          case 3:
            #select ID
            all_entity_ids = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, entity, '')]
            value = rand.choice(all_entity_ids) if all_entity_ids else entity + self.utils.getID()
            self.prompt += f'do {procedure} {value if rand.choices([1,2],[0.9,0.1])[0] == 1 else entity + self.utils.getID()}' if rand.choices([1,2],[0.9,0.1])[0] == 1 else f'do {procedure} {value} with {add_params(self, entity, procedure)}'

        self.prompt = self.prompt.strip()
        if self.prompt.endswith(('with','where')):
          tokens = self.prompt.split(' ')[0:-1]
          for t in tokens:
            self.prompt += f'{t} '
          self.prompt= self.prompt.strip()

        self.prompt += '.'

    self.prompt = 'say hello.'
    procedure_prob = rand.choices([1,2,3,4],[0.3,0.3,0.3,0.1])[0]

    match procedure_prob:

      case 1:
        self.procedure = 'select'
        procedure_generator(self, 'select', [0.5, 0.4, 0.1], [0.7,0.3])
      case 2:
        self.procedure = 'insert'
        # insert a entity [with]
        entity = rand.choice(self.supported_entity)
        self.prompt += f'do insert a {entity.lower()}'
        self.prompt += f' with {add_params(self, entity,"insert")}' if rand.choices([0,1],[0.7,0.3])[0] == 0 else ''
        self.prompt = self.prompt.strip()
        if self.prompt.endswith(('with','where')):
          tokens = self.prompt.split(' ')[0:-1]
          for t in tokens:
            self.prompt += f'{t} '
          self.prompt= self.prompt.strip()
        self.prompt += '.'
      case 3:
        self.procedure = 'update'
        procedure_generator(self, 'update', [0.8, 0.1, 0.1], [0.9,0.1])
      case 4:
        self.procedure = 'delete'
        procedure_generator(self, 'delete', [0.88, 0.02, 0.1], [0.9,0.1])

    self.prompt += 'say goodbye.'
    self.reset_params()
    return self.prompt, self.procedure

  def reset_params(self):
    self.params = {}


In [6]:
# class which holds the structure of a TOD system

class TODsystem:

    def __init__(self, intents, templates, graph, namespace, user_templates):
      self.intents = intents # a dictionary of supported user intents and keywords to identify them
      self.templates = templates # a dictionary of phrase templates to be used by the NLG module for each system action
      self.slots = {} # current user turn slots
      self.state = {} # current state of the active instance discussed by the two interlocutors
      self.action = {
                'act' : ''
            } # dictionary to be used by the POL module to decide what action the TOD system should make
      self.general_existence = ['something','anything', 'someone', 'somebody','anybody','anyone'] # list of words to detect general reference
      self.turn = 0  # turn index
      self.instance = False # flag to check if the current instance have all the mandatory params
      self.supported_entity = ['Project', 'Employee', 'Status'] # entitites which are part of the TOD system ontology; should be dynamic
      self.entity = '' # the type of the instance which is currently discussed
      self.active_entity = '' # the type of the instance which is to be discussed; ex. user want to insert a new type
      self.response = '' # the response of the TOD system
      self.last_intent = ''
      self.detected_intent = 'default' # the current user intent
      self.params = {} # the params of the current entity
      self.user = '' # user input
      self.pattern = ' |,' # pattern to tokenize the user input
      self.insert_ID = '' # ID for the active instance being under insert intent
      self.ID = {'insert' : '',
                       'update' : '',
                       'delete' : '',
                       'select' : ''} # ID of the current instance being discussed for each procedure
      self.predecesor = [] # list of all the instances mentioned in the discution which were not canceled or confirmed
      self.graph = graph # the general knowledge graph (contains the ontology)
      self.local_graph = r.Graph() # the conversation graph
      self.namespace = namespace # the general namespace of the instances in the graphs
      self.query = Query(self.namespace) # the Query helper
      self.utils = Utils() # the Utils helper
      self.active_procedure = {'insert' : False,
                       'update' : False,
                       'delete' : False,
                       'select' : False} # a dictionary which saves the current active procedure
      self.procedure_state = {'insert' : {},
                       'update' : {},
                       'delete' : {},
                       'select' : {}} # a dictionary which saves the current state of the active procedure
      self.inflection = inflect.engine() # used to check the form of an entity specified by the user, in order to act accordingly
      self.new_procedure = False # a flag used to verify if the user said specific keywords of a procedure which triggers the creation of a new thread
      self.f = None # object to hold the file where the generated dialogue (whole) is written
      self.user_simulator = User(user_templates, 'say hello.say goodbye.', self.graph, self.namespace, self.local_graph)
      self.prompt_generator = PromptGenerator(self.graph,self.namespace,self.supported_entity, self.general_existence)
      self.annotated_dialogue = {} # the annotated dialogue
      self.annotated_dialogues = {} # all annotated dialogues
      self.index = 0 # index of a user utterance; runs from the first to the last scenario
      self.capped_intents = {'agree': 0, 'thank': 0, 'hello': 0, 'goodbye': 0} # dictionary to cap the simple intents utterances

    # function which defines the general pipeline of the TOD system

    def chat_pipeline(self, scenarios_number, nr_of_test = 1, ):

      scenarios_number = scenarios_number
      self.annotated_dialogues = {}
      statistics_graph = open(f'statistics_graph_{nr_of_test}.txt','w')
      statistics = open(f'statistics_{nr_of_test}.txt','w')
      out_file = open(f"train_{nr_of_test}.json", "w")
      j = 1

      for i in tqdm(range(1, int(scenarios_number)+1)):

        scenario_prompt, scenario_name = self.prompt_generator.generate_prompt()
        #print(scenario_prompt)
        self.user_simulator.set_prompt(scenario_prompt)
        #self.f = open(f"{scenario_name + self.utils.getID()}.txt", "a")
        #self.f.write(f"scenario prompt: {scenario_prompt} \n")
        #now = datetime.datetime.now()
        #time = now.strftime('%Y-%m-%d-%H-%M')
        #self.f.write(f'time of creation: {time} \n')

        if i == 1:

          now = datetime.datetime.now()
          time = now.strftime('%Y-%m-%d-%H-%M-%S-%f')
          statistics_graph.write(f'start time: {time}')

          se_text = ''
          for se in self.supported_entity:
            se_text += f'\t{se}'

          statistics_graph.write(f'\nNr_of_scenarios{se_text}')
          statistics.write('Nr_of_scenarios\tprompt\tturns')


        #if i == ((int(scenarios_number) / 1) * j) or i == 1:

        se_nr_text = ''
        for se in self.supported_entity:
          nr_of_instances_per_class = self.query.nr_of_instances_per_class_query(self.graph, se)
          se_nr_text += f'{nr_of_instances_per_class}\t'

        statistics_graph.write(f'\n{i}\t{se_nr_text}')

          #if i != 1:
           # j += 1


        while self.detected_intent != 'goodbye':

          #print(f'turn: {self.turn}; active_entity: {self.insert_ID}')
          ac_prod = 'insert'
          for k, v in self.active_procedure.items():
            if v is True:
              ac_prod = k

          self.annotated_dialogue = {}
          self.user = self.user_simulator.chat(self.action, self.procedure_state[ac_prod], self.insert_ID, self.entity, self.predecesor).strip()
          #self.user = input('User: ')
          #print(f'user: {self.user}')
          #self.f.write(f'\nUser: {self.user} \n')
          self.NLU()
          self.DST()
          self.POL()
          self.NLG()

          self.turn += 1

        intents = scenario_prompt.split('.')
        nr_of_intents = len(intents)-3
        statistics.write(f'\n{i}\t{intents[1].split(" ")[1]}{nr_of_intents}\t{self.turn}')

        #self.f.close()
        self.reset_params()

      json.dump(self.annotated_dialogues, out_file, indent = 4)
      now = datetime.datetime.now()
      time = now.strftime('%Y-%m-%d-%H-%M-%S-%f')
      statistics_graph.write(f'\nfinish time: {time}')
      files.download(f'statistics_{nr_of_test}.txt')
      files.download(f'statistics_graph_{nr_of_test}.txt')
      files.download(f'train_{nr_of_test}.json')
      statistics_graph.close()
      statistics.close()
      out_file.close()

    # function which maps the NLU module; it detects the user intent, slots and updates the state of the current active instance

    def NLU(self):

        def set_active_procedure(self):
          for k, v in self.active_procedure.items():
            self.active_procedure[k] = False
            if k == self.detected_intent:
              self.active_procedure[self.detected_intent] = True

        def insert_annotated_data(self, key, value, isKey = False):
          self.annotated_dialogue['slots'][key] = value
          user = f'({self.last_intent}) {self.user}'
          self.annotated_dialogue['text'] = user
          SIndex, EIndex = self.utils.entityDetails(user, value[3:].lower() if isKey else value)
          self.annotated_dialogue['positions'][key] = [SIndex, EIndex]



        self.action = {'act' : ''}
        default = True
        self.slots = {}
        tokens = self.utils.tokenize(self.pattern, self.user)

        for k, v in self.intents.items():
            for vv in v:
                if re.search(fr'\b{vv}\b', self.user.lower()):
                    default = False
                    self.last_intent = self.detected_intent
                    self.detected_intent = k
                    self.new_procedure = True

        if default:
            self.last_intent = self.detected_intent
            self.detected_intent = 'default'
            if any('has' + t.capitalize() in self.params.keys() for t in tokens):
              for k, v in self.active_procedure.items():
                if v == True:
                  self.detected_intent = k

        user = f'({self.last_intent}) {self.user}'
        self.annotated_dialogue['text'] = user
        self.annotated_dialogue['slots'] = {}
        self.annotated_dialogue['positions'] = {}

        if self.detected_intent == 'remove':

          slots = {}

          for k, v in self.active_procedure.items():
            if v is True:
              if k != 'insert':
                if k != 'update':
                  for k, v in self.procedure_state[k].items():
                    if any(k == kk for kk in self.params.keys()):
                      slots[k] = v
                else:
                  if any(t in ['filters'] for t in tokens):
                    slots = self.procedure_state[k]['old_values']
                    self.slots['update_values'] = 'old_values'
                  else:
                    slots = self.procedure_state[k]['new_values']
                    self.slots['update_values'] = 'new_values'
              else:
                slots = self.state

          self.slots['removedParams'] = []

          i = 0
          for t in tokens:
            for k in slots.keys():
                tt = t.capitalize()
                if tt in k:
                  insert_annotated_data(self, f'remove_param_{i}', k, isKey = True)
                  i += 1
                  self.slots['removedParams'].append(k)

          for k, v in slots.items():
            v = self.utils.escape_special_chars(v)[1:-1]
            if re.search(fr'\b{v}\b', self.user):
              insert_annotated_data(self, f'remove_param_{i}', v)
              i += 1
              self.slots['removedParams'].append(k)

        if self.detected_intent == 'switchEntity':
            if len(self.predecesor) > 0 :
                tok = [t.lower() for t in tokens]
                for v in self.intents['switchEntity']:
                    if v in tok:
                        self.slots['activeEntity'] = tok[tok.index(v) + 2].capitalize()
                if self.slots['activeEntity'] == self.insert_ID:

                  self.detected_intent = 'default'
            else:

                self.detected_intent = 'default'

        if self.detected_intent == 'insert':

          set_active_procedure(self)

          tok = [t.lower() for t in tokens]
          if tok[1].lower() in ['the','it'] and any('has' + t.capitalize() in self.params.keys() for t in tokens) == False:

            insert_annotated_data(self, 'entity', tokens[-1])

            if tok[1].lower() == 'the' and self.entity != tok[-1].capitalize():

              self.detected_intent = 'wrongInsert'
            else:

              self.detected_intent = 'wrongInsert'
              if self.instance == True:

                self.detected_intent = 'agree'

        if self.detected_intent in ['select','delete', 'update']:

          continue_flag = [False, False]
          for phrase in ['continue', 'execute']:
            if phrase == tokens[0].lower():
              continue_flag[0] = True

              insert_annotated_data(self, 'procedure', tokens[-1])

              for k, v in self.active_procedure.items():
                if v is True and tokens[-1].lower() == k:
                  continue_flag[1] = True
              if continue_flag[1] is False:

                self.detected_intent = 'default'



          if continue_flag[0] is False:

            if self.new_procedure:
              keywords = self.intents[self.detected_intent]
              subject, typeOf = self.utils.getInstanceOrEntity(tokens, keywords, self.supported_entity)
              if subject:
                if typeOf is False:

                  insert_annotated_data(self, 'entity', subject)

                  subject = subject.capitalize()
                  self.slots['entity'] = subject
                  en = self.inflection.singular_noun(subject) if self.inflection.singular_noun(subject) != False else subject
                  self.params = self.query.params_query(self.graph, en)
                else:
                  self.slots['instance'] = subject
                  self.slots['instance_type'] = typeOf
                  self.params = self.query.params_query(self.graph, typeOf)

              elif typeOf:
                self.params = {}
                insert_annotated_data(self, 'entity', typeOf)

            self.slots = self.utils.getParamsValues(tokens, self.detected_intent, self.params, self.slots, self.general_existence)

            procedure_state = self.procedure_state[self.detected_intent]
            ids = []

            for key in ['update_ids', 'delete_ids']:
              if key in procedure_state.keys():
                ids = procedure_state[key]
              if ids:
                if 'instance' in self.slots.keys():
                  if self.detected_intent == 'update':
                    if len(self.slots['old_values']) == 0 and len(self.slots['new_values']) == 0:
                      self.new_procedure = False
                      set_active_procedure(self)
                  elif all(key in ['instance','instance_type'] for key in self.slots.keys()):
                    self.new_procedure = False
                    set_active_procedure(self)

                elif 'entity' in self.slots.keys():

                  if self.inflection.singular_noun(self.slots['entity']) != False:
                    en = self.inflection.singular_noun(self.slots['entity'])

                  if en == procedure_state['entity']:
                    if self.detected_intent == 'delete' and  all(key == 'entity' for key in self.slots.keys()):
                      self.new_procedure = False
                      set_active_procedure(self)
                    if self.detected_intent == 'update' and len(self.slots['old_values']) == 0 and len(self.slots['new_values']) == 0:
                      self.new_procedure = False
                      set_active_procedure(self)

            if self.ID[self.detected_intent] == '' or self.new_procedure == True or self.active_procedure[self.detected_intent] == False :
              self.ID[self.detected_intent] = self.detected_intent.capitalize() + self.utils.getID()
              self.local_graph.add((self.namespace[self.ID[self.detected_intent]], RDF.type, self.namespace[self.detected_intent.capitalize()]))

            set_active_procedure(self)

        if self.detected_intent == 'insert':
          keyword_detected = False
          set_active_procedure(self)
          tok = [t.lower() for t in tokens]
          for v in self.intents['insert']:
            if v in tok:
              keyword_detected = True
              en = tok[tok.index(v) + 2].capitalize()

              insert_annotated_data(self, 'entity', tokens[tok.index(v) + 2])

              if en in self.supported_entity:
                self.active_entity = en
              else:

                self.detected_intent = 'default'

          if self.detected_intent != 'default':
            # ongoing of same type
            if self.entity == self.active_entity:

              if self.insert_ID in self.predecesor:
                  self.predecesor.pop(-1)
              if any('has' + t.capitalize() in self.params.keys() for t in tokens) == False or (any('has' + t.capitalize() in self.params.keys() for t in tokens) == True and self.insert_ID == '') or (keyword_detected == True and any('has' + t.capitalize() in self.params.keys() for t in tokens) == True):
                self.instance = False
                if self.insert_ID != '':
                  self.predecesor.append(self.insert_ID)
                self.insert_ID = self.entity + self.utils.getID()
                self.local_graph.add((self.namespace[self.insert_ID], RDF.type, self.namespace[self.entity]))
                self.params = self.query.params_query(self.graph, self.entity)
                self.ID[self.detected_intent] = self.detected_intent.capitalize() + self.utils.getID()
                self.local_graph.add((self.namespace[self.ID[self.detected_intent]], RDF.type, self.namespace[self.detected_intent.capitalize()]))
                self.local_graph.add((self.namespace[self.ID[self.detected_intent]], self.namespace.instance, self.namespace[self.insert_ID]))


            concan = self.query.confirm_cancel_query(self.local_graph, self.insert_ID).askAnswer


            if self.entity != self.active_entity or concan:
              if concan == False and self.insert_ID not in self.predecesor:
                  if self.insert_ID != '':
                    self.predecesor.append(self.insert_ID)
                    self.instance = False
                    self.insert_ID = self.entity + self.utils.getID()
              else:
                self.instance = False
                self.predecesor.remove(self.insert_ID)
                if keyword_detected:
                  if self.insert_ID != '':
                    self.predecesor.append(self.insert_ID)
                  self.insert_ID = self.entity + self.utils.getID()

              self.entity = self.active_entity
              if self.query.existence_query(self.local_graph, self.insert_ID).askAnswer == False:
                self.instance = False
                self.insert_ID = self.entity + self.utils.getID()
              self.local_graph.add((self.namespace[self.insert_ID], RDF.type, self.namespace[self.entity]))
              self.params = self.query.params_query(self.graph, self.entity)
              self.ID[self.detected_intent] = self.detected_intent.capitalize() + self.utils.getID()
              self.local_graph.add((self.namespace[self.ID[self.detected_intent]], RDF.type, self.namespace[self.detected_intent.capitalize()]))
              self.local_graph.add((self.namespace[self.ID[self.detected_intent]], self.namespace.instance, self.namespace[self.insert_ID]))
              self.concan = False

            self.slots = self.utils.getParamsValues(tokens, self.detected_intent, self.params, self.slots, self.general_existence)

        b = r.BNode()
        self.local_graph.add((self.namespace.User, self.namespace[self.detected_intent], b))
        self.local_graph.add((b, self.namespace.turn, r.Literal(self.turn)))

        self.annotated_dialogue['intent'] = self.detected_intent

        if self.slots:

          for k, v in self.slots.items():
            if isinstance(v, list):
              for vv in v:
                self.local_graph.add((b, n[k], r.Literal(vv)))
            else:
              self.local_graph.add((b, n[k], r.Literal(v)))


          for k, v in self.active_procedure.items():
            if v == True:

              if k != 'update':
                id = self.insert_ID if k == 'insert' and 'removedParams' not in self.slots.keys() else self.ID[k]
                for kk, vv in self.slots.items():

                  if kk not in ['removedParams', 'entity']:
                    vv = vv.strip()
                    insert_annotated_data(self, kk, vv)

                  if kk != 'activeEntity':
                    self.local_graph.remove((self.namespace[id], self.namespace[kk], None))
                    self.local_graph.add((self.namespace[id], self.namespace[kk], r.Literal(vv)))

              elif 'update_values' not in self.slots.keys():
                for kk, vv in self.slots.items():
                  if kk not in ['new_values','old_values','entity']:
                    vv = vv.strip()
                    insert_annotated_data(self, kk, vv)
                  if kk not in ['new_values','old_values']:
                    self.local_graph.remove((self.namespace[self.ID[k]], self.namespace[kk], None))
                    self.local_graph.add((self.namespace[self.ID[k]], self.namespace[kk], r.Literal(vv)))
                update_instance = self.query.procedure_query(self.local_graph, self.ID['update'], 'Update')

                for key in ['new_values','old_values']:

                  bnode = r.BNode()
                  if key in update_instance.keys():
                    for k, v in update_instance[key].items():
                      self.local_graph.add((bnode, self.namespace[k], r.Literal(v)))

                  for k, v in self.slots[key].items():

                    insert_annotated_data(self, f'{key}_{k}', v)

                    self.local_graph.remove((bnode, self.namespace[k], None))
                    self.local_graph.add((bnode, self.namespace[k], r.Literal(v)))

                  self.local_graph.remove((self.namespace[self.ID['update']], self.namespace[key], None))
                  self.local_graph.add((self.namespace[self.ID['update']], self.namespace[key], bnode))

        if self.detected_intent not in ['hello', 'goodbye', 'thank', 'select', 'switchEntity', 'update', 'delete']:
            self.local_graph.add((b, self.namespace.entity, self.namespace[self.insert_ID]))

        self.new_procedure = False

        if self.detected_intent in ['agree','hello','thank','goodbye']:
          if self.capped_intents[self.detected_intent] < 550:
            self.capped_intents[self.detected_intent] += 1
            self.annotated_dialogues[self.index] = self.annotated_dialogue
            self.index += 1
          else:
            self.annotated_dialogue = {}
        else:
          self.annotated_dialogues[self.index] = self.annotated_dialogue
          self.index += 1

        for k,v in self.active_procedure.items():
          if v is True:
            if k != 'insert':
              self.procedure_state[k] = self.query.procedure_query(self.local_graph, self.ID[k], k.capitalize())
            else:
              self.procedure_state[k] = self.query.insert_procedure_state_query(self.local_graph, self.insert_ID)
          else:
            self.procedure_state[k] = {}

    # function that maps the DST module; it selects the state of the currently discussed instance, if needed

    def DST(self):

        self.state = self.query.state_query(self.local_graph, self.insert_ID)

    # function that maps the POL module; it reacts to the user's intent with specific actions

    def POL(self):

      def select_instances(self, target_slots, entity):

        entity = entity
        if self.inflection.singular_noun(entity) != False:
          entity = self.inflection.singular_noun(entity)

        preds = []
        types = []
        key = []
        values = []
        confirm = {}
        slots = target_slots
        paramsq = ''
        literal_instances = {}
        results = []
        ids = {}

        for k, v in self.params.items():
          if k in slots.keys():

            if str(v[1]).startswith('XMLSchema') == False:
              res = False
              val = self.utils.tokenize(self.pattern, str(slots[k]))
              if val[0] in self.general_existence and val[-2] not in self.general_existence:
                res2 = self.query.pre_existing_param_query(self.graph, v[1], val[-1], ':has' + str(val[-2].capitalize()))

              else:
                res = self.query.existence_query(self.graph, val[-1], 'a', f':{v[1]}').askAnswer
                if val[0] in self.general_existence and val[-2] in self.general_existence:
                  res2 = self.query.pre_existing_param_query(self.graph, v[1], val[-1])
                else:
                  res2 = self.query.pre_existing_param_query(self.graph, v[1], val[-1],':hasName')

                confirm[k] = True

              if res == True:
                ids[k] = [val[-1]]
                instance = self.query.select_instance_query(self.graph, val[-1])
                ids[val[-1]] = instance['hasName']

              if res == False and len(res2) == 0:
                preds.append(k)
                types.append(v[1])
                confirm[k] = res

              if len(res2) >= 1:
                ids[k] = []
                for rs in res2:
                  ids[k].append(rs['ID'])
                  instance = self.query.select_instance_query(self.graph, rs['ID'])
                  ids[rs['ID']] = instance['hasName']
                key.append(k)
                values.append(res2)
                confirm[k] = True

            else:
              t = self.params[k][1].split("#")[1]
              val = self.utils.tokenize(self.pattern, str(slots[k]))

              if len(val) > 1:
                res2 = self.query.pre_existing_param_query(self.graph, entity, val[-1])
                literal_instances[k] = res2

              else:
                paramsq += f' :{k} "{str(slots[k])}"^^xsd:{t}; '


        if False not in confirm.values():


          # select the params which are literal
          resp = self.query.pre_existing_query(self.graph, entity, paramsq)

          if len(resp) > 0 and len(literal_instances) > 0:
            matches = []
            minimum = len(literal_instances)
            for ins in resp:
              app = 0
              for k, v in literal_instances.items():
                for lit_ins in v:
                  if ins['ID'] == lit_ins['ID']:
                    app += 1
              if app == minimum:
                matches.append(ins)
            resp = matches

          if len(resp) > 0 and any(k in key for k in slots.keys()):

            #combine with the instances
            matches = []
            only_keys = [k for k in ids.keys() if k in self.params.keys()]
            minimum = len(only_keys)
            for result in resp:
              app = 0
              for k,v in result.items():
                if k in key or k in ids.keys():
                  valid = False
                  if v in ids[k]:
                    app += 1
                    valid = True
                    result[k] = f'[ID: {v}; name: {ids[v]}]'
              if app == minimum:
                results.append(result)

          elif len(resp) > 0:

            for result in resp:
              for k, v in result.items():
                if k in self.params.keys() and str(self.params[k][1]).startswith('XMLSchema') == False:
                  instance = self.query.select_instance_query(self.graph, v)
                  result[k] = f'[ID: {v}; name: {instance["hasName"]}]'

            results = resp

        return results

      def checkInsert(self, entity, target_slots, choose_entity_keys, choose_entity_values, preexistence_check = False, update_pre_check = []):

                preds = []
                types = []
                key = []
                values = []
                literal_keys = []
                confirm = {}
                for k, v in self.params.items():
                    if k in target_slots.keys():

                        val = self.utils.tokenize(self.pattern, str(target_slots[k]))
                        if str(v[1]).startswith('XMLSchema') == False:
                            res = False
                            if val[0] in self.general_existence and val[-2] not in self.general_existence:
                                res2 = self.query.pre_existing_param_query(self.graph, v[1], val[-1], ':has' + str(val[-2].capitalize()))
                            else:
                                res = self.query.existence_query(self.graph, val[-1], 'a', f':{v[1]}').askAnswer
                                if val[0] in self.general_existence and val[-2] in self.general_existence:
                                  res2 = self.query.pre_existing_param_query(self.graph, v[1], val[-1])
                                else:
                                  res2 = self.query.pre_existing_param_query(self.graph, v[1], val[-1],':hasName')
                                confirm[k] = True

                            if res == False and len(res2) == 0:
                                preds.append(k)
                                types.append(v[1])
                                confirm[k] = res
                            if len(res2) >= 1:
                                key.append(k)
                                values.append(res2)
                                confirm[k] = True
                        else:
                          if len(val) > 1:
                            literal_keys.append(k)

                if False not in confirm.values():

                    if any(k in key for k in target_slots.keys()):
                      return 'choose', key, values

                    else:
                        resp = []
                        if preexistence_check:
                          all_instances = []
                          if not update_pre_check:
                            update_pre_check = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, entity, '')]
                          for inst in update_pre_check:
                            t_slots = self.query.select_instance_query(self.graph, inst)
                            t_slots.pop('type')
                            t_slots.pop('ID')
                            all_instances.append(t_slots)
                            for k, v in target_slots.items():
                              t_slots[k] = v
                            paramsq = ''
                            for p in self.params.keys():
                                if self.params[p][0] == 1 :
                                  if str(self.params[p][1]).startswith('XMLSchema') == True:
                                        t = self.params[p][1].split("#")[1]
                                        paramsq += f' :{p} "{str(t_slots[p])}"^^xsd:{t}; '
                                  else:
                                        paramsq += f' :{p} :{str(t_slots[p])}; '

                            resp.append(self.query.pre_existing_query(self.graph, entity, paramsq))

                          if not resp[0]:
                            done_flag = False
                            for i in range(len(all_instances)):
                              all_instances_copy = all_instances.copy()
                              all_instances_copy.pop(i)
                              if any(all_instances[i] == dt for dt in all_instances_copy):
                                done_flag = True
                                return 'preExistingEntityCancel', [all_instances[i]], None
                            if done_flag == False:
                              resp = []

                          elif resp[0]:
                            return 'preExistingEntityCancel', resp[0], None

                        else:
                          paramsq = ''
                          for p in self.params.keys():
                            if self.params[p][0] == 1 :
                              if p not in confirm.keys():
                                t = self.params[p][1].split("#")[1]
                                paramsq += f' :{p} "{str(target_slots[p])}"^^xsd:{t}; '
                              else:
                                paramsq += f' :{p} :{str(target_slots[p])}; '

                          resp = (self.query.pre_existing_query(self.graph, entity, paramsq))

                        if len(resp) > 0:
                            return 'preExistingEntityCancel', resp, None

                        elif choose_entity_keys:
                            preds = []
                            keys = []
                            confirm_flag = False
                            for k, v in target_slots.items():
                              for kk, vv in zip(choose_entity_keys, choose_entity_values):
                                if k == kk:
                                  confirm_flag = False
                                  for d in vv:
                                    if v == d['ID']:
                                      confirm_flag = True
                                  if confirm_flag == False:
                                    preds.append(k)
                                    keys.append(str(self.params[k][1]))

                            if preds:
                              return 'chooseReject', preds, types
                            elif literal_keys:
                              return 'wrong', literal_keys, None
                            else:
                              return 'confirm', None, None

                        elif literal_keys:
                        # to prevent values such as something like value for literal params
                          return 'wrong', literal_keys, None
                        else:
                          return 'confirm', None, None
                else:
                  return 'reject', preds, types

      def remove_params(self, keys, target_slots, id, pred = None, obj = None):
        for k in keys:
          if k in target_slots.keys():
            target_slots.pop(k)
            if obj == None:
              self.local_graph.remove((self.namespace[id], self.namespace[k], None))
        if obj != None:
          bnode = r.BNode()
          for k, v in target_slots.items():
            self.local_graph.add((bnode, self.namespace[k], r.Literal(v)))
          self.local_graph.remove((self.namespace[id], self.namespace[pred], None))
          self.local_graph.add((self.namespace[id], self.namespace[pred], bnode))

        return target_slots

      def set_action(self, action, intent, set_ID, set_active_procedure, additional_info = {}):
        self.action['act'] = action
        if set_ID:
          self.ID[intent] = ''
          if intent == 'insert':
            self.insert_ID = ''
        if set_active_procedure:
          self.active_procedure[intent] = False
        for k, v in additional_info.items():
          self.action[k] = v

      def get_predecesor(self):

        if self.action['act'] != 'switchEntity':

          if self.insert_ID in self.predecesor:
            self.predecesor.remove(self.insert_ID)

          self.state = {}
          self.instance = False
          self.params = {}

          if self.action['act'] in ['cancelProcedure','preExistingEntityCancel']:
            self.local_graph.remove((self.namespace[self.insert_ID], None, None))
            self.local_graph.add((b, n.entity, self.namespace[self.insert_ID]))

          self.insert_ID = ''
          self.ID['insert'] = ''


          if self.predecesor:
            self.insert_ID = self.predecesor[-1]
            self.active_procedure['insert'] = True
            insert_state = self.query.insert_procedure_state_query(self.local_graph, self.insert_ID)
            self.ID['insert'] = insert_state['ID']


        for e in self.supported_entity:
          if e in self.insert_ID:
            self.active_entity = self.insert_ID[self.insert_ID.index(e):len(e)]
            self.entity = self.active_entity

        self.params = self.query.params_query(self.graph, self.active_entity)
        self.DST()

        checkParams = self.utils.checkParams(self.params, self.state)
        if checkParams['act'] == 'confirmParams':
          self.instance = True

      b = r.BNode()

      if self.detected_intent == 'hello':
            self.action['act'] = 'hello'

      elif self.detected_intent == 'goodbye':
            self.action['act'] = 'goodbye'

      elif self.detected_intent == 'switchEntity':

            if self.slots['activeEntity'] in self.predecesor:

              if self.insert_ID not in self.predecesor:
                  self.predecesor.append(self.insert_ID)

              self.predecesor.remove(self.slots['activeEntity'])
              set_action(self, 'switchEntity', 'insert', True, True)
              self.insert_ID = self.slots['activeEntity']
              self.predecesor.append(self.insert_ID)
              self.active_procedure['insert'] = True
              insert_state = self.query.insert_procedure_state_query(self.local_graph, self.insert_ID)
              self.ID['insert'] = insert_state['ID']

              get_predecesor(self)

            else:
                self.action['act'] = 'wrongEntity'

      elif self.detected_intent == 'select':

          ID = self.ID['select']
          select_state = self.procedure_state[self.detected_intent]
          select_slots = {}

          for k, v in select_state.items():
            if any(k == kk for kk in self.params.keys()):
              select_slots[k] = v


          if 'instance' in select_state.keys():
            if any(key not in ['instance', 'instance_type', 'category', 'type'] for key in select_state.keys()):
              set_action(self, 'wrongSelect', 'select', True, True)

            if self.action['act'] != 'wrongSelect':
              instance = self.query.select_instance_query(self.graph, select_state['instance'])
              if instance:
                for k, v in instance.items():
                      if k in self.params.keys() and str(self.params[k][1]).startswith('XMLSchema') == False:
                        ref_instance = self.query.select_instance_query(self.graph, v)
                        instance[k] = f'[ID: {v}; name: {ref_instance["hasName"]}]'
                set_action(self, 'showSelect', 'select', True, True, {'results': [instance], 'entity': instance['type']})

              else:
                set_action(self, 'wrongSelect', 'select', True, True)

          elif 'entity' in select_state.keys():

            entity = select_state['entity']
            results = select_instances(self, select_slots, entity)

            if results:
              self.action['act'] = 'showSelect'
              self.action['results'] = results
              self.action['entity'] = entity
            else:
              set_action(self, 'wrongSelect', 'select', True, True)


          else:
            set_action(self, 'wrongFormatSelect', 'select', True, True)

          self.local_graph.add((self.namespace[ID], self.namespace['category'], self.namespace[self.action['act']]))

      elif self.detected_intent == 'update':

          ID = self.ID['update']
          #entity = self.slots['instance_type']
          update_state =  self.procedure_state[self.detected_intent]
          update_slots = update_state['new_values']
          filter_slots = update_state['old_values']
          update_ids = []
          choose_update_keys = []
          choose_update_values = []

          if 'choose_update_keys' in update_state.keys():
            if type(update_state['choose_update_keys']) is str:
              choose_update_keys = [update_state['choose_update_keys']]
            else:
              choose_update_keys = update_state['choose_update_keys']

          if 'choose_update_values' in update_state.keys():
            if type(update_state['choose_update_values'][0]) is list:
              choose_update_values = update_state['choose_update_values']
            else:
               choose_update_values = [update_state['choose_update_values']]

          if 'update_ids' in update_state.keys():
            update_ids = update_state['update_ids']

          params = []
          items = {}
          if 'instance' in update_state.keys():
            instance = self.query.select_instance_query(self.graph, update_state['instance'])

            if instance and instance['type'] != 'Class':
              if len(update_ids) > 0:
                if instance['ID'] not in update_ids:
                  set_action(self, 'wrongUpdate', 'update', True, True)
                elif 'old_values' in self.slots.keys() and self.slots['old_values']:
                  set_action(self, 'wrongUpdate', 'update', True, True)

              elif len(filter_slots) != 0:
                set_action(self, 'wrongUpdate', 'update', True, True)

              if len(update_slots) == 0:
                set_action(self, 'wrongUpdate', 'update', True, True)

              if self.action['act'] != 'wrongUpdate':
                entity = update_state['instance_type']
                action, arg1, arg2 = checkInsert(self, entity, update_slots, choose_update_keys, choose_update_values, True, [instance['ID']])

                if action == 'choose':
                  remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)
                  self.local_graph.remove((self.namespace[ID], self.namespace['choose_update_keys'], None))
                  for a in arg1:
                    self.local_graph.add((self.namespace[ID], self.namespace['choose_update_keys'], r.Literal(a)))
                  self.local_graph.remove((self.namespace[ID], self.namespace['choose_update_values'], None))
                  for a in arg2:
                    self.local_graph.add((self.namespace[ID], self.namespace['choose_update_values'], r.Literal(a)))

                  set_action(self, 'chooseEntity', 'update', False, False, {'key': arg1, 'values': arg2})

                elif action == 'wrong':
                  set_action(self, 'wrongLiteralDataFormat', 'update', False, False, {'keys': arg1})
                  remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                elif action == 'preExistingEntityCancel':
                  set_action(self, 'preExistingEntityCancel', 'update', True, True, {'entity': arg1})
                  #remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                elif action == 'reject':
                  set_action(self, 'dependencyUpdate', 'update', False, False, {'params': arg1})
                  # might intergrate it in dependencyUpdate NLG
                  #self.action['type'] = arg2
                  remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                elif action == 'chooseReject':
                  set_action(self, 'chooseReject', 'update', False, False, {'pred': arg1, 'type': arg2})
                  # might intergrate it in dependencyUpdate NLG
                  #self.action['type'] = arg2
                  remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                elif action == 'confirm':

                  for k, v in update_slots.items():
                    if k in self.params.keys() and str(self.params[k][1]).startswith('XMLSchema') == False:
                      ref_instance = self.query.select_instance_query(self.graph, v)
                      if ref_instance:
                        instance[k] = f'[ID: {v}; name: {ref_instance["hasName"]}]'
                        self.graph.remove((self.namespace[instance['ID']], self.namespace[k], None))
                        self.graph.add((self.namespace[instance['ID']], self.namespace[k], self.namespace[v]))

                    elif k in self.params.keys():
                      t = self.params[k][1].split("#")[1]
                      instance[k] = v
                      self.graph.remove((self.namespace[instance['ID']], self.namespace[k], None))
                      self.graph.add((self.namespace[instance['ID']], self.namespace[k], r.Literal(v, datatype = XSD[t])))

                    set_action(self, 'confirmUpdate', 'update', True, True, {'results': [instance]})


            else:
              set_action(self, 'wrongUpdate', 'update', True, True)

          elif 'entity' in update_state.keys():

            entity = update_state['entity']
            results = select_instances(self, filter_slots, entity)

            if results:

              if self.inflection.singular_noun(entity) != False:

                entity = self.inflection.singular_noun(entity)

                if len(update_slots) == 0:
                  set_action(self, 'wrongUpdate', 'update', True, True)

                if self.action['act'] != 'wrongUpdate':
                  if update_ids:
                    resp = []
                    for i in update_ids:
                      res = self.query.select_instance_query(self.graph, i)
                      resp.append(res)
                    results = resp

                  action, arg1, arg2 = checkInsert(self, entity, update_slots, choose_update_keys, choose_update_values, True, update_ids)

                  if action == 'choose':
                    remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)
                    self.local_graph.remove((self.namespace[ID], self.namespace['choose_update_keys'], None))
                    for a in arg1:
                      self.local_graph.add((self.namespace[ID], self.namespace['choose_update_keys'], r.Literal(a)))
                    self.local_graph.remove((self.namespace[ID], self.namespace['choose_update_values'], None))
                    for a in arg2:
                      self.local_graph.add((self.namespace[ID], self.namespace['choose_update_values'], r.Literal(a)))

                    set_action(self, 'chooseEntity', 'update', False, False, {'key': arg1, 'values': arg2})

                  elif action == 'wrong':
                    set_action(self, 'wrongLiteralDataFormat', 'update', False, False, {'keys': arg1})
                    remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                  elif action == 'preExistingEntityCancel':
                    set_action(self, 'preExistingEntityCancel', 'update', True, True, {'entity': arg1})
                    #remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                  elif action == 'reject':
                    set_action(self, 'dependencyUpdate', 'update', False, False, {'params': arg1})
                    # might intergrate it in dependencyUpdate NLG
                    #self.action['type'] = arg2
                    remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                  elif action == 'chooseReject':
                    set_action(self, 'chooseReject', 'update', False, False, {'pred': arg1, 'type': arg2})
                    # might intergrate it in dependencyUpdate NLG
                    remove_params(self, arg1, update_slots, ID, 'new_values', obj = True)

                  elif action == 'confirm':

                    for k, v in update_slots.items():
                      if k in self.params.keys() and str(self.params[k][1]).startswith('XMLSchema') == False:
                          ref_instance = self.query.select_instance_query(self.graph, v)
                          if ref_instance:
                            for res in results:
                              res[k] = f'[ID: {v}; name: {ref_instance["hasName"]}]'
                              self.graph.remove((self.namespace[res['ID']], self.namespace[k], None))
                              self.graph.add((self.namespace[res['ID']], self.namespace[k], self.namespace[v]))

                      elif k in self.params.keys():

                          t = self.params[k][1].split("#")[1]
                          for res in results:
                            res[k] = v
                            self.graph.remove((self.namespace[res['ID']], self.namespace[k], None))
                            self.graph.add((self.namespace[res['ID']], self.namespace[k],  r.Literal(v, datatype = XSD[t])))

                      set_action(self, 'confirmUpdate', 'update', True, True, {'results': results})


              else:
                up_ids = []
                for res in results:
                  up_ids.append(res['ID'])
                self.local_graph.remove((self.namespace[ID], self.namespace['update_ids'], None))
                self.local_graph.add((self.namespace[ID], self.namespace['update_ids'], r.Literal(up_ids)))
                set_action(self, 'showUpdate', 'update', False, False,  {'results': results})

            else:
              set_action(self, 'wrongUpdate', 'update', True, True)

          else:
            set_action(self, 'default', 'update', True, True)

          self.local_graph.add((self.namespace[ID], self.namespace['category'], self.namespace[self.action['act']]))

      elif self.detected_intent == 'delete':

        def delete(delete_list):

          dependency_results = []
          dependency_ids = []
          deleted_ids = []

          for did in delete_list:
            result = self.query.dependency_query(self.graph, did['ID'])
            if result:
              for i in range(len(result)):
                info_result = self.query.select_instance_query(self.graph, result[i]['ID'])
                result[i] = info_result
              dependency_results.append(result)
              dependency_ids.append({'ID' : did['ID'], 'hasName' : did['hasName']})
            else:
              self.graph.remove((self.namespace[did['ID']], None, None))
              deleted_ids.append({'ID' : did['ID'], 'hasName' : did['hasName']})

          return dependency_results, dependency_ids, deleted_ids

        params = []
        items = {}
        paramsq = ''
        ID = self.ID['delete']
        delete_state = self.procedure_state[self.detected_intent]
        delete_ids = []
        delete_slots = {}

        for k, v in delete_state.items():
          if any(k == kk for kk in self.params.keys()):
            delete_slots[k] = v

        if 'delete_ids' in delete_state.keys():
          delete_ids = delete_state['delete_ids']

        if 'instance' in delete_state.keys():

          instance = self.query.select_instance_query(self.graph, self.slots['instance'])

          if instance and instance['type'] != 'Class':

            if delete_ids:
              if not any(instance['ID'] == v['ID'] for v in delete_ids):
                set_action(self, 'wrongDelete', 'delete', True, True)

            for k in self.slots.keys():
                  if any(k == kk for kk in self.params.keys()):
                    set_action(self, 'wrongDelete', 'delete', True, True)

            if self.action['act'] != 'wrongDelete':
              results = self.query.dependency_query(self.graph, instance['ID'])
              if results:
                set_action(self, 'dependencyDelete', 'delete', True, True, {'dependency_results': [results], 'dependency_instances': [{'ID' : instance['ID'], 'hasName' : instance['hasName']}]})
              else:
                self.graph.remove((self.namespace[instance['ID']], None, None))
                set_action(self, 'confirmDelete', 'delete', True, True, {'results': [{'ID': instance['ID'], 'hasName' : instance['hasName']}]})

          else:
            set_action(self, 'wrongDelete', 'delete', True, True)

        elif 'entity' in delete_state.keys():

          entity = delete_state['entity']
          results = select_instances(self, delete_slots, entity)

          if results:

            dependency_results = []
            dependency_ids = []
            deleted_ids = []

            if self.inflection.singular_noun(entity) != False:
              if delete_ids:
                dependency_results, dependency_ids, deleted_ids = delete(delete_ids)
              else:
                dependency_results, dependency_ids, deleted_ids = delete(results)

              if dependency_ids:
                set_action(self, 'dependencyDelete', 'delete', True, True, {'dependency_results': dependency_results, 'dependency_instances': dependency_ids, 'instances': deleted_ids})
              else:
                set_action(self, 'confirmDelete', 'delete', True, True, {'results': deleted_ids})
                self.action['act'] = 'confirmDelete'
                self.action['results'] = deleted_ids

            else:

              # question one or all; raw/single entity
              del_ids = []
              for res in results:
                del_ids.append({'ID' : res['ID'], 'hasName' : res['hasName']})
              self.local_graph.remove((self.namespace[ID], self.namespace['delete_ids'], None))
              self.local_graph.add((self.namespace[ID], self.namespace['delete_ids'], r.Literal(del_ids)))
              set_action(self, 'showDelete', 'delete', False, False, {'results': results})

          else:
            set_action(self, 'wrongDelete', 'delete', True, True)

        else:
          set_action(self, 'default', 'delete', True, True)

        self.local_graph.add((self.namespace[ID], self.namespace['category'], self.namespace[self.action['act']]))

      elif self.detected_intent == 'agree':


        if self.instance:

          insert_procedure_ID = self.ID['insert']
          insert_state = self.procedure_state['insert']
          ID = self.insert_ID

          choose_insert_keys = []
          if 'choose_insert_keys' in insert_state.keys():
            if type(insert_state['choose_insert_keys']) is str:
              choose_insert_keys = [insert_state['choose_insert_keys']]
            else:
              choose_insert_keys = insert_state['choose_insert_keys']

          choose_insert_values = []
          if 'choose_insert_values' in insert_state.keys():
            if type(insert_state['choose_insert_values'][0]) is list:
              choose_insert_values = insert_state['choose_insert_values']
            else:
               choose_insert_values = [insert_state['choose_insert_values']]

          action, arg1, arg2 = checkInsert(self, self.active_entity, self.state, choose_insert_keys, choose_insert_values)

          if action == 'choose':
            set_action(self, 'chooseEntity', 'insert', False, False, {'key': arg1, 'values': arg2})
            self.local_graph.remove((self.namespace[insert_procedure_ID], self.namespace['choose_insert_keys'], None))
            for a in arg1:
              self.local_graph.add((self.namespace[insert_procedure_ID], self.namespace['choose_insert_keys'], r.Literal(a)))
            self.local_graph.remove((self.namespace[insert_procedure_ID], self.namespace['choose_insert_values'], None))
            for a in arg2:
              self.local_graph.add((self.namespace[insert_procedure_ID], self.namespace['choose_insert_values'], r.Literal(a)))

            self.instance = False

            #remove the params inserted from the graph
            self.state = remove_params(self, arg1, self.state, ID, obj = None)

          # to prevent values such as something like value for literal params

          elif action == 'wrong':
            set_action(self, 'wrongLiteralDataFormat', 'insert', False, False, {'keys': arg1})
            self.instance = False

           #remove the params inserted from the graph
            self.state = remove_params(self, arg1, self.state, ID, obj = None)

          elif action == 'preExistingEntityCancel':

            set_action(self, 'preExistingEntityCancel', 'insert', False, True, {'entity': arg1})

            get_predecesor(self)

          elif action == 'confirm':

            self.graph.add((self.namespace[ID], RDF.type, self.namespace[self.entity]))
            for k, v in self.state.items():
              pred = k
              if k in self.params.keys() and str(self.params[k][1]).startswith('XMLSchema') == True:
                if k != 'removedParams':
                  t = self.params[k][1].split("#")[1]
                  self.graph.add((self.namespace[ID], self.namespace[pred], r.Literal(v, datatype = XSD[t])))
              else:
                obj = v
                self.graph.add((self.namespace[ID], self.namespace[pred],self.namespace[obj]))

            self.local_graph.add((b, self.namespace.entity, self.namespace[ID]))

            set_action(self, 'confirm', 'insert', False, True)

            get_predecesor(self)

          elif action in ['reject','chooseReject']:
            set_action(self, action, 'update', False, False, {'pred': arg1,'type': arg2})
            self.instance = False

            #remove the params inserted from the graph
            self.state = remove_params(self, arg1, self.state, ID, obj = None)

        else:
          self.action['act'] = 'default'

      elif self.detected_intent == 'disagree':
            if self.instance:
                self.action['act'] = 'askStep'
            else:
                self.action['act'] = 'default'

      elif self.detected_intent == 'cancel':

        if True not in self.active_procedure.values():
          self.action['act'] = 'default'

        for k, v in self.active_procedure.items():
          if v == True:
            if k != 'insert':
              set_action(self, 'cancelProcedure', k, True, True)
            else:
              if self.insert_ID != '':
                set_action(self, 'cancelProcedure', 'insert', False, True)
                get_predecesor(self)
              else:
                self.action['act'] = 'default'

      elif self.detected_intent == 'thank':
            self.action['act'] = 'welcome'

      elif self.detected_intent in ['insert', 'wrongInsert']:
            self.action = self.utils.checkParams(self.params, self.state)

      elif self.detected_intent == 'remove':


            if self.slots['removedParams']:

              self.action['params'] = []

              for k, v in self.active_procedure.items():
                if v == True:
                  if k == 'insert':
                    for p in self.slots['removedParams']:
                      if p in self.state.keys():
                        self.state.pop(p)
                        self.action['params'].append(p)
                        self.local_graph.remove((self.namespace[self.insert_ID], self.namespace[p], None))

                    self.instance = False


                  else:

                    bnode = r.BNode()
                    if k == 'update':

                      update_values = self.slots['update_values']

                      for k, v in self.procedure_state['update'][update_values].items():
                        self.local_graph.add((bnode, self.namespace[k], r.Literal(v)))
                      for p in self.slots['removedParams']:
                        self.action['params'].append(p)
                        self.local_graph.remove((bnode, self.namespace[p], None))

                      self.local_graph.remove((self.namespace[self.ID['update']], self.namespace[update_values], None))
                      self.local_graph.add((self.namespace[self.ID['update']], self.namespace[update_values], bnode))

                    else:
                      for p in self.slots['removedParams']:
                        self.action['params'].append(p)
                        self.local_graph.remove((self.namespace[self.ID[k]], self.namespace[p], None))

              self.action['act'] = 'removeParams'

            else:
                self.action['act'] = 'default'


      elif self.detected_intent == 'default':
            self.action['act'] = 'default'


      self.local_graph.add((self.namespace.System, self.namespace[self.action['act']], b))
      self.local_graph.add((b, self.namespace.turn, r.Literal(self.turn)))

      if len(self.action) > 1:
            for k, v in self.action.items():
                if k != 'act':
                  if isinstance(v, list):
                    for vv in v:
                      self.local_graph.add((b, self.namespace[k], r.Literal(vv)))
                  else:
                      self.local_graph.add((b, self.namespace[k], r.Literal(v)))
                    # think about the pre existing case where you get a list of dictionaries - are complicated cases worth to be saved?

      if self.action['act'] in ['removeParams', 'requireParams', 'confirmParams', 'reject', 'switchEntity']:
            self.local_graph.add((b, self.namespace.entity, self.namespace[self.insert_ID]))

    # function that maps the NLG module; it converts the system's action (optionally with other values too) into natural language responses

    def NLG(self):

        act = self.action['act']
        templates_length = len(self.templates[act])-1
        r = rand.randint(0, templates_length)
        if act in self.templates.keys():

            if act not in self.templates['requireVars']:
                self.response = self.templates[act][r]

            elif act == 'showSelect':
              results = self.utils.printListOfDictionaries(self.action['results'])
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<entity>'], [self.action['entity']], results)

            elif act == 'wrongFormatSelect':
              results = self.utils.printListOfLiterals(self.supported_entity)
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<entities>'], [results])

            elif act in ['confirmDelete','showDelete','confirmUpdate','showUpdate']:
              results = self.utils.printListOfDictionaries(self.action['results'])
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<instances>'], [results])

            elif act == 'dependencyDelete':
              dependency = '\n'
              deleted = ''
              for kk, vv in zip(self.action['dependency_instances'], self.action['dependency_results']):
                  response = f" {kk['ID']} ({kk['hasName']}): "
                  for w in vv:
                      response += '{'
                      for k, v in w.items():
                        if k in ['ID', 'hasName']:
                          response += f' {k}: {v}; '
                      response += '};'
                  dependency += f'{response} \n'
              if 'instances' in self.action.keys():
                deleted = 'Also, i did delete the following instances: '
                for instance in self.action['instances']:
                  deleted += f"ID: {instance['ID']} ({instance['hasName']}); "
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<dependency_instances>'], [dependency], deleted)

            elif act in ['dependencyUpdate','removeParams']:
              results = self.utils.printListOfLiterals(self.action['params'], True)
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<params>'], [results])

            elif act == 'chooseEntity':
                results = '\n'
                for kk, vv in zip(self.action['key'], self.action['values']):
                    response = f' {kk}: '
                    response += self.utils.printListOfDictionaries(vv)
                    results += response
                self.response = self.utils.replace_placeholder(self.templates[act][r], ['<n>'], [str(len(self.action['key']))], results)

            elif act in ['switchEntity','wrongEntity','confirm', 'cancelProcedure','preExistingEntityCancel']:

                switch_templates_length = len(self.templates['switchEntity'])-1
                switch_r = rand.randint(0, switch_templates_length)

                additional_info = ''
                if self.insert_ID != '':
                  if act != 'wrongEntity':
                    additional_info = self.templates['switchEntity'][switch_r]
                  else:
                    additional_info = 'The active entity is <active_entity>.'
                results = self.utils.printDictionary(self.state, ['removedParams'])
                if results:
                  additional_info += ' The params for it are: ' + results
                if len(self.predecesor) > 1:
                  additional_info += 'If you want, you can switch to another entity from the list: '
                  results = self.utils.printListOfLiterals(self.predecesor, lit_to_avoid = [self.insert_ID])
                  additional_info += results

                if act == 'preExistingEntityCancel':
                  for e in self.action['entity']:
                    params_results = self.utils.printDictionary(e)
                  self.response = self.utils.replace_placeholder(self.templates[act][r], ['<params>', '<active_entity>'], [params_results, self.insert_ID], additional_info)
                elif act == 'switchEntity':
                  self.response = self.utils.replace_placeholder(additional_info, ['<active_entity>'], [self.insert_ID])
                elif act == 'confirm':
                  self.response = self.utils.replace_placeholder(self.templates[act][r], ['<active_entity>'], [self.insert_ID], additional_info)
                else:
                  self.response = self.utils.replace_placeholder(self.templates[act][r], ['<active_entity>'], [self.insert_ID], additional_info)

            elif act in ['reject','chooseReject']:
              preds = self.utils.printListOfLiterals(self.action['pred'], True)
              types = self.utils.printListOfLiterals(self.action['type'])
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<entity>','<preds>','<types>'],
                                                            [self.entity, preds, types])

            elif act == 'wrongLiteralDataFormat':
              results = self.utils.printListOfLiterals(self.action['keys'])
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<keys>'], [results])

            elif act == 'requireParams':
              additional_info = ''
              mandatory = self.utils.printListOfLiterals(self.action['mandatory'], True)
              optional = self.utils.printListOfLiterals(self.action['optional'], True)
              if optional != '':
                additional_info = 'There are optional params too: <optional>'
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<entity>','<mandatory>','<optional>'],
                                                                [self.entity, mandatory, optional], additional_info)

            elif act == 'confirmParams':
              self.instance = True
              results = self.utils.printDictionary(self.state, ['removedParams'])
              self.response = self.utils.replace_placeholder(self.templates[act][r], ['<entity>','<params>'],
                                                                [self.entity, results])

        for k,v in self.active_procedure.items():
          if v is True:
            if k != 'insert':
              self.procedure_state[k] = self.query.procedure_query(self.local_graph, self.ID[k], k.capitalize())
            else:
              self.procedure_state[k] = self.query.insert_procedure_state_query(self.local_graph, self.insert_ID)
          else:
            self.procedure_state[k] = {}

        #print('System: ' + self.response + '\n')
        #print(self.local_graph.serialize())
        #print_params = self.print_params()
        #self.f.write('System: ' + self.response + '\n\n')
        #text = re.split('\\n', self.local_graph.serialize())
        #size = len(text)
        #counter = 0
        #to_print = ''
        #for t in text:
        #  counter += 1
        #  if counter < size:
        #    t = '\t' + t + '\n'
        #  else:
        #    t = '\t' + t
        #  to_print += t
        #self.f.write('Local graph: \n' + to_print)
        #self.f.write(print_params)

    def reset_params(self):

        self.slots = {}
        self.state = {}
        self.action = {
            'act' : ''
        }
        self.turn = 0
        self.entity = ''
        self.active_entity = ''
        self.response = ''
        self.last_intent = ''
        self.detected_intent = 'default'
        self.params = {}
        self.user = ''
        self.instance = False
        self.insert_ID = ''
        self.ID = {'insert' : '',
                       'update' : '',
                       'delete' : '',
                       'select' : ''}
        self.predecesor = []
        self.local_graph = r.Graph()
        self.active_procedure = {'insert' : False,
                       'update' : False,
                       'delete' : False,
                       'select' : False}
        self.procedure_state = {'insert' : {},
                       'update' : {},
                       'delete' : {},
                       'select' : {}}
        self.new_procedure = False
        self.f = None

    def print_params(self):

      string_to_return =  f'''\nCurrent turn {self.turn}:
              NLU
              \t ID: {self.insert_ID}
              \t predecesor: {self.predecesor}
              \t intent: {self.detected_intent}
              \t slots: {self.slots}
              \t params: {self.params}
              \t ac_entity: {self.active_entity}
              \t entity: {self.entity}
              DST
              \t state: {self.state}
              POL
              \t action: {self.action}
              \t instance: {self.instance}
              PROCEDURES
              \t active_procedure: {self.active_procedure}
              \t ID {self.ID}
              \t procedure_state: {self.procedure_state}
              '''

      return string_to_return



In [7]:
# class which holds the structure of the user simulator

class User:

  def __init__(self, templates, prompt, graph, namespace, local_graph = None):

    self.templates = templates # templates to use for utterances
    self.prompt = prompt # the prompt from the Prompt Generator
    self.utils = Utils() # utils utility object
    self.tasks = self.utils.tokenize('\.', self.prompt) # tasks from the prompt
    self.intent_and_slot = [] # store dictionaries mapping the each task
    self.response = '' # the response to give to the TOD System
    self.inflection = inflect.engine() # engine to check whether a word is in singular or plural form
    self.index = 0 # index of the task
    self.vowels = ['a','e','i','o','u'] # list of vowels
    self.get_next_task_lock = True # flag to prevent the system from moving to the next task until the current one is completed
    self.graph = graph # the general Knowledge Graph
    self.local_graph = local_graph # the local Graph that maps the conversation
    self.namespace = namespace # the namespace of both graphs (the personalized one)
    self.query = Query(self.namespace) # query utility object
    self.params = {} # the parameters of an entity type
    self.general_existence = ['something','anything', 'someone', 'somebody','anybody','anyone'] # list of words to detect general reference
    self.instances = [] # store the instances built throughout the conversation
    self.get_intent_and_slots()

  # function to set the value of self.prompt to a prompt generated by the Prompt Generator

  def set_prompt(self, prompt):
    self.prompt = prompt
    self.tasks = []
    self.intent_and_slot = []
    self.index = 0
    self.tasks = self.utils.tokenize('\.', self.prompt)
    self.get_intent_and_slots()

  # function to divide the prompt into tasks, in the form of intent and slots

  def get_intent_and_slots(self):

    for t in self.tasks:
      tokens = self.utils.tokenize(' ', t)


      local_dict = {
            'intent' : tokens[1]
      }

      if tokens[0] == 'do':
        if tokens[2] not in ['a', 'an', 'all']:
          local_dict['instance'] = tokens[2]
        else:
          local_dict['entity'] = tokens[3]

      if len(tokens) > 4:
        if tokens[4] == 'with':
          if local_dict['intent'] == 'update':
            new_values = {}
            old_values = {}
            values = {}
            changed_value = False

            i = 0
            while i < len(tokens[5:]):
              j = 5 + i

              if tokens[j] == 'where':
                changed_value = True
                new_values = values
                values = {}
                i += 1
              else:
                if tokens[j+1] in self.general_existence:
                  if tokens[j+2] == 'like':
                    values[tokens[j]] = f'{tokens[j + 1]} {tokens[j + 2]} {tokens[j + 3]}'
                    i += 4
                  else:
                    values[tokens[j]] = f'{tokens[j + 1]} {tokens[j + 2]} {tokens[j + 3]} {tokens[j + 4]}'
                    i += 5
                else:
                  values[tokens[j]] = tokens[j + 1]
                  i += 2

            if changed_value:
              local_dict['old_values'] = values

            else:
              new_values = values

            local_dict['new_values'] = new_values


          else:
            i = 0
            if len(tokens) > 5:
              while i < len(tokens[5:]):
                j = 5 + i

                if tokens[j+1] in self.general_existence:
                  if tokens[j+2] == 'like':
                    local_dict[tokens[j]] = f'{tokens[j + 1]} {tokens[j + 2]} {tokens[j + 3]}'
                    i += 4
                  else:
                    local_dict[tokens[j]] = f'{tokens[j + 1]} {tokens[j + 2]} {tokens[j + 3]} {tokens[j + 4]}'
                    i += 5
                else:
                  local_dict[tokens[j]] = tokens[j + 1]
                  i += 2

      self.intent_and_slot.append(local_dict)

  # function to generate a user utterance

  def chat(self, sys_act_complete = None, procedure = None, insert_instance = None, active_entity = None, predecesor = None):

    # function to move to the next task

    def set_next_state(self):
      self.get_next_task_lock = True
      if len(self.intent_and_slot)-1 != self.index:
        self.response = 'thanks'
        self.index += 1

    # function to add parameters to the user utterance

    def add_params(self, entity, select_slots = [], insert_instance_ID = None):

      entity = entity
      if self.inflection.singular_noun(entity) != False:
        entity = self.inflection.singular_noun(entity)
      self.params = self.query.params_query(self.graph, entity)
      select_slots = select_slots

      if not select_slots:
        for k in self.params.keys():
          if any(k == kk for kk in procedure.keys()):
            True
          else:
            select_slots.append(k)

      if len(select_slots) != 0:
        nr_of_params_prob = rand.randint(1,len(select_slots))
        l = list(permutations(range(0, nr_of_params_prob)))
        permutation = l[rand.randint(0,len(l)-1)]

      else:
        set_next_state(self)
        nr_of_params_prob = 0

      i = 1
      while i <= nr_of_params_prob:

        slot = select_slots[permutation[i-1]]
        type_of_param = str(self.params[slot][1])
        key = slot[3:].lower()

        if type_of_param.startswith('XMLSchema') == False:
          # instance
          type_of_instance = rand.choices([1,2,3,4],[0.48, 0.22, 0.22,0.08])[0]
          instance_params = self.query.params_query(self.graph, type_of_param)
          type_params = self.query.class_query(self.graph, type_of_param)
          general_existence_pool = [[2,5],[0,1]] if type_params['category'] == 'human' else [[0,1],[0,1]]
          parameter = list(instance_params.keys())[rand.randint(0,len(instance_params.keys())-1)][3:].lower()
          #print(type_of_instance)
          match type_of_instance:
            case 1:
              name = type_of_param + 'name' + '.txt'
              value = self.utils.read_random_line(name)
              parameter = 'name'
            case 2:
              name = type_of_param + parameter + '.txt'
              value = self.utils.read_random_line(name)
              value = f'{self.general_existence[rand.randint(general_existence_pool[0][0],general_existence_pool[0][1])]} with {parameter} {value}'
            case 3:
              name = type_of_param + parameter + '.txt'
              value = self.utils.read_random_line(name)
              value = f'{self.general_existence[rand.randint(general_existence_pool[0][0],general_existence_pool[0][1])]} with {self.general_existence[rand.randint(general_existence_pool[1][0],general_existence_pool[1][1])]} {value}'
            case 4:
              all_entity_ids = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, type_of_param,'')]
              value = rand.choice(all_entity_ids) if all_entity_ids else type_of_param + self.utils.getID()
              parameter = 'ID'

          if insert_instance_ID:
            for instance in self.instances:
              if insert_instance_ID == instance['ID']:
                instance[key] = [value, parameter]
            if not any(insert_instance_ID == instance['ID'] for instance in self.instances):
              self.instances.append({'ID':insert_instance_ID, key: [value, parameter]})

          if i == nr_of_params_prob and i != 1:
            self.response += f'and {key} is {value}'
          else:
            self.response += f'{key} is {value}{"," if nr_of_params_prob > 1 else ""} '

          i += 1

        else:
          # literal
          type_of_literal = rand.choices([1,2],[0.7, 0.3])[0]
          name = entity + key + '.txt'
          #print(type_of_literal)
          match type_of_literal:
            case 1:
              value = self.utils.read_random_line(name)
            case 2:
              value = self.utils.read_random_line(name)
              value = value[0:rand.randint(1, len(value)-1)] if len(value) > 1 else value
              value = f'something like {value}'
          if i == nr_of_params_prob and i != 1:
            self.response += f'and {key} is {value}'
          else:
            self.response += f'{key} is {value}{"," if nr_of_params_prob > 1 else ""} '

          i += 1

    # function to remove parameters from the user utterance

    def remove_params(self, entity, select_slots = {}, insert_instance_ID = None):
      entity = entity
      if self.inflection.singular_noun(entity) != False:
        entity = self.inflection.singular_noun(entity)
      self.params = self.query.params_query(self.graph, entity)
      select_slots = select_slots

      if not select_slots:
        for k, v in procedure.items():
          if any(k == kk for kk in self.params.keys()):
            select_slots[k] = v

      if len(select_slots) != 0:
        nr_of_params_prob = rand.randint(1,len(select_slots))
        l = list(permutations(range(0, nr_of_params_prob)))
        permutation = l[rand.randint(0,len(l)-1)]
        self.response = 'remove '
      else:
        set_next_state(self)
        nr_of_params_prob = 0

      i = 1
      while i <= nr_of_params_prob:

        keys = list(select_slots.keys())
        key = keys[permutation[i-1]]
        value = select_slots[key]
        key = key[3:].lower()

        if insert_instance_ID:
          for instance in self.instances:
            if insert_instance_ID == instance['ID']:
              if key in instance.keys():
                instance.pop(key)

        if i == nr_of_params_prob and i != 1:
          self.response += f'and {key if rand.randint(0,1) == 0 else value} '
        else:
          self.response += f'{key if rand.randint(0,1) == 0 else value}{"," if nr_of_params_prob > 1 else ""} '
        i += 1

    sys_act = sys_act_complete['act']
    d = self.intent_and_slot[self.index]
    intent = d['intent']
    self.response = ''

    if self.get_next_task_lock:

      templates_length = len(self.templates[intent])-1
      r = rand.randint(0, templates_length)

      if intent not in self.templates['requireVars']:
        self.response = self.templates[intent][r]

        if len(self.intent_and_slot)-1 != self.index and self.get_next_task_lock:
          self.index += 1

      else:
        self.get_next_task_lock = False
        add_info = ''
        entity = ''
        instance = ''

        if 'entity' in d.keys():
          entity = d['entity']
          if self.inflection.singular_noun(entity) != False:
            add_info = 'all <entity> '
          else:
            if entity[0].lower() in self.vowels:
              add_info = 'an <entity> '
            else:
              add_info = 'a <entity> '
        else:
          instance = d['instance']
          add_info = '<instance> '


        if intent == 'update':

          new_values = ''
          if 'new_values' in d.keys():
            add_info += 'by changing <new_values>'
            new_values = self.utils.printDictionary(d['new_values'], extra_words = 'to')

          old_values = ''
          if 'old_values' in d.keys():
            add_info += ' where <old_values>'
            old_values = self.utils.printDictionary(d['old_values'], extra_words = 'is')

          self.response = self.utils.replace_placeholder(self.templates[intent][r], ['<entity>', '<instance>', '<new_values>', '<old_values>'], [entity, instance, new_values, old_values], add_info)

        else:

          if len(d.keys()) > 2 and 'to_spwan' not in d.keys():
            add_info += 'where <values>'
          values = self.utils.printDictionary(d,['intent', 'entity'], extra_words = 'is')
          self.response = self.utils.replace_placeholder(self.templates[intent][r], ['<entity>', '<instance>', '<values>'], [entity, instance, values], add_info)

    elif sys_act != None:

      match intent:
        case 'select':
          if 'instance' in d.keys():
            set_next_state(self)

          else:
            match sys_act:

              case 'showSelect':
                intent_prob = rand.choices([1,2,3,4], [0.42,0.32,0.18,0.08])[0]

                match intent_prob:

                  case 1:
                    set_next_state(self)

                  case 2:
                    add_params(self, procedure['entity'])

                  case 3:
                    remove_params(self, procedure['entity'])
                  case 4:
                    set_next_state(self)
                    self.response = 'cancel the procedure'

              case 'wrongSelect':
                set_next_state(self)
              case 'wrongSelectFormat':
                set_next_state(self)
              case 'removeParams':
                templates_length = len(self.templates['removeParams']) - 1
                r = rand.randint(0, templates_length)
                self.response = self.templates['removeParams'][r]
                self.response += 'select'
              case _:
                self.response = 'I don\'t know what to say'
                set_next_state(self)

        case 'delete':
          match sys_act:
            case 'showDelete':
              delete_ids = [instance['ID'] for instance in procedure['delete_ids']]
              entity = self.inflection.plural(procedure['entity'])

              delete_instance_prob = rand.choices([1,2,3,4,5],[0.49,0.24,0.12,0.1, 0.05])[0]

              match delete_instance_prob:
                case 1:
                  delete_correct_instance_prob = rand.choices([1,2],[0.8,0.2])[0]
                  match delete_correct_instance_prob:
                    case 1:
                      delete_id = rand.choice(delete_ids)
                      self.response = f'delete {delete_id}'
                    case 2:

                      all_entity_ids = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, procedure['entity'],'')]
                      all_entity_ids = [x for x in all_entity_ids if x not in delete_ids]
                      delete_id = rand.choice(all_entity_ids) if all_entity_ids else rand.choice(delete_ids)

                      self.response = f'delete {delete_id}'
                case 2:
                  self.response = f'delete all {entity.lower()}'
                case 3:
                  add_params(self, procedure['entity'])
                case 4:
                  remove_params(self, procedure['entity'])
                case 5:
                  set_next_state(self)
                  self.response = 'cancel the procedure'
            case 'confirmDelete':
              set_next_state(self)
            case 'wrongDelete':
              set_next_state(self)
            case 'dependencyDelete':
              set_next_state(self)
            case 'removeParams':
              templates_length = len(self.templates['removeParams']) - 1
              r = rand.randint(0, templates_length)
              self.response = self.templates['removeParams'][r]
              self.response += 'delete'
            case 'default':
              set_next_state(self)

        case 'update':

          def default_cases(self, entity, probs):

            prob = rand.choices([1,2,3,4,5],probs)[0]

            match prob:
              case 1:
                # complete new values
                select_slots = []
                for k in self.params.keys():
                  if any(k == kk for kk in procedure['new_values'].keys()):
                    True
                  else:
                    select_slots.append(k)

                add_params(self, entity, select_slots)
              case 2:
                # complete old values
                select_slots = []
                for k in self.params.keys():
                  if any(k == kk for kk in procedure['old_values'].keys()):
                    True
                  else:
                    select_slots.append(k)

                add_params(self, entity, select_slots)
                if self.response != 'thanks':
                  self.response = 'put to filters ' + self.response
              case 3:
                # complete new values
                remove_params(self, entity, procedure['new_values'])
              case 4:
                # complete old values
                remove_params(self, entity, procedure['old_values'])
                if self.response != 'thanks':
                  self.response += 'from filters'
              case 5:
                set_next_state(self)
                self.response = 'cancel the procedure'

          match sys_act:

            case 'dependencyUpdate' | 'wrongLiteralDataFormat':
              #TO DO: one more case, insert

              update_instance_prob = rand.choices([1,2],[0.8, 0.2])[0]
              entity = procedure['instance_type'] if 'instance_type' in procedure.keys() else procedure['entity']
              if self.inflection.singular_noun(entity) != False:
                entity = self.inflection.singular_noun(entity)
              self.params = self.query.params_query(self.graph, entity)

              match update_instance_prob:
                case 1:
                  # only for the ones in the wrong format
                  add_params(self, entity, sys_act_complete['params'] if sys_act == 'dependencyUpdate' else sys_act_complete['keys'])

                case 2:
                  default_cases(self, entity, [0.23,0.11,0.11,0.11,0.44])

            case 'showUpdate':

              update_ids = [instance for instance in procedure['update_ids']]
              update_instance_prob = rand.choices([1,2,3],[0.45,0.35,0.2])[0]
              entity = procedure['instance_type'] if 'instance_type' in procedure.keys() else procedure['entity']
              if self.inflection.singular_noun(entity) != False:
                entity = self.inflection.singular_noun(entity)
              self.params = self.query.params_query(self.graph, entity)

              match update_instance_prob:
                case 1:
                  update_correct_instance_prob = rand.choices([1,2],[0.8,0.2])[0]
                  match update_correct_instance_prob:
                    case 1:
                      update_id = rand.choice(update_ids)
                      self.response = f'update {update_id}'
                    case 2:

                      all_entity_ids = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, procedure['entity'],'')]
                      all_entity_ids = [x for x in all_entity_ids if x not in update_ids]
                      update_id = rand.choice(all_entity_ids) if all_entity_ids else rand.choice(update_ids)
                      self.response = f'update {update_id}'
                case 2:
                  entity = self.inflection.plural(procedure['entity'])
                  self.response = f'update all {entity.lower()}'
                case 3:
                  default_cases(self, entity, [0.2,0.25,0.2,0.23,0.12])

                  set_next_state(self)
                  self.response = 'cancel the procedure'

            case 'chooseEntity' | 'chooseReject':


              update_keys = procedure['choose_update_keys'] if type(procedure['choose_update_keys']) != str else [procedure['choose_update_keys']]
              update_values = procedure['choose_update_values']
              update_instance_prob = rand.choices([1,2],[0.8,0.2])[0]
              entity = procedure['instance_type'] if 'instance_type' in procedure.keys() else procedure['entity']
              if self.inflection.singular_noun(entity) != False:
                entity = self.inflection.singular_noun(entity)
              self.params = self.query.params_query(self.graph, entity)

              match update_instance_prob:
                case 1:

                  nr_of_params_prob = rand.randint(1,len(update_keys))
                  l = list(permutations(range(0, nr_of_params_prob)))
                  permutation = l[rand.randint(0,len(l)-1)]

                  i = 1

                  while i <= nr_of_params_prob:

                    update_correct_instance_prob = rand.choices([1,2],[0.8,0.2])[0]
                    key = update_keys[permutation[i-1]]

                    match update_correct_instance_prob:

                      case 1:

                        value = update_values[permutation[i-1]]
                        if type(value) == list:
                          value = rand.choice(value)['ID']
                        else:
                          value = value['ID']

                        key = key[3:].lower()

                      case 2:

                        type_of_param = str(self.params[key][1])
                        all_entity_ids = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, type_of_param,'')]
                        value = update_values[permutation[i-1]]
                        if type(value) == list:
                          value = [instance['ID'] for instance in value]
                        else:
                          value = [value['ID']]
                        all_entity_ids = [x for x in all_entity_ids if x not in value]
                        value = rand.choice(all_entity_ids) if all_entity_ids else type_of_param + self.utils.getID()
                        key = key[3:].lower()


                    if i == nr_of_params_prob and i != 1:
                      self.response += f'and {key} is {value}'
                    else:
                      self.response += f'{key} is {value}{"," if nr_of_params_prob > 1 else ""} '

                    i += 1
                case 2:
                  default_cases(self, entity, [0.23,0.11,0.11,0.11,0.44])

            case 'confirmUpdate':
              set_next_state(self)
            case 'wrongUpdate':
              set_next_state(self)
            case 'preExistingEntityCancel':
              set_next_state(self)
            case 'removeParams':
              templates_length = len(self.templates['removeParams']) - 1
              r = rand.randint(0, templates_length)
              self.response = self.templates['removeParams'][r]
              self.response += 'update'
            case 'default':
              set_next_state(self)

        case 'insert':

          def default_cases(self, entity, probabilities, state):

            probs = rand.choices([1,2,3],probabilities)[0]
            select_slots = []
            for k in self.params.keys():
              if any(k == kk for kk in state.keys()):
                True
              else:
                select_slots.append(k)

            if all(x in state.keys() for x in self.params.keys()) and probs == 1:
              probs = 2 if rand.randint(0,1) == 0 else 3

            if not state and probs == 2:
              probs = 3

            match probs:

              case 1:
                add_params(self, entity, select_slots, insert_instance)
              case 2:
                remove_params(self, entity, state, insert_instance)
              case 3:
                for instance in self.instances:
                  if insert_instance == instance['ID']:
                    self.instances.remove(instance)

                if len(predecesor) <= 1:
                  set_next_state(self)
                self.response = 'cancel the procedure'

          state = self.query.state_query(self.local_graph, insert_instance)
          entity = active_entity
          if self.inflection.singular_noun(entity) != False:
            entity = self.inflection.singular_noun(entity)

          self.params = self.query.params_query(self.graph, entity)

          match sys_act:

            case 'requireParams' | 'switchEntity':
              default_cases(self,entity, [0.93,0.05,0.02], state)
            case 'confirmParams':

              for instance in self.instances:
                if insert_instance == instance['ID']:
                  for k, v in state.items():
                    k = k[3:].lower()
                    if k not in instance.keys():
                      instance[k] = v
              if not any(insert_instance == instance['ID'] for instance in self.instances):
                instance = {'ID' : insert_instance}
                for k, v in state.items():
                  instance[k[3:].lower()] = v

                self.instances.append(instance)

              insert_probs = rand.choices([1,2,3],[0.9,0.05,0.05])[0]

              match insert_probs:

                case 1:
                  templates_length = len(self.templates[intent])-1
                  r = rand.randint(0, templates_length)
                  self.response = self.templates['agree'][r]
                case 2:
                  templates_length = len(self.templates[intent])-1
                  r = rand.randint(0, templates_length)
                  self.response = self.templates['disagree'][r]
                case 3:
                  default_cases(self,entity, [0.85,0.05,0.1], state)
            case 'preExistingEntityCancel' | 'cancelProcedure':

              if insert_instance:

                if len(predecesor) > 1:
                    wrongEntity = f'{entity}{self.utils.getID()}'
                    predecesor = predecesor[0:len(predecesor)-1]
                    correctEntity = predecesor[rand.randint(0,len(predecesor)-1)]
                    self.response = f'switch to {correctEntity if rand.choices([1,2],[0.8,0.2])[0] == 1 else wrongEntity}'

                else:
                  default_cases(self, entity,[0.25,0.25,0.5], state)
              else:
                if rand.randint(0,1) == 0:
                  self.intent_and_slot.insert(self.index + 1, {'intent' : 'insert', 'entity' : entity })
                  set_next_state(self)
                else:
                  set_next_state(self)
            case 'wrongLiteralDataFormat':

              insert_instance_prob = rand.choices([1,2],[0.9, 0.1])[0]

              match insert_instance_prob:
                case 1:
                  # only for the ones in the wrong format
                  add_params(self, entity, sys_act_complete['keys'])

                case 2:
                  default_cases(self, entity, [0.25,0.25,0.5], state)
            case 'chooseEntity' | 'chooseReject':

              insert_keys = procedure['choose_insert_keys'] if type(procedure['choose_insert_keys']) != str else [procedure['choose_insert_keys']]
              insert_values = procedure['choose_insert_values']
              insert_instance_prob = rand.choices([1,2],[0.9,0.1])[0]

              match insert_instance_prob:
                case 1:

                  nr_of_params_prob = rand.randint(1,len(insert_keys))
                  l = list(permutations(range(0, nr_of_params_prob)))
                  permutation = l[rand.randint(0,len(l)-1)]

                  i = 1

                  while i <= nr_of_params_prob:

                    insert_correct_instance_prob = rand.choices([1,2],[0.8,0.2])[0]
                    key = insert_keys[permutation[i-1]]

                    match insert_correct_instance_prob:

                      case 1:

                        value = insert_values[permutation[i-1]]
                        if type(value) == list:
                          value = rand.choice(value)['ID']
                        else:
                          value = value['ID']

                        key = key[3:].lower()

                      case 2:

                        type_of_param = str(self.params[key][1])
                        all_entity_ids = [instance['ID'] for instance in self.query.pre_existing_query(self.graph, type_of_param,'')]
                        value = insert_values[permutation[i-1]]
                        if type(value) == list:
                          value = [instance['ID'] for instance in value]
                        else:
                          value = [value['ID']]
                        all_entity_ids = [x for x in all_entity_ids if x not in value]
                        value = rand.choice(all_entity_ids) if all_entity_ids else type_of_param + self.utils.getID()
                        key = key[3:].lower()


                    if i == nr_of_params_prob and i != 1:
                      self.response += f'and {key} is {value}'
                    else:
                      self.response += f'{key} is {value}{"," if nr_of_params_prob > 1 else ""} '

                    i += 1
                case 2:
                  default_cases(self, entity, [0.25,0.25,0.5], state)
            case 'removeParams':
              self.response = 'insert it' if rand.randint(0,1) == 0 else f'insert the {entity}'
            case 'askStep':
              prob = rand.choices([1,2,3],[0.45,0.3,0.25])[0]

              match prob:

                case 1:
                  self.intent_and_slot.insert(self.index + 1, {'intent' : 'insert', 'entity' : entity })
                  set_next_state(self)
                case 2:
                  default_cases(self, entity,[0,0,1], state)
                case 3:
                  self.response = 'insert it' if rand.randint(0,1) == 0 else f'insert the {entity}'
            case 'confirm':

              if insert_instance:

                action_prob = rand.choices([1,2,3],[0.5,0.3,0.2] if len(predecesor) > 1 else [0.6,0.4,0])[0]
                not_spawn_flag = True

                for instance in self.instances:
                  if instance['ID'] == insert_instance and 'to_spawn' in instance.keys():
                    not_spawn_flag = False
                    if len(instance['to_spawn']) == 0:
                      self.response = ''
                      counter_type_list = 0
                      for k, v in instance.items():
                        if type(v) == list and k != 'to_spawn':
                          self.response += f'{k} is {v[0]}, '
                      self.response = self.response[:-2]

                    else:
                      pair = instance['to_spawn'][-1]
                      key = pair[0]
                      value = instance[key[3:].lower()]
                      instance['to_spawn'].pop()

                      self.intent_and_slot.insert(self.index + 1, {'intent' : 'insert', 'entity' : pair[1].lower(), value[1] : self.utils.tokenize(' ',value[0])[-1] })
                      set_next_state(self)


                if not_spawn_flag:
                  match action_prob:

                    case 1:
                      self.response = 'insert it' if rand.randint(0,1) == 0 else f'insert the {entity}'
                    case 2:
                      default_cases(self, entity, [0.25, 0.25, 0.5], state)
                    case 3:
                      wrongEntity = f'{entity}{self.utils.getID()}'
                      predecesor = predecesor[0:len(predecesor)-1]
                      correctEntity = predecesor[rand.randint(0,len(predecesor)-1)]
                      self.response = f'switch to {correctEntity if rand.choices([1,2],[0.8,0.2])[0] == 1 else wrongEntity}'
              else:
                set_next_state(self)
            case 'wrongEntity':

              switch_prob = rand.choices([1,2],[0.8,0.2])[0]
              match switch_prob:
                case 1:
                  wrongEntity = f'{entity}{self.utils.getID()}'
                  predecesor = predecesor[0:len(predecesor)-1]
                  correctEntity = predecesor[rand.randint(0,len(predecesor)-1)]
                  self.response = f'switch to {correctEntity if rand.choices([1,2],[0.9,0.1])[0] == 1 else wrongEntity}'
                case 2:
                  set_next_state(self)
            case 'reject':
              insert_instance_prob = rand.choices([1,2,3],[0.9, 0.08,0.02])[0]

              match insert_instance_prob:
                case 1:

                  for instance in self.instances:
                    if insert_instance == instance['ID']:
                      if len(sys_act_complete['pred']) <= 1:
                        key = sys_act_complete['pred'][0]
                        value = instance[key[3:].lower()]
                        instance['to_spawn'] = []
                        #{'intent': 'insert', 'entity': 'employee', 'name': 'John'}
                        self.intent_and_slot.insert(self.index + 1, {'intent' : 'insert', 'entity' : sys_act_complete['type'][0].lower(), value[1] : self.utils.tokenize(' ',value[0])[-1]  })

                      else:

                        instance['to_spawn'] = []
                        for p, t in zip(sys_act_complete['pred'],sys_act_complete['type']):
                          instance['to_spawn'].append([p, t])
                        instance['to_spawn'].pop()
                        key = sys_act_complete['pred'][-1]
                        value = instance[key[3:].lower()]
                        self.intent_and_slot.insert(self.index + 1, {'intent' : 'insert', 'entity' : sys_act_complete['type'][-1].lower(), value[1] : self.utils.tokenize(' ',value[0])[-1]  })

                  set_next_state(self)

                case 2:
                  # only for the ones in the wrong format
                  add_params(self, entity, sys_act_complete['pred'])

                case 3:
                  default_cases(self, entity, [0.25,0.25,0.5], state)
            case 'default':
              set_next_state(self)

        case _:
          set_next_state(self)

    return self.response

  def reset_params(self):
    self.prompt = ''
    self.tasks = ''
    self.intent_and_slot = []
    self.index = 0
    self.get_next_task_lock = True
    self.params = {}
    self.instances = []

## Run the Dialogue Simulator

In [8]:
# templates for generating a user utterance; Each key is a system action with values as a list of possible user utterances to use as a response
# each template will have values added by the User Simulator in natural language

user_templates = {
    'hello' : ['Hello', 'Hi', 'Good morning'],
    'goodbye' : ['Goodbye', 'Have a nice day', 'See you soon', 'bye'],
    'thank' : ['thank you very much','thank you', 'Thanks', 'Thanks for helping me'],
    'insert' : ['insert ', 'add '],
    'select' : ['select ', 'retrieve ','show '],
    'delete' : ['delete '],
    'update' : ['update '],
    'agree' : ['yes', 'correct', 'it is true', 'true'],
    'disagree' : ['no', 'not true', 'false'],
    'remove' : ['remove '],
    'removeParams' : ['continue the ', 'execute the '],
    'cancel' : ['stop ', 'cancel '],
    'switchEntity' : ['switch ', 'change '],
    'requireVars' : ['insert', 'select', 'update', 'delete']
}

In [None]:
# create the graph and the namespace

g = r.Graph()
n = r.Namespace("http://www.semanticweb.org/ionut/ontologies/2023/2/projects/")
g.parse("generalKB.ttl")


In [None]:
# provide the dictionary of supported intents. Each key is an intent with values as a list of keywords to be detected in a user utterance

intents = {
    'hello' : ['hello', 'hi', 'good morning'],
    'goodbye' : ['goodbye', 'have a nice day', 'see you soon', 'bye'],
    'thank' : ['thank', 'thanks'],
    'insert' : ['insert', 'add'],
    'select' : ['select', 'give', 'retrieve','show'],
    'delete' : ['delete'],
    'update' : ['update'],
    'agree' : ['yes', 'correct', 'it is true', 'true'],
    'disagree' : ['no', 'not true', 'false'],
    'remove' : ['remove'],
    'cancel' : ['stop', 'cancel'],
    'switchEntity' : ['switch', 'change']
}

# provide the dictionary of templates for system's response. Each key is a system action with values as a list of possible system responses, with/without placeholders (<placeholder>)
# a placeholder is live-substituted with certain values by the TOD system

templates_w_placeholders = {

    'hello' : ['Hello!', 'Hi!'],
    'goodbye' : ['Goodbye!', 'Have a nice day!'],
    'welcome' : ['You are welcome.', 'No problem.', 'For nothing.'],
    'wrongFormatSelect' : ['The filters you are trying to user are in a wrong format. For ID you have "entity + 20 digits" and what entities I support are: <entities>.'],
    'wrongSelect' : ['There are no instances with the specified parameters. Please try again!', 'No instance match your values, please adjust the filters.'],
    'showSelect' : ['Here is the list of <entity> that match your search: ','There are some <entity> that fit your filters: '],
    'confirmDelete' : ['I did delete the following IDs: <instances>. Procede with another process.', 'The instances with ID <instances> were successfully removed. Please initiate a process.'],
    'wrongDelete' : ['Your delete query contains mistakes. Please try another process.', 'No instance is affected by your delete query. You may want to restart the process.'],
    'dependencyDelete' : ['I cannot delete some instances because other depend on them: <dependency_instances> Firstly delete those. '],
    'showDelete': ['You want to delete an instance, but I found more than one matching your filters. Select one or all to be deleted from the following list: <instances>'],
    'confirmUpdate' : ['I have updated the instances that match your search with the new values: <instances>', 'Updated successfully, here is the list with the latest versions: <instances>'],
    'wrongUpdate' : ['Your update query does not affect any instance from the graph. Please try again!', 'No instance match your values, please adjust the filters or the new values you wish to insert and then try to update again.'],
    'dependencyUpdate' : ['You want to change the value for some params (<params>) but they require pre-existing instances in the graph. Please make sure that the mentioned instance do exist before updating.'],
    'showUpdate' : ['You want to update an instance, but the filters you chose match more than one. Please refer to one from the list below by ID: <instances>'],
    'confirm' : ['I did insert the instance. ', 'The instance was successfully inserted. '],
    'wrongLiteralDataFormat' : ['Some values for keys which do not require pre-existing entities in the graph are in the wrong format (<keys>). Please insert other ones.'],
    'cancelProcedure' : ['Okay, I have canceled the procedure. ','I did stop the procedure. ', 'The ongoing procedure was stopped. '],
    'askStep' :['Would you like to cancel the whole procedure or to remove some parameters?'],
    'reject' : ['You want to put a <entity>, but some params (<preds>) require pre-existing instances in the graph. Please insert the corresponding instances (<types>) and then proceed with the <entity>.' ,
                       'A <entity> has params (<preds>) which need pre-existing instances in the graph (<types>). Please insert those first.'
                      ],
    'chooseReject': ['For <preds> you have to choose from the provided list by mentioning its ID.', 'You did not choose from the provided list of instances. Please do so for <preds>.'],
    'removeParams' : ['The following params were removed: <params>','Removing <params> from the process is completed.'],
    'requireParams' : ['I cannot insert the <entity> unless you give me the necessary params: <mandatory>',
                       'The <entity> requires: <mandatory>'
        ],
    'confirmParams' : ['I need you to confirm that you want to insert a <entity> with the following params: <params>',
                       'The <entity> will be inserted with the following params: <params>. Is it correct?'
                      ],
    'preExistingEntityCancel' : ['There is another instance in the graph with the same params (maybe some optionals too): <params>. The process was canceled. ', 'I have found an instance with the same mandatory params in the knowledge base: <params>, maybe some optionals. I did stop the procedure. '],
    'chooseEntity' : ['I have found <n> param(s) with entities which have values for some param(s) which might suit your intent: ', 'It seems that some values you requested have matching entities in my knowledge base: '],
    'wrongEntity' : ['The entity you wanted to switch to is wrong.', 'There is no entity with that ID.'],
    'switchEntity' : ['I switched the entity to <active_entity>. ', 'The switch to <active_entity> was done successfully. '],
    'default' : ['What you said makes no sense right now. Please initiate or continue a procedure.'],
    'requireVars' : ['chooseReject','wrongLiteralDataFormat','wrongFormatSelect', 'showDelete', 'showUpdate', 'confirmUpdate', 'dependencyUpdate', 'dependencyDelete','confirmDelete', 'showDelete', 'showSelect','wrongEntity','switchEntity','confirm','reject','requireParams','confirmParams','removeParams','cancelProcedure', 'preExistingEntityCancel', 'chooseEntity']
}


# generate a desired amount of scenarios, for x times
# there are 3 files generated: train.json (data to use for training a NLU model), statistics.txt (details about a scenario), statistics_graph.txt (details about the general KB)
# the files download automatically
# the graph is re-initialized each turn, to have fair statistics

bot = TODsystem(intents, templates_w_placeholders, g, n, user_templates)
scenarios_number = input('Please, tell me how many scenarios you want: ')
for i in range(1):
  g = r.Graph()
  g.parse("generalKB.ttl")
  bot.graph = g
  bot.chat_pipeline(scenarios_number, i)
