In the following descriptions, the primary method of file management used by the Dialog Platform is local files.
In addition to local files, file management methods can be also managed in the database, but the configuration is the same.
The definition file extension is aiml for the scenario file and txt for the non-scenario configuration file extension.
The directory structure of the scenario files is as follows.
storage Various File Directories
├── braintree AIML Extraction File Directory
├── categories Scenario File Directory
│ └── *.aiml
├── conversations Dialog History Information Directory
│ └── *.conv
├── debug Dialog History Information Directory
│ ├── duplicates.txt
│ ├── errors.txt
│ └── *.log
├── lookups File Directory for Substitution Process
│ ├── regex.txt
│ ├── denormal.txt
│ ├── gender.txt
│ ├── normal.txt
│ ├── person.txt
│ └── person2.txt
├── maps map Element Usage File Directory
│ └── *.txt
├── nodes map Element Usage File Directory
│ ├── pattern_nodes.conf
│ └── template_nodes.conf
├── properties File Directory for Property lists
│ ├── nlu_servers.yaml
│ ├── defaults.txt
│ └── properties.txt
├── security File Directory for Security
│ └── usergroups.yaml
└── sets Set Usage File Directory
└── *.txt
The configuration of the file to be used is defined in the config file (config.yaml), and the data type to be used is specified by whether or not the following entities are set.
There are two types of entities, one for common use and one for specific nodes, but they are defined and managed in the same way.
Of the following entities, the data with "automatic generation" of "No" can be edited and used to expanded at startup.
The data with "single file" of "Yes" can specify only one target file, and the file with "No" can specify multiple files by directory specification.
- Common Entities
Entity | Content | Single file | Automatic generation | Description |
---|---|---|---|---|
binaries | Tree data | No | Yes | Stores binary data for the search tree expanding AIML. |
braintree | Tree data | No | Yes | Stores XML formatted data for the search tree expanding AIML. |
categories | dialog AIML | No | No | Stores AIML files used to control dialog. |
conversations | dialog history information | No | Yes | The history information file of dialog processing contents (Contain the values of variables) including the past of each user is stored. |
defaults | Property List | Yes | No | Stores definition files for global variables (name) that are expanded at initial startup. |
duplicates | Duplicate specification information | Yes | Yes | (For debugging) Stores an error information file when expanding AIML files. |
errors | Error Information | Yes | Yes | (For debugging) Stores an error information file when expanding AIML files. |
license_keys | License Key | Yes | No | Stores license key files required for external connections. |
pattern_nodes | Class definition list | Yes | No | Stores the processing class definition file for each node of pattern. |
postprocessors | Class definition list | Yes | No | Stores the processing class definition file for editing response statements. |
preprocessors | Class definition list | Yes | No | Stores the processing class definition file for editing a user utterance in a request. |
properties | Property List | Yes | No | Stores the property definition files used by the bot<pattern_bot> of pattern and the bot<template_bot> node of the template. |
spelling_corpus | Spelling Information | No | No | Stores the corpus file used for spelling checker. |
template_nodes | Class definition list | Yes | No | Stores the processing class definition file for each node in template. |
- Entity for the Pattern node
Entity | Stored Content | Single file | Automatic generation | Description |
---|---|---|---|---|
regex_templates | Regular expression list | Yes | No | Stores the regular expression list file used in the template specification of the regex<pattern_regex> node. |
sets | Target word list | No | No | Stores the word list file used by the set<pattern_set> node. |
- Entity for the Template node
Entity | Stored Content | Single file | Automatic generation | Description |
---|---|---|---|---|
denormal | Transformation dictionary | Yes | No | Stores the dictionary file used for transformations in the denormalize<template_denormalize> node. |
gender | Transformation dictionary | Yes | No | Stores the dictionary file used for transformations in the gender<template_gender> node. |
learnf | Category list | No | YES | Stores the categories information created by process of the learnf<template_learnf> node on a per-user basis. |
logs | Log Information | No | YES | log<template_log> information created by the processing of the node is stored for each user. |
maps | Property List | No | No | Stores the dictionary file used for translation on the map<template_map> node. |
normal | Transformation dictionary | Yes | No | Stores the dictionary file used for transformations on the normalize<template_normalize> node. |
person | Transformation dictionary | Yes | No | Stores the dictionary file used for transformations on the person<template_person> node. |
person2 | Transformation dictionary | Yes | No | Stores the dictionary file used for transformations on the person2<template_person2> node. |
rdfs | RDF Data List | No | No | Stores the definition file of the RDF<RDF_Support> data to be processed by the RDF related nodes. |
rdf_updates | RDF Update Information | Yes | Yes | Stores change history information for RDF data. |
usergroups | Security Information | Yes | No | Stores the role definition file used by the authorise<template_authorise> node. |
console
.In the client configuration, a section called
storage
has entities as a subsection.Specifies the name of the I/O control method (Store) for each entity as the file management method.
Here, the store method name:
file
is specified.console:
storage:
entities:
binaries: file
braintree: file
categories: file
conversations: file
defaults: file
duplicates: file
errors: file
license_keys: file
pattern_nodes: file
postprocessors: file
preprocessors: file
properties: file
spelling_corpus: file
template_nodes: file
regex_templates: file
sets: file
denormal: file
gender: file
learnf: file
logs: file
maps: file
normal: file
person: file
person2: file
rdf: file
rdf_updates: file
usergroups: file
Here, for the store name: file, “type: file ” specifies to use the storage engine to I/O the local file.
For local file I/O (file specification), the name of the actual storage engine is the
entity name + '_ storage'
.The configuration of the storage engine for each entity is done in the 'config' subsection as follows:
stores:
file:
type: file
config:
binaries_storage:
file: ./storage/braintree/braintree.bin
braintree_storage:
file: ./storage/braintree/braintree.xml
categories_storage:
dirs: ./storage/categories
subdirs: true
extension: aiml
conversations_storage:
dirs: ./storage/conversations
defaults_storage:
file: ./storage/properties/defaults.txt
duplicates_storage:
file: ./storage/debug/duplicates.txt
errors_storage:
file: ./storage/debug/errors.txt
license_keys_storage:
file: ./storage/licenses/license.keys
pattern_nodes_storage:
file: ./storage/nodes/pattern_nodes.conf
postprocessors_storage:
file: ./storage/processing/postprocessors.conf
preprocessors_storage:
file: ./storage/processing/preprocessors.conf
properties_storage:
file: ./storage/properties/properties.txt
spelling_corpus_storage:
file: ./storage/spelling/corpus.txt
template_nodes_storage:
file: ./storage/nodes/template_nodes.conf
regex_templates_storage:
file: ./storage/lookups/regex.txt
sets_storage:
dirs: ./storage/sets
extension: txt
denormal_storage:
file: ./storage/lookups/denormal.txt
gender_storage:
file: ./storage/lookups/gender.txt
learnf_storage:
dirs: ./storage/learnf
logs_storage:
dirs: ./storage/debug
maps_storage:
dirs: ./storage/maps
extension: txt
normal_storage:
file: ./storage/lookups/normal.txt
person_storage:
file: ./storage/lookups/person.txt
person2_storage:
file: ./storage/lookups/person2.txt
rdfs_storage:
dirs: ./storage/rdfs
subdirs: true
extension: txt
rdf_updates_storage:
dirs: ./storage/rdf_updates
usergroups_storage:
file: ./storage/security/usergroups.yaml
The definition in the 'config' subsection is a description method indicating the use of a single file only or multiple files depending on the entity involved.
For entities that specify a single file, the 'file' attribute specifies the file path.
usergroups_storage:
file: ./storage/security/usergroups.yaml
In the case of an entity that can use multiple files, specify the following three attributes. However, for automatically generated entities, only the directory path is specified.
- dirs: Specifies the target file directory path.
- subdirs: true/false set whether to search subdirectories under the target file directory.
- extension: The extension of the file type to load.
categories_storage:
dirs: ./storage/categories
subdirs: true
extension: aiml
conversations_storage:
dirs: ./storage/conversations
There are following four storage engines that are not automatically generated and can specify multiple files.
- categories_storage
- sets_storage
- maps_storage
- rdfs_storage
An example of using Redis for database management is shown below.
For the entity managed by Redis, store method name: redis
is specified in the entities subsection of storage.
console:
storage:
entities:
binaries: file
braintree: file
categories: redis
:
The entities that can input and output in Redis are as follows.
- binaries : Stores binary data for the AIML expanded search tree.
- braintree : Stores XML format data for the AIML expanded search tree.
- conversations : History information of dialog processing contents (Contain the values of variables) including the past of each user is stored.
- duplicates : (For debugging) Stores category duplicate information when extracting AIML files.
- errors : (For debugging) Stores error information when extracting AIML files.
- learnf :Stores categories information created by processing the
learnf<template_learnf>
node for each user. - logs :Log information created by processing the
log<template_log>
node is stored for each user.
In the 'config' subsection, specify the common parameters required for using Redis, and the actual I/O are performed by the storage engine for each entity, including the key setting.
stores:
redis:
type: redis
config:
host: localhost
port: 6379
password: xxxx
db: 0
prefix: programy
drop_all_first: false
Describes the definition files that are commonly used outside of AIML in editable files.
The file specified by the following entities is described in the form of 'Name: Value' for the purpose of setting the value of variables, etc.
- defaults : Defines the initial values of global variables (name).
- properties : Defines bot properties used in pattern
bot<pattern_bot>
and templatebot<template_bot>
nodes. - maps : Defines the key/value relationship used by the
map<template_map>
node. - nlu_servers : Configure the NLU server.
defaults
entity enables you to set the value of global variables (name) at initial startup for use in scenarios.However, if there is a dialog information history for each user, this setting is disabled because the latest value on the history is reflected.
That is, it is only valid for new users.
The following example specifies that name variable: initial_variable should be set to value: "initial value".
initial_variable: initial value
The properties
entity sets parameters that determine the default behavior of the bot, along with information references at the bot node.
Entity | Content | Description | Default Value |
---|---|---|---|
name | bot name | The value obtained when the bot<template_bot> name attribute is specified as name. |
(If unspecified, the value of default-get) |
birthdate | bot creation date | The value obtained when the bot<template_bot> name attribute is specified as name. |
(If unspecified, the value of default-get) |
grammar_version | grammar version | The value obtained when specifying grammar_version for the bot<template_bot> name attribute. |
(If unspecified, the value of default-get) |
app_version | app version | The value obtained when app_version is specified for the bot<template_bot> name attribute. |
(If unspecified, the value of default-get) |
default-response | default response | Return if no pattern matches. | unknown |
default-get | default get | A string that can be retrieved by a get on an undefined variable. | unknown |
joiner_terminator | end-of-sentence character | Specifies the character string to which the ending phrase of the response sentence is added. If not specified, nothing is given. | . |
joiner_join_chars | excluded end-of-sentence character | Specifies a string that is used to combine and exclude the joiner_terminator character when automatically appending the statement terminator with the joiner_terminator specification. If not specified, the character specified by joiner_terminator is appended. | .?! |
splitter_split_chars | sentence separator | Specifies the character used internally to devide a sentence. If the specified string is included in the statement, it is treated as a multiple statement, and response returns a string created by combining multiple response statements. However, metadata only returns what you set in the final statement. If not specified, the utterance is treated as one sentence. | . |
punctuation_chars | delimiter | Specifies the character to be treated as a delimiter. Delimiters are excluded from matching, and the matching process is performed to the form created by excluding delimiters from utterance sentences and reponse sentences. | (None) |
- Configuration Example
name:Basic Response
birthdate:March 01, 2019
grammar_version:0.0.1
app_version: 0.0.1
default-response: Sorry, I didn't understand.
default-get: I don't know.
version: v0.0.1
joiner_terminator: .
joiner_join_chars: .?!
splitter_split_chars: .
punctuation_chars: ;'",!()[]:’”;
Specifies the character string to which the ending phrase of the response sentence is automatically added. If you specify "Hello" for template,
- Configuration Example
joiner_terminator: .
<category>
<pattern> Hello </pattern>
<template>Let's have a great day. </template>
</category>
Output: Let's have a great day.
The response of the dialog API response: To prevent a punctuation mark from being added at the end of the response, define the following. (You can disable it by leaving after ":" blank.)
joiner_terminator:
If unspecified, no punctuation is added to the response.
Output: Let's have a great day
In the case where joiner_join_chars is not specified, if you include a response such as "Let's have a great day." or "I feel great!" and combine the response with the characters specified by the joiner_terminator, it adds to the response the ending character, e.g., punctuation mark, specified by joiner_terminator even if the response already has the ending character and returns a sentence such as "Let's have a great day.." or "I feel great!.".
- Configuration Example
joiner_terminator: .
joiner_join_chars: .?!
<category>
<pattern> Hello </pattern>
<template>Let's have a great day.</template>
</category>
<category>
<pattern>It's nice weather today, too.</pattern>
<template> I feel great! </template>
</category>
Output: Let's have a great day.
Input: It's nice weather today, too
Output: I feel great!
If joiner_join_chars is unspecified, the characters specified in joiner_terminator are always combined.
joiner_terminator: .
joiner_join_chars:
Output: Let's have a great day..
Input: It's nice weather today, too
Output: I feel great!.
Specifies the character used intrernally to divide a sentence. If the specified string is included in the statement, it is treated as a multiple statement, and returns a string created by combining multiple response statements. For the "Hello. It's nice weather today, too." utterance, if you specify ". " for splitter_split_chars, it will process 2 statements: "Hello." and "It's nice weather today, too.".
- Configuration Example
joiner_terminator: .
splitter_split_chars: .
<category>
<pattern>Hello. </pattern>
<template>It's nice weather today, too.</template>
</category>
<category>
<pattern>Let's have a great day. </pattern>
<template>I feel great.</template>
</category>
Output: Let's have a great day. I feel great.
If splitter_split_chars is not specified, the utterance is not split, so matching with "Hello. It's nice weather today, too." as a single sentence is performed, and no response is given because there is no matching utterance in AIML described above.
splitter_split_chars:
Output: Sorry, I didn't understand.
Specifies the character to be treated as a delimiter.Delimiters are excluded from the matching process and are excluded from utterance sentences. If you have the input "Hello." and "Hello", the characters in punctuation_chars are ignored and treated as the same utterance.
- Configuration Example
punctuation_chars: ;'",!()[]:’”;
<category>
<pattern>Hello</pattern>
<template>Let's have a great day</template>
</category>
Output: Let's have a great day.
Input: Hello
Output: Let's have a great day.
If punctuation_chars is not specified, "." is also matched, so "Hello." and "Hello" are treated as the different utterances.
punctuation_chars:
Output: Let's have a great day.
Input: Hello.
Output: Sorry, I didn't understand.
maps
entity, information references in the map node are done by file names (exclude extension), so you can separate files by type of information.The following example is from prefectural_office.txt, which lists the relationships between prefectures and prefectural capitals.
Tokyo Metropolis: Tokyo
Tokyo: Tokyo
Kanagawa Prefecture: Yokohama City
Kanagawa: Yokohama City
Osaka Prefecture: Osaka City
Osaka: Osaka City
:
nlu_servers
entity sets URL for access (endpoint) and API key within the nlu node.Please contact COTOBA DESIGN for endpoints and API keys to use within the nlu node. (https://www.cotoba.net)
The following example shows how to set two URLs: the first URL has no API key set, and the second URL has an API key set.
nlu:
- url: http://localhost:5201/run
- url: http://localhost:3000/run
apikey: test_key
The file specified by the following entities lists the words and strings to be processed.
- sets : Defines a list of words to be matched for the
set<pattern_set>
node.
sets
entity is able to separate files for each type of information because the set node references the information by file name (exclude extension).In the case of Japanese, a match may not be made depending on the result of word division performed during the match processing, so the match processing is performed as a character string instead of a word.
The following is an example of prefecture.txt listing the state/province names:.
Tokyo Metropolis
Tokyo
Kanagawa Prefecture
Kanagawa
Osaka Prefecture
Osaka
:
The file specified by the following entity should be of the form 'regular expression name: regular expression string'.
- regex_templates :Define regular expression list to be used in template specification of
regex<pattern_regex>
node.
regex_templates
entity is used to commonly define a regular expression string in a file for use in matching operations performed on regex nodes.On the regex node side, the template attribute specifies the regular expression name. Regular expression descriptions must be specified on a word basis.
The description example is as follows.
realize:REALI[Z|S]E
elevator: elevator | lift
:
The file specified by the following entity is of the form "Value to be converted", "Converted value" for the purpose of creating a table for translation.
- normal :Defines a list of conversion tables for the
normalize<template_normalize>
node. - denormal :Defines a list of conversion tables for the
denormalize<template_denormalize>
node. - gender :Defines a list of conversion tables for the
gender<template_gender>
node. - person :Defines a list of conversion tables for a
person<template_person>
node. - person2 :Defines a list of conversion tables for the
person2<template_person2>
node.
normal
entities, symbols etc. in strings are converted into independent words by defining them as follows.For alphabetic characters, if the 1st character of "Value to be converted" is not '' (Blank), all matches in the subject string are converted (character substitution), assuming symbol conversion.
On the other hand, if the 1st character is ' '(Blank), the conversion is done by word (word substitution).
The normalize conversion is done in the following order: character conversion, word conversion. In both cases, blank is inserted before and after the converted text.
As an example of combining the 2 transforms, you could convert "Mr." to " mister dot " by converting "." to "dot" and "Mr" to "mister".
Japanese words are converted to units.
".","dot"
"/","slash"
":","colon"
"*","_"
" Mr","mister"
" can t","can not"
denormal
entity, which is paired with a normal
entity conversion, is defined as follows and convert words to character strings such as symbols.For alphabetic characters, "Value to be converted" is a word, and "Converted value" includes the need for ' '(Blank) when concatenating with the surrounding string. If there is no space, it is concatenated with the surrounding string.
As the opposite of normalize, if you want to change "mister dot" back to "Mr.", you can also specify "Dot" as "." and "mister "in the" Mr" as 2 separate specifications.
"dot","."
"slash","/"
"colon",":"
"_","*"
"mister dot"," Mr."
"can not"," can't "
The entities gender
, person
and person2
specify word-by-word conversions; for example, gender defines:.
"he","she"
"his","her"
"him","her"
"her","him"
"she","he"
See RDF Support<RDF_Support>
for the file format specified by the rdfs
entity.
See Security <Security>
for the file format specified by the usergroups
entity.
The following entities contain implementation-dependent content (Class definitions, etc.) and therefore do not need to be described:.
- license_keys : The license key files required for external connections, etc.
- pattern_nodes : Contains the processing class definition files for each node of the pattern.
- postprocessors : The processing class definition file for editing response statements.
- preprocessors : A processing class definition file for editing user utterances in a request.
- spelling_corpus : The corpus file from which the spelling checker is based.
- template_nodes : The processing class definition file for each node of template.