New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving integration of open domain utterances #4266
Comments
Some open design questions -
The domain file would then look like -
The only place developer would need to specify the action(if needed) would be in training story and he could use the same name based on the template. This also helps in the case where you may want to build only one response selector model for all utterance types and all utterances get classified under a default intent The config for response selector in that case would look like -
Specifying this in the Edit - Discussed this with @amn41 and we decided to follow a templated name.
How do we define whether to merge the models into one model for utterance types in the above case? In the other case, we would do away with
Based on Edit - Decided to go the most transparent way which is to have the developer write config for multiple utterance types if needed and not worry about making it simpler. Instead of key |
Based on user tests and feedback received, allowing multiple responses for similar utterances is a good to have capability and we should try to include it. Although the current training data format is not the most flexible format for it. For e.g. -
The above format includes a single response cleanly. To allow multiple responses we would need to add a delimiter in between all candidate responses in the first line of the data point which wouldn't render very cleanly in the markdown format. One most intuitive option is source other variants of the same response from a different file based on a key lookup. The key can be the primary response itself and the variants can be written inside What is the best option? |
Document regarding nuances of different training formats - |
Fixed in #4233 |
Description of Problem:
Currently open domain utterances like chitchat can be integrated but in a slightly cumbersome manner. Each such open domain utterance needs to be classified under a separate intent and one action needs to be predicted based on a complex OR construct.
For example -
NLU
Story
Template
Can we do away with the complex OR constructs and micro intents for each utterance type?
Overview of the Solution:
We can treat this as a more end-to-end dialogue system where given the user utterance, if it's of type
open_domain
(e.g.chitchat
) then directly pick the most appropriate bot response using a ML model.We call this a
ResponseSelector
Component which sits inside NLU pipeline, works exactly asEmbeddingIntentClassifier
and embeds both candidate response and user utterance in the same embedding space and computes a similarity. Based on similarity ranking the model picks the most appropriate response.We can have multiple types of open domain utterances like -
chitchat
,faq
, etc. and build a separate response selector model or a unified model, depending on configuration.We always use a mapping policy for the
open_domain
intent to map it to autter_x
action which queries the appropriate response selector model for the appropriate bot utterance. Also, theseutter_x
actions can be included in training stories in case there are follow-up actions that should be triggered and this behaviour can be learnt by other core policies.A simple PoC of this is described here
Examples (if relevant):
The new way of integrating open domain utterances could look like -
NLU
Domain
Definition of Done:
The text was updated successfully, but these errors were encountered: