Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OSS-413: Proper intents in interactive training #12722

Merged
merged 5 commits into from Sep 27, 2023

Conversation

ottonemo
Copy link
Contributor

@ottonemo ottonemo commented Aug 9, 2023

In OSS-413 a user reports that since Rasa 3.1 the interactive training dialogue suggests malformed/shortened intent names, e.g.:

  • a
  • b
  • c

This is due to a bug in the parsing of intent names which assumes that every intent is a dictionary (which it is ONLY when a property such as use_entities is set).

While OSS-413 states that this is solely cosmetic it is an actual bug that caused severe problems that cost several hours to debug: When using the regex matcher it is checked whether the parsed intent is in the domain or not. When it is not, it will fail and attempt to revert the user utterance. The user utterance is then written to the tracker but SlotSet events are not repeated - therefore any form validator will fail for inexplicable reasons.

This also means that there is a bug in the _correct_wrong_nlu for not copying enough or re-starting the ActionExtractSlots action which is not addressed here.

This behavior is rather hard to test, if you have any suggestions, feel free to guide me.

Proposed changes:

  • Make a distinction between dict and str intent names from the domain

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

@ottonemo ottonemo requested a review from a team as a code owner August 9, 2023 16:19
Copy link
Member

@ancalita ancalita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ottonemo Great, thank you for your contribution tackling this bug!
Before merging there are a few things we need to do:

  • please add a changelog entry here to describe both the bug and bugfix
  • please share some assistant example to reproduce the bug manually and confirm the bugfix you implemented fixes this
  • please open a separate ticket for the other issue you flagged: there is a bug in the _correct_wrong_nlu for not copying enough or re-starting the ActionExtractSlots action
  • since this is a bugfix it should target branch 3.6.x, please rebase, this way it can be released in a micro patch.

Could we add a unit test for record_messages that will include a lot of mocking and monkeypatching most of the functions used here apart from _validate_nlu which uses the intents list you modified, even here you will have to monkeypatch questionary.confirm to just either print (in which case it's advisable to use capsys fixture from pytest) or return the message to be asserted directly 🙏🏻
Let me know in case of any questions!

rasa/core/training/interactive.py Outdated Show resolved Hide resolved
@ottonemo ottonemo changed the base branch from main to 3.6.x August 18, 2023 11:08
@ottonemo
Copy link
Contributor Author

Hi, thanks for the comments!

please add a changelog entry here to describe both the bug and bugfix

Done.

please share some assistant example to reproduce the bug manually and confirm the bugfix you implemented fixes this

Sure! Luckily the moodbot example is sufficient. Simply run rasa interactive from, for example, the 3.6.x branch:

image

Since I maintain an internal middleware for Rasa that needed to deal with this issue I can confirm that the fix I submitted resolves the issue. However, just to be certain, here's a screenshot from the OSS-413 branch:

image

since this is a bugfix it should target branch 3.6.x, please rebase, this way it can be released in a micro patch.

I rebased the commits and the PR - I hope it is correct now.

Could we add a unit test for record_messages that will include a lot of mocking and monkeypatching most of the functions used here apart from _validate_nlu which uses the intents list you modified, even here you will have to monkeypatch questionary.confirm to just either print (in which case it's advisable to use capsys fixture from pytest) or return the message to be asserted directly 🙏🏻

That sounds very complicated and not very maintainable. What about refactoring the intent retrieval into a separate function and testing it for the two cases? This way we would only need to provide domains (i.e. from Domain.from_yaml().as_dict()).

@NitinKumar94
Copy link

I would love to see this PR merged! I was wondering why rasa-interactive did not display the correct intent names and this does fix it! Bumping this thread so that it can be reviewed

@ancalita ancalita self-requested a review September 13, 2023 10:44
Copy link
Member

@ancalita ancalita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my review comments 🚀

I added one more request for the changelog, also please fix the formatting so that the quality check in the CI passes (it's a required step). You will need to run make format locally where you installed 3.6.x, then push the changes. I also recommend running make lint and make types, and fix any issues that might be flagged.

What about refactoring the intent retrieval into a separate function and testing it for the two cases?

Yes, great idea, please go ahead and add this unit test.

Comment on lines +3 to +4
This will also fix a bug where forced user utterances (using the regex matcher) will
be reverted even though they are present in the domain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the information you gave in the PR description, as that was clearer than these 2 lines 🙏🏻

In OSS-413 a user reports that since Rasa 3.1 the interactive
training dialogue suggests malformed/shortened intent names, e.g.:

- a
- b
- c

This is due to a bug in the parsing of intent names which assumes that
every intent is a dictionary (which it is ONLY when a property such as
`use_entities` is set).

While OSS-413 states that this is solely cosmetic it is an actual bug
that caused severe problems that cost several hours to debug:
When using the regex matcher it is checked whether the parsed intent
is in the domain or not. When it is not, it will fail and attempt to
revert the user utterance. The user utterance is then written to the
tracker but `SlotSet` events are not repeated - therefore any form
validator will fail for inexplicable reasons.

This also means that there is a bug in the `_correct_wrong_nlu` for
not copying enough or re-starting the `ActionExtractSlots` action which
is not addressed here.
Since the main problem of bug OSS-413 is that intents with
attributes are not retrieved well the test was implemented to
use domains with and without intent definitions using attributes.
@ottonemo
Copy link
Contributor Author

@ancalita I added the test cases and ran linting, formatting and type checkers. Please review again!

Copy link
Member

@ancalita ancalita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks @ottonemo

@ancalita ancalita merged commit e37774e into RasaHQ:3.6.x Sep 27, 2023
95 of 96 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants