Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impossible to pass end to end test when the same entity is extracted by different classifiers #9771

Closed
2 of 4 tasks
hsm207 opened this issue Oct 1, 2021 · 7 comments
Closed
2 of 4 tasks
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/model-testing Issues focused around testing models (e.g. via `rasa test`) type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@hsm207
Copy link
Contributor

hsm207 commented Oct 1, 2021

Rasa Open Source version

2.8.7

Rasa SDK version

No response

Rasa X version

No response

Python version

3.8

What operating system are you using?

Linux

What happened?

rasa test will fail when the same entity is extracted by different classifiers.

Steps to reproduce:

  1. Include this in config.yml
  - name: RegexEntityExtractor
  - name: DIETClassifier
    epochs: 100
    constrain_similarities: true
    entity_recognition: true
  1. Define this lookup table:
nlu:
- lookup: country
  examples: |
    - Albania
    - Algeria
    - Andorra
    - Angola
    - Antigua and Barbuda
    - Argentina
    - Armenia
    - Australia
    - Austria
    - Azerbaijan
    - Germany

i.e. list of countries

  1. Define an intent that only has this:
- intent: inform
  examples: |
    - [Germany](country)
    - [Argentina](country)
  1. Write a test story like this:
stories:
  - story: identical to training NLU example
    steps:
    - user: |
        [Germany](country)
      intent: inform
  1. Run rasa test --fail-on-prediction-errors

Command / Request

No response

Relevant log output

2021-10-01 08:45:28 INFO     rasa.core.test  - Evaluating 1 stories
Progress:
  0%|                                                                                                                                                   | 0/1 [00:00<?, ?it/s]2021-10-01 08:45:28 DEBUG    rasa.nlu.classifiers.diet_classifier  - There is no trained model for 'ResponseSelector': The component is either not trained or didn't receive enough training data.
2021-10-01 08:45:28 DEBUG    rasa.nlu.selectors.response_selector  - Adding following selector key to message property: default
/opt/venv/lib/python3.8/site-packages/rasa/shared/utils/io.py:97: UserWarning: Parsing of message: 'Germany' lead to overlapping entities: Germany of type country extracted by RegexEntityExtractor overlaps with Germany of type country extracted by DIETClassifier. This can lead to unintended filling of slots. Please refer to the documentation section on entity extractors and entities getting extracted multiple times:https://rasa.com/docs/rasa/components#entity-extractors
2021-10-01 08:45:28 DEBUG    rasa.core.processor  - Received user message 'Germany' with intent '{'id': 113713248080627725, 'name': 'inform', 'confidence': 0.9980354905128479}' and entities '[{'entity': 'country', 'start': 0, 'end': 7, 'value': 'Germany', 'extractor': 'RegexEntityExtractor'}, {'entity': 'country', 'start': 0, 'end': 7, 'confidence_entity': 0.9988456964492798, 'value': 'Germany', 'extractor': 'DIETClassifier'}]'
  0%|                                                                                                                                                   | 0/1 [00:00<?, ?it/s]
2021-10-01 08:45:28 DEBUG    rasa.__main__  - Failed to run CLI command due to an exception.
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/rasa/__main__.py", line 117, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/opt/venv/lib/python3.8/site-packages/rasa/cli/test.py", line 260, in test
    run_core_test(args)
  File "/opt/venv/lib/python3.8/site-packages/rasa/cli/test.py", line 133, in run_core_test
    test_core(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_testing.py", line 182, in test_core
    rasa.utils.common.run_in_loop(
  File "/opt/venv/lib/python3.8/site-packages/rasa/utils/common.py", line 296, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 1041, in test
    story_evaluation, _, entity_results = await _collect_story_predictions(
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 916, in _collect_story_predictions
    ) = await _predict_tracker_actions(
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 822, in _predict_tracker_actions
    user_uttered_result = _collect_user_uttered_predictions(
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 547, in _collect_user_uttered_predictions
    raise WrongPredictionException(
rasa.core.test.WrongPredictionException: NLU model predicted a wrong intent. Failed Story: 

version: "2.0"
stories:
- story: identical to training NLU example
  steps:
  - intent: inform
    entities:
    - country: Germany
    - country: Germany

WrongPredictionException: NLU model predicted a wrong intent. Failed Story: 

version: "2.0"
stories:
- story: identical to training NLU example
  steps:
  - intent: inform
    entities:
    - country: Germany
    - country: Germany

Definition of done

  • Issue reproduced
  • Solution identified
  • Bug fixed
  • @hsm207 notified

@JEM-Mosig is reviewer

@hsm207 hsm207 added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Oct 1, 2021
@hsm207 hsm207 changed the title Impossible to end to end test when the same entity is extracted by different classifiers Impossible to pass end to end test when the same entity is extracted by different classifiers Oct 1, 2021
@TyDunn
Copy link
Contributor

TyDunn commented Oct 1, 2021

@hsm207 This issue is a Nice to have in 3.0. It would also fix this issue here but only in 3.0+. Let us know if you feel we need to solve this issue here sooner or in 2.x

@hsm207
Copy link
Contributor Author

hsm207 commented Oct 1, 2021

@TyDunn So far, no customer has brought this up yet. The workaround is to not use the --fail-on-prediction-errors parameter and manually review the failed stories to find the real failures. I think this is acceptable until Rasa 3.0 is released.

But since we encourage people to write test stories, I feel this should be fixed in 3.0.0 and not be treated as a nice to have.

@TyDunn
Copy link
Contributor

TyDunn commented Oct 1, 2021

@hsm207 The reason is that it is a nice to have is because we want to avoid scope creep with this big release. The scope has been set for months now, and we risk delaying it if we add things at this point. If we have time, we'll get to it. Otherwise, let's keep it high in the CSE issues board and the team will take care of it shortly after the 3.0 release :)

@TyDunn TyDunn added area:rasa-oss/model-testing Issues focused around testing models (e.g. via `rasa test`) effort:research/4 labels Oct 1, 2021
@samsucik
Copy link
Contributor

samsucik commented Oct 14, 2021

I've been able to reproduce this issue and it's worse than I thought 🙂

It boils down to us comparing expected and predicted entities in a naive and harsh way. It even insists on the entities being extracted in the same order as listed in the training story, which can get you into real trouble:

Imagine you have RegexEntityExtractor that extracts countries, DIETClassifier that extracts job names, and a test story like this:

  - story: simple
    steps:
    - intent: inform
      user: |
        I'm a [researcher](job_name) from [Germany](country)

Now, if you have RegexEntityExtractor before DIETClassifier in your pipeline, the model will fail on the test story. Just because country is extracted before job_name but the test story expects the extracted entities to be in the other order (as they occur in the user utterance -- first job_name, then country).

The other thing is that the code that compares expected and predicted entities ignores roles and groups (which feels wrong).

This being said, I'll focus on fixing the bug reported above and create followup issues for the other things.

@hsm207
Copy link
Contributor Author

hsm207 commented Oct 14, 2021

thanks @samsucik for the update!

@samsucik
Copy link
Contributor

@JEM-Mosig fyi I'm now implementing a fix and I'm not de-duplicating the extracted entities globally, only for the purposes of the checks which compare expected and predicted entities. I think there's value in leaving duplicates in the extracted entities for now because it might uncover underlying issues with the way one uses multiple entity extractors for the same entity. And also because fixing only the checking code is simpler.

@samsucik
Copy link
Contributor

@hsm207 I've created a fix in #9875. It addresses the issue with duplicated or differently ordered entity predictions. Note that it does not touch the actual attributes of a Message (i.e. the prediction data attached to it by entity extractors). It only eliminates the error that was previously raised without good reason. It'll be the subject of a future issue to handle duplicated/overlapping entity predictions and to change the status quo where one has to include some training examples in order for RegexEntityExtractor to kick in.

@JEM-Mosig review of the code should be quick but I'd like you to also challenge the overall approach I took 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/model-testing Issues focused around testing models (e.g. via `rasa test`) type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

No branches or pull requests

4 participants