Impossible to pass end to end test when the same entity is extracted by different classifiers #9771

hsm207 · 2021-10-01T08:47:51Z

Rasa Open Source version

2.8.7

Rasa SDK version

No response

Rasa X version

No response

Python version

3.8

What operating system are you using?

Linux

What happened?

rasa test will fail when the same entity is extracted by different classifiers.

Steps to reproduce:

Include this in config.yml

  - name: RegexEntityExtractor
  - name: DIETClassifier
    epochs: 100
    constrain_similarities: true
    entity_recognition: true

Define this lookup table:

nlu:
- lookup: country
  examples: |
    - Albania
    - Algeria
    - Andorra
    - Angola
    - Antigua and Barbuda
    - Argentina
    - Armenia
    - Australia
    - Austria
    - Azerbaijan
    - Germany

i.e. list of countries

Define an intent that only has this:

- intent: inform
  examples: |
    - [Germany](country)
    - [Argentina](country)

Write a test story like this:

stories:
  - story: identical to training NLU example
    steps:
    - user: |
        [Germany](country)
      intent: inform

Run rasa test --fail-on-prediction-errors

Command / Request

No response

Relevant log output

2021-10-01 08:45:28 INFO     rasa.core.test  - Evaluating 1 stories
Progress:
  0%|                                                                                                                                                   | 0/1 [00:00<?, ?it/s]2021-10-01 08:45:28 DEBUG    rasa.nlu.classifiers.diet_classifier  - There is no trained model for 'ResponseSelector': The component is either not trained or didn't receive enough training data.
2021-10-01 08:45:28 DEBUG    rasa.nlu.selectors.response_selector  - Adding following selector key to message property: default
/opt/venv/lib/python3.8/site-packages/rasa/shared/utils/io.py:97: UserWarning: Parsing of message: 'Germany' lead to overlapping entities: Germany of type country extracted by RegexEntityExtractor overlaps with Germany of type country extracted by DIETClassifier. This can lead to unintended filling of slots. Please refer to the documentation section on entity extractors and entities getting extracted multiple times:https://rasa.com/docs/rasa/components#entity-extractors
2021-10-01 08:45:28 DEBUG    rasa.core.processor  - Received user message 'Germany' with intent '{'id': 113713248080627725, 'name': 'inform', 'confidence': 0.9980354905128479}' and entities '[{'entity': 'country', 'start': 0, 'end': 7, 'value': 'Germany', 'extractor': 'RegexEntityExtractor'}, {'entity': 'country', 'start': 0, 'end': 7, 'confidence_entity': 0.9988456964492798, 'value': 'Germany', 'extractor': 'DIETClassifier'}]'
  0%|                                                                                                                                                   | 0/1 [00:00<?, ?it/s]
2021-10-01 08:45:28 DEBUG    rasa.__main__  - Failed to run CLI command due to an exception.
Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/rasa/__main__.py", line 117, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/opt/venv/lib/python3.8/site-packages/rasa/cli/test.py", line 260, in test
    run_core_test(args)
  File "/opt/venv/lib/python3.8/site-packages/rasa/cli/test.py", line 133, in run_core_test
    test_core(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_testing.py", line 182, in test_core
    rasa.utils.common.run_in_loop(
  File "/opt/venv/lib/python3.8/site-packages/rasa/utils/common.py", line 296, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 1041, in test
    story_evaluation, _, entity_results = await _collect_story_predictions(
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 916, in _collect_story_predictions
    ) = await _predict_tracker_actions(
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 822, in _predict_tracker_actions
    user_uttered_result = _collect_user_uttered_predictions(
  File "/opt/venv/lib/python3.8/site-packages/rasa/core/test.py", line 547, in _collect_user_uttered_predictions
    raise WrongPredictionException(
rasa.core.test.WrongPredictionException: NLU model predicted a wrong intent. Failed Story: 

version: "2.0"
stories:
- story: identical to training NLU example
  steps:
  - intent: inform
    entities:
    - country: Germany
    - country: Germany

WrongPredictionException: NLU model predicted a wrong intent. Failed Story: 

version: "2.0"
stories:
- story: identical to training NLU example
  steps:
  - intent: inform
    entities:
    - country: Germany
    - country: Germany

Definition of done

Issue reproduced
Solution identified
Bug fixed
@hsm207 notified

@JEM-Mosig is reviewer

The text was updated successfully, but these errors were encountered:

TyDunn · 2021-10-01T13:15:16Z

@hsm207 This issue is a Nice to have in 3.0. It would also fix this issue here but only in 3.0+. Let us know if you feel we need to solve this issue here sooner or in 2.x

hsm207 · 2021-10-01T13:39:45Z

@TyDunn So far, no customer has brought this up yet. The workaround is to not use the --fail-on-prediction-errors parameter and manually review the failed stories to find the real failures. I think this is acceptable until Rasa 3.0 is released.

But since we encourage people to write test stories, I feel this should be fixed in 3.0.0 and not be treated as a nice to have.

TyDunn · 2021-10-01T13:44:49Z

@hsm207 The reason is that it is a nice to have is because we want to avoid scope creep with this big release. The scope has been set for months now, and we risk delaying it if we add things at this point. If we have time, we'll get to it. Otherwise, let's keep it high in the CSE issues board and the team will take care of it shortly after the 3.0 release :)

samsucik · 2021-10-14T11:17:20Z

I've been able to reproduce this issue and it's worse than I thought 🙂

It boils down to us comparing expected and predicted entities in a naive and harsh way. It even insists on the entities being extracted in the same order as listed in the training story, which can get you into real trouble:

Imagine you have RegexEntityExtractor that extracts countries, DIETClassifier that extracts job names, and a test story like this:

  - story: simple
    steps:
    - intent: inform
      user: |
        I'm a [researcher](job_name) from [Germany](country)

Now, if you have RegexEntityExtractor before DIETClassifier in your pipeline, the model will fail on the test story. Just because country is extracted before job_name but the test story expects the extracted entities to be in the other order (as they occur in the user utterance -- first job_name, then country).

The other thing is that the code that compares expected and predicted entities ignores roles and groups (which feels wrong).

This being said, I'll focus on fixing the bug reported above and create followup issues for the other things.

hsm207 · 2021-10-14T12:45:23Z

thanks @samsucik for the update!

samsucik · 2021-10-14T12:48:19Z

@JEM-Mosig fyi I'm now implementing a fix and I'm not de-duplicating the extracted entities globally, only for the purposes of the checks which compare expected and predicted entities. I think there's value in leaving duplicates in the extracted entities for now because it might uncover underlying issues with the way one uses multiple entity extractors for the same entity. And also because fixing only the checking code is simpler.

samsucik · 2021-10-14T15:18:52Z

@hsm207 I've created a fix in #9875. It addresses the issue with duplicated or differently ordered entity predictions. Note that it does not touch the actual attributes of a Message (i.e. the prediction data attached to it by entity extractors). It only eliminates the error that was previously raised without good reason. It'll be the subject of a future issue to handle duplicated/overlapping entity predictions and to change the status quo where one has to include some training examples in order for RegexEntityExtractor to kick in.

@JEM-Mosig review of the code should be quick but I'd like you to also challenge the overall approach I took 😉

hsm207 added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Oct 1, 2021

hsm207 changed the title ~~Impossible to end to end test when the same entity is extracted by different classifiers~~ Impossible to pass end to end test when the same entity is extracted by different classifiers Oct 1, 2021

TyDunn added area:rasa-oss/model-testing Issues focused around testing models (e.g. via `rasa test`) effort:research/4 labels Oct 1, 2021

JEM-Mosig assigned samsucik and JEM-Mosig Oct 4, 2021

samsucik mentioned this issue Oct 19, 2021

Consider entity roles & groups in rasa test --fail-on-prediction-errors #9931

Closed

samsucik closed this as completed Oct 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Impossible to pass end to end test when the same entity is extracted by different classifiers #9771

Impossible to pass end to end test when the same entity is extracted by different classifiers #9771

hsm207 commented Oct 1, 2021 •

edited by samsucik

Loading

TyDunn commented Oct 1, 2021

hsm207 commented Oct 1, 2021

TyDunn commented Oct 1, 2021

samsucik commented Oct 14, 2021 •

edited

Loading

hsm207 commented Oct 14, 2021

samsucik commented Oct 14, 2021

samsucik commented Oct 14, 2021

Impossible to pass end to end test when the same entity is extracted by different classifiers #9771

Impossible to pass end to end test when the same entity is extracted by different classifiers #9771

Comments

hsm207 commented Oct 1, 2021 • edited by samsucik Loading

Rasa Open Source version

Rasa SDK version

Rasa X version

Python version

What operating system are you using?

What happened?

Command / Request

Relevant log output

TyDunn commented Oct 1, 2021

hsm207 commented Oct 1, 2021

TyDunn commented Oct 1, 2021

samsucik commented Oct 14, 2021 • edited Loading

hsm207 commented Oct 14, 2021

samsucik commented Oct 14, 2021

samsucik commented Oct 14, 2021

hsm207 commented Oct 1, 2021 •

edited by samsucik

Loading

samsucik commented Oct 14, 2021 •

edited

Loading