Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Rasa Spelling PLAYGROUND #37

Closed
rmiaouh opened this issue Feb 15, 2021 · 8 comments
Closed

[BUG] - Rasa Spelling PLAYGROUND #37

rmiaouh opened this issue Feb 15, 2021 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@rmiaouh
Copy link

rmiaouh commented Feb 15, 2021

Hi guys & @koaning. Your work is amazing.

I have an issue with the Rasa Spelling PLAYGROUND

image

My model trained with : Rasa V2.1.3
RasaLit : V0.1.2
image

File "/home/robin/askMona_Projets/1_Projet_NomDeLieux/ArDf/Divers_Tests/TEST_RASA/env/lib/python3.6/site-packages/streamlit/script_runner.py", line 332, in _run_script
    exec(code, module.__dict__)
File "/home/robin/askMona_Projets/1_Projet_NomDeLieux/ArDf/Divers_Tests/TEST_RASA/env/lib/python3.6/site-packages/rasalit/apps/spelling/app.py", line 50, in <module>
    preds = clf.predict_proba(augs)
File "/home/robin/askMona_Projets/1_Projet_NomDeLieux/ArDf/Divers_Tests/TEST_RASA/env/lib/python3.6/site-packages/rasalit/apps/spelling/classifier.py", line 93, in predict_proba
    result.append([ranking_dict[n] for n in self.class_names_])
File "/home/robin/askMona_Projets/1_Projet_NomDeLieux/ArDf/Divers_Tests/TEST_RASA/env/lib/python3.6/site-packages/rasalit/apps/spelling/classifier.py", line 93, in <listcomp>
    result.append([ranking_dict[n] for n in self.class_names_])

KeyError: 'smalltalk residence'
Looks like it doesn't like how I called my Intent ?

@koaning
Copy link
Contributor

koaning commented Feb 15, 2021

I have a feeling that this might be related to your config.yml setting. Did you have a max-ranking set in DIET by any chance?

@rmiaouh
Copy link
Author

rmiaouh commented Feb 15, 2021

I will send you my Metadata.txt. I dont think i have a max-ranking ?

metadata.txt

@koaning
Copy link
Contributor

koaning commented Feb 15, 2021

Ah yeah, I think I've found the issue. Your DIET classifier doesn't predict all possible values because of this setting in DIET;

"ranking_length": 10,

This might be a default value though so it's something that I need to fix on my end. So it's added to the TODO pile for this week. Thanks for reporting!

Will ping here one there's a fix.

@rmiaouh
Copy link
Author

rmiaouh commented Feb 15, 2021

Oh thank you for all. Now I understand all the logic behind this bug.
I train models with 90 intents. Can this parameter impact the output performance of the model? Or this parameter only impacts the number of intents it displays ?

@koaning
Copy link
Contributor

koaning commented Feb 15, 2021

I'm not 100% sure yet, but I'll try to investigate this.

@koaning
Copy link
Contributor

koaning commented Feb 16, 2021

I've just pushed a PR with a fix, once tests are green it will be merged. Could you let me know if it works?

Today I learned that Rasa pipelines don't expose all the intent confidence scores on prediction. Only if you configure DIET to do so. The fix turned out to be simple and we're now also robust against the changes made to the confidence change.

In case you haven't seen the announcement yet.

@rmiaouh
Copy link
Author

rmiaouh commented Feb 16, 2021

I've just pushed a PR with a fix, once tests are green it will be merged. Could you let me know if it works?
@koaning It's perfect ! Ty for all
image

May I ask you a last question ?
I currently have a problem with the DIET entity detection.
For exemple : if I train the entity "Livre" (means "book" in french) and I call the word "Live", the Diet entity detector will think it's the world "Livre".
image
And I think it's because of the CountVectors in my config :

  • name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4

But from what I know the CountVectors is essential for better performance.
Do you have a strategy to avoid this problem?

image

Ty for all, I will watch the video :).

@koaning
Copy link
Contributor

koaning commented Feb 16, 2021

What entities are you trying to detect? Maybe a RegexEntityExtractor works better here.

@koaning koaning closed this as completed Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants