-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intent classification with garbage text #221
Comments
Hi @Aljumaili85, I have a couple of questions:
If the correct intent was identified and the confidence numbers you are observing are derived from Frank |
Hi @franklevasseur,
Yes, the intent was right in all cases, simply because I have no other intent.
I tried both cases: the botpress and the standalone (the latest binary version 1.0.2). Command to run the language server : Commnd to run the NLU server: intent JSON :
Thank you! |
Hello again, I have a few points to address:
Instead of relying on the confidence number (which has little value), you can try adding more intents and see if the accuracy is good enough. You might also be interested in Botpress Cloud, which takes advantage of large language models. I hope this information is helpful. Best regards, |
Thank you @franklevasseur . |
Make sure the issue is NLU related
Operating system
Windows
Product used
NLU Server
Deploy Option
Binary
Version
12.30.6
Configuration File
No response
CLI Arguments
No response
Environment variables
No response
Description of the bug
Hi there, I am using the NLU engine as a simple intent classifier, so I train the model using very simple utterances with no slots and no entities.
I notice that when I use, as input, a simple phrase containing only the main keywords, the returned result is excellent, but when I add some garbage text to my phrase, the confidence dramatically goes down.
With garbage text, I mean "please, I would like ... " or "can you please ..." etc.
I assume that the NLU is capable of identifying the lables for these words, but unfortunately, adding these words drops down the confidence of the returned results.
Example:
"opening hours of the swimming pool " --> result { "intent" : "info_swimming_pool", "confidence": "0.9576875041845723"
"please, I would like to know the opening hours of the swimming pool" --> result { "intent" : "info_swimming_pool", "confidence": "0.2959758827595158"
The language used in my case is Italian, and my Language server is running with no problem, and the training process was successful.
any suggestion?
Thanks
The text was updated successfully, but these errors were encountered: