New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boosting exact matches #110
Comments
I'm glad you're excited about this project. If I understand correctly it sounds like you want to overfit to your training data :) which would probably hurt the model's ability to generalise to new examples. |
@amn41 Also the type of examples of was trying to train the model on are similar to following Thanks, Raffi |
If there's really only one word difference I would recommend using one intent, call it Regarding the confidence, it's the estimated probability that the text matches the predicted intent. So if you have two intents and 51% confidence, that's pretty weak. If you have 50 intents and 25% confidence, that's actually quite high, if that makes sense! |
Thanks! That makes sense. However, perhaps I am training wrong. I tried the same dataset with both mitie and spacy_sklearn and the confidence for mitie is much higher (An intent with spacy_sklearn would be 0.3, while with mitie it is 0.9 and above). Is there a reason for this? My mitie training took a little over an hour on a 16gb macbook pro. With spacy_sklearn, the same training takes about 20 seconds. There were 19 intents with 85 utterances. Any insights into this would be much appreciated! Thanks! |
The confidence you get from MITIE is a |
Thanks so much for taking the time to respond! I think I am actually using your fork already because you had mentioned it in another issue. Even with your fork does it make sense that training with mitie could take over 6 hours for 19 intents with 115 utterances? spacy_sklearn just takes 30 seconds on the same dataset. In regards to the score, a lowish score with mitie, still means that the probability that it mapped to the correct intent is low, correct? To me the mitie score just seems to make more sense me than the probability in spacy_sklearn. When I see a score above 0.9 in mitie to me that sounds pretty good. Does mitie do better on a smaller dataset than spacy_sklearn? With spacy_sklearn, there seemed to be little difference in the probablity of a correct and incorrect mapping of an intent. I also thought mitie did a better job with utterances with words that overlapped. With spacy_sklearn during training I get Thanks again! Any help would be much appreciated! |
Re: training times - I think I'll bring the The warning about the f-score is probably the following: when training rasa randomly splits the data into a training (80%) and a test set (20%). If you have very few examples of one intent, it can be that the test set doesn't have doesn't contain any examples of that intent, which generates this warning. |
Cool, that sounds great! Is |
mitie_sklearn is kind of a hybrid, using MITIE's word vectors and named entity recogniser, but sklearn to train the intent classifier. It works, I've used it before, but might not be fully up to date. Try it out and let me know if you get any errors. |
Thanks for the information! I see the training and interpreter classes for mitie_sklearn, but I think that the trainer is missing Thanks again for all your help! |
Ok yeah it's out of date. I'll find some time to fix this :) |
Thanks so much for such an awesome project! I am really enjoying testing it out.
One thing I noticed is that an exact match does not have a confidence of 1.0. Sometimes it may even have a low confidence. What would your approach be to boost exact matches. Obviously, we can just search for an exact match, but I was wondering if in the model itself there is a way to improve the score.
Any help would be much appreciated. Thanks again for the awesome work!
The text was updated successfully, but these errors were encountered: