If you use the data and publish please cite our preprint paper:
@article{lorenc2021benchmark,
title={Benchmark of public intent recognition services},
author={Lorenc, Petr and Marek, Petr and Pichl, Jan and Konr{\'a}d, Jakub and {\v{S}}ediv{\`y}, Jan},
journal={Language Resources and Evaluation},
pages={1--19},
year={2021},
publisher={Springer}
}
This project is a collection of code and datasets for evaluating intent classification models (local and remote). The results can be helpful when creating chatbots or other conversational interfaces.
Code for performing local testing.
Different approaches for classification
Updated code from NLU-Evaluation-Corpora
Template for getting sentence embeddings - due to privacy issue cannot be provided.
Three dataset used in NLU-Evaluation-Corpora - used to perform comparison.
Own data for evaluation
Result in JSON format
Confusion matrix visualisation - using Facets
If you have any questions, please contact:
Petr Lorenc (Czech Technical University) petr.lorenc@cvut.cz