Skip to content

This repo contains evaluation of Autofaq.ai, Cognigy, DialogFlow, Microsoft LUIS and Watson NLU services.

Notifications You must be signed in to change notification settings

AutoFAQ/NLU-Evaluation-Scripts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 

Repository files navigation

AutoFAQ NLU - https://autofaq.ai/en evaluation benchmarks.

This repo contains evaluation of Autofaq.ai, Cognigy, DialogFlow, Microsoft LUIS and Watson NLU services.

Data

Data from Benchmarking Natural Language Understanding Services for Building Conversational Agents (2019), for details and reproduction see folder v4.

Small subset

640 Training Setences - 10 Sentences per Intent

1076 Test Sentences

AutoFAQ Cognigy DialogFlow Microsoft LUIS Watson
Accuracy 0.808 0.751 0.656 0.655 0.69
F1 (macro) 0.785 0.748 0.657 0.641 0.686

Large subset

1908 Training Setences - ~30 Sentences per Intent 5518 Test Sentences

AutoFAQ Cognigy DialogFlow Microsoft LUIS Watson
Accuracy 0.854 0.846 0.761 0.788 0.81
F1 (macro) 0.846 0.827 0.758 0.776 0.804

About

This repo contains evaluation of Autofaq.ai, Cognigy, DialogFlow, Microsoft LUIS and Watson NLU services.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 66.1%
  • Python 33.9%