Bespoken Benchmarking Project

This is Bespoken's open-source benchmarking project.

This provides a general mechanism for testing and evaluating NLP platforms.

We have conducted two tests so far:

Voice assistant open-domain question answering - see the results
Digital contact center ASR - see the results

Process

We interact with the voice assistants using the Bespoken Device Service - which allow us to interact exactly as a real person would with an actual device. Read more here.

For running the tests and collecting the results, we leverage our batch testing framework:
https://gitlab.com/bespoken/batch-tester

Benchmark Results

Results are meant to published on a bi-monthly basis. The table below summarizes our tests and results to-date:

Date	Test Type	Data Set	Platforms	Results
7/26/2020	General Knowledge	ComQA	Alexa, Google Assistant, Siri	Link
11/20/2020	Speech Recognition	DefinedCrowd	Amazon Connect, Google Dialogflow, Twilio Voice	Link

The published results are viewable here:
https://benchmark.bespoken.io

Methodology

General Knowledge

We classify answers as correct or not by the presence of the answer from the dataset.

In the case where the dataset has multiple answers, if anyone is present we include it. Read more here

Speech Recognition Accuracy

We take datasets from DefinedCrowd and run them through the various platforms using our Virtual Devices for IVR:

Contact

We appreciate all feedback. Open an issue to suggest additional datasets as well as improvements to our methodology.

Contact us at contact@bespoken.io.

Name		Name	Last commit message	Last commit date
Latest commit History 237 Commits
.github/workflows		.github/workflows
.vscode		.vscode
datasets		datasets
docs		docs
input		input
script		script
src		src
test		test
web		web
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
package-lock.json		package-lock.json
package.json		package.json

License

bespoken/nlp-benchmark

Folders and files

Latest commit

History

Repository files navigation

Bespoken Benchmarking Project

Process

Benchmark Results

Methodology

General Knowledge

Speech Recognition Accuracy

Contact

About

Resources

License

Stars

Watchers

Forks

Languages