SpeechColab ASR leaderboard

1. Overview

"If you can’t measure it, you can’t improve it." -- Peter Drucker

Regarding to the current state of Automatic Speech Recognition(ASR), the term "State-Of-The-Art"(SOTA) is kind of vague in the sense that:

For industry, there is no objective and quantative benchmark on how these commercial APIs perform in real-life scenarios, at least in public domain.
For academia, it is becoming harder today to compare ASR models due to the fragmentation of research toolkits and ecosystems.
How are academic SOTA and industrial SOTA related ?

As above figure shows, SpeechIO leaderboard serves as an ASR benchmarking platform, by providing 3 components:

TestSet Zoo: A collection of test sets covering wide range of speech recognition scenarios
Model Zoo: A collection of models including commercial APIs and open-sourced pretrained models
An automated benchmarking pipeline:
- defines a simplest-possible specification on recognition interface, the format of input test sets, the format of output recognition results.
- As long as model submitters conform to this specification, a fully automated pipeline will take care of the rest (e.g. data preparation -> recognition invocation -> text post processing -> WER/CER/SER evaluation)

With SpeechIO leaderboard, anyone can benchmark, reproduce, compare others' systems on local machine, as long as they are published in model zoo and test-set zoo.

2. TestSet Zoo

Test Sets From Public Academic Datasets

已公开 Unlocked	编号 TEST_SET_ID	说明 DESCRIPTION	语言 LANGUAGE
✓	LIBRISPEECH_TEST_CLEAN	"test_clean" set of LibriSpeech	en
✓	LIBRISPEECH_TEST_OTHER	"test_other" set of LibriSpeech	en
✓	GIGASPEECH_V1.0.0_DEV	dev set of GigaSpeech	en
✓	GIGASPEECH_V1.0.0_TEST	test set of GigaSpeech	en
✓	AISHELL1_TEST	test set of AISHELL-1	zh
✓	AISHELL2_IOS_TEST	test set of AISHELL-2 (iOS channel)	zh
✓	AISHELL2_ANDROID_TEST	test set of AISHELL-2 (Android channel)	zh
✓	AISHELL2_MIC_TEST	test set of AISHELL-2 (Microphone channel)	zh

SpeechIO Test Sets (ZH)

SpeechIO test sets are carefully curated by SpeechIO authors, crawled from publicly available sources (Youtube, TV programs, Podcast etc), covering various well-known acoustic scenarios(AM) and content domains(LM & vocabulary), labeled by professional annotators.

已公开 Unlocked	编号 TEST_SET_ID	名称 Name	场景 Scenario	内容领域 Topic Domain	时长 hours	难度(1-5) Difficulty
✓	SPEECHIO_ASR_ZH00000	接入调试集 For leaderboard submitter debugging	视频会议、论坛演讲 video conference & forum speech	经济、货币、金融 economy, currency, finance	1.0	★★☆
✓	SPEECHIO_ASR_ZH00001	新闻联播	新闻播报 TV News	时政 news & politics	9	★
✓	SPEECHIO_ASR_ZH00002	鲁豫有约	访谈电视节目 TV interview	名人工作/生活 celebrity & film & music & daily	3	★★☆
✓	SPEECHIO_ASR_ZH00003	天下足球	专题电视节目 TV program	足球 Sports & Football & Worldcup	2.7	★★☆
✓	SPEECHIO_ASR_ZH00004	罗振宇跨年演讲	会场演讲 Stadium Public Speech	社会、人文、商业 Society & Culture & Business Trend	2.7	★★
✓	SPEECHIO_ASR_ZH00005	李永乐老师在线讲堂	在线教育 Online Education	科普 Popular Science	4.4	★★★
✗	SPEECHIO_ASR_ZH00006	张大仙 & 骚白王者荣耀直播	直播 Live Broadcasting	游戏 Game	1.6	★★★☆
✗	SPEECHIO_ASR_ZH00007	李佳琪 & 薇娅直播带货	直播 Live Broadcasting	电商、美妆 Makeup & Online shopping/advertising	0.9	★★★★☆
✗	SPEECHIO_ASR_ZH00008	老罗语录	线下培训 Offline lecture	段子、做人 Life & Purpose & Ethics	1.3	★★★★☆
✗	SPEECHIO_ASR_ZH00009	故事FM	播客 Podcast	人生故事、见闻 Ordinary Life Story Telling	4.5	★★☆
✗	SPEECHIO_ASR_ZH00010	创业内幕	播客 Podcast	创业、产品、投资 Startup & Enterprenuer & Product & Investment	4.2	★★☆
✗	SPEECHIO_ASR_ZH00011	罗翔刑法法考培训讲座	在线教育 Online Education	法律法考 Law & Lawyer Qualification Exams	3.4	★★☆
✗	SPEECHIO_ASR_ZH00012	张雪峰考研线上小讲堂	在线教育 Online Education	考研高校报考 University & Graduate School Entrance Exams	3.4	★★★☆
✗	SPEECHIO_ASR_ZH00013	谷阿莫&牛叔说电影	短视频 VLog	电影剪辑 Movie Cuts	1.8	★★★
✗	SPEECHIO_ASR_ZH00014	贫穷料理 & 琼斯爱生活	短视频 VLog	美食、烹饪 Food & Cooking & Gourmet	1	★★★☆
✗	SPEECHIO_ASR_ZH00015	单田芳白眉大侠	评书 Traditional Podcast	江湖、武侠 Kongfu Fiction	2.2	★★☆
✗	SPEECHIO_ASR_ZH00016	德云社相声演出	剧场相声 Theater Crosstalk Show	包袱段子 Funny Stories	1	★★★
✗	SPEECHIO_ASR_ZH00017	吐槽大会	脱口秀电视节目 Standup Comedy	明星糗事 Celebrity Jokes	1.8	★★☆
✗	SPEECHIO_ASR_ZH00018	小猪佩奇 & 熊出没	少儿动画 Children Cartoon	童话故事、日常 Fairy Tale	0.9	★☆
✗	SPEECHIO_ASR_ZH00019	CCTV5 NBA 比赛转播	体育赛事解说 Sports Game Live	篮球、NBA NBA Game	0.7	★★★
✗	SPEECHIO_ASR_ZH00020	篮球人物	纪录片 Documentary	篮球明星、成长 NBA Super Stars' Life & History	2.2	★★
✗	SPEECHIO_ASR_ZH00021	汽车之家车辆评测	短视频 VLog	汽车测评 Car benchmarks, Road driving test	1.7	★★★☆
✗	SPEECHIO_ASR_ZH00022	小艾大叔豪宅带看	短视频 VLog	房地产、豪宅 Realestate, Mansion tour	1.7	★★★
✗	SPEECHIO_ASR_ZH00023	无聊开箱 & Zealer评测	短视频 VLog	产品开箱评测 Unboxing	2	★★★
✗	SPEECHIO_ASR_ZH00024	付老师种植技术	短视频 VLog	农业、种植 Agriculture, Planting	2.7	★★★☆
✗	SPEECHIO_ASR_ZH00025	石国鹏讲古希腊哲学	线下培训 Offline lecture	历史，古希腊哲学 History, Greek philosophy	1.3	★★☆
✗	SPEECHIO_ASR_ZH00026	张震鬼故事	广播节目 Broadcasting Program	鬼故事 Horror Stories	2.4	★★★
✗	SPEECHIO_ASR_ZH00027	华语辩论世界杯	辩论赛 Debates Contest	兴趣、技能、成长 Hobby, Skill, Growth	1.4	★★★
✗	SPEECHIO_ASR_ZH00028	时政现场同传	同声传译 Simultaneous Translation	时政、社会公共治理 News & Events on Public Governance	2.1	★★★☆

To pull a unlocked test set from cloud to your local dataset-zoo leaderboard/datasets/*:

ops/pull dataset <TEST_SET_ID>

3. Model Zoo

Cloud API Models

API models are usually small (basically client programs), so we normally put them in this github repo.

已公开 Unlocked	编号 MODEL_ID	类型 type	模型作者/所有人 model author/owner	简介 description	链接 Service URL
✓	aispeech_api_zh	Cloud API	思必驰 AISpeech	思必驰开放平台	https://cloud.aispeech.com
✓	aliyun_api_en	Cloud API	阿里巴巴 Alibaba	阿里云	https://www.alibabacloud.com/product/intelligent-speech-interaction
✓	aliyun_api_zh	Cloud API	阿里巴巴 Alibaba	阿里云	https://ai.aliyun.com/nls/asr
✓	baidu_pro_api_zh	Cloud API	百度 Baidu	百度智能云(极速版)	https://cloud.baidu.com/product/speech/asr
✓	google_api_en	Cloud API	谷歌 Google	谷歌云	https://cloud.google.com/speech-to-text
✗		Cloud API	讯飞 IFlyTek	讯飞开放平台(听写)	https://www.xfyun.cn/services/voicedictation
✓	iflytek_lfasr_api_zh	Cloud API	讯飞 IFlyTek	讯飞开放平台(转写)	https://www.xfyun.cn/services/lfasr
✓	microsoft_rest_api_en	Cloud API	微软 Microsoft	Azure	https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/
✓	microsoft_rest_api_zh	Cloud API	微软 Microsoft	Azure	https://azure.microsoft.com/zh-cn/services/cognitive-services/speech-services/
✓	microsoft_sdk_en	Cloud API	微软 Microsoft	Azure	https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/
✓	microsoft_sdk_zh	Cloud API	微软 Microsoft	Azure	https://azure.microsoft.com/zh-cn/services/cognitive-services/speech-services/
✓	sogou_api_zh	Cloud API	搜狗 Sogou	AI开放平台	https://ai.sogou.com/product/one_recognition/
✓	tencent_api_zh	Cloud API	腾讯 Tencent	腾讯云	https://cloud.tencent.com/product/asr
✓	yitu_api_zh	Cloud API	依图 YituTech	依图语音开放平台	https://speech.yitutech.com

Local Engine (Open-sourced Pretrained ASR Models)

Local models/engines are normally too large for github, so we store these models in cloud.

已公开 Unlocked	编号 MODEL_ID	类型 type	模型作者/所有人 model author/owner	简介 description
✓	speechio_kaldi_multicn	pretrained model	Xingyu NA(那兴宇)	Kaldi multi_cn recipe
✓	wenet_multi_cn	pretrained model	Binbin Zhang(张彬彬)@wenet-e2e	WeNet multi_cn recipe
✓	vosk_model_cn	batteries-included local engine	alphacephei	Chinese engine of Vosk
✓	wenet_wenetspeech	pretrained model	Binbin Zhang(张彬彬)@wenet-e2e	WeNet wenetspeech recipe

To pull a unlocked model from cloud to your local model-zoo leaderboard/models/*:

ops/pull model <MODEL_ID>

4. Benchmarking Pipeline

To submit your model to leaderboard and get it benchmarked over all(including locked) test sets, follow this Specification

Also you can pull publicly unlocked models & test sets, and trigger benchmarking pipeline on your local machine via:

ops/leaderboard_runner requests/request.yaml

the content of request.yaml is described in above specification.

5. Latest Leaderboard Report

Contacts

Email: leaderboard@speechio.ai

Name		Name	Last commit message	Last commit date
Latest commit History 300 Commits
credentials		credentials
datasets		datasets
misc		misc
models		models
ops		ops
requests		requests
utils		utils
.gitignore		.gitignore
HOW_TO_SUBMIT.md		HOW_TO_SUBMIT.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

credentials

credentials

datasets

datasets

misc

misc

models

models

ops

ops

requests

requests

utils

utils

.gitignore

.gitignore

HOW_TO_SUBMIT.md

HOW_TO_SUBMIT.md

README.md

README.md

Repository files navigation

SpeechColab ASR leaderboard

1. Overview

2. TestSet Zoo

3. Model Zoo

4. Benchmarking Pipeline

5. Latest Leaderboard Report

Contacts

About

Releases

Packages

Languages

MissRu/Leaderboard

Folders and files

Latest commit

History

Repository files navigation

SpeechColab ASR leaderboard

1. Overview

2. TestSet Zoo

3. Model Zoo

4. Benchmarking Pipeline

5. Latest Leaderboard Report

Contacts

About

Resources

Stars

Watchers

Forks

Languages