Skip to content

Nexdata-AI/88-Hours-Mandarin-Speech-Data-in-Noisy-Environment-by-Mobile-Phone

Repository files navigation

88-Hours-Mandarin-Speech-Data-in-Noisy-Environment-by-Mobile-Phone

Description

Spoken Mandarin audio data under noisy environment captured by mobile phone, it is recorded by 203 speakers from all over China, covering all major dialect regions; and a variety of noise scenes such as subways, supermarkets, restaurants, etc., more suitable for real application scenes; it can be used for automatic speech recognition, machine translation, and voiceprint recognition.

For more details, please refer to the link: https://www.nexdata.ai/datasets/191?source=Github

Format

16kHz, 16bit, uncompressed wav, mono channel

Recording environment

noisy, including subway, market, restaurant, street, airport, etc.

Recording content

common sentences; letters

Speaker

203 people, 57% of which are male

Device

Android mobile phone; iPhone

Language

mandarin (without heavy local accent)

Transcription content

text, noise symbols

Accuracy rate

95% (the accuracy rate of noise symbols is not included)

Application scenarios

speech recognition, voiceprint recognition

Licensing Information

Commercial License