Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add recipe for ICASSP2024 ICMC-ASR Grand Challenge #1172

Merged
merged 2 commits into from
Oct 8, 2023

Conversation

yfyeung
Copy link
Contributor

@yfyeung yfyeung commented Oct 8, 2023

The ICMC-ASR Grand Challenge dataset is collected in a hybrid electric vehicle with speakers sitting in different positions, including the driver seat and passenger seats. The total number of speakers is over 160 and all of them are native Chinese speakers speaking Mandarin without strong accents. To comprehensively capture speech signals of the entire cockpit, two types of recording devices are used: far-field and near-field recording devices. 8 distributed microphones are placed at four seats in the car, which are the driver's seat (DS01C01, DX01C01), the passenger seat (DS02C01, DX02C01), the rear right seat (DS03C01, DX03C01) and the rear left seat (DS04C01, DX04C01). Additionally, 2 linear microphone arrays, each consisting of 2 microphones, are placed on the display screen (DL01C01, DL02C02) and at the center of the inner sunroof (DL02C01, DL02C02), respectively. All 12 channels of far-field data are time-synchronized and included in the released dataset as far-field data. For transcription purposes, each speaker wears a high-fidelity headphone to record near-field audio, denoted by the seat where the speaker is situated. Specifically, DA01, DA02, DA03, and DA04 represent the driver seat, passenger seat, rear right seat and rear left seat, respectively. The near-field data only have single-channel audio recordings. Additionally, a sizable real noise dataset is provided, following the recording setup of the far-filed data but without speaker talking, to facilitate research of in-car scenario data simulation technology.

Participants can obtain the datasets at https://icmcasr.org manually.

@yfyeung yfyeung changed the title Add dataset for ICASSP2024 ICMC-ASR Grand Challenge Add recipe for ICASSP2024 ICMC-ASR Grand Challenge Oct 8, 2023
Copy link
Collaborator

@pzelasko pzelasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Looks like a cool challenge and dataset.

@pzelasko pzelasko added this to the v1.17 milestone Oct 8, 2023
@pzelasko pzelasko merged commit 499ddd3 into lhotse-speech:master Oct 8, 2023
10 checks passed
@yfyeung yfyeung deleted the icmcasr branch October 9, 2023 06:51
flyingleafe pushed a commit to flyingleafe/lhotse that referenced this pull request Oct 11, 2023
* Add ICMC-ASR corpus

* Fix isort

---------

Co-authored-by: yfy62 <yfy62@d3-hpc-sjtu-test-005.cm.cluster>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants