Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suspicious supervisions in AMI #83

Closed
pzelasko opened this issue Sep 23, 2020 · 1 comment
Closed

Suspicious supervisions in AMI #83

pzelasko opened this issue Sep 23, 2020 · 1 comment

Comments

@pzelasko
Copy link
Collaborator

There are 307 supervisions that have no text, and 5 of them have an extremely long duration (400 - 1600 seconds). See:

image

@jimbozhang could you take a look at it?

@jimbozhang
Copy link
Contributor

All the annotation are from http://groups.inf.ed.ac.uk/ami/AMICorpusAnnotations/ami_manual_annotations_v1.6.1_export.gzip

For the "No annotation found" warnings:
Indeed there is no annotation for these 5 audios in the gzip file.

For the no text supervisions:
ami_manifests['dev']['supervisions']['IB4011.Headset-2.wav-12-0'].text == '' , the corresponding line in the gzip file is:
IB4011 C MIO095 2 501.733 503.627 501.733 503.627 . 713.34 , no text. There are voice in this audio segment, but no text in the annotation gzip file. I think this line is wrong.

For the long duration supervisions:
ami_manifests['dev']['supervisions']['IB4002.Headset-2.wav-11-0'].duration > 1298 , the corresponding line in the gzip file is:
IB4002 C FIO093 2 108.059 109.184 108.059 109.184 . 1406.34 . Also no text.
For the normal segments, the column-7 (109.184) should be equal of the column-9 (1406.34), but for the no-text line (I think those lines are just wrong annotations), the two column are not equal. The current lhotse takes 1406.34 as the end time, I'll fix it by taking 109.184 as the end time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants