Trying the diarization pipeline on random .wav files #114

saisumit · 2018-07-24T22:07:37Z

Hey, as suggested by the detailed tutorials, i went through them and trained all the models required for the pipeline. The pipeline is working on the AMI dataset but when i try to reproduce the results on other .wav files sampled at 16k, mono, and 256bps, it is not able to diarize the audio.
Here is the breif of what i actually did.

Took a random meeting audio file, sampled at 16k , mono and 256bps
renamed it to ES2003a and replaced it with actual ES2003a ( thought it as a turnaround of creating another database )
ran all the pipelines ( sad,scd, emb, diarization )

Output :

Speaker activity detection works perfectly and is able to classify regions of speech.
Speaker diarization does't works, everything is classified as 0

can you please tell if its because of replacing the actual file that the pipeline is giving wrong outputs for the diarization, and whats a better way to test the pipeline on random audios.

bml1g12 · 2018-07-25T01:31:38Z

I'm not the developer but If you do this method I would think you will need to regenerate the Precomputed MFCCs for each step.

That being said, I suspect there is a more elegant way of using the pipeline for a new .wav without making a new database which would instead make use of the underlying python API; I've done this for embeddings but haven't used the pipeline yet.

saisumit · 2018-07-25T04:29:31Z

Actually i renamed the .wav to ES2003a that is actually the first file for which mfcc's, raw sad values and scd values are generated so you can simply break the loop after that and still get the reqd files. It would be great if you can tell me how to do it with the complete pipeline. That being said am i actually doing anything wrong as the result of diarization are simply useless ( in comparison to LIUM / AALTO diarization libraries ) which have much more robust results

hedonistrh · 2018-07-25T07:35:40Z

Hi, I have some problem about speaker change detection. It always says there is a no change for whole file. #111

I suggest that you should firstly check your training for speaker change detection. Maybe, you have same kind of problem like me.

saisumit · 2018-07-25T07:41:50Z

As i said it works perfectly on AMI dataset so i don't think that the problem is there.
These are the two files i tried this on :
https://drive.google.com/open?id=15Stt_JjWT7rzypHP5v9FfFTU1NmMHcew
https://drive.google.com/open?id=1IP8v5_VMiQQnk426R6p-fZajyhER04Hj

hedonistrh · 2018-07-25T08:03:42Z

Hi,
It is quite interesting. How you can check it works perfectly or not? Can you share result of pyannote metrics for test files like this script.


# Loop on Test Files
from pyannote.database import get_annotated
for test_file in protocol.test():
    # print (test_file)
    # load reference annotation
    reference = test_file['annotation']
    uem = get_annotated(test_file)

    # load precomputed change scores as pyannote.core.SlidingWindowFeature
    scd_scores = precomputed(test_file)

    # binarize scores to obtain speech regions as pyannote.core.Timeline
    hypothesis = peak.apply(scd_scores, dimension=1)

    # evaluate speech activity detection
    metric(reference, hypothesis.to_annotation(), uem=uem)
    purity, coverage, fmeasure = metric.compute_metrics()
    print(f'Purity = {100*purity:.1f}% / Coverage = {100*coverage:.1f}%')

purity, coverage, fmeasure = metric.compute_metrics()
print(f'Purity = {100*purity:.1f}% / Coverage = {100*coverage:.1f}%')

I ask this because you said this.

Speaker diarization does't works, everything is classified as 0

If you got %100 coverage result from the script, it can be reason of your problem.

saisumit · 2018-07-27T02:44:54Z

Hey, i tried out you suggetion, seems like i am getting 1% coverage. Any idea what can be done ?

from pyannote.database import get_protocol
protocol = get_protocol('AMI.SpeakerDiarization.MixHeadset')
from pyannote.audio.features import Precomputed
precomputed = Precomputed('/media/DataDriveA/Datasets/sumit/scd')
from pyannote.audio.signal import Peak
peak = Peak(alpha=0.5, min_duration=1.0, log_scale=True)
from pyannote.metrics.diarization import DiarizationPurityCoverageFMeasure
metric = DiarizationPurityCoverageFMeasure()
from pyannote.database import get_annotated
for test_file in protocol.test():
... reference = test_file['annotation']
... uem = get_annotated(test_file)
... scd_scores = precomputed(test_file)
... hypothesis = peak.apply(scd_scores, dimension=1)
... metric(reference, hypothesis.to_annotation(), uem=uem)
...
0.019149014550107722
0.014072230507041606
0.03635586201714968
0.018868535259038234
0.016567144528221323
0.018798456401816547
0.03245998534873909
0.01930586206686937
0.018597586570416588
0.013406219189916744
0.04511515526311587
0.02177338540954314
0.025974370018109923
0.022023084858175945
0.02715933469271844
0.0197217875965152
0.018951645548169263
0.014496894798537002
0.023111019879448098
0.014170020247782397
0.013125944790148964
0.013026454562392011
purity, coverage, fmeasure = metric.compute_metrics()
print(f'Purity = {100purity:.1f}% / Coverage = {100coverage:.1f}%')
Purity = 78.4% / Coverage = 1.0%

hedonistrh · 2018-07-27T07:41:22Z

Thanks for the reply.

May you look your hypothesis.to_annotation() result. It will give some idea to us.

It is quite interesting. We are trying somewhat same thing, however, we get different results. I think, the only difference comes from here. I use weights from 5. epoch.

!pyannote-change-detection apply tutorials/change-detection/train/AMI.SpeakerDiarization.MixHeadset.train/weights/0005.pt AMI.SpeakerDiarization.MixHeadset raw_scores

yinruiqing · 2018-07-27T08:42:36Z

In fact, if you choose difference thresholds in Peak, you'll get different results. Usually, threshold is chosen by validation. If you don't want do validation. You can try different thresholds by yourself to plot the coverage and purity curve.

hedonistrh · 2018-07-27T08:46:49Z

Thanks for the interest.

When I tried this and look for the output which shows that there are always changes.

peak = Peak(alpha=0.1, min_duration=1.0, log_scale=True)

If I tried this, it gives %100 coverage.

peak = Peak(alpha=0.13, min_duration=1.0, log_scale=True)

saisumit · 2018-07-27T12:52:46Z

hey @hbredin @yinruiqing @hedonistrh as i get it, these hyperparameters are chosen at the end by the final pipeline module that we run, this being said am i correct about this "The scd training module is not working" because even if i use the trained model at 500th epoch, it gives strange results for coverage using the default parameters provided by the author :
peak = Peak(alpha=0.5, min_duration=1.0, log_scale=True)
I am assuming that both the values of purity and coverage should lie in [ 70,90 ] range going by the results described in this paper, though its for a slightly different dataset ( https://pdfs.semanticscholar.org/edff/b62b32ffcc2b5cc846e26375cb300fac9ecc.pdf )

hedonistrh · 2018-07-27T14:10:16Z

Yes, it is for ETAPE dataset, however, I think, the results that we get are not appropriate according to paper.

Also, you can visualize your outputs. It can give some idea.

bml1g12 · 2018-07-30T04:51:22Z

This is validation taken every 100 iterations:

And the following results using the tutorial settings after the 1000th epoch and min_duration=1.0 and the first test file in AMI

A very large minimum duration is required to get a reasonable coverage. The results do seem quite different to the ETAPE results in the paper.

hedonistrh · 2018-07-30T08:13:42Z

@bml1g12 Thanks for sharing these results. Yes, according to paper both metric should be like %90. We use different dataset, however, I think, results are not good.

hbredin · 2018-07-30T15:04:47Z

Taking a (short) break from my (long) summer break to comment on this issue.

You are actually comparing apples and oranges.

The original paper reports SegmentationPurity and SegmentationCoverage while the above script reports DiarizationPurity and DiarizationCoverage.

You should use SegmentationPurity and SegmentationCoverage to reproduce results in the paper.

More info about the different metrics can be found in pyannote.metrics paper.

bml1g12 · 2018-07-31T07:43:19Z

Indeed, using SegmentationPurity and SegmentationCoverage I obtain:

hedonistrh · 2018-07-31T09:28:58Z

@hbredin Thanks for the comment.

@bml1g12 May you share hypothesis.to_annotation() result? Also, thanks for sharing these results.

bml1g12 · 2018-07-31T09:39:08Z

Using alpha=0.2, min_duration=1.0 on the first file EN2002b.Mix-Headset, zoomed in on the first 200 seconds for clarity notebook.crop = Segment(0,200)

hedonistrh · 2018-07-31T09:49:03Z

Thanks for the reply.

I will try to train it again. Because, as I wrote, I always get %100 coverage.

bml1g12 · 2018-07-31T09:50:21Z

Maybe you are using a bad epoch - how does your validation coverage for the epoch you are using compare? i.e. in the tensorboard file

bml1g12 · 2018-07-31T09:51:55Z

Ah I see your using the 5th epoch, whereas I'm using 1000th - I suspect that is the issue

hedonistrh · 2018-07-31T10:10:44Z

I agree with you. My computing resource is limited, so that, I have used a few epochs. When I tried for 100 epochs, loss was non-decreasing after 10th epochs.

Edit: I have tried with weights from 50th epoch and take the alpha as a 0.25. Now, the result make
sense.

Thanks for the helps. 🙏

hedonistrh · 2018-08-01T11:57:58Z

@bml1g12 Hello, may you share the weights file? Because, I can not train until 1000th epoch. :)

saisumit · 2018-08-01T23:23:29Z

here you go https://drive.google.com/open?id=10kLHAOBcsvOUlnC_glYFjpuHV25QpmD7

hedonistrh · 2018-08-02T11:38:37Z

@saisumit Thanks! 🙏 I have tried it, however, I got this error.
CUDA driver version is insufficient for CUDA runtime version

I have no access for Nvidia Gpu and I am using 16.04 Ubuntu. Probably, error occurs because of these.

bml1g12 · 2018-08-02T14:53:52Z

I'm afraid mine was also run on a cuda GPU so I can't help you either

hbredin · 2018-09-03T09:48:20Z

Closing as it seems that this issue has diverged from the original one.

hbredin closed this as completed Sep 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying the diarization pipeline on random .wav files #114

Trying the diarization pipeline on random .wav files #114

saisumit commented Jul 24, 2018 •

edited

bml1g12 commented Jul 25, 2018 •

edited

saisumit commented Jul 25, 2018 •

edited

hedonistrh commented Jul 25, 2018

saisumit commented Jul 25, 2018

hedonistrh commented Jul 25, 2018 •

edited

saisumit commented Jul 27, 2018 •

edited

hedonistrh commented Jul 27, 2018

yinruiqing commented Jul 27, 2018 •

edited

hedonistrh commented Jul 27, 2018 •

edited

saisumit commented Jul 27, 2018 •

edited

hedonistrh commented Jul 27, 2018

bml1g12 commented Jul 30, 2018 •

edited

hedonistrh commented Jul 30, 2018

hbredin commented Jul 30, 2018

bml1g12 commented Jul 31, 2018

hedonistrh commented Jul 31, 2018

bml1g12 commented Jul 31, 2018

hedonistrh commented Jul 31, 2018

bml1g12 commented Jul 31, 2018

bml1g12 commented Jul 31, 2018

hedonistrh commented Jul 31, 2018 •

edited

hedonistrh commented Aug 1, 2018

saisumit commented Aug 1, 2018

hedonistrh commented Aug 2, 2018

bml1g12 commented Aug 2, 2018

hbredin commented Sep 3, 2018

Trying the diarization pipeline on random .wav files #114

Trying the diarization pipeline on random .wav files #114

Comments

saisumit commented Jul 24, 2018 • edited

bml1g12 commented Jul 25, 2018 • edited

saisumit commented Jul 25, 2018 • edited

hedonistrh commented Jul 25, 2018

saisumit commented Jul 25, 2018

hedonistrh commented Jul 25, 2018 • edited

saisumit commented Jul 27, 2018 • edited

hedonistrh commented Jul 27, 2018

yinruiqing commented Jul 27, 2018 • edited

hedonistrh commented Jul 27, 2018 • edited

saisumit commented Jul 27, 2018 • edited

hedonistrh commented Jul 27, 2018

bml1g12 commented Jul 30, 2018 • edited

hedonistrh commented Jul 30, 2018

hbredin commented Jul 30, 2018

bml1g12 commented Jul 31, 2018

hedonistrh commented Jul 31, 2018

bml1g12 commented Jul 31, 2018

hedonistrh commented Jul 31, 2018

bml1g12 commented Jul 31, 2018

bml1g12 commented Jul 31, 2018

hedonistrh commented Jul 31, 2018 • edited

hedonistrh commented Aug 1, 2018

saisumit commented Aug 1, 2018

hedonistrh commented Aug 2, 2018

bml1g12 commented Aug 2, 2018

hbredin commented Sep 3, 2018

saisumit commented Jul 24, 2018 •

edited

bml1g12 commented Jul 25, 2018 •

edited

saisumit commented Jul 25, 2018 •

edited

hedonistrh commented Jul 25, 2018 •

edited

saisumit commented Jul 27, 2018 •

edited

yinruiqing commented Jul 27, 2018 •

edited

hedonistrh commented Jul 27, 2018 •

edited

saisumit commented Jul 27, 2018 •

edited

bml1g12 commented Jul 30, 2018 •

edited

hedonistrh commented Jul 31, 2018 •

edited