New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trying the diarization pipeline on random .wav files #114
Comments
I'm not the developer but If you do this method I would think you will need to regenerate the Precomputed MFCCs for each step. That being said, I suspect there is a more elegant way of using the pipeline for a new .wav without making a new database which would instead make use of the underlying python API; I've done this for embeddings but haven't used the pipeline yet. |
Actually i renamed the .wav to ES2003a that is actually the first file for which mfcc's, raw sad values and scd values are generated so you can simply break the loop after that and still get the reqd files. It would be great if you can tell me how to do it with the complete pipeline. That being said am i actually doing anything wrong as the result of diarization are simply useless ( in comparison to LIUM / AALTO diarization libraries ) which have much more robust results |
Hi, I have some problem about speaker change detection. It always says there is a no change for whole file. #111 I suggest that you should firstly check your training for speaker change detection. Maybe, you have same kind of problem like me. |
As i said it works perfectly on AMI dataset so i don't think that the problem is there. |
Hi,
I ask this because you said this.
If you got %100 coverage result from the script, it can be reason of your problem. |
Hey, i tried out you suggetion, seems like i am getting 1% coverage. Any idea what can be done ?
|
Thanks for the reply. May you look your hypothesis.to_annotation() result. It will give some idea to us. It is quite interesting. We are trying somewhat same thing, however, we get different results. I think, the only difference comes from here. I use weights from 5. epoch.
|
In fact, if you choose difference thresholds in Peak, you'll get different results. Usually, threshold is chosen by validation. If you don't want do validation. You can try different thresholds by yourself to plot the coverage and purity curve. |
Thanks for the interest. When I tried this and look for the output which shows that there are always changes.
If I tried this, it gives %100 coverage.
|
hey @hbredin @yinruiqing @hedonistrh as i get it, these hyperparameters are chosen at the end by the final pipeline module that we run, this being said am i correct about this "The scd training module is not working" because even if i use the trained model at 500th epoch, it gives strange results for coverage using the default parameters provided by the author : |
Yes, it is for ETAPE dataset, however, I think, the results that we get are not appropriate according to paper. Also, you can visualize your outputs. It can give some idea. |
@bml1g12 Thanks for sharing these results. Yes, according to paper both metric should be like %90. We use different dataset, however, I think, results are not good. |
Taking a (short) break from my (long) summer break to comment on this issue. You are actually comparing apples and oranges. The original paper reports SegmentationPurity and SegmentationCoverage while the above script reports DiarizationPurity and DiarizationCoverage. You should use More info about the different metrics can be found in pyannote.metrics paper. |
@hbredin Thanks for the comment. @bml1g12 May you share hypothesis.to_annotation() result? Also, thanks for sharing these results. |
Thanks for the reply. I will try to train it again. Because, as I wrote, I always get %100 coverage. |
Maybe you are using a bad epoch - how does your validation coverage for the epoch you are using compare? i.e. in the tensorboard file |
Ah I see your using the 5th epoch, whereas I'm using 1000th - I suspect that is the issue |
@bml1g12 Hello, may you share the weights file? Because, I can not train until 1000th epoch. :) |
@saisumit Thanks! 🙏 I have tried it, however, I got this error. I have no access for Nvidia Gpu and I am using 16.04 Ubuntu. Probably, error occurs because of these. |
I'm afraid mine was also run on a cuda GPU so I can't help you either |
Closing as it seems that this issue has diverged from the original one. |
Hey, as suggested by the detailed tutorials, i went through them and trained all the models required for the pipeline. The pipeline is working on the AMI dataset but when i try to reproduce the results on other .wav files sampled at 16k, mono, and 256bps, it is not able to diarize the audio.
Here is the breif of what i actually did.
Output :
can you please tell if its because of replacing the actual file that the pipeline is giving wrong outputs for the diarization, and whats a better way to test the pipeline on random audios.
The text was updated successfully, but these errors were encountered: