[WIP] Feat/vtc #694

hadware · 2021-07-01T16:16:35Z

This is a working PR on the future VTC implementation inspired from @MarvinLvn 's work, and to be merged into the next release of pyannote-audio.

Note: nothing has been done yet, this is just to get things started.

hbredin · 2021-07-08T13:02:30Z

FYI: https://github.com/pyannote/pyannote-audio/blob/fe29b430e0f3380a1471c6ba155f165aabf8096f/tutorials/add_your_own_task.ipynb

codecov · 2021-07-11T00:44:38Z

Codecov Report

Merging #694 (5fe2153) into develop (fd0c42c) will increase coverage by 0.63%.
The diff coverage is 0.00%.

❗ Current head 5fe2153 differs from pull request most recent head a28bacb. Consider uploading reports for the commit a28bacb to get more accurate results

@@             Coverage Diff             @@
##           develop     #694      +/-   ##
===========================================
+ Coverage    37.35%   37.98%   +0.63%     
===========================================
  Files           50       50              
  Lines         3167     3046     -121     
===========================================
- Hits          1183     1157      -26     
+ Misses        1984     1889      -95

Impacted Files	Coverage Δ
pyannote/audio/pipelines/multilabel_detection.py	`0.00% <0.00%> (ø)`
...io/tasks/segmentation/voice_type_classification.py	`0.00% <0.00%> (ø)`
pyannote/audio/utils/signal.py	`0.00% <0.00%> (-20.39%)`	⬇️
pyannote/audio/core/inference.py	`62.26% <0.00%> (ø)`
pyannote/audio/pipelines/utils.py	`0.00% <0.00%> (ø)`
pyannote/audio/pipelines/__init__.py	`0.00% <0.00%> (ø)`
pyannote/audio/pipelines/clustering.py	`0.00% <0.00%> (ø)`
pyannote/audio/pipelines/resegmentation.py	`0.00% <0.00%> (ø)`
pyannote/audio/pipelines/speaker_diarization.py	`0.00% <0.00%> (ø)`
pyannote/audio/utils/metric.py
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fd0c42c...a28bacb. Read the comment docs.

hadware · 2021-09-30T22:18:32Z

All right, It's currently working on my test dataset, I'll try to test it some more on my fake data, and then on some real data (@MarvinLvn 's and our clinical data) to see if it matches (or hopefully even beats) the former scores.

hbredin · 2021-10-05T08:32:21Z

pyannote/audio/pipelines/multilabel_detection.py

+            class_name: ParamDict(
+                onset=Uniform(0., 1.),
+                offset=Uniform(0., 1.),
+                min_duration_on=Uniform(0., 2.),
+                min_duration_off=Uniform(0., 2.),
+                pad_onset=Uniform(-1., 1.),
+                pad_offset=Uniform(-1., 1.)


In relation with pyannote/pyannote-pipeline#34, this is a good use case for freezing only parts of ParamDict. pad_onset and pad_offset are seldom useful and it makes sense to reduce the dimension of the hyperparameter search space by freezing them to 0.

Yes! I wondered if pad_{off,on}set was relevant or not, since you weren't using it in the VAD pipeline. I'll freeze that once it's freezable. (What I could also do is not parameterize it though?)

hadware · 2021-10-05T14:24:22Z

I actually thought a bit about things, and It needs a bit more tweaking of the "custom" "MultilabelFScore" metric i've implemented for the optimization part: it currently doesn't support intersections and unions of classes.

stale · 2021-12-07T18:22:21Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

hbredin · 2021-12-08T07:45:01Z

Hey @hadware, the stale bot seems unhappy with the status of this PR... Any update on your side?

hadware · 2021-12-08T09:48:01Z

Ah! Well, I'm not planning on giving up on this, but I'm currently a bit overwhelmed by some other work. The dust should settle by mid december, but I might need some input from you in the meantime to write a proper test of the new pipeline: how do you advise that I test it now that the pyannote-audio CLI endpoint is "deprecated"?

hbredin · 2021-12-09T14:59:43Z

Ah! Well, I'm not planning on giving up on this, but I'm currently a bit overwhelmed by some other work. The dust should settle by mid december, but I might need some input from you in the meantime to write a proper test of the new pipeline: how do you advise that I test it now that the pyannote-audio CLI endpoint is "deprecated"?

Tasks are currently tested by simply checking that training is not broken (i.e. does not raise any Exception).
This is done by running notebooks available in /notebook/ directory. You could add your task to the example.ipynb notebook for example.

As far as pipelines are concerned, they are simply not tested for now 👎

hadware · 2022-01-07T19:11:06Z

Sorry for the "supremely concise" commit history, I tried to rebase on your develop branch but really bad things happened. Git is still an unforgiving ~~master~~ main for me, it seems.

hadware · 2022-01-11T19:44:40Z

Small question needed for model testing: what would you advise that I use for checkpointing? Pytorch-lightning's way? I can't really figure out how you're "expecting" it to be done in Pyannote's current iteration.

hadware · 2022-01-11T19:57:36Z

Nevermind, I figured I should use the pyannote-audio-train CLI endpoint.

…efault config

hadware · 2022-01-12T10:41:24Z

Sorry for the dumbi-ish question, but i've never used Hydra on my own so I'm not very familiar with it. Here's the problem:

from what i've understood, since i've added a VoiceTypeClassification.yaml "subconfig" to pyannote, i'm mostly good to go
from what i've understood, if users want to "override" some of theses values, they can just do so using the CLI
however, there are some parameters that are mandatory for this task and that cannot have default values and shouldn't be defined in the CLI: classes, classes unions and classes intersections
in my opinion, these should be defined in a user-specified YAML file, that should be given as an input when running the pipeline via the CLI, but the YAML config should be "overloading" the one already specified for the VoiceTypeClassification task

So here's my question:

what's the structure for such a yaml config file
how do you feed it to hydra's CLI
(sorry that's actually two questions, but you get the point).

hbredin · 2022-01-12T20:49:42Z

Hmmm. I am not sure why you need to use the CLI for testing the model.
This should be enough:

task = VoiceTypeClassification(...)
model = PyanNet(task=task)
trainer = Trainer(max_epochs=1)
trainer.fit(model)

Also, what do you mean by "model testing"?

hadware · 2022-01-13T11:07:09Z

Sorry, by testing i meant "using the new VTC task + pipeline on our data to see if it reproduces our well-known results, or even beats them". Thus this means training, validation and testing. I mostly need help on the training part (which I intend to do using the CLI, c.f. my previous post).

hbredin · 2022-01-13T13:01:30Z

Don't bother using the CLI -- try something like this:

from pyannote.database import get_protocol
dataset = get_protocol('YourDataset.SpeakerDiarization.YourProtocol) 

from pyannote.audio.tasks import VoiceActivityDetection
vad = VoiceActivityDetection(dataset)

from pyannote.audio.models.segmentation import PyanNet
model = PyanNet(task=vad, sincnet={"stride": 10})

from pytorch_lightning.callbacks import EarlyStopping
from pytorch_lightning.callbacks.model_checkpoint import ModelCheckpoint
from pytorch_lightning.loggers import TensorBoardLogger

value_to_monitor, min_or_max = vad.val_monitor

model_checkpoint = ModelCheckpoint(
    monitor=value_to_monitor, 
    mode=min_or_max, 
    save_top_k=5, 
    every_n_epochs=1, 
    save_last=True, 
    dirpath=".", 
    filename=f"{{epoch}}-{{{value_to_monitor}:.6f}}",
    verbose=True)

early_stopping = EarlyStopping(
    monitor=value_to_monitor,
    mode=min_or_max,
    min_delta=0.0,
    patience=10.,
    strict=True,
    verbose=False) 

logger = TensorBoardLogger(".", name="", version="", log_graph=False)

from pytorch_lightning import Trainer
trainer = Trainer(gpus=1, callbacks=[model_checkpoint, early_stopping], logger=logger)
trainer.fit(model)

hadware · 2022-01-29T18:39:50Z

Thanks a lot, this was very helpful!

I have some good news: the whole pipeline seems to be working and giving out nice results. Here's a small table that i'm going to update as we run more tests and might fix some things:

Pyan. Vers.	Model	Dataset	Data Augment.	Best Epoch	Tuning Iterations	Tuning Metric	Fscore	IER
V1	Pyannet	Clinical Itws	Noise (MUSAN)	100	-	Fscore	86.6	19.6
V2	Pyannet	Clinical Itws	None	38	50	IER	86.80	20.69
V2	Pyannet	Clinical Itws	None	38	50	Fscore	86.49	21.47

Note: on V1, I used the cyclic rate scheduler, as per @MarvinLvn 's worthy advice. On V2, nothing in the training part of the experiment has been gridsearched or else, I basically used what you've given and didn't question anything. In my opinion, these results are promising since I haven't started to try and add data augmentation or tweak the learning-rate, etc.

# Conflicts: # pyannote/audio/pipelines/multilabel_detection.py

stale · 2022-04-04T12:47:03Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

hbredin mentioned this pull request Jul 2, 2021

Add more tasks #696

Closed

3 tasks

hadware closed this Jul 10, 2021

hadware force-pushed the feat/vtc branch from b7e5d4d to 2ecdc23 Compare July 10, 2021 15:41

hadware reopened this Jul 11, 2021

hbredin reviewed Oct 5, 2021

View reviewed changes

hbredin mentioned this pull request Oct 18, 2021

Train models for gender and age recognition #774

Closed

stale bot added the wontfix label Dec 7, 2021

stale bot removed the wontfix label Dec 8, 2021

hadware force-pushed the feat/vtc branch from 897ee81 to e16b900 Compare January 7, 2022 18:55

Re-added files from backup branch

5fe2153

hadware force-pushed the feat/vtc branch from e16b900 to 5fe2153 Compare January 7, 2022 19:09

Re-added to init

1d59d44

Re-added __init__ references, re-added VoiceTypeClassification.yaml d…

7ebbc74

…efault config

Merge branch 'develop' into feat/vtc

14998e6

Merge branch 'develop' into feat/vtc

fd1e3bc

hadware added 3 commits January 28, 2022 16:25

Fixing Fscore metric, fixing MultilabelPipeline apply code

5cc74da

Merge remote-tracking branch 'origin/feat/vtc' into feat/vtc

98c97d2

Fixed multilabel pipeline apply method.

934cf01

hbredin and others added 5 commits January 31, 2022 09:18

Merge branch 'develop' into feat/vtc

ab7112e

Fixing imports

1f0c63e

Merge remote-tracking branch 'origin/feat/vtc' into feat/vtc

a5947c3

# Conflicts: # pyannote/audio/pipelines/multilabel_detection.py

Fixing imports (again)

b0ec1a2

Fixing imports (again^2)

a28bacb

hadware mentioned this pull request Mar 8, 2022

[WIP] Multilabel Detection #891

Merged

3 tasks

stale bot added the wontfix label Apr 4, 2022

stale bot closed this Apr 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Feat/vtc #694

[WIP] Feat/vtc #694

hadware commented Jul 1, 2021

hbredin commented Jul 8, 2021

codecov bot commented Jul 11, 2021 •

edited

hadware commented Sep 30, 2021

hbredin Oct 5, 2021

hadware Oct 5, 2021

hadware commented Oct 5, 2021 •

edited

stale bot commented Dec 7, 2021

hbredin commented Dec 8, 2021

hadware commented Dec 8, 2021

hbredin commented Dec 9, 2021

hadware commented Jan 7, 2022

hadware commented Jan 11, 2022

hadware commented Jan 11, 2022

hadware commented Jan 12, 2022 •

edited

hbredin commented Jan 12, 2022

hadware commented Jan 13, 2022

hbredin commented Jan 13, 2022

hadware commented Jan 29, 2022 •

edited

stale bot commented Apr 4, 2022

[WIP] Feat/vtc #694

[WIP] Feat/vtc #694

Conversation

hadware commented Jul 1, 2021

hbredin commented Jul 8, 2021

codecov bot commented Jul 11, 2021 • edited

Codecov Report

hadware commented Sep 30, 2021

hbredin Oct 5, 2021

Choose a reason for hiding this comment

hadware Oct 5, 2021

Choose a reason for hiding this comment

hadware commented Oct 5, 2021 • edited

stale bot commented Dec 7, 2021

hbredin commented Dec 8, 2021

hadware commented Dec 8, 2021

hbredin commented Dec 9, 2021

hadware commented Jan 7, 2022

hadware commented Jan 11, 2022

hadware commented Jan 11, 2022

hadware commented Jan 12, 2022 • edited

hbredin commented Jan 12, 2022

hadware commented Jan 13, 2022

hbredin commented Jan 13, 2022

hadware commented Jan 29, 2022 • edited

stale bot commented Apr 4, 2022

codecov bot commented Jul 11, 2021 •

edited

hadware commented Oct 5, 2021 •

edited

hadware commented Jan 12, 2022 •

edited

hadware commented Jan 29, 2022 •

edited