New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Feat/vtc #694
[WIP] Feat/vtc #694
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #694 +/- ##
===========================================
+ Coverage 37.35% 37.98% +0.63%
===========================================
Files 50 50
Lines 3167 3046 -121
===========================================
- Hits 1183 1157 -26
+ Misses 1984 1889 -95
Continue to review full report at Codecov.
|
All right, It's currently working on my test dataset, I'll try to test it some more on my fake data, and then on some real data (@MarvinLvn 's and our clinical data) to see if it matches (or hopefully even beats) the former scores. |
class_name: ParamDict( | ||
onset=Uniform(0., 1.), | ||
offset=Uniform(0., 1.), | ||
min_duration_on=Uniform(0., 2.), | ||
min_duration_off=Uniform(0., 2.), | ||
pad_onset=Uniform(-1., 1.), | ||
pad_offset=Uniform(-1., 1.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In relation with pyannote/pyannote-pipeline#34, this is a good use case for freezing only parts of ParamDict
. pad_onset
and pad_offset
are seldom useful and it makes sense to reduce the dimension of the hyperparameter search space by freezing them to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I wondered if pad_{off,on}set
was relevant or not, since you weren't using it in the VAD pipeline. I'll freeze that once it's freezable. (What I could also do is not parameterize it though?)
I actually thought a bit about things, and It needs a bit more tweaking of the "custom" "MultilabelFScore" metric i've implemented for the optimization part: it currently doesn't support intersections and unions of classes. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hey @hadware, the |
Ah! Well, I'm not planning on giving up on this, but I'm currently a bit overwhelmed by some other work. The dust should settle by mid december, but I might need some input from you in the meantime to write a proper test of the new pipeline: how do you advise that I test it now that the |
Tasks are currently tested by simply checking that training is not broken (i.e. does not raise any Exception). As far as pipelines are concerned, they are simply not tested for now 👎 |
Sorry for the "supremely concise" commit history, I tried to rebase on your develop branch but really bad things happened. Git is still an unforgiving |
Small question needed for model testing: what would you advise that I use for checkpointing? Pytorch-lightning's way? I can't really figure out how you're "expecting" it to be done in Pyannote's current iteration. |
Nevermind, I figured I should use the |
Sorry for the dumbi-ish question, but i've never used Hydra on my own so I'm not very familiar with it. Here's the problem:
So here's my question:
|
Hmmm. I am not sure why you need to use the CLI for testing the model. task = VoiceTypeClassification(...)
model = PyanNet(task=task)
trainer = Trainer(max_epochs=1)
trainer.fit(model) Also, what do you mean by "model testing"? |
Sorry, by testing i meant "using the new VTC task + pipeline on our data to see if it reproduces our well-known results, or even beats them". Thus this means training, validation and testing. I mostly need help on the training part (which I intend to do using the CLI, c.f. my previous post). |
Don't bother using the CLI -- try something like this: from pyannote.database import get_protocol
dataset = get_protocol('YourDataset.SpeakerDiarization.YourProtocol)
from pyannote.audio.tasks import VoiceActivityDetection
vad = VoiceActivityDetection(dataset)
from pyannote.audio.models.segmentation import PyanNet
model = PyanNet(task=vad, sincnet={"stride": 10})
from pytorch_lightning.callbacks import EarlyStopping
from pytorch_lightning.callbacks.model_checkpoint import ModelCheckpoint
from pytorch_lightning.loggers import TensorBoardLogger
value_to_monitor, min_or_max = vad.val_monitor
model_checkpoint = ModelCheckpoint(
monitor=value_to_monitor,
mode=min_or_max,
save_top_k=5,
every_n_epochs=1,
save_last=True,
dirpath=".",
filename=f"{{epoch}}-{{{value_to_monitor}:.6f}}",
verbose=True)
early_stopping = EarlyStopping(
monitor=value_to_monitor,
mode=min_or_max,
min_delta=0.0,
patience=10.,
strict=True,
verbose=False)
logger = TensorBoardLogger(".", name="", version="", log_graph=False)
from pytorch_lightning import Trainer
trainer = Trainer(gpus=1, callbacks=[model_checkpoint, early_stopping], logger=logger)
trainer.fit(model) |
Thanks a lot, this was very helpful! I have some good news: the whole pipeline seems to be working and giving out nice results. Here's a small table that i'm going to update as we run more tests and might fix some things:
Note: on V1, I used the cyclic rate scheduler, as per @MarvinLvn 's worthy advice. On V2, nothing in the training part of the experiment has been gridsearched or else, I basically used what you've given and didn't question anything. In my opinion, these results are promising since I haven't started to try and add data augmentation or tweak the learning-rate, etc. |
# Conflicts: # pyannote/audio/pipelines/multilabel_detection.py
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This is a working PR on the future VTC implementation inspired from @MarvinLvn 's work, and to be merged into the next release of pyannote-audio.
Note: nothing has been done yet, this is just to get things started.