Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a voice activity detection pipeline that is fully compatible with all of diart's features:
Benchmark
on a dataset and computing the detection error rate (instead of the diarization error rate)Benchmark
in paralleltau_active
)This is also implemented in the CLI interface and can be changed using the
--pipeline
argument, which requires the name of the pipeline to run. For the time being, the only possible options areSpeakerDiarization
andVoiceActivityDetection
. More to come in the future!A significant refactoring was needed to squeeze in this feature, so backward compatibility with v0.7 is not guaranteed.
I also changed the names of some major classes so that they are more clear:
BasePipeline
becomesPipeline
BasePipelineConfig
becomesPipelineConfig
OnlineSpeakerDiarization
becomesSpeakerDiarization
(all pipelines are online)PipelineConfig
becomesSpeakerDiarizationConfig
(there are 2 pipelines now)RealTimeInference
becomesStreamingInference
("real-time" depends on hardware, I'm more comfortable with "streaming")RealTimePlot
becomesStreamingPlot
(same as above)Changelog
VoiceActivityDetection
pipeline with itsVoiceActivityDetectionConfig
Pipeline
,PipelineConfig
andHyperParameter
todiart.blocks.base
--pipeline
argument to CLI so the user can select a different pipeline to run, optimize, evaluate, etc.Pipeline
must be able to suggest an evaluation metric if none is providedmetric: pyannote.metrics.BaseMetric
parameter toOptimizer
andBenchmark.__call__()
direction: Literal["minimize", "maximize"]
parameter toOptimizer
setup.cfg
Notes on performance
Using
pyannote/segmentation
,duration=5s
,step=0.5s
,latency=5s
andtau_active=0.507
, the performance on AMI MixHeadset is: