Online speaker diarization as a block #92

juanmc2005 · 2022-09-08T09:23:17Z

This PR addresses issues #83 and #84.

Changelog

OnlineSpeakerDiarization is now independent from RxPY and can be used as a block (cc @hbredin)
- It can receive a single waveform or a list of them (for batched inference)
- This class is now stateful (speaker centroids), so it now allows to initialize the system with centroids from a previous pipeline
from diart import OnlineSpeakerDiarization, PipelineConfig
AudioLoader cannot split files into audio chunks anymore
Emit additional chunks with zero padding at the end of file streams so the pipeline output is fully aggregated
- Before, this was done by notifying DelayedAggregation of the stream duration, then it would concatenate the last non-aggregated output
Fix bug: EmbeddingNormalization was squeezing the output
Uris are no longer required by blocks or output annotations, instead they're added by sinks
OnlineSpeakerDiarization does not split the stream into chunks anymore, nor it resamples chunks dynamically
- This is now done by RealTimeInference, which still uses RxPY as it is a higher level API
RealTimeInference can now do batched inference and includes new parameters
Benchmark reuses RealTimeInference internally (huge win here)
regularize_audio_stream() renamed to rearrange_audio_stream(), as the notion of regularity is not very clear here
diart.pipelines does not exist anymore
Audio sources don't have a length property anymore
Remove PrecalculatedFeaturesAudioSource
Customizable reading block size in most audio sources
When possible, AudioSource block size is set to the step size
Add OnlineSpeakerDiarization.reset() to reset internal state (centroids and buffers)
Add AudioSource.close() to correctly handle termination from external causes
Avoid confusion with Rx observers not being called with do and do_action
- Replace diart.operators.profile with a stateful diart.utils.Chronometer
- Absorb diart.operators.progress in RealTimeInference
- RealTimeInference now handles all the complexity of Rx

RAM usage during inference is considerably reduced (~30%)
Runtime doesn't seem to be impacted

…t the end of a file stream

…from Rx

juanmc2005 added 5 commits September 7, 2022 14:18

Make OnlineSpeakerDiarization a block. Add on-the-fly batched inference

34968cb

Replace stream end parameter in DelayedAggregation by audio padding a…

3b5f0bd

…t the end of a file stream

Set specific block size for some audio sources

cad2cc3

Fix aggregation error not filling buffer

5a8bf80

Add some useful comments

8b7c8a6

juanmc2005 added feature New feature or request API Improvements to the API refactoring Internal design improvements that don't change the API labels Sep 8, 2022

juanmc2005 added this to the Version 0.6 milestone Sep 8, 2022

juanmc2005 added 6 commits September 11, 2022 15:13

Make profile() wrapper compatible with batched inference

5eb7490

Add OnlineSpeakerDiarization.reset() to reset the internal state

f8816e2

Make progress bars disappear after inference has finished

536f0c3

Add AudioSource.close() to close streams correctly if needed

68153f0

Fix bug: profiling report not shown in case of error

43a5cb3

Fully delegate progress to RealTimeInference to avoid weird behavior …

7917497

…from Rx

juanmc2005 merged commit a53318b into develop Sep 28, 2022

juanmc2005 deleted the feat/pipeline branch September 28, 2022 12:49

This was referenced Sep 28, 2022

Free pipeline from ReactiveX #83

Closed

Move PrecalculatedFeaturesAudioSource inside OnlineSpeakerDiarization #84

Closed

juanmc2005 mentioned this pull request Oct 31, 2022

Version 0.6 #109

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Online speaker diarization as a block #92

Online speaker diarization as a block #92

juanmc2005 commented Sep 8, 2022 •

edited

Loading

Online speaker diarization as a block #92

Online speaker diarization as a block #92

Conversation

juanmc2005 commented Sep 8, 2022 • edited Loading

Changelog

juanmc2005 commented Sep 8, 2022 •

edited

Loading