You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the online diarization pipeline on an entire dataset can be difficult and slow because the current implementation simulates an online scenario and processes one chunk at a time.
This is the way to go in a real-time scenario but it would be very useful to have a faster implementation for evaluation, for example to see the performance impact of swapping a component (issue #34 is a good example).
Idea
Pre-calculate segmentation and embeddings for all chunks in a file and only run clustering and output reconstruction online, this would considerably speed up the process. It could be hosted in a new BatchedOnlineSpeakerDiarization class implementing the same interface as OnlineSpeakerDiarization
The text was updated successfully, but these errors were encountered:
Problem
Running the online diarization pipeline on an entire dataset can be difficult and slow because the current implementation simulates an online scenario and processes one chunk at a time.
This is the way to go in a real-time scenario but it would be very useful to have a faster implementation for evaluation, for example to see the performance impact of swapping a component (issue #34 is a good example).
Idea
Pre-calculate segmentation and embeddings for all chunks in a file and only run clustering and output reconstruction online, this would considerably speed up the process. It could be hosted in a new
BatchedOnlineSpeakerDiarization
class implementing the same interface asOnlineSpeakerDiarization
The text was updated successfully, but these errors were encountered: