-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update diarization data loader to train meeting data #4567
Conversation
Signed-off-by: Taejin Park <tango4j@gmail.com>
This pull request introduces 1 alert when merging 08dfec2 into fea3775 - view on LGTM.com new alerts:
|
Signed-off-by: Taejin Park <tango4j@gmail.com>
…into update_diar_to_label
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
This pull request introduces 1 alert when merging 932ece6 into fea3775 - view on LGTM.com new alerts:
|
Signed-off-by: Taejin Park <tango4j@gmail.com>
@@ -61,7 +61,7 @@ def get_scale_mapping_list(uniq_timestamps): | |||
return scale_mapping_argmat | |||
|
|||
|
|||
def extract_seg_info_from_rttm(uniq_id, rttm_lines, emb_dict=None, target_spks=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you replaced emb_dict
with mapping_dict
but I see emb_dict
variable still at line 84.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please add doc strings for these variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a mistake and I didn't catch it cause i didn't test seq_eval_mode. removed emb_dict. Adding docstrings.
@@ -274,7 +259,7 @@ def assign_labels_to_longer_segs(self, uniq_id, base_scale_clus_label): | |||
per_scale_clus_label = torch.tensor(per_scale_clus_label) | |||
return per_scale_clus_label, uniq_scale_mapping | |||
|
|||
def get_diar_target_labels(self, uniq_id, fr_level_target): | |||
def get_diar_target_labels(self, uniq_id, sample, fr_level_target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add doc strings for sample
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible to initialize with default values for these variables? comes in handy for debugging issues in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs discussion. I can't fully understand what variable should be initilaized with default values. These arguments should not be assigned with any default values. Adding docstring for sample.
@@ -359,6 +359,19 @@ def rttm_to_labels(rttm_filename): | |||
return labels | |||
|
|||
|
|||
def get_rttm_speaker_index(rttm_labels): | |||
""" | |||
Generate speaker mapping between integer index to RTTM speaker label names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add doc strings, input and return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding doctstrings for this function
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
Needs to test after we merge MSDD model classes
* Update audio_to_diar_label to train meeting data Signed-off-by: Taejin Park <tango4j@gmail.com> * Style fix with --scope=nemo Signed-off-by: Taejin Park <tango4j@gmail.com> * Style fix problem, re-run style fix Signed-off-by: Taejin Park <tango4j@gmail.com> * Removed remaining commented lines Signed-off-by: Taejin Park <tango4j@gmail.com> * Remove an unused variable Signed-off-by: Taejin Park <tango4j@gmail.com> * Reflected comments Signed-off-by: Taejin Park <tango4j@gmail.com> * style fixed Signed-off-by: Taejin Park <tango4j@gmail.com> * style fix for no reason Signed-off-by: Taejin Park <tango4j@gmail.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
* Update audio_to_diar_label to train meeting data Signed-off-by: Taejin Park <tango4j@gmail.com> * Style fix with --scope=nemo Signed-off-by: Taejin Park <tango4j@gmail.com> * Style fix problem, re-run style fix Signed-off-by: Taejin Park <tango4j@gmail.com> * Removed remaining commented lines Signed-off-by: Taejin Park <tango4j@gmail.com> * Remove an unused variable Signed-off-by: Taejin Park <tango4j@gmail.com> * Reflected comments Signed-off-by: Taejin Park <tango4j@gmail.com> * style fixed Signed-off-by: Taejin Park <tango4j@gmail.com> * style fix for no reason Signed-off-by: Taejin Park <tango4j@gmail.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>
* Update audio_to_diar_label to train meeting data Signed-off-by: Taejin Park <tango4j@gmail.com> * Style fix with --scope=nemo Signed-off-by: Taejin Park <tango4j@gmail.com> * Style fix problem, re-run style fix Signed-off-by: Taejin Park <tango4j@gmail.com> * Removed remaining commented lines Signed-off-by: Taejin Park <tango4j@gmail.com> * Remove an unused variable Signed-off-by: Taejin Park <tango4j@gmail.com> * Reflected comments Signed-off-by: Taejin Park <tango4j@gmail.com> * style fixed Signed-off-by: Taejin Park <tango4j@gmail.com> * style fix for no reason Signed-off-by: Taejin Park <tango4j@gmail.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Signed-off-by: Taejin Park tango4j@gmail.com
What does this PR do ?
This PR adds a new feature to
audio_to_diar_label.py
.So far, we have only trained with 2 speaker audio where one of the speakers always exists.
This PR makes dataloader to take 2 speaker subset from more-than-2-speaker audio clips with corresponding groundtruth.
Thus, there are cases where grround-truth-cluster label is "-1" where speakers are not included in either of two targeted speakers.
ex) Two speaker ground-truth label ([0,0,1,1,-1,-1,-1,1,1,1,0]
Also removed unnecessary variables.
Collection: [Note which collection this PR will affect]
ASR
Changelog
audio_to_diar_label.py
-- Removed unused variables in the functions.
-- base_clus_label now can have -1 label.
-- Few cosmetic changes
collections.py
A few changes include:
-- Added get_rttm_speaker_index() function since sess_spk_dict could have more than 2 speakers.
speaker_utils.py
-- Added get_rttm_speaker_index that takes speakers from RTTM files.
Usage
The usage will be provided when model and module codes are published.
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information