Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update diarization data loader to train meeting data #4567

Merged
merged 14 commits into from
Jul 28, 2022
Merged

Conversation

tango4j
Copy link
Collaborator

@tango4j tango4j commented Jul 20, 2022

Signed-off-by: Taejin Park tango4j@gmail.com

What does this PR do ?

This PR adds a new feature to audio_to_diar_label.py.
So far, we have only trained with 2 speaker audio where one of the speakers always exists.
This PR makes dataloader to take 2 speaker subset from more-than-2-speaker audio clips with corresponding groundtruth.
Thus, there are cases where grround-truth-cluster label is "-1" where speakers are not included in either of two targeted speakers.
ex) Two speaker ground-truth label ([0,0,1,1,-1,-1,-1,1,1,1,0]

Also removed unnecessary variables.

Collection: [Note which collection this PR will affect]

ASR

Changelog

  • audio_to_diar_label.py

-- Removed unused variables in the functions.
-- base_clus_label now can have -1 label.
-- Few cosmetic changes

  • collections.py

A few changes include:

-- Added get_rttm_speaker_index() function since sess_spk_dict could have more than 2 speakers.

  • speaker_utils.py

-- Added get_rttm_speaker_index that takes speakers from RTTM files.

Usage

The usage will be provided when model and module codes are published.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Signed-off-by: Taejin Park <tango4j@gmail.com>
@lgtm-com
Copy link

lgtm-com bot commented Jul 20, 2022

This pull request introduces 1 alert when merging 08dfec2 into fea3775 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
@lgtm-com
Copy link

lgtm-com bot commented Jul 20, 2022

This pull request introduces 1 alert when merging 932ece6 into fea3775 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@tango4j tango4j requested a review from nithinraok July 22, 2022 21:58
@tango4j tango4j marked this pull request as ready for review July 22, 2022 22:01
@@ -61,7 +61,7 @@ def get_scale_mapping_list(uniq_timestamps):
return scale_mapping_argmat


def extract_seg_info_from_rttm(uniq_id, rttm_lines, emb_dict=None, target_spks=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you replaced emb_dict with mapping_dict but I see emb_dict variable still at line 84.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please add doc strings for these variables

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mistake and I didn't catch it cause i didn't test seq_eval_mode. removed emb_dict. Adding docstrings.

@@ -274,7 +259,7 @@ def assign_labels_to_longer_segs(self, uniq_id, base_scale_clus_label):
per_scale_clus_label = torch.tensor(per_scale_clus_label)
return per_scale_clus_label, uniq_scale_mapping

def get_diar_target_labels(self, uniq_id, fr_level_target):
def get_diar_target_labels(self, uniq_id, sample, fr_level_target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add doc strings for sample

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible to initialize with default values for these variables? comes in handy for debugging issues in the future

Copy link
Collaborator Author

@tango4j tango4j Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs discussion. I can't fully understand what variable should be initilaized with default values. These arguments should not be assigned with any default values. Adding docstring for sample.

@@ -359,6 +359,19 @@ def rttm_to_labels(rttm_filename):
return labels


def get_rttm_speaker_index(rttm_labels):
"""
Generate speaker mapping between integer index to RTTM speaker label names.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add doc strings, input and return

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding doctstrings for this function

tango4j and others added 5 commits July 26, 2022 15:23
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Copy link
Collaborator

@nithinraok nithinraok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.
Needs to test after we merge MSDD model classes

@tango4j tango4j merged commit 16c96ba into main Jul 28, 2022
@tango4j tango4j deleted the update_diar_to_label branch August 2, 2022 01:38
Davood-M pushed a commit to Davood-M/NeMo that referenced this pull request Aug 9, 2022
* Update audio_to_diar_label to train meeting data

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Style fix with --scope=nemo

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Style fix problem, re-run style fix

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Removed remaining commented lines

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Remove an unused variable

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Reflected comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

* style fixed

Signed-off-by: Taejin Park <tango4j@gmail.com>

* style fix for no reason

Signed-off-by: Taejin Park <tango4j@gmail.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
piraka9011 pushed a commit to piraka9011/NeMo that referenced this pull request Aug 25, 2022
* Update audio_to_diar_label to train meeting data

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Style fix with --scope=nemo

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Style fix problem, re-run style fix

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Removed remaining commented lines

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Remove an unused variable

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Reflected comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

* style fixed

Signed-off-by: Taejin Park <tango4j@gmail.com>

* style fix for no reason

Signed-off-by: Taejin Park <tango4j@gmail.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* Update audio_to_diar_label to train meeting data

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Style fix with --scope=nemo

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Style fix problem, re-run style fix

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Removed remaining commented lines

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Remove an unused variable

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Reflected comments

Signed-off-by: Taejin Park <tango4j@gmail.com>

* style fixed

Signed-off-by: Taejin Park <tango4j@gmail.com>

* style fix for no reason

Signed-off-by: Taejin Park <tango4j@gmail.com>

Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants