-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spoken Language Identification #4846
Conversation
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
This pull request introduces 2 alerts when merging e0299de into d19146c - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jenkins is failing, you would need to fix it.
Did you get approval to release the model? if yes then you might need to add that as well for pretrained models list
min_lr: 0.0001 | ||
|
||
trainer: | ||
devices: 2 # number of gpus (original titanet-large was trained on 4 nodes with 8 gpus each) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may update the comment or is it still the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's the same.
logging.info(f'Hydra config: {OmegaConf.to_yaml(cfg)}') | ||
|
||
trainer = pl.Trainer(**cfg.trainer) | ||
exp_manager(trainer, cfg.get("exp_manager", None)) | ||
asr_model = EncDecClassificationModel(cfg=cfg.model, trainer=trainer) | ||
|
||
if cfg.name == 'TitaNet': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add lower and see if titanet
in cfg.name.lower(), so they can have their own config with titanet append or prepended to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good point. Updated it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is a temp trick. I didn't add task conditions here because we will refactor the model later.
asr_model = EncDecClassificationModel(cfg=cfg.model, trainer=trainer) | ||
|
||
if cfg.name == 'TitaNet': | ||
the_model = EncDecSpeakerLabelModel(cfg=cfg.model, trainer=trainer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: may be just model
is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
|
||
labels.append(item['label']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you meant to write labels.append(label)
here I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!!!!
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Yes, I will submit nvbug to publish the model and add link to bug fix branch. will possibly also need to update the suggested optim in yaml file then. |
Signed-off-by: fayejf <fayejf07@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* data and cal weight Signed-off-by: fayejf <fayejf07@gmail.com> * add config yaml file Signed-off-by: fayejf <fayejf07@gmail.com> * remove impulse for simplicity Signed-off-by: fayejf <fayejf07@gmail.com> * add langid to speech class train script Signed-off-by: fayejf <fayejf07@gmail.com> * style fix Signed-off-by: fayejf <fayejf07@gmail.com> * auroc and marco acc for val and test Signed-off-by: fayejf <fayejf07@gmail.com> * reflect nithin's comment and fix test/ci Signed-off-by: fayejf <fayejf07@gmail.com> * bring back impulse Signed-off-by: fayejf <fayejf07@gmail.com> * fix test Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>
* data and cal weight Signed-off-by: fayejf <fayejf07@gmail.com> * add config yaml file Signed-off-by: fayejf <fayejf07@gmail.com> * remove impulse for simplicity Signed-off-by: fayejf <fayejf07@gmail.com> * add langid to speech class train script Signed-off-by: fayejf <fayejf07@gmail.com> * style fix Signed-off-by: fayejf <fayejf07@gmail.com> * auroc and marco acc for val and test Signed-off-by: fayejf <fayejf07@gmail.com> * reflect nithin's comment and fix test/ci Signed-off-by: fayejf <fayejf07@gmail.com> * bring back impulse Signed-off-by: fayejf <fayejf07@gmail.com> * fix test Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: George Zelenfroind <gzelenfroind@nvidia.com>
Are the model weights publicly available? |
@guillermo-gabrielli-fer The model would possibly be publicly available next week. |
@guillermo-gabrielli-fer We've had a better version of the lang id model and in the process of evaluating and publishing. It will publish ASAP and I will let you know once it's done. Thanks for your patience. |
* data and cal weight Signed-off-by: fayejf <fayejf07@gmail.com> * add config yaml file Signed-off-by: fayejf <fayejf07@gmail.com> * remove impulse for simplicity Signed-off-by: fayejf <fayejf07@gmail.com> * add langid to speech class train script Signed-off-by: fayejf <fayejf07@gmail.com> * style fix Signed-off-by: fayejf <fayejf07@gmail.com> * auroc and marco acc for val and test Signed-off-by: fayejf <fayejf07@gmail.com> * reflect nithin's comment and fix test/ci Signed-off-by: fayejf <fayejf07@gmail.com> * bring back impulse Signed-off-by: fayejf <fayejf07@gmail.com> * fix test Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
@guillermo-gabrielli-fer The model is published. Thanks for you patience. #5080 |
@fayejf Thanks for the release of the model, is there currently any function for pure inference on an audio-file? Similar to e.g. "verify_speakers" that can be used with Titanet? |
you may use this function
model.labels
|
@nithinraok might as well make this a function publically available in the Titanet model itself |
Yes will create one. |
Good point guys. Will add an general |
* data and cal weight Signed-off-by: fayejf <fayejf07@gmail.com> * add config yaml file Signed-off-by: fayejf <fayejf07@gmail.com> * remove impulse for simplicity Signed-off-by: fayejf <fayejf07@gmail.com> * add langid to speech class train script Signed-off-by: fayejf <fayejf07@gmail.com> * style fix Signed-off-by: fayejf <fayejf07@gmail.com> * auroc and marco acc for val and test Signed-off-by: fayejf <fayejf07@gmail.com> * reflect nithin's comment and fix test/ci Signed-off-by: fayejf <fayejf07@gmail.com> * bring back impulse Signed-off-by: fayejf <fayejf07@gmail.com> * fix test Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
* data and cal weight Signed-off-by: fayejf <fayejf07@gmail.com> * add config yaml file Signed-off-by: fayejf <fayejf07@gmail.com> * remove impulse for simplicity Signed-off-by: fayejf <fayejf07@gmail.com> * add langid to speech class train script Signed-off-by: fayejf <fayejf07@gmail.com> * style fix Signed-off-by: fayejf <fayejf07@gmail.com> * auroc and marco acc for val and test Signed-off-by: fayejf <fayejf07@gmail.com> * reflect nithin's comment and fix test/ci Signed-off-by: fayejf <fayejf07@gmail.com> * bring back impulse Signed-off-by: fayejf <fayejf07@gmail.com> * fix test Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
What does this PR do ?
Add training script and config (titanet) for spoken language identification.
Collection:ASR
Changelog
Usage
Before your PR is "Ready for review"
Pre checks:
[doc will be updated in next PR]
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information