You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use the NeMo implementation of Quartz to replicate the results presented in this paper.
However I am facing some issues. First of all, the pretrained encoder model has a different structure with respect to the one implemented in Nemo. In particular, to be able to load the state dictionary I had to modify Masked1DConv to inherit from 1DConv (as in the original Jasper implementation).
Moreover there are discrepancies with the names of the layers that has to be fixed to be able to load properly the pretrained model.
After my attempts at fixing these issues, I still was not able to reach the performances mentioned in the paper. I tried evaluating on dev_other, and I reached 16.9% in terms of WER, which is much higher compared to 11.58% reported on the paper.
I used the configuration file and the pretrained model that can be found here.
The validation is run inside a docker container built starting from the Dockerfile available in the repo. The only minor difference regards the version of the pytorch image used, that is the 19.09 instead of 19.11 because of some issues with CUDA drivers that wouldn't allow me to use the GPUs.
Any help would be much appreciated. Thank you!
The text was updated successfully, but these errors were encountered:
Are you using master version of NeMo? It is not compatible with published checkpoints. Could you please try the latest stable version 0.8.2 which you’ll get if you just install from pip (or take 0.8.2 tag on git) Published checkpoints correspond to the latest stable version.
(master is 0.9 which is WIP and will be released soon)
Hi,
I am trying to use the NeMo implementation of Quartz to replicate the results presented in this paper.
However I am facing some issues. First of all, the pretrained encoder model has a different structure with respect to the one implemented in Nemo. In particular, to be able to load the state dictionary I had to modify Masked1DConv to inherit from 1DConv (as in the original Jasper implementation).
Moreover there are discrepancies with the names of the layers that has to be fixed to be able to load properly the pretrained model.
After my attempts at fixing these issues, I still was not able to reach the performances mentioned in the paper. I tried evaluating on dev_other, and I reached 16.9% in terms of WER, which is much higher compared to 11.58% reported on the paper.
I used the configuration file and the pretrained model that can be found here.
The validation is run inside a docker container built starting from the Dockerfile available in the repo. The only minor difference regards the version of the pytorch image used, that is the 19.09 instead of 19.11 because of some issues with CUDA drivers that wouldn't allow me to use the GPUs.
Any help would be much appreciated. Thank you!
The text was updated successfully, but these errors were encountered: