-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-trained models #59
Comments
Currently not available, will get to this as soon as I can :) |
I have a decent-ish Libri model I can upload somewhere if you'd like. |
@ryanleary that would be awesome! Has it been trained on the latest checkpoint system? Would make integration easier |
Yes. It's definitely a preliminary model, but somewhat functional. Trained on 11 epochs of 1k hrs libri with augmentation.
Shall we start up a wiki for this kind of thing as well as other documentation? |
Really good idea, will get to it ASAP and open a PR to get this together! EDIT: @ryanleary to keep simple do you think just a new file in the repo under the name |
Oops missed the edit. That's probably alright. My only thought for wiki was so that there wouldn't need to be PRs every time there are new models. I don't really have a strong preference though. Do you have a preference where I upload that model above? |
That's a good idea @ryanleary! I'm going to try push to get the skip_rnn branch merged into pytorch because I want all the models atleast on initial release to be the pure DS2 architecture (which requires skip_rnn to be implemented). Then I'll open a new issue to keep track of models trained! |
Sure thing. Definitely looking forward to getting full batch norm support. Will retrain once we have a build of pytorch that supports it. |
Since the skip_input work appears to be stalled, did you want to do this now or continue to wait? |
@ryanleary, I'll create a new issue with a plan on what needs to be done for the networks; my initial thoughts is that skip input isn't viable long term without cuDNN support. My reasoning on this is that it already takes long to train the DS architecture, and not utilising cuDNN slows this down drastically. It will be even worse when Volta NVIDIA GPUs come out, and we can't utilise the new architecture. As a result I think the 'vanilla' architecture will have to stray a bit, and be a batch norm on top of the cuDNN RNN (architectures etc will be outlined in the issue!) |
@ryanleary and whoever else has input into this, does it make sense to train all models regardless of dataset on the full DS2 architecture (as close to this as possible). |
I think that's certainly ideal, but we can update models in the future. Having some pretrained models that match what's currently implemented will at least help people experiment with this with a model that is better than "toy". As an aside, I'm personally looking forward more to getting BatchNorm and lookahead convolutions implemented and moving toward the "Production" DeepSpeech implementation. Should be easier to train and it looks like it only costs about 5% relative performance hit [Spectrogram -> 2d conv -> 2d conv -> GRU -> GRU -> GRU [forward-only] -> 1D Row Conv -> FC] |
@ryanleary agreed. In my head getting the beam search language model integrated (taking it from the TF fork in the other issue) is the main step towards production DS, and probably the biggest at this stage! I've opened a new ticket at #85 tracking progress of pre-trained models, will close this one. |
Any pre-trained models available?
The text was updated successfully, but these errors were encountered: