-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wavenet model for audio generation #624
Conversation
This looks great, what kind of feedback were you looking for in the draft stage? The external prerequisites are understandable, and it doesn't introduce any new build dependencies, so that looks good to me. The model itself looks solid, although I haven't tried it yet. Would you like a more thorough review of it now, or wait until you've completed work on the model? Thanks for working on this, it would be a nice model to have in our examples. |
@BradLarson Just wanted to make sure the general direction looks reasonable. Detailed review can definitely wait till after pending work is done. Thanks! |
It's been a little while, so I'm just checking back in on this model. It would be great to have in, if you still had the time to drive it to completion. If not, I totally understand. |
Hey Brad, sorry about the delay here. I'll be picking this back up soon. Once I get the |
We're doing another pass on outstanding pull requests, so I just wanted to confirm that you still were planning to move forward with this. Will this need the swift-apis additions first, in order to make this viable? |
Hi @BradLarson, sorry for the delay. I've been caught up with other stuff. I definitely plan to return to this at some point but I might not get to it soon. It will require some changes in swift-apis first, unless the |
Closing this out since it's outdated. I'll pull in updates from swift-apis etc. and reopen shortly! |
This implements a first cut Wavenet model for the audio generation task on the VCTK dataset. This uses the Python + Tensorflow version here as a reference implementation.
At the moment only the training loop works and there are some limitations / missing features that
need to be addressed:
Additionally, I have some ideas for future improvements / enhancements that I can include as follow up PRs:
librosa
instead of the simplisticpydub
library in Python for reading and processing audio. Initial attempts resulted inthe following error:
AVFoundation
? Initial attempts at this caused linker issues:I'd love to get some early feedback on this!