Major maintenance update #961

CorentinJ · 2021-12-28T12:16:14Z

Hello everyone. As you may know, this repo hasn't been actively maintained in the last 1-2 years - aside from all bluefish has done. Thank you for that bluefish, it was very helpful. As for myself, I am still working full-time and I became a father recently. I won't give you my life story but it's clear that maintaining this repository is not a priority for me.

This project was nothing more than a master's thesis. It still is today. It's far from SOTA. In a field that progresses so rapidly, it is already outdated. If you were to build from this repo to start a serious project, code-wise it's okay-ish. I actually found myself pleasantly surprised while making this update, I thought it would be a lot harder than it was. Anyway, the strength of this repo lies in its accessibility. It's got a cool little GUI slapped on top of it that lets you play around even if you don't know much about ML or programming.

Of course this accessibility drops when just getting the toolbox to run is a can of worms. This update attempts to remediate this.

Changes

Environment

I've updated everything using a fresh python 3.7 env as reference. All packages were updated to their latest version (as possible) and pinned in requirements.txt. Aside from torch, doing a requirements install will get you the entire env ready
- webrtcvad installs nicely on windows these days, so that's a needle out of my foot
  - As a result I've removed all --no_trim arguments

All models

Pretrained models are now downloaded automatically! You can still download them manually if you wish
The directory structure for saved models has changed. Where before you had <model_type>/saved_models/pretrained/pretrained.pt, you now have saved_models/<run_id>/<model_type>.pt. You may store different models types in the same run_id if you wish.
Anywhere multiprocessing was involved I have made fixes. Windows now uses multiprocessing again. To this end, I made all matplotlib imports local, because it is a source of problems.

Encoder

Preprocessing is now in a process pool instead of a (almost useless) threadpool.
Preprocessing now supports more audio extensions (but still expects the same datasets)

Misc

The "no mp3 support" was removed, I'll revisit this issue if needed but I wasn't happy with how it was handled currently.

…ctory structure

ireneb612 · 2021-12-28T14:21:55Z

Thank you for your time and effort !

DoubleF3lix · 2021-12-28T15:04:13Z

Congratulations on being a father! Glad to see this.

raccoonML · 2021-12-28T18:57:34Z

If you were to build from this repo to start a serious project, code-wise it's okay-ish.

I really like how you perform synthesizer audio preprocessing in this repo. The code gets reused in a lot of my personal projects. I find the train.txt and individual .npy files are much easier to work with than pickle datasets.

Tomcattwo · 2022-01-09T00:19:30Z

Corentin,
Thank you so much for developing and sharing this valuable tool! Congratulations on the birth of your child, and wishing you a lifetime of enjoyment with your now-expanded family!

Do not underestimate the effect that your Masters thesis and the toolbox have had upon the world...there are likely many more people than you could have ever imagined who are using it for numerous new and creative things, all over the world. Blessings and thanks.
Regards,
Tomcattwo

CodingRox82 · 2022-08-24T04:34:45Z

@CorentinJ

How do you think this repo will hold up in attempting to do the stuff below nowadays? Do you think this repo is a good fit or are there better options out there? I don't need to do text to speech and I don't need a GUI. All I need to do is be able to run the voice to voice code on demand.

-I will have a number of different audio file samples of different voices reading a variety of sentences. These will be the training models.
-In real time, an input audio file of a person reading a sentence will be sent over the internet and this repo will receive it, convert it into an output file of one of the training models, then send it back over the internet.
-The words in the output audio file should be able to be understood as clearly as they are in the input audio file.

fran478 · 2023-04-25T23:56:40Z

Hello. An example path would be saved_models/<run_id>/<model_type>.pt = saved_models/encoder/encoder.pt?. Thanks for any help.

valvesss · 2023-07-19T00:18:36Z

Time to re-open?

CorentinJ added 16 commits December 27, 2021 19:32

Cleanup of text transforms

898acc9

Convenient method to download default models

2c0a69c

Updated the toolbox

797c9ed

Updated demo_cli

2ec378a

Updated encoder preprocessing

878a6a2

Support for any audio extension in encoder preprocessing

13b5d72

Updated encoder training

966f286

Updated synthesizer audio preprocessing

61e43da

Updated synthesizer training

95b0c10

Updated vocoder preprocessing

cad519f

Updated vocoder training

92004a2

Updated the model display of the toolbox to the new saved models dire…

9020ac5

…ctory structure

Updated the readme

c755441

Fixed encoder training using 0 workers

39dad63

Added new requirements

1dfccaf

Update link in readme

bdfc4e8

CorentinJ merged commit 370e970 into master Dec 28, 2021

CorentinJ deleted the dev branch December 28, 2021 12:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major maintenance update #961

Major maintenance update #961

CorentinJ commented Dec 28, 2021

ireneb612 commented Dec 28, 2021

DoubleF3lix commented Dec 28, 2021

raccoonML commented Dec 28, 2021

Tomcattwo commented Jan 9, 2022

CodingRox82 commented Aug 24, 2022 •

edited

Loading

fran478 commented Apr 25, 2023

valvesss commented Jul 19, 2023

Major maintenance update #961

Major maintenance update #961

Conversation

CorentinJ commented Dec 28, 2021

Changes

ireneb612 commented Dec 28, 2021

DoubleF3lix commented Dec 28, 2021

raccoonML commented Dec 28, 2021

Tomcattwo commented Jan 9, 2022

CodingRox82 commented Aug 24, 2022 • edited Loading

fran478 commented Apr 25, 2023

valvesss commented Jul 19, 2023

CodingRox82 commented Aug 24, 2022 •

edited

Loading