Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to separate a track #6

Open
LennyPenny opened this issue Nov 4, 2018 · 6 comments
Open

How to separate a track #6

LennyPenny opened this issue Nov 4, 2018 · 6 comments

Comments

@LennyPenny
Copy link

LennyPenny commented Nov 4, 2018

Hey,
thanks for the awesome work!
So I got the checkpoints and added their path to the config, but how do I actually do separation on a new wav file?
The eval_dsd_100.py file only seems to iterate over the dsd100 dataset.

Is there a function that that I could use that I'm missing? Maybe you have some extra code to do this that is missing from the repo

@sungheonpark
Copy link
Owner

Hi Lenny,
You can refer to the evaluation code to perform separation task on your own wav files.
You can use librosa.load function to load wav files, and most part of the evaluation code can be used without changes. It may not so difficult to modify the code for your own task. Feel free to ask if you have further questions.

@LennyPenny
Copy link
Author

Okay I got it working now! It's really good:)
Maybe I will make a PR with an easy to use command line script.

Do you have any pointers as to how I could turn this into a real time separator? Like for a radio stream or something

@sungheonpark
Copy link
Owner

I don't have much experience about getting the live streaming data. You may find some example python codes dealing with live streaming from the Internet.

@LennyPenny
Copy link
Author

LennyPenny commented Nov 29, 2018

Another question before I dive into this too deeply:
Does it run the neural network for each sample of input (i.e. 44.1k times a second for a 44.1khz audio file), or is the whole file given the neural network at once?

If the former is true, then getting this to work with live streaming data would be easily possible,

@sungheonpark
Copy link
Owner

When the spectrogram is fed into the network, it is divided into smaller chunks. The spectrogram of a single file has size of 512 x (length of the spectrogram). The input size of CNN is 512x64, so the spectrogram is cropped to fit the size of the network. In the training, input spectrogram is cropped to 512x64 at random, and during the test, the spectrogram is sequentially divided and fed into the network.

@pratikshaya
Copy link

@LennyPenny Can you please tell me how actually you change the code to separate the track in test .wav file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants