-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAMP-VSEP dataset #222
DAMP-VSEP dataset #222
Conversation
Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it looks great, thanks !
I have this concern about on-the-fly resampling that I'd like to discuss.
Also, instead of passing the root_dir, I would probably use sed
to prepend it to all the path in the csv files. But we might get to that when working on the recipe !
import json | ||
import librosa | ||
import warnings | ||
warnings.filterwarnings('ignore') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this filters?
A comment would be welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this filters?
By default, librosa uses soundfile but if they can't open the file with soundfile they change to audioread.
Every time they change to audioread they show a Warning.
librosa/librosa#1015
A comment would be welcome.
Ok I will relate the comment to the question in librosa
asteroid/data/dampvsep_dataset.py
Outdated
{'mean': f"{mix.mean():.16f}", | ||
'std': f"{mix.std():.16f}", | ||
'scaler': f"{amplitude_scaler:.16f}", | ||
'vocal': sample['vocal_path'], | ||
'background': sample['background_path'], | ||
'vocal_duration': vocal_dur, | ||
'background_duration': back_dur}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to declare the dict first, and then return it, it's more readable.
@staticmethod | ||
def _build_metadata(inputs): | ||
sample, root, sample_rate = inputs | ||
back, _ = librosa.load(root / sample['background_path'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use soundfile
for loading but it doesn't resample.
Is resampling needed? If yes, isn't is expensive to resample on the fly when it could be fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use
soundfile
for loading but it doesn't resample.
This is a very good question. First, you need to know that the dataset provides three audio files: mixtures.m4a, background.m4a and vocal.ogg.
Soundfile can't handle M4A files but can handle OGG. In contrast, audioread can handle both.
Librosa, try with soundfile, if error, try audiofile. Also, it allows resampling and converts to mono.
Is resampling needed? If yes, isn't is expensive to resample on the fly when it could be fixed?
Fixed the sample rate would mean to create a copy of the dataset. In my subset it is not a big deal, In fact, I have a recipe that makes a 16KHz WAV copy of the corpus in stage 0. But, personally, don't like to have several copies of my datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, understood. What about torchaudio by curiosity?
I would also resample beforehand for simplicity but this is a choice.
If we merge this, we'll need librosa as a dependency. I guess we couldn't avoid that ^^
Let's add the dependency then, I let you do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Librosa it is also used in AVSpeech and they have it as a requirement in the recipe.
I could do the same
asteroid/data/dampvsep_dataset.py
Outdated
|
||
def __init__(self, root_path, task, split, samples_per_track=1, | ||
random_segments=False, sample_rate=16000, | ||
segment=None, silence_in_segment=None, num_workers=None, norm=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would need docs about the split
and norm
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed these
asteroid/data/dampvsep_dataset.py
Outdated
|
||
dataset_name = 'DAMP-VSEP' | ||
|
||
def __init__(self, root_path, task, split, samples_per_track=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about having a default for task
and split
?
Positional args are always confusing in the long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will add
task='enh_both'
split=None
asteroid/data/dampvsep_dataset.py
Outdated
v: ndarray, vocal. | ||
b: ndarray, backgroun. | ||
snr: float, SNR. Default=0 | ||
Outputs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True
|
||
parser.add_argument( | ||
'--root', type=str, help='root path of dataset', | ||
default='/media/gerardo/TOSHIBA/DataSets/DAMP/DAMP-VSEP' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to remove all user-specific absolute path.
…URLS_HASHTABLE Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
… MODELS_URLS_HASHTABLE" This reverts commit d81dec7 Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
…URLS_HASHTABLE Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com> (cherry picked from commit d81dec7)
… MODELS_URLS_HASHTABLE" This reverts commit 0423858 Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
…model in MODELS_URLS_HASHTABLE"" This reverts commit c72eac1 Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
…retrain model in MODELS_URLS_HASHTABLE""" This reverts commit a47921c Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
…nhboth pretrain model in MODELS_URLS_HASHTABLE"""" This reverts commit f2175db Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
…P-VSEP_enhboth pretrain model in MODELS_URLS_HASHTABLE""""" This reverts commit 56b0c61 Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
This reverts commit 84272b1 Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
This reverts commit 274d9f4 Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
This reverts commit 89acbeb Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
This reverts commit d776041 Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
…nhboth pretrain model in MODELS_URLS_HASHTABLE"""" This reverts commit f2175db Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
|
Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
Signed-off-by: Gerardo Roa Dabike <gerardo.roa@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready to merge. I'll go again through the changes after the last fixes.
Thanks again
@@ -150,7 +156,7 @@ def get_tracks(self): | |||
|
|||
@staticmethod | |||
def _build_metadata(inputs): | |||
sample, root, sample_rate = inputs | |||
sample, root, sample_rate, snr = inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks weird to unpack like this but it's a private method, so ok, why not.
""" | ||
|
||
dataset_name = 'DAMP-VSEP' | ||
|
||
def __init__(self, root_path, task, split, samples_per_track=1, | ||
def __init__(self, root_path, task='enh_both', split=None, samples_per_track=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if split is None then?
@staticmethod | ||
def _build_metadata(inputs): | ||
sample, root, sample_rate = inputs | ||
back, _ = librosa.load(root / sample['background_path'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, understood. What about torchaudio by curiosity?
I would also resample beforehand for simplicity but this is a choice.
If we merge this, we'll need librosa as a dependency. I guess we couldn't avoid that ^^
Let's add the dependency then, I let you do it.
Hey @groadabike, could we get this in? |
Hey @mpariente, Honestly, I did some modifications and I am running some experiments. |
Hi @mpariente, didn't have much time to do all the modifications I wanted. |
Thanks for getting back! Yes, it's completely fine |
Close PR. Will be resubmitted with updated code and recipe. |
Signed-off-by: Gerardo Roa Dabike gerardo.roa@gmail.com
This is my current version of the DAMP-VSEP dataset.
It relais in pre-constructed files with the details of the train, valid and test set.
If you run dampvsep/test_dataloader/test_dataloader.py, the first time will construct and save the metadata.