Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding audio_path to DATA section #62

Closed
bagustris opened this issue Aug 30, 2023 · 8 comments
Closed

Adding audio_path to DATA section #62

bagustris opened this issue Aug 30, 2023 · 8 comments

Comments

@bagustris
Copy link
Collaborator

Currently, the filename in the database (CSV) must contain a full path instead of a basename only. In most cases, the provider of the dataset only provides a file with a list of the basenames (for platform independence). So, I would like to request adding audio_path to the DATA section in the INI file.

This can be optional, meaning, that if this option is not given, Nkululeko will search file path in the given CSV file (current behavior).

Example usage (see train.audio_path and dev.audio_path)

[DATA]
databases = ['train', 'test', 'dev']
train = ./data/ravdess/ravdess_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
train.audio_path = ./data/ravdess/ravdess_speech
dev = ./data/ravdess/ravdess_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
dev.audio_path = ./data/ravdess/ravdess_speech

One important note is that Nkululeko should be able to find audio files inside subdirectories of given audio_path since database creator sometimes also split their audio files into subdirectories instead of in a single directory.

Actually, I want to evaluate my experiment here without much effort with Nkululeko :)

@felixbur
Copy link
Owner

but you can have relative and absolute paths in nkululeko?
relative meaning, it starts from the database root location.
Isn't that sufficient?

@bagustris
Copy link
Collaborator Author

bagustris commented Sep 1, 2023

Relative and absolute paths are to define the database path (e.g, CSV file) not the content of file header inside the CSV file, right?

Suppose I have train.csv containing the following,

file, emotion
train_001.wav, fear
train_002.wav,  sad
train_003.wav, happy
...
train_100.wav, neutral

With the current configuration, is that possible to run nkululeko.nkululeko without any pre-processing?

Also currently there is target option for the labels, e.g., target=emotion. How Nkululeko could recognize audio path header in CSV file? I just assume all datasets use file as header. If this is the case, we also need to specify an audio header similar to target.

@bagustris
Copy link
Collaborator Author

I checked it can be accomplished to set the root directory under [EXP] section. It needs clarity to the INI file and and documentations.

@felixbur
Copy link
Owner

felixbur commented Sep 4, 2023

I checked it can be accomplished to set the root directory under [EXP] section. It needs clarity to the INI file and and documentations.

that's actually the root for the experiment results, not neccessarily the databases!

@felixbur
Copy link
Owner

felixbur commented Sep 4, 2023

And, yeah, documentation is still mainly my blog, that#s a weak point.
So, you have to set the root directory (of the data) and you can specify the path to the audios from there, but if that would be useful, adding another key for an (optional) audio_path that would mean the path between database root and audiofiles root, would of course be no problem

@bagustris
Copy link
Collaborator Author

I am wrong when stating that defining root dir will solve this issue. I tried on TESS dataset (still in my branch), and it doesn't work. Still need the full path inside file header of the CSV file. So, the request of this feature is still valid.

This also will help if the dataset directory is not inside data Nkululeko directory (of course one can make softlinks for simple).

@felixbur
Copy link
Owner

felixbur commented Sep 4, 2023

ok, i add this later today. The data directory does NOT have to be inside the nkululeko root! ACtually a Nkululeko root is NOT necessary at all!

@felixbur
Copy link
Owner

felixbur commented Sep 4, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants