-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AudioFolder packaged loader #4530
Add AudioFolder packaged loader #4530
Conversation
…ferring labels is not default)
The documentation is not available anymore as the PR was closed or merged. |
…ust in case)" This reverts commit d0b2592.
@lhoestq @mariosasko I don't know what to do with the test, do you have any ideas? :) |
also it's passed in |
…ts into add-audio-folder-new
…ts into add-audio-folder-new
If the error only happens on 3.6, maybe #4460 can help ^^' It seems to work in 3.7 on the windows CI
I think it a missed opportunity to have a consistent API between imagefolder and audiofolder, since they do everything the same way. Can you give more details why you think we should drop the labels by default ? |
Considering audio classification in audio is not as common as image classification in image, I'm ok with having different config defaults as long as they are properly documented (check Papers With Code for stats and compare the classification numbers to the other tasks, do this for both modalities) Also, WDYT about creating a generic folder loader that ImageFolder and AudioFolder then subclass to avoid having to update both of them when there is something to update/fix? |
@lhoestq I think it doesn't change the API itself, it just doesn't infer labels by default, but you can still set
If users load this dataset with At the same time, But this is definitely should be explained in the docs, which I've forgotten to update... I'll add this section soon. Also +to the generic loader, will work on it. |
If a metadata.jsonl file is present, then it doesn't have to infer the labels I agree. Note that this is already the case for imagefolder ;) in your case Labels are only inferred if there are no metadata.jsonl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @polinaeterna and @mariosasko this is super cool ! IMO once the class name is settled we're good to go. We can re-organize the tests in a subsequent PR I think :)
I also added a comment about the default for drop_labels
again. I think that if the documentation focuses first on audio+metadata, it's ok to have the same default as ImageFolder
@@ -0,0 +1,439 @@ | |||
import importlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on this !
for module in parent_builder_modules: | ||
extend_module_for_streaming(module, use_auth_token=builder.use_auth_token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool ! Sounds good to me
class AudioFolderConfig(folder_builder.FolderBuilderConfig): | ||
"""Builder Config for AudioFolder.""" | ||
|
||
drop_labels: bool = True # usually we don't need labels as classification is not the main audio task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to say that I'm still in favor of setting it to None by default for consistency with imagefolder ^^'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well I don't have a strong opinion here anymore :D
if we set drop_labels=None
by default as you suggested, it might be confusing in cases when users provide only audio files, without metadata (or with broken metadata?). this is probably quite unlikely, so I'm ok with your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mariosasko what do you think about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to be consistent, so I agree with @lhoestq.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm ok with that but it makes explaining things in documentation a bit more complicated...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've updated the docs, tried to make it simpler. still not sure that this logic with default None value of "drop_labels" is clear but I guess we'd better see what users say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhoestq @mariosasko what do you think about it now? 🤗
also, don't you know what's happening with the CI? why it takes forever and finally some jobs are cancelled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks all good to me ! :D
Not sure what's happening with the CI though. I just re-launched one job to see if it was caused by a bug in github actions or the windows runners
>>> dataset = load_dataset("audiofolder", data_files="https://s3.amazonaws.com/datasets.huggingface.co/SpeechCommands/v0.01/v0.01_test.tar.gz") | ||
``` | ||
|
||
## AudioFolder with metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would put this section first, since this is the main use case anyway.
docs/source/audio_load.mdx
Outdated
@@ -55,3 +55,126 @@ If you only want to load the underlying path to the audio dataset without decodi | |||
'transcription': 'I would like to set up a joint account with my partner'} | |||
``` | |||
|
|||
## AudioFolder | |||
|
|||
You can also load a dataset with an `AudioFolder` dataset builder. It does not require writing a custom dataloader, making it useful for quickly loading audio data. Your dataset structure might look like: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove this part, since it's already presented later in "AudioFolder with labels"
Co-authored-by: Mario Šaško <mario@huggingface.co>
…n't introduce any new params to be more consistent with usual datasets script as it doesn't do anything here anyway
…king ¯\_(ツ)_/¯" This reverts commit fc41118.
…odule Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
…ts into add-audio-folder-new
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
will close #3964
AudioFolder is almost identical to ImageFolder except for inferring labels is not the default behavior (
drop_labels
is set to True in config), the option of inferring them is preserved though.The weird thing is happening with the
test_data_files_with_metadata_and_archives
whenstreaming
isTrue
. Here is the log from the CI:I hadn't been able to reproduce this locally until I created the same test environment (I mean with
pip install .[tests]
) with python3.6. The same env but with python3.8 passes the test! I didn't manage to figure out what's wrong, I also tried simply to replace the test wav file and still got the same error. Versions ofsoundfile
,librosa
andlibsndfile
are identical. Might it be something with zip compression? Sounds weird but I don't have any other ideas...TODO: