Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding loader for BAF dataset #583

Merged
merged 5 commits into from
Mar 15, 2023
Merged

Adding loader for BAF dataset #583

merged 5 commits into from
Mar 15, 2023

Conversation

guillemcortes
Copy link
Collaborator

Adding loader for BAF

Please use the following title: "Adding loader for MyDATASET". If your pull request is work in progress, change your title to "[WIP] Adding loader for MyDATASET" to avoid reviews while the loader is not ready.

Description

Please include the following information at the top level docstring for the dataset's module mydataset.py:

  • Describe annotations included in the dataset
  • Indicate the size of the datasets (e.g. number files and duration, hours)
  • Mention the origin of the dataset (e.g. creator, institution)
  • Describe the type of music included in the dataset
  • Indicate any relevant papers related to the dataset
  • Include a description about how the data can be accessed and the license it uses (if applicable)

Dataset loaders checklist:

  • Create a script in scripts/, e.g. make_my_dataset_index.py, which generates an index file.
  • Run the script on the canonical version of the dataset and save the index in mirdata/indexes/ e.g. my_dataset_index.json.
  • Create a module in mirdata, e.g. mirdata/my_dataset.py
  • Create tests for your loader in tests/datasets/, e.g. test_my_dataset.py
  • Add your module to docs/source/mirdata.rst and docs/source/table.rst
  • Run tests/test_full_dataset.py on your dataset.

If your dataset is not fully downloadable there are two extra steps you should follow:

  • Contacting the mirdata organizers by opening an issue or PR so we can discuss how to proceed with the closed dataset.
  • Show that the version used to create the checksum is the "canonical" one, either by getting the version from the dataset creator, or by verifying equivalence with several other copies of the dataset.
  • Make sure someone has run pytest -s tests/test_full_dataset.py --local --dataset my_dataset once on your dataset locally and confirmed it passes

Please-do-not-edit flag

To reduce friction, we will make commits on top of contributor's pull requests by default unless they use the please-do-not-edit flag. If you don't want this to happen don't forget to add the flag when you start your pull request.

@codecov
Copy link

codecov bot commented Mar 7, 2023

Codecov Report

Merging #583 (2834da4) into master (0620b8c) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #583      +/-   ##
==========================================
+ Coverage   96.84%   96.88%   +0.04%     
==========================================
  Files          56       57       +1     
  Lines        6717     6816      +99     
==========================================
+ Hits         6505     6604      +99     
  Misses        212      212              

Copy link
Collaborator

@genisplaja genisplaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @guillemcortes, I am just leaving a few tiny comments but the loader looks really nice. Good coverage, docs rendering correctly, and tests passing :) I'll invite @nkundiushuti, @magdalenafuentes, and @harshpalan to take a look, and IMO we can merge!

tests/datasets/test_baf.py Outdated Show resolved Hide resolved
mirdata/datasets/baf.py Outdated Show resolved Hide resolved
mirdata/datasets/baf.py Outdated Show resolved Hide resolved
@guillemcortes
Copy link
Collaborator Author

Hi @genisplaja, thanka for your comments, I've done the according modifications. I've also improved the csv2pandas management and atomized the function. I think now it's better. I've also included the corresponding test. Would you mind checking it? Thanks!

@genisplaja
Copy link
Collaborator

genisplaja commented Mar 9, 2023

That's nice! That is ready to go in my opinion. Thanks @guillemcortes :) Maybe @harshpalan or @magdalenafuentes could take a look? 🙏🏼🙏🏼🙏🏼

Copy link
Collaborator

@harshpalan harshpalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @guillemcortes @genisplaja sorry for the delayed response. This looks good. I'm approving the changes, please go ahead and merge it. Thanks.

@genisplaja genisplaja merged commit 8c84e50 into mir-dataset-loaders:master Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants