Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding loader for DCASE 2023 Task-2 #134

Merged
merged 18 commits into from
Dec 1, 2023
Merged

Adding loader for DCASE 2023 Task-2 #134

merged 18 commits into from
Dec 1, 2023

Conversation

tanmayy24
Copy link
Collaborator

@tanmayy24 tanmayy24 commented Oct 23, 2023

Title

Please use the following title: "Adding loader for MyDATASET". If your pull request is work in progress, change your title to "[WIP] Adding loader for MyDATASET" to avoid reviews while the loader is not ready.

Description

Please include the following information at the top level docstring for the dataset's module mydataset.py:

  • Describe annotations included in the dataset
  • Indicate the total duration of the dataset in hours, and (optionally) also list the number of individual files
  • Mention the origin of the dataset (e.g. creator, institution)
  • Describe the type of audio included in the dataset
  • Indicate any relevant papers related to the dataset
  • Include a description about how the data can be accessed and the license it uses (if applicable)

Dataset loaders checklist:

  • Create a script in scripts/, e.g. make_my_dataset_index.py, which generates an index file.
  • Run the script on the canonical version of the dataset and save the index in soundata/indexes/ e.g. my_dataset_index.json.
  • Create a module in soundata, e.g. soundata/my_dataset.py.
  • Create tests for your loader in tests/, e.g. test_my_dataset.py.
  • Add your module to docs/source/soundata.rst and docs/source/quick_reference.rst.
  • Run black, flake8 and mypy (see Running your tests locally).
  • Run tests/test_full_dataset.py on your dataset.
  • Check that codecov coverage does not decrease.

If your dataset is not fully downloadable there are two extra steps you should follow:

  • Contacting the soundata organizers by opening an issue or PR so we can discuss how to proceed with the closed dataset.
  • Show that the version used to create the checksum is the "canonical" one, either by getting the version from the dataset creator, or by verifying equivalence with several other copies of the dataset.
  • Make sure someone has run pytest -s tests/test_full_dataset.py --local --dataset my_dataset once on your dataset locally and confirmed it passes.

Please-do-not-edit flag

To reduce friction, we will make commits on top of contributor's pull requests by default unless they use the please-do-not-edit flag. If you don't want this to happen don't forget to add the flag when you start your pull request.

@codecov
Copy link

codecov bot commented Oct 23, 2023

Codecov Report

Merging #134 (4a41d2d) into main (5cb1204) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #134      +/-   ##
==========================================
+ Coverage   98.69%   98.72%   +0.03%     
==========================================
  Files          28       29       +1     
  Lines        2447     2515      +68     
==========================================
+ Hits         2415     2483      +68     
  Misses         32       32              

@guillemcortes
Copy link
Collaborator

Hi! Is it ready to review? @tanmayy24

@tanmayy24
Copy link
Collaborator Author

Yes! It's ready for review now!

@tanmayy24 tanmayy24 changed the title [WIP] Adding loader for DCASE 2023 Task-2 Adding loader for DCASE 2023 Task-2 Oct 27, 2023
Copy link
Collaborator

@genisplaja genisplaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, thanks @tanmayy24, looks good to me! Minor changes requested, basically a minor documentation thing and a missing test, let's see if it can be added. The rest looks good to me!

@@ -19,6 +19,13 @@ Dataset Loaders
.. automodule:: soundata.datasets.marco
:members:
:inherited-members:

DCASE23-Task2
^^^^^^^^
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finish line!

Comment on lines 380 to 384
# Check for file existence
if not os.path.exists(metadata_add_train_path):
raise FileNotFoundError(
f"Additional training metadata for {machine} not found. Did you run .download()?"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we test that?

@tanmayy24 tanmayy24 force-pushed the tanmay/dcasetask2 branch 2 times, most recently from 970ac13 to b40286f Compare December 1, 2023 01:43
@tanmayy24 tanmayy24 merged commit 73f26b9 into main Dec 1, 2023
11 checks passed
@magdalenafuentes magdalenafuentes deleted the tanmay/dcasetask2 branch February 6, 2024 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants