-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding loader for 3D-MARCo #71
Conversation
Adding my work toward the Please advice on how to properly process and test this kind of metadata. By default, soundata checks that the properties and annotations are loaded from a metadata file, not from the |
Codecov Report
@@ Coverage Diff @@
## master #71 +/- ##
==========================================
+ Coverage 97.02% 97.14% +0.12%
==========================================
Files 17 18 +1
Lines 1377 1437 +60
==========================================
+ Hits 1336 1396 +60
Misses 41 41 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iranroman looks good to me! Just left minor comments.
As part of this PR we could reduce the size of the eigenscape
test file from 2.5s
to 1s
, since now it's quite heavy (>8MB).
soundata/datasets/eigenscape.py
Outdated
time (str): time when the audio signal was recorded | ||
date (str): date when the audio signal was recorded | ||
additional information (str): notes included by the dataset | ||
authors with otherdetails relevant to the specific clip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
soundata/datasets/marco.py
Outdated
|
||
source_label = self._clip_metadata.get("source_label") | ||
if source_label is None: | ||
self.source_label = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line not covered by tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing since there will always be a source_label
.
soundata/datasets/marco.py
Outdated
) | ||
|
||
if not os.path.exists(json_path): | ||
raise FileNotFoundError("Metadata not found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line not covered by tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also removing. Since the metadata is extracted from the index.json
and that comes built-in with soundata
, this file should be there if soundata was installed properly.
soundata/datasets/marco.py
Outdated
for path_filename in all_paths_filenames: | ||
|
||
clip_id = path_filename | ||
|
||
path, filename = path_filename.split("/") | ||
|
||
source_label = path | ||
|
||
clip_metadata = filename.split("_") | ||
|
||
# remove arbitrary clip numbering used by dataset authors | ||
clip_metadata = [ | ||
data for data in clip_metadata if data != "" and data[0] != "0" | ||
] | ||
|
||
microphone_info = clip_metadata[1:] | ||
|
||
if "deg" in clip_metadata[0]: | ||
source_angle = "".join(clip_metadata[0].partition("deg")[:2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe remove in-between empty lines?
tests/datasets/test_marco.py
Outdated
# validate metadata | ||
assert jam.file_metadata.duration == 1.0 | ||
assert jam.sandbox.microphone_info == ["OCT3D", "2", "FR"] | ||
print(jam.annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
tests/test_utils.py
Outdated
@@ -17,6 +17,7 @@ def run_clip_tests(clip, expected_attributes, expected_property_types): | |||
|
|||
# test clip attributes | |||
for attr in clip_attr["attributes"]: | |||
print(attr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
Thanks a lot for your comments @magdalenafuentes. I have addressed them and pushed my new commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iranroman great! Ready to merge!
3D-MARCo
Description
Please include the following information at the top level docstring for the dataset's module mydataset.py:
Dataset loaders checklist:
scripts/
, e.g.make_my_dataset_index.py
, which generates an index file.soundata/indexes/
e.g.my_dataset_index.json
.soundata/my_dataset.py
tests/
, e.g.test_my_dataset.py
docs/source/soundata.rst
anddocs/source/quick_reference.rst
tests/test_full_dataset.py
on your dataset.