-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Rewrite formats as sub-package of classes, rewrite API #161
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- add `interface` sub-pacakge with base.py + defines `BaseFormat` - add seq sub-package in `interface` with base.py + defines `SeqLike` interface - add bbox sub-package in `interface` with base.py + definfes `BBoxLike` interface - import BaseFormat and sub-packages in interface/__init__.py - import inside __init__ in Transcriber to avoid circular imports
- add formats/__init__.py - with `as_list` and `by_name` functions that use module-level constant FORMAT constant built dynamically at imported-time from the `__all__`s of `crowsetta.formats.seq` and `crowsetta.formats.bbox`
because current function that converts everything to a list of paths has unexpected side effects. Just want a validator that tests whether the extension is what we expect it to be, that can be used by `Annotation.from_file` methods - no longer accept a list - instead accept "PathLike" as defined in `crowsetta.typing` - no longer return a list, or anything; no side effects - simply check that the paths `endswith` one of the string extensions passed in - revise comment explaining why we use `endswith` not `suffix` for clarity
goal is parametrize `a_notmat` so that any test using it runs multiple test cases with every .not.mat file. https://hynek.me/til/parametrized-pytest-fixtures/ define paths as constants outside fixtures so other modules can access them outside fixtures, and then finally pass one of the constants into a fixture `a_notmat_path` to parametrize.
Calling Sequence.__eq__ raises a FutureWarning due to comparing numpy arrays: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison The method as it was written was pretty verbose, iterating over each array `_labels`, `_onsets_s`, ..., etc., and doing array comparisons on each of those. It's much more concise to just do: - if `other` is not a Sequence, return False - if `other` Sequence doesn't have same number of segments, return False - finally, zip the segments and compare each of them directly, and return boolean of all([self_seg == other_seg]) The Segment.__eq__ works well on its own, so not only is this less wordy but it lets us rely on a method that `attrs` writes for us
- Remove tests/data_for_tests/koumura/Bird0/Annotation.csv - delete tests/data_for_tests/koumura/Bird0/ExpectedError
NickleDave
force-pushed
the
rewrite-formats-as-classes-fix-#99
branch
from
May 6, 2022 02:49
d299628
to
695dc6a
Compare
- add .csv files generated by tests/scripts/remake_test_csv.py - rename / update with correct formatting: tests/data_for_tests/csv/gy6or6_032312.csv -> notmat_gy6or6_032312.csv - update test .csv formatting so it raises correct error: tests/data_for_tests/csv/missing_fields_in_header.csv - add test data .csv: tests/data_for_tests/csv/invalid_fields_in_header.csv - remove test data .csv (wrong format, etc.): unrecognized_fields_in_header.csv - rename + reformat: example_annotation_with_onset_inds_offset_inds_None.csv + remove invalid on/offset_ind columns full of 'None' str + rename: tests/data_for_tests/example_user_annotation.csv - add onset_s_column_but_no_offset_s_column.csv - add tests/data_for_tests/csv/no_onset_or_offset_column.csv
with fixtures that returns paths to .csv files used for different tests
- Rename data_for_tests/audio_wav_annot_phn/ -> timit_kaggle/ - Rename data_for_tests/audio_wav_nist_annot_phn/ -> timit_nist/ - Reorganize data in tests/data_for_tests/timit_nist and timit_kaggle/
- rewrite fixtures in fixtures/timit.py mainly so we can parametrize a fixture with paths, i.e. the `a_phn_path` fixture is parametrized with `ALL_PHN_PATHS`; to have access to this list outside of the fixture, to pass it into the parameter argument, needed to declare constants outside the fixtures.
- to parametrize - and add other fixtures, e.g. das_csv_paths
No longer used
NickleDave
force-pushed
the
rewrite-formats-as-classes-fix-#99
branch
from
May 6, 2022 23:49
085e52d
to
949524d
Compare
NickleDave
added a commit
that referenced
this pull request
May 7, 2022
This was referenced Jun 17, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rewrites formats as classes, and closes many related issues.
Big step towards version 4.0 as described in #146
attrs
classes #99attrs
classes #99pandas
-- fixes refactor 'generic-seq' format; usepandas
for generating csv #63simple-seq
can parse various formats #134