Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored modules related to input-output #194

Merged
merged 14 commits into from
Jun 3, 2024
Merged

Refactored modules related to input-output #194

merged 14 commits into from
Jun 3, 2024

Conversation

niksirbi
Copy link
Member

@niksirbi niksirbi commented May 27, 2024

Description

What is this PR

  • Bug fix
  • Addition of a new feature
  • Other

Why is this PR needed?

The public functions in the load_poses.py module currently assume that users are always loading data from a file (or from a DeepLabCut-style pandas dataframe). However, there are some use-cases where the data are already in Python, in the form of numpy arrays, perhaps imported with custom loaders (this is not hypothetical, a potential user has already asked for it). There is a way to convert such data into a properly-formatted movement dataset, but this way is not easy to find and is not documented.

What does this PR do?

Adds a from_numpy() function that explicitly accepts position (+ optional confidence) data in the form of numpy arrays and returns a movement dataset. Under the hood it calls the ValidPosesDataset validator and the existing _from_valid_data() utility.

The addition of this function enabled me to slightly refactor the load_poses.py module such that from_numpy() is the single point-of-entry into a movement dataset - i.e. every other loading function first reads data into numpy arrays before calling the new function. This was already de facto the case, but it's much more explicit now. Moreover, this refactoring also enabled me to get read of a redundant validation call for LightningPose data.

Here's the schematic of the updated load_poses.py module. The previous version can be found here.

updated_load_poses

How has this PR been tested?

I added a simple unit test for the new function. The underlying ValidPosesDataset is already extensively tested, and so are all file loaders.

Is this a breaking change?

No.

Does this PR require an update to the documentation?

The API index has been updated accordingly. The new function's docstring also includes example usage.

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit

EDIT 2024-05-31

Following @sfmig review, the scope of this PR expanded, resulting in a more thorough refactoring of IO-related modules: load_poses.py, save_poses.py, and validators.py. This mostly involved renaming functions and editing docstrings, to make the whole thing more logical and internally consistent.

These are the names of the updated public functions:
Screenshot 2024-05-31 at 15 49 18

Note that we renamed from_dlc_df to from_dlc_style_df (and likewise for save), because LightningPose also uses "DeepLabCut-style" dataframes.

We also decided to rename private functions such that it's clear what is being converted to what, e.g.: _ds_from_sleap_labels_file() instead of _load_from_sleap_labels.file(). There is one remaining inconsistency, namely the fact that public functions start with from_ while private functions start with _ds_from_ or _df_from_. That's because the way public functions are actually invoked is the following:

from movement.io import load_poses

ds = load_poses.from_file(`file/path')

and load_poses.ds_from_file would be redundant. Perhaps there is scope for renaming load_poses to load_dataset (and save_poses to save_dataset accordingly), such that the syntax would be movement.io.load_dataset.from_file(). That could make more sense now, because "poses" is a bit ambiguous, while we've fully defined what a "dataset" is. I'll open an issue about that.

Here's the updated diagram for movement's I/O functionalites.

movement_io_updated

Copy link

codecov bot commented May 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.68%. Comparing base (426003c) to head (f99d7d8).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #194      +/-   ##
==========================================
- Coverage   99.68%   99.68%   -0.01%     
==========================================
  Files          11       11              
  Lines         638      634       -4     
==========================================
- Hits          636      632       -4     
  Misses          2        2              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@niksirbi niksirbi requested a review from sfmig May 28, 2024 07:09
@niksirbi niksirbi marked this pull request as ready for review May 28, 2024 07:09
Copy link
Contributor

@sfmig sfmig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactoring ✨ !
Just some suggestions on function names and docstrings. Maybe it's not a lot to add to this PR but also happy to have it separately.

movement/io/load_poses.py Outdated Show resolved Hide resolved
movement/io/load_poses.py Outdated Show resolved Hide resolved
movement/io/load_poses.py Outdated Show resolved Hide resolved
movement/io/load_poses.py Outdated Show resolved Hide resolved
tests/test_unit/test_load_poses.py Outdated Show resolved Hide resolved
@niksirbi
Copy link
Member Author

Thanks for the review @sfmig, I like all you suggestions and will implement them here. The scope of this PR will increase to "refactoring load_poses module" and I will edit the PR title and description accordingly.

Copy link

sonarcloud bot commented May 31, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@niksirbi niksirbi changed the title added from_numpy() function to the load_poses module Refactored modules related to input-output May 31, 2024
@niksirbi
Copy link
Member Author

@sfmig I've updated this PR's description and title. I think there is no need to go line-by-line through the diff again, just let me know whether you agree with the changes as I've described them in the updated PR description.

@sfmig
Copy link
Contributor

sfmig commented Jun 3, 2024

Looks fantastic @niksirbi 🚀
And also great recap of the renaming bits, thanks!
Will merge now

@sfmig sfmig added this pull request to the merge queue Jun 3, 2024
Merged via the queue into main with commit 8e20498 Jun 3, 2024
27 checks passed
@niksirbi niksirbi mentioned this pull request Jun 4, 2024
7 tasks
@niksirbi niksirbi deleted the load-from-numpy branch June 15, 2024 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants