New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always use IDs in plots and brain results #2614
Conversation
Codecov ReportBase: 62.53% // Head: 62.46% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #2614 +/- ##
===========================================
- Coverage 62.53% 62.46% -0.08%
===========================================
Files 249 249
Lines 42168 42262 +94
Branches 347 347
===========================================
+ Hits 26371 26399 +28
- Misses 15797 15863 +66
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
patches_field, leaf | ||
) | ||
|
||
labels = curr_view.values(label_field, unwind=True) | ||
field = curr_view.get_field(label_field) | ||
labels = view._get_values_by_id( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates embeddings backend to always pull color-by data using IDs
) | ||
|
||
if ids is not None and not is_frames: | ||
values = samples._get_values_by_id( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the refactoring in the plotting methods was to achieve this line: when the user has provided IDs and is asking to pull values from a dataset via path/expression, use _get_values_by_id()
to ensure that the correct values are pulled, even if samples
doesn't correspond 1-1 with other data that may have been provided.
plot = fo.scatterplot( | ||
points=points, | ||
samples=dataset, | ||
ids=ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of something that wouldn't previously work. User provided ids
+ points
corresponding to a subset of the samples=dataset
argument that they provided, but they provided paths/expressions for labels
and sizes
arguments.
Previously this would fail because dataset.values()
would naively be used (which would result in too many labels/sizes). Now the ids
argument is used to lookup the correct labels/sizes in the correct order corresponding to points/ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Embeddings.py changes LGTM - Other plot changes look fine but I'm less familiar with those.
Updates methods like
scatterplot()
and the in-App embeddings backend to always rely on sample/label IDs when pulling data for plots.Previously, these implementations assumed that the
view
on which embeddings/etc were computed has not changed in any way. Now, these methods will gracefully handle added/deleted data.New tests are added to
tests/intensive/plot_tests.py
to verify that the plotting methods function as expected.