Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use IDs in plots and brain results #2614

Merged
merged 10 commits into from Feb 6, 2023
Merged

Always use IDs in plots and brain results #2614

merged 10 commits into from Feb 6, 2023

Conversation

brimoor
Copy link
Contributor

@brimoor brimoor commented Feb 5, 2023

Updates methods like scatterplot() and the in-App embeddings backend to always rely on sample/label IDs when pulling data for plots.

Previously, these implementations assumed that the view on which embeddings/etc were computed has not changed in any way. Now, these methods will gracefully handle added/deleted data.

New tests are added to tests/intensive/plot_tests.py to verify that the plotting methods function as expected.

@brimoor brimoor added the enhancement Code enhancement label Feb 5, 2023
@brimoor brimoor requested a review from a team February 5, 2023 22:12
@brimoor brimoor self-assigned this Feb 5, 2023
@codecov
Copy link

codecov bot commented Feb 5, 2023

Codecov Report

Base: 62.53% // Head: 62.46% // Decreases project coverage by -0.08% ⚠️

Coverage data is based on head (a6a0eeb) compared to base (ff3d48a).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2614      +/-   ##
===========================================
- Coverage    62.53%   62.46%   -0.08%     
===========================================
  Files          249      249              
  Lines        42168    42262      +94     
  Branches       347      347              
===========================================
+ Hits         26371    26399      +28     
- Misses       15797    15863      +66     
Flag Coverage Δ
app 50.03% <ø> (-0.10%) ⬇️
python 99.39% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
tests/unittests/import_export_tests.py 99.83% <100.00%> (+<0.01%) ⬆️
app/packages/state/src/recoil/skeletonFilter.ts 11.85% <0.00%> (-7.87%) ⬇️
app/packages/state/src/recoil/pathFilters/index.ts 34.55% <0.00%> (-0.78%) ⬇️
app/packages/looker/src/overlays/keypoint.ts 18.77% <0.00%> (+1.84%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

patches_field, leaf
)

labels = curr_view.values(label_field, unwind=True)
field = curr_view.get_field(label_field)
labels = view._get_values_by_id(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates embeddings backend to always pull color-by data using IDs

)

if ids is not None and not is_frames:
values = samples._get_values_by_id(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the refactoring in the plotting methods was to achieve this line: when the user has provided IDs and is asking to pull values from a dataset via path/expression, use _get_values_by_id() to ensure that the correct values are pulled, even if samples doesn't correspond 1-1 with other data that may have been provided.

plot = fo.scatterplot(
points=points,
samples=dataset,
ids=ids,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of something that wouldn't previously work. User provided ids + points corresponding to a subset of the samples=dataset argument that they provided, but they provided paths/expressions for labels and sizes arguments.

Previously this would fail because dataset.values() would naively be used (which would result in too many labels/sizes). Now the ids argument is used to lookup the correct labels/sizes in the correct order corresponding to points/ids.

Copy link
Contributor

@ritch ritch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Embeddings.py changes LGTM - Other plot changes look fine but I'm less familiar with those.

@brimoor brimoor merged commit 85d2f59 into develop Feb 6, 2023
@brimoor brimoor deleted the save-ids branch February 6, 2023 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Code enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants