Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modify uniformgrid filters #2432

Merged
merged 11 commits into from
Apr 9, 2022
Merged

modify uniformgrid filters #2432

merged 11 commits into from
Apr 9, 2022

Conversation

njneeteson
Copy link
Contributor

Overview

I discovered a while ago that many of the UniformGrid (vtkImageData) specific filters do not play nicely with cell data. The purpose of this PR is to throw a band-aid on the problem for the UniformGrid filters that I have added to PyVista so that users don't try to use one of them with cell data and get confused by the results. Some filters simply needed the documentation updated with a warning / explanation of behaviour with cell data, and some needed functionality modified to prohibit use of them with cell data.

Perhaps in the future someone can figure out what is going on under the hood and there could be a more elegant solution to these issues. We could also consider doing something like automatically moving cell arrays to point arrays, processing, and moving back, but it gets messy with potential name collisions. If a user really wants to apply one of these filters to cell data then they can do that conversion manually (just as they would have to if they wanted to do something analogous in VTK).

Resolves #2217

Details

gaussian_smooth and dilate_erode: using it with cell data does not seem to work at all, so I removed the preferences kwarg and added ValueErrors that trigger if the user tries to use it with cell data (if cell data is the active scalar or if the scalar kwarg points to cell data exclusively)

median_smooth and image_threshold: you can use this with cell data, but it will send the output to a point data array and overwrite any existing point data array of the same name. Warning added.

@github-actions github-actions bot added the enhancement Changes that enhance the library label Apr 5, 2022
@codecov
Copy link

codecov bot commented Apr 5, 2022

Codecov Report

Merging #2432 (88db967) into main (0856b08) will decrease coverage by 0.05%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #2432      +/-   ##
==========================================
- Coverage   93.70%   93.64%   -0.06%     
==========================================
  Files          74       74              
  Lines       15994    16002       +8     
==========================================
- Hits        14987    14985       -2     
- Misses       1007     1017      +10     

@adeak
Copy link
Member

adeak commented Apr 5, 2022

Thanks for the PR! Just a general reply for now:

We could also consider doing something like automatically moving cell arrays to point arrays, processing, and moving back, but it gets messy with potential name collisions.

Collisions could always be handled by prepending something to the array name until there's no clash. The real issue is that many users deal with huge datasets, and the implicit conversion to point data could take a lot of time and memory. It's a very risky thing to do on our own, not knowing what the users' data is like, so it's definitely a lot safer to make the user convert if they want to.

And a technical note: we recently switched to black for autoformatting, and added an optional feature of using pre-commit hooks. We recommend setting up pre-commit as per the instructions at https://docs.pyvista.org/extras/developer_notes.html#style-checking, you can then add your changes and run pre-commit run (and repeat) to hammer out the kinks before you commit (alternatively, the checks/autoformatting will run before you commit).

@MatthewFlamm
Copy link
Contributor

Collisions could always be handled by prepending something to the array name until there's no clash. The real issue is that many users deal with huge datasets, and the implicit conversion to point data could take a lot of time and memory. It's a very risky thing to do on our own, not knowing what the users' data is like, so it's definitely a lot safer to make the user convert if they want to.

I am such a user and would highly prefer that we don't duplicate data unless it is explicitly required. When we exceed memory requirements in vtk, I often get a segfault without extensive error tracing. It is better, IMO, to raise an error like you have in this PR.

These aren't the only filters that require point data. streamlines, streamlines_evenly_spaced_2D, and contour for example. Can we generalize this into a helper, so that it can be reused?

field = self.get_array_association(scalars, preference=preference)
field = self.get_array_association(scalars, preference='point')
if field.value == 1:
raise ValueError('Can only process point data, given `scalars` are cell data.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could recommend to the user to use the cell_data_to_point_data filter in the error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wasn't sure how verbose we wanted this to be. Maybe this could be added as a new PyVista-specific error in core/errors.py and then added wherever relevant (including for the other filters you mentioned). Probably a subject for another issue/PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay leaving this for a more general solution across the code if you want to keep this fix PR small.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't have the time or the familiarity with pyvista's codebase to expand the scope of this PR beyond the uniformgrid filters. I'm fine with changing the error messages I've used here to be more informative but it might be better for someone else to make the decision of what feedback the user should get in this situation (trying to use cell data in a filter that doesn't support it) and then apply that consistently across the board.

@njneeteson
Copy link
Contributor Author

Thanks for the PR! Just a general reply for now:

We could also consider doing something like automatically moving cell arrays to point arrays, processing, and moving back, but it gets messy with potential name collisions.

Collisions could always be handled by prepending something to the array name until there's no clash. The real issue is that many users deal with huge datasets, and the implicit conversion to point data could take a lot of time and memory. It's a very risky thing to do on our own, not knowing what the users' data is like, so it's definitely a lot safer to make the user convert if they want to.

Good point, and with Matthew's points as well, seems like it's best to leave it as-is. I felt my first draft of the PR should be to do as little under the hood as possible and just leave it up to users to work around the limitations of the underlying VTK filters and seems like that's the way to go.

And a technical note: we recently switched to black for autoformatting, and added an optional feature of using pre-commit hooks. We recommend setting up pre-commit as per the instructions at https://docs.pyvista.org/extras/developer_notes.html#style-checking, you can then add your changes and run pre-commit run (and repeat) to hammer out the kinks before you commit (alternatively, the checks/autoformatting will run before you commit).

Yeah I ran the pre-commit stuff and I thought I had fixed the flagged issues before doing the PR but I guess I hadn't. Next time I'll double-check and make sure it's all good before submitting.

@adeak
Copy link
Member

adeak commented Apr 5, 2022

Yeah I ran the pre-commit stuff and I thought I had fixed the flagged issues before doing the PR but I guess I hadn't. Next time I'll double-check and make sure it's all good before submitting.

No problem at all, there's nothing wrong with failing CI. I just wasn't sure if you were aware of the tooling, in case it makes your life easier :)

@akaszynski
Copy link
Member

Made a few changes:

Related

Unrelated but bothering and discovered while going through this PR

  • Smoothing filters were lacking examples. Added to median_smooth and gausian_smooth.
  • Links to perlin_noise and sample_function were missing. Those have been added.

doc/api/core/misc.rst Outdated Show resolved Hide resolved
@akaszynski akaszynski changed the title Feat/modify uniformgrid filters modify uniformgrid filters Apr 9, 2022
@akaszynski akaszynski merged commit 93c20c0 into pyvista:main Apr 9, 2022
@tkoyama010 tkoyama010 mentioned this pull request Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Changes that enhance the library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None of the UniformGrid-specific filters work on cell_data arrays.
4 participants