Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add many Readers #2496

Merged
merged 45 commits into from
Apr 27, 2022
Merged

Add many Readers #2496

merged 45 commits into from
Apr 27, 2022

Conversation

MatthewFlamm
Copy link
Contributor

@MatthewFlamm MatthewFlamm commented Apr 21, 2022

Overview

This PR will attempt to add all the readers used in the standard_reader_routine in the fileio module to enable #2494.

Details

A followup PR will be made to align the read vs. Reader usage.

A checklist of readers from #2494 that I will track here:

  • vtkBMPReader
  • vtkDEMReader
  • vtkJPEGReader
  • vtkMetaImageReader
  • vtkNrrdReader
  • vtkPNGReader
  • vtkPNMReader
  • vtkSLCReader
  • vtkTIFFReader
  • vtkGLTFReader <- no data
  • vtkHDRReader
  • vtkFluentReader <- no data
  • vtkMFIXReader <- no data
  • vtkPTSReader <- no data
  • vtkAVSucdReader
  • vtkSegYReader <- no data
  • vtkHDFReader

TODO:

  • tests
  • add all to pyvista namespace
  • add to documentation
  • add instantiation tests for readers without available data

Bonus

Resolves #2509

@codecov
Copy link

codecov bot commented Apr 21, 2022

Codecov Report

Merging #2496 (8b64843) into main (6bb517c) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #2496      +/-   ##
==========================================
+ Coverage   93.70%   93.72%   +0.01%     
==========================================
  Files          75       75              
  Lines       16072    16106      +34     
==========================================
+ Hits        15061    15095      +34     
  Misses       1011     1011              

@tkoyama010 tkoyama010 added the enhancement Changes that enhance the library label Apr 21, 2022
@MatthewFlamm
Copy link
Contributor Author

I have implemented all the Readers that I can find datasets for in vtk-data. I won't be able to test the readers that have no suitable data. I will indicate which ones in the description above.

I will add those Readers and then start adding in true tests.

@MatthewFlamm
Copy link
Contributor Author

I'm assuming I added too many plots for the docstring test. Should I be skipping testing all doctests that download and plot datasets?

@adeak
Copy link
Member

adeak commented Apr 23, 2022

When I run the doctests locally I see a pretty steep rise in memory need (2 GB at 36% execution, and counting). That has to be something we should fix, right? There's no reason why anything should persist across individual doctest executions, so there should be no accumulation of memory.

I first thought that this is related to doctests sharing the module global namespace, but at least stdlib doctest says

By default, each time doctest finds a docstring to test, it uses a shallow copy of M’s globals, so that running tests doesn’t change the module’s real globals, and so that one test in M can’t leave behind crumbs that accidentally allow another test to work. This means examples can freely use any names defined at top-level in M, and names defined earlier in the docstring being run. Examples cannot see names defined in other docstrings.

So I wonder why allocated memory keeps building up...

@MatthewFlamm MatthewFlamm marked this pull request as ready for review April 23, 2022 23:42
@MatthewFlamm
Copy link
Contributor Author

Other than figuring out the doctest issue, this is ready to go.

I decided not to implement tests for readers without available data. One reader hung when using fake data even when just setting the file name. So please pay particular attention to those 5 implementations during review.

@akaszynski
Copy link
Member

When I run the doctests locally I see a pretty steep rise in memory need (2 GB at 36% execution, and counting). That has to be something we should fix, right? There's no reason why anything should persist across individual doctest executions, so there should be no accumulation of memory.

Locally, I see 8GB + of resident memory usage, which exceeds the GitHub runner Supported runners and hardware resources. Even adding a gc.collect into a conftest.py doesn't help; the references are sticking around between tests.

@akaszynski
Copy link
Member

Played around with this for a bit. I think given the limitations of doctest and pytest, we're going to have to provide ourselves with a workaround. Might be one of the following:

  • Run pytest --doctest-modules pyvista on individual modules within pyvista. This will work, but will require some maintenance if modules are added or removed. For example:
    python -m pytest --doctest-modules pyvista/plotting
    python -m pytest --doctest-modules pyvista/utilities
    ...
    
  • Use @adkea's check_doctest_names.py, but we'll miss out on checking output.

@adeak
Copy link
Member

adeak commented Apr 25, 2022

We could make module discovery automatic with something like

find pyvista -maxdepth 1 -type d -iregex 'pyvista/[a-z][a-z_]*' | xargs python -m pytest --doctest-modules

but I'm sure most of the memory comes from pyvista.examples.downloads. I don't have enough free RAM to try.

@MatthewFlamm
Copy link
Contributor Author

I suspect that we are not closing any plotters during doctests. We only use mesh.plot(), and doctest doesn't do any true teardown. I can't figure out why, but if you run pytest --doctest-modules foldername it doesn't do any explicit plotting to screen. If you run pytest --doctest-modules file.py it explicitly plots to screen and you have to manually close each figure.

https://github.com/astropy/pytest-doctestplus looks very useful. It allows to use pytest fixtures, which enables true teardown functionality. I'm playing with this now, and I'm seeing promising results. It doesn't support the NUMBER flag for doctests, and there are more specific replacements, so I need to figure out how to convert that usage.

@MatthewFlamm
Copy link
Contributor Author

I can confirm locally that using pytest-doctestplus with a teardown fixture that uses pyvista.close_all fixes memory buildup during doctests.

We have to make a decision before converting to pytest-doctestplus, because it breaks compatibility with the normal doctest. One example is that the normal doctest won't use fixtures, but this is okay as long as the fixtures we write aren't required to run the tests.

More important is that pytest-doctestplus changes some of the behaviors when comparing values such as floats or ELLIPSIS. For example, float values will now be rounded to nearest instead of rounded down. Modifying our doctests to accommodate will break the tests on the normal doctest.

@MatthewFlamm
Copy link
Contributor Author

I've started #2509 so we can discuss this there.

@MatthewFlamm
Copy link
Contributor Author

If 5776ca8 fixes the doctest memory issue, we can revert the # doctest: +SKIP commits.

@adeak
Copy link
Member

adeak commented Apr 25, 2022

I copied Alex' pyvista/conftest.py and a local full doctest run ran in 4.5 minutes with barely any increase in used memory. I have 3.6 GB free at the moment.

@adeak
Copy link
Member

adeak commented Apr 25, 2022

pyvista/utilities/reader.py::pyvista.utilities.reader.HDRReader PASSED           [ 96%]
pyvista/utilities/reader.py::pyvista.utilities.reader.TIFFReader PASSED          [ 98%]

👍

Copy link
Member

@akaszynski akaszynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I made one final change to ignore coverage of conftest.py. If that goes through, feel free to merge.

@adeak
Copy link
Member

adeak commented Apr 25, 2022

@MatthewFlamm earlier you said

I decided not to implement tests for readers without available data. One reader hung when using fake data even when just setting the file name. So please pay particular attention to those 5 implementations during review.

Is this still relevant? Which are the 5 readers that did that?

Ah, the ones with "<- no data" in your PR comment. Got it.

.coveragerc Outdated Show resolved Hide resolved
pyvista/conftest.py Outdated Show resolved Hide resolved
pyvista/examples/downloads.py Outdated Show resolved Hide resolved
pyvista/examples/downloads.py Show resolved Hide resolved
pyvista/examples/downloads.py Outdated Show resolved Hide resolved
pyvista/utilities/reader.py Outdated Show resolved Hide resolved
adeak and others added 3 commits April 26, 2022 00:44
Co-authored-by: Andras Deak <adeak@users.noreply.github.com>
Co-authored-by: Andras Deak <adeak@users.noreply.github.com>
@MatthewFlamm
Copy link
Contributor Author

I will wait for the CI to run and then merge tomorrow if I have time if there are no more comments (or planned reviews)

Copy link
Member

@adeak adeak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed some small (mostly unrelated) issues, but let's not stall this PR with those. Great improvements here, thanks!

@MatthewFlamm MatthewFlamm merged commit 9df8012 into main Apr 27, 2022
@MatthewFlamm MatthewFlamm deleted the align-read-and-readers branch April 27, 2022 15:00
adeak added a commit to adeak/pyvista that referenced this pull request Apr 28, 2022
* upstream/main:
  Make VTK version error clear when PointSet is still abstract (pyvista#2483)
  Use imageio intersphinx links (pyvista#2489)
  Fix glyphs when orienting with cell data (pyvista#2500)
  Bump mypy from 0.942 to 0.950 (pyvista#2522)
  Update hypothesis requirement from <6.45.1 to <6.45.2 (pyvista#2523)
  Add many Readers (pyvista#2496)
  Bump trimesh from 3.10.8 to 3.11.2 (pyvista#2519)
  Return actor from add_mesh_threshold (pyvista#2516)
  fix uniformgrid.x docstring (pyvista#2511)
  Update imageio requirement from <2.18.0 to <2.19.0 (pyvista#2506)
  Update hypothesis requirement from <6.44.1 to <6.45.1 (pyvista#2507)
  add polyhedral example (pyvista#2505)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Changes that enhance the library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

doctest memory increasing
4 participants