Add many Readers #2496

MatthewFlamm · 2022-04-21T20:03:12Z

Overview

This PR will attempt to add all the readers used in the standard_reader_routine in the fileio module to enable #2494.

Details

A followup PR will be made to align the read vs. Reader usage.

A checklist of readers from #2494 that I will track here:

TODO:

tests
add all to pyvista namespace
add to documentation
~~add instantiation tests for readers without available data~~

Bonus

Resolves #2509

codecov · 2022-04-21T20:11:36Z

Codecov Report

Merging #2496 (8b64843) into main (6bb517c) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #2496      +/-   ##
==========================================
+ Coverage   93.70%   93.72%   +0.01%     
==========================================
  Files          75       75              
  Lines       16072    16106      +34     
==========================================
+ Hits        15061    15095      +34     
  Misses       1011     1011

MatthewFlamm · 2022-04-22T20:09:56Z

I have implemented all the Readers that I can find datasets for in vtk-data. I won't be able to test the readers that have no suitable data. I will indicate which ones in the description above.

I will add those Readers and then start adding in true tests.

MatthewFlamm · 2022-04-23T14:01:02Z

I'm assuming I added too many plots for the docstring test. Should I be skipping testing all doctests that download and plot datasets?

adeak · 2022-04-23T20:05:02Z

When I run the doctests locally I see a pretty steep rise in memory need (2 GB at 36% execution, and counting). That has to be something we should fix, right? There's no reason why anything should persist across individual doctest executions, so there should be no accumulation of memory.

I first thought that this is related to doctests sharing the module global namespace, but at least stdlib doctest says

By default, each time doctest finds a docstring to test, it uses a shallow copy of M’s globals, so that running tests doesn’t change the module’s real globals, and so that one test in M can’t leave behind crumbs that accidentally allow another test to work. This means examples can freely use any names defined at top-level in M, and names defined earlier in the docstring being run. Examples cannot see names defined in other docstrings.

So I wonder why allocated memory keeps building up...

MatthewFlamm · 2022-04-23T23:47:44Z

Other than figuring out the doctest issue, this is ready to go.

I decided not to implement tests for readers without available data. One reader hung when using fake data even when just setting the file name. So please pay particular attention to those 5 implementations during review.

akaszynski · 2022-04-25T04:42:52Z

When I run the doctests locally I see a pretty steep rise in memory need (2 GB at 36% execution, and counting). That has to be something we should fix, right? There's no reason why anything should persist across individual doctest executions, so there should be no accumulation of memory.

Locally, I see 8GB + of resident memory usage, which exceeds the GitHub runner Supported runners and hardware resources. Even adding a gc.collect into a conftest.py doesn't help; the references are sticking around between tests.

akaszynski · 2022-04-25T05:30:07Z

Played around with this for a bit. I think given the limitations of doctest and pytest, we're going to have to provide ourselves with a workaround. Might be one of the following:

Run pytest --doctest-modules pyvista on individual modules within pyvista. This will work, but will require some maintenance if modules are added or removed. For example:
```
python -m pytest --doctest-modules pyvista/plotting
python -m pytest --doctest-modules pyvista/utilities
...
```
Use @adkea's check_doctest_names.py, but we'll miss out on checking output.

adeak · 2022-04-25T07:49:09Z

We could make module discovery automatic with something like

find pyvista -maxdepth 1 -type d -iregex 'pyvista/[a-z][a-z_]*' | xargs python -m pytest --doctest-modules

but I'm sure most of the memory comes from pyvista.examples.downloads. I don't have enough free RAM to try.

MatthewFlamm · 2022-04-25T14:28:31Z

I suspect that we are not closing any plotters during doctests. We only use mesh.plot(), and doctest doesn't do any true teardown. I can't figure out why, but if you run pytest --doctest-modules foldername it doesn't do any explicit plotting to screen. If you run pytest --doctest-modules file.py it explicitly plots to screen and you have to manually close each figure.

https://github.com/astropy/pytest-doctestplus looks very useful. It allows to use pytest fixtures, which enables true teardown functionality. I'm playing with this now, and I'm seeing promising results. It doesn't support the NUMBER flag for doctests, and there are more specific replacements, so I need to figure out how to convert that usage.

MatthewFlamm · 2022-04-25T15:12:53Z

I can confirm locally that using pytest-doctestplus with a teardown fixture that uses pyvista.close_all fixes memory buildup during doctests.

We have to make a decision before converting to pytest-doctestplus, because it breaks compatibility with the normal doctest. One example is that the normal doctest won't use fixtures, but this is okay as long as the fixtures we write aren't required to run the tests.

More important is that pytest-doctestplus changes some of the behaviors when comparing values such as floats or ELLIPSIS. For example, float values will now be rounded to nearest instead of rounded down. Modifying our doctests to accommodate will break the tests on the normal doctest.

MatthewFlamm · 2022-04-25T15:51:20Z

I've started #2509 so we can discuss this there.

MatthewFlamm · 2022-04-25T18:47:09Z

If 5776ca8 fixes the doctest memory issue, we can revert the # doctest: +SKIP commits.

adeak · 2022-04-25T18:48:17Z

I copied Alex' pyvista/conftest.py and a local full doctest run ran in 4.5 minutes with barely any increase in used memory. I have 3.6 GB free at the moment.

This reverts commit c016141.

This reverts commit 5f3c74d.

adeak · 2022-04-25T19:04:22Z

pyvista/utilities/reader.py::pyvista.utilities.reader.HDRReader PASSED           [ 96%]
pyvista/utilities/reader.py::pyvista.utilities.reader.TIFFReader PASSED          [ 98%]

👍

akaszynski

LGTM. I made one final change to ignore coverage of conftest.py. If that goes through, feel free to merge.

adeak · 2022-04-25T19:17:54Z

@MatthewFlamm earlier you said

I decided not to implement tests for readers without available data. One reader hung when using fake data even when just setting the file name. So please pay particular attention to those 5 implementations during review.

Is this still relevant? Which are the 5 readers that did that?

Ah, the ones with "<- no data" in your PR comment. Got it.

.coveragerc

pyvista/conftest.py

pyvista/examples/downloads.py

pyvista/utilities/reader.py

Co-authored-by: Andras Deak <adeak@users.noreply.github.com>

MatthewFlamm · 2022-04-26T16:29:38Z

I will wait for the CI to run and then merge tomorrow if I have time if there are no more comments (or planned reviews)

adeak

I've noticed some small (mostly unrelated) issues, but let's not stall this PR with those. Great improvements here, thanks!

* upstream/main: Make VTK version error clear when PointSet is still abstract (pyvista#2483) Use imageio intersphinx links (pyvista#2489) Fix glyphs when orienting with cell data (pyvista#2500) Bump mypy from 0.942 to 0.950 (pyvista#2522) Update hypothesis requirement from <6.45.1 to <6.45.2 (pyvista#2523) Add many Readers (pyvista#2496) Bump trimesh from 3.10.8 to 3.11.2 (pyvista#2519) Return actor from add_mesh_threshold (pyvista#2516) fix uniformgrid.x docstring (pyvista#2511) Update imageio requirement from <2.18.0 to <2.19.0 (pyvista#2506) Update hypothesis requirement from <6.44.1 to <6.45.1 (pyvista#2507) add polyhedral example (pyvista#2505)

MatthewFlamm added 6 commits April 21, 2022 15:32

add bmp reader

f4adc5f

add DEMReader

1d58f69

add JPEGReader

9a6380b

add MetaImageReader

2af25af

fix whitespace

513edd6

add NRRD Reader

cb7a8ee

tkoyama010 added the enhancement Changes that enhance the library label Apr 21, 2022

MatthewFlamm added 8 commits April 21, 2022 19:56

add PNGReader

f42b253

add pnm data and PNM Reader

7a12c7d

add SLC Reader

11fb1a9

add TIFFReader

5d988d2

add HDR Reader

e92e3b8

add PTS Reader

91c9db6

add AVSucdReader

0879669

add HDF Reader

3e33e83

MatthewFlamm added 13 commits April 22, 2022 16:19

add gltf reader

a1b2bc3

Add Fluent Reader

1ff63e7

add MFIX Reader

535515c

add SegYReader

47bc2d4

cleanup can docstring

40581a4

add missing lazy_vtkHDFReader for vtk<9

5aaad77

add missing lazy_vtkSegYReader

115ff3b

test BMPReader

dd8ce1d

test DEMReader

974e5cd

test JPEGReader

f3bdb40

test MetaImageReader

e43b308

test NRRDReader

edaf25e

test PNGReader

e42ced5

MatthewFlamm marked this pull request as ready for review April 23, 2022 23:42

MatthewFlamm mentioned this pull request Apr 25, 2022

doctest memory increasing #2509

Closed

testing conftest with just base doctest-modules

5776ca8

MatthewFlamm added 2 commits April 25, 2022 14:51

Revert "skip hdr doctest; due to file size?"

c199a07

This reverts commit c016141.

Revert "skip TIFFReader docstring; due to file size?"

751aa39

This reverts commit 5f3c74d.

akaszynski mentioned this pull request Apr 25, 2022

Use pytest-doctestplus; fix increasing memory during doctest #2510

Closed

ignore coverage of conftest.py

485d004

akaszynski approved these changes Apr 25, 2022

View reviewed changes

adeak reviewed Apr 25, 2022

View reviewed changes

adeak and others added 3 commits April 26, 2022 00:44

Apply suggestions from code review

103c57a

Code review: cells_nd description

503aa78

Co-authored-by: Andras Deak <adeak@users.noreply.github.com>

Clarify where can dataset came from

8b64843

Co-authored-by: Andras Deak <adeak@users.noreply.github.com>

adeak approved these changes Apr 26, 2022

View reviewed changes

MatthewFlamm merged commit 9df8012 into main Apr 27, 2022

MatthewFlamm deleted the align-read-and-readers branch April 27, 2022 15:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add many Readers #2496

Add many Readers #2496

MatthewFlamm commented Apr 21, 2022 •

edited by akaszynski

codecov bot commented Apr 21, 2022 •

edited

MatthewFlamm commented Apr 22, 2022

MatthewFlamm commented Apr 23, 2022

adeak commented Apr 23, 2022 •

edited

MatthewFlamm commented Apr 23, 2022

akaszynski commented Apr 25, 2022

akaszynski commented Apr 25, 2022

adeak commented Apr 25, 2022 •

edited

MatthewFlamm commented Apr 25, 2022

MatthewFlamm commented Apr 25, 2022

MatthewFlamm commented Apr 25, 2022

MatthewFlamm commented Apr 25, 2022

adeak commented Apr 25, 2022 •

edited by akaszynski

adeak commented Apr 25, 2022 •

edited

akaszynski left a comment

adeak commented Apr 25, 2022 •

edited

MatthewFlamm commented Apr 26, 2022

adeak left a comment

Add many Readers #2496

Add many Readers #2496

Conversation

MatthewFlamm commented Apr 21, 2022 • edited by akaszynski

Overview

Details

Bonus

codecov bot commented Apr 21, 2022 • edited

Codecov Report

MatthewFlamm commented Apr 22, 2022

MatthewFlamm commented Apr 23, 2022

adeak commented Apr 23, 2022 • edited

MatthewFlamm commented Apr 23, 2022

akaszynski commented Apr 25, 2022

akaszynski commented Apr 25, 2022

adeak commented Apr 25, 2022 • edited

MatthewFlamm commented Apr 25, 2022

MatthewFlamm commented Apr 25, 2022

MatthewFlamm commented Apr 25, 2022

MatthewFlamm commented Apr 25, 2022

adeak commented Apr 25, 2022 • edited by akaszynski

adeak commented Apr 25, 2022 • edited

akaszynski left a comment

Choose a reason for hiding this comment

adeak commented Apr 25, 2022 • edited

MatthewFlamm commented Apr 26, 2022

adeak left a comment

Choose a reason for hiding this comment

MatthewFlamm commented Apr 21, 2022 •

edited by akaszynski

codecov bot commented Apr 21, 2022 •

edited

adeak commented Apr 23, 2022 •

edited

adeak commented Apr 25, 2022 •

edited

adeak commented Apr 25, 2022 •

edited by akaszynski

adeak commented Apr 25, 2022 •

edited

adeak commented Apr 25, 2022 •

edited