Contributing to `pyvista/vtk-data` #1790

adam-grant-hendry · 2021-11-07T23:22:22Z

Discussed in #1789

^{Originally posted by adam-grant-hendry November 7, 2021}
If I want to create a pull request to add a dataset to pyvista/vtk-data, should I fork the repository? It contains lfs files and downloading counts against the maintainer's bandwidth.

Are there details on contributing to vtk-data? The Contributing.md only talks about licenses (i.e. don't add a data set for which there isn't a license).

The above may be a good opportunity to elaborate in Contributing.md

The text was updated successfully, but these errors were encountered:

banesullivan · 2021-11-28T17:20:04Z

I'm going to remove the LFS files stored in that repo and place them somewhere else (they're geo example files I put in there for another PyVista-based project). Unfortunately, the only way I've been able to find how to do this is to delete and recreate the whole repository removing history. I'm perfectly content doing this as the history of that repo really isn't important, though it may mean that our examples will go down for a bit as the files are re-uploaded and github's mirrors update.

@pyvista/developers, does anyone have a problem with me moving forward with this in the next few days?

adeak · 2021-11-28T17:22:13Z

I'm not personally affected, so just out of curiosity: why do the files need removing?

banesullivan · 2021-11-28T17:58:38Z

why do the files need removing?

Honestly, I don't know... just when I was looking up how to remove git-lfs files from a repo, I stumbled up this: https://stackoverflow.com/questions/34579211/how-to-delete-a-file-tracked-by-git-lfs-and-release-the-storage-quota

There are a lot of mixed answers in there. First, I'm going to try to remove the files, uninstall LFS from the repo, and clean the history to see if this works. But there's some discussion about GitHub not actually purging the files and any clones will still count against the users LFS quota.

At the end of the day, it may just be easier to delete the repo and re-upload all non git-lfs files rather than spending an hour or so to get it right to preserve some history that really does not matter

adeak · 2021-11-28T17:59:56Z

So I take it the answer is "because they take up git LFS quota" 😆

For what it's worth there's a link to the authoritative reference: https://docs.github.com/en/repositories/working-with-files/managing-large-files/removing-files-from-git-large-file-storage

adam-grant-hendry · 2021-11-28T20:22:08Z

As long as it doesn't affect the maintainer, I'm fine with any approach 🙂. I have a PR to add a DICOM stack dataset thats around 65 MB, if I remember correctly, to

implement the VTKDicomReader in pyvista. and
add memory use and speed improvements to add_volume()

which is why I was asking

banesullivan · 2021-11-28T20:42:34Z

We already have a small dicom dataset that we use for testing: https://github.com/pyvista/vtk-data/blob/master/Data/DICOM_KNEE.dcm and https://github.com/pyvista/vtk-data/tree/master/Data/DICOMDirectory

its the examples.download_knee() data

Further the vtkDICOMImageReader is implemented in PyVista:

pyvista/pyvista/utilities/fileio.py

Line 33 in 9d05d4d

'.dcm': _vtk.vtkDICOMImageReader,

MatthewFlamm · 2021-11-28T21:03:40Z

The DICOMDirectory example would probably suffice for testing. This data set would provide a more interesting example.

We have added Reader classes which allow for more control over reading data files. This request was about adding more control. See https://github.com/pyvista/pyvista/blob/main/pyvista/utilities/reader.py.

banesullivan · 2021-11-28T21:13:56Z

This request was about adding more control

Ah, forgive me - drive by commenting here to be honest. I just wanted to make sure you all were aware we had and use that data since there was talk in pyvista/pyvista-support#500 (comment) about adding a reader and a dataset

MatthewFlamm · 2021-11-28T21:22:24Z

For this PR, I think it is about whether this additional data set would provide some advantage for testing or example usage. I'm not a DICOM user, so I'm not sure if the single file in the existing DICOMDirectory is sufficient.

banesullivan · 2021-11-28T21:29:59Z

I'm not a DICOM user, so I'm not sure if the single file in the existing DICOMDirectory is sufficient.

Me either, I've assumed it is sufficient though. @adam-grant-hendry, would this new dataset improve our examples/tests to make sure we sufficiently cover what you hope to do?

adam-grant-hendry · 2021-11-29T01:09:07Z

would this new dataset improve our examples/tests to make sure we sufficiently cover what you hope to do?

@banesullivan Yes, this will. In addition to porting over the VTKDICOMReader and add the ability to read a directory of files, there are two things my update to add_volume() does:

Fixes a memory leak that causes RAM to explode and pyvista to crash, and
Loads a volume at a speed comparable to paraview

I can run a memory profile before and after the commit that implements the change in the PR to show the improvement, both in memory usage and load time.

(FYI, need to test the ability for the DICOMReader to open multiple files)

adam-grant-hendry · 2021-11-29T03:52:38Z

However, ignoring the multi-file reader, I could probably show the memory and time improvements with the knee DICOM example

akaszynski · 2022-07-29T16:44:18Z

Can this be closed?

adam-grant-hendry · 2022-07-30T04:33:14Z

Can this be closed?

I think so, yes.

adam-grant-hendry closed this as completed Nov 28, 2021

adam-grant-hendry reopened this Nov 28, 2021

banesullivan mentioned this issue Nov 28, 2021

PyVista add_volume: Garbled Output, Much Slower than ParaView, & Memory Hog pyvista/pyvista-support#500

Closed

akaszynski closed this as completed Jul 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to `pyvista/vtk-data` #1790

Contributing to `pyvista/vtk-data` #1790

adam-grant-hendry commented Nov 7, 2021 •

edited

Loading

banesullivan commented Nov 28, 2021

adeak commented Nov 28, 2021

banesullivan commented Nov 28, 2021

adeak commented Nov 28, 2021 •

edited

Loading

adam-grant-hendry commented Nov 28, 2021 •

edited

Loading

banesullivan commented Nov 28, 2021

MatthewFlamm commented Nov 28, 2021

banesullivan commented Nov 28, 2021

MatthewFlamm commented Nov 28, 2021

banesullivan commented Nov 28, 2021

adam-grant-hendry commented Nov 29, 2021 •

edited

Loading

adam-grant-hendry commented Nov 29, 2021 •

edited

Loading

akaszynski commented Jul 29, 2022

adam-grant-hendry commented Jul 30, 2022

Contributing to pyvista/vtk-data #1790

Contributing to pyvista/vtk-data #1790

Comments

adam-grant-hendry commented Nov 7, 2021 • edited Loading

Discussed in #1789

banesullivan commented Nov 28, 2021

adeak commented Nov 28, 2021

banesullivan commented Nov 28, 2021

adeak commented Nov 28, 2021 • edited Loading

adam-grant-hendry commented Nov 28, 2021 • edited Loading

banesullivan commented Nov 28, 2021

MatthewFlamm commented Nov 28, 2021

banesullivan commented Nov 28, 2021

MatthewFlamm commented Nov 28, 2021

banesullivan commented Nov 28, 2021

adam-grant-hendry commented Nov 29, 2021 • edited Loading

adam-grant-hendry commented Nov 29, 2021 • edited Loading

akaszynski commented Jul 29, 2022

adam-grant-hendry commented Jul 30, 2022

Contributing to `pyvista/vtk-data` #1790

Contributing to `pyvista/vtk-data` #1790

adam-grant-hendry commented Nov 7, 2021 •

edited

Loading

adeak commented Nov 28, 2021 •

edited

Loading

adam-grant-hendry commented Nov 28, 2021 •

edited

Loading

adam-grant-hendry commented Nov 29, 2021 •

edited

Loading

adam-grant-hendry commented Nov 29, 2021 •

edited

Loading