Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Temporal data set (or clarification on usage) #412

Open
MatthewFlamm opened this issue Oct 21, 2019 · 2 comments
Open

Feature Request: Temporal data set (or clarification on usage) #412

MatthewFlamm opened this issue Oct 21, 2019 · 2 comments
Labels
discussion Thought provoking threads where decisions have to be made feature-request Please add this cool feature!

Comments

@MatthewFlamm
Copy link
Contributor

Is there a more standard way to group and use temporal data sets in PyVista than what I am doing? If not, I would like this to be a feature request for a TemporalDataSet implementation.

I have a 'file.pvd' that has a vtk collection with 'timestep' but this cannot be read in with pv.read('file.pvd'), so I'm using this manual method to read it in and then manual methods to sort and extract data.

#90 deals with plotting as far as I can tell, and this request is about the structure to handle time dependent data.

Here is a skeleton of what I'm doing now:

import pyvista as pv
import numpy as np
import glob

files_slice = glob.glob('*.vti')

meshes = [pv.read(f) for f in files_slice]
multi_mesh=pv.MultiBlock(meshes)

time = [(i, mesh.field_arrays['TIME']) for i, mesh in enumerate(multi_mesh)]
sort_time = sorted(time, key=lambda x: x[1])

for i, t in sort_time:
    max_vel = np.max(multi_mesh[i]['Velocity Magnitude (m/s)'])
    # print time step, time, and maximum velocity magnitude in slice
    print(f'i={i:2d} time={t[0]:4.1f} max_vel={max_vel:4.2f}')
@banesullivan banesullivan added discussion Thought provoking threads where decisions have to be made feature-request Please add this cool feature! labels Oct 21, 2019
@banesullivan
Copy link
Member

Hi @MatthewFlamm, thanks for posting your question here as it brings up an excellent data management point when it comes to temporal datasets. We haven't really given a ton of thought to how we should be properly managing time-varying datasets in PyVista because VTK handles time variance over a pipeline and PyVista has effectively removed the pipelining nature of VTK. Thus we're in need of a clear way to manage time varying data which can become be rather complicated.

Datasets/meshes can have time variance in one of two ways:

  • the data attributes are changing through time (values on the points/cells change)
  • the spatial reference changes (the node locations change - cell connectivity hopefully doesn't change)
  • and then you could combine those two for a very dynamic dataset.

Typically, I work with datasets where the spatial reference remains constant but the values on the points/cells change (simulating a time-varying field in a discretized space). To do this, I usually have a single data array for each timestep associated with one mesh. I.e. something like:

import pyvista as pv
import numpy as np

mesh = pv.Sphere()

time_steps = [0, 1, 2, ]
for t in time_steps:
    # each of these arrays would be my data at different time steps
    mesh["time:{}".format(t)] = np.random.rand(mesh.n_points)

and then I might iterate over those arrays while plotting to create a GIF or something like in the docs.

What might be an interesting idea would be to add a metadata tracker to PyVista meshes where you can say all arrays with a given prefix (e.g. "Velocity ...") correspond to a single array across time steps that PyVista could know how to process for simple operations like min/max or plotting. Keeping the time step arrays seperate like this would allow the data to be tracked both on the VTK and PyVista sides maintaining how PyVista can be interoperable with any VTK code. With this we could do something like add a level of dimensionality to the data arrays to represent time when they are accessed from PyVista. Implementation could be done by extending the array handlers we currently have to make sure the multi-dimensional NumPy array still points back to the seperated versions of the array on the VTK side. This would be a pretty major undertaking to implement and test properly but it is definitely doable.


When it comes to dealing with time-variance of the mesh's spatial reference, things get really messy. At the moment, having a MultiBlock dataset as you have in the above example is probably the cleanest solution. I honestly can't think of any better way to do this at the moment as VTK's pipelining approach is well suited for this kind of time variance.

@MatthewFlamm
Copy link
Contributor Author

I am currently using data which has a static spatial grid but has varying values over time points, so your example definitely makes more sense to me. I don't need to use MultiBlock at all. In this case too, it is easy to write your own functions using numpy for things like TemporalStatistics, integrating variables spatially at each timepoint (or integrating temporally at each spatial point), plotting over time, etc., etc.

It would be great to have a more general time varying data set implementation and corresponding filters (or operations) for more complex cases, but I agree that this needs a lot of thought to keep it as straightforward, flexible, and easy-to-use as the existing functionality. Keep up the great work!

At some point vtk had a TemporalDataSet class that wrapped data objects, see here, but it was removed for being too complicated in the pipeline framework.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Thought provoking threads where decisions have to be made feature-request Please add this cool feature!
Projects
None yet
Development

No branches or pull requests

2 participants