Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add other snapshot formats #7

Open
rieder opened this issue Dec 17, 2019 · 11 comments
Open

Add other snapshot formats #7

rieder opened this issue Dec 17, 2019 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@rieder
Copy link

rieder commented Dec 17, 2019

Would you be interested in adding support for other snapshot formats, e.g. the AMUSE HDF5 format?

@rieder
Copy link
Author

rieder commented Dec 17, 2019

Happy to help if you are :)

@dmentipl
Copy link
Owner

Hi @rieder. Thanks for showing interest in Plonk!

I am interested in adding AMUSE HDF5 support. However, I'm not familiar with AMUSE. So I'm happy for you to attempt it (with my guidance, as required).

@dmentipl dmentipl added the enhancement New feature or request label Dec 18, 2019
@dmentipl
Copy link
Owner

A good place to start is by looking at CONTRIBUTING.md.

If you have any questions, please don't hesitate to ask. (Although, responses may be slow over the holiday period.)

@rieder
Copy link
Author

rieder commented Jun 12, 2020

Perhaps the easiest way is to not write yet another function for reading files, but to directly populate a Plonk snap object with values from an AMUSE particleset. What would be the right way to manually construct such a snap object?

@rieder
Copy link
Author

rieder commented Jun 25, 2020

@dmentipl any ideas on how this can/should be done?

@dmentipl
Copy link
Owner

dmentipl commented Jul 8, 2020

Sorry for the delayed response.

The load_snap function is defined in plonk/snap/readers/__init__.py. In that module we can add 'AMUSE' to the _data_sources tuple. And add an if clause to load_snap checking if data_source is 'AMUSE'.

Then we will need to add a module plonk/snap/readers/amuse.py that contains the actual reader. Have a look at the one for Phantom HDF5 snaps. The function generate_snap_from_file returns a Snap object. This is the function called in load_snap to load the Phantom snap.

The properties of Snap that it sets are:

  • snap.data_source, a string, e.g. 'Phantom'
  • snap.file_path, this is a pathlib.Path to the file
  • snap._file_pointer, this is the h5py.File object
  • snap.properties, this is a dictionary of properties, e.g. 'equation_of_state' set in _header_to_properties
  • snap.units, this is the units of the data, set in _header_to_properties

Now for the actual arrays of data. Plonk loads things lazily. It does this by having _array_registry on the snap which is a dictionary where the key is the name of the array and the value is a function that returns the array when called with the Snap object. The same goes with sink particle arrays.

So, we also need to set:

  • snap._array_registry, set in _populate_particle_array_registry
  • snap._sink_registry, set in _populate_sink_array_registry

Any of the arrays that are in the HDF5 file directly can be read like

array_registry['position'] = _get_dataset('xyz', 'particles')

In the example above, for Phantom HDF5 data, the particle positions are in the dataset 'particles/xyz'. I.e. using h5py directly, snap._file_pointer['particles/xyz'].

If the array doesn't exist on file, e.g. Phantom snaps don't have the density, it is contructed from the smoothing length and mass, we need to write a small function to do this. See for example _density:

def _density(snap: Snap) -> ndarray:
    m = _mass(snap)
    h = _get_dataset('h', 'particles')(snap)
    hfact = snap.properties['smoothing_length_factor']
    return m * (hfact / np.abs(h)) ** 3

I hope it's not too confusing. The main point is that the array registry is a dictionary of key/values where the value is a function that is called inside Snap, when required, like

self._array_registry['position'](self)

Please let me know if that helps. Or if you need some more assistance.

@dmentipl
Copy link
Owner

I've made some changes to what is described above. See https://github.com/dmentipl/plonk/compare/1d34668..master.

The comments at the top of https://github.com/dmentipl/plonk/blob/master/plonk/snap/readers/__init__.py explain some of the details.

@dmentipl
Copy link
Owner

But the fundamentals are unchanged.

@rieder
Copy link
Author

rieder commented Oct 7, 2020

Would it be possible to create a Plonk Snap object from a particle array that is already in memory, without writing to an HDF5 file and then reading that file again? That would probably be much easier (and more general) to write.

Perhaps it would help to have a chat about this?

@dmentipl
Copy link
Owner

dmentipl commented Oct 8, 2020

Hi @rieder,

Thanks for the suggestion. That sounds like a good idea.

Unfortunately, I don't have time at the moment to work on it as I'm writing up my PhD thesis. Hopefully, I'll have more time in December, or January next year.

@rieder
Copy link
Author

rieder commented Oct 8, 2020

of course, that would be fine. good luck with the writing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants