-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major rework of BEAST files #299
Comments
A file format that is more understandable than hdf5 would be nice. Possibilities are FITS or ASDF. But concerns about speed. |
Fewer files would be easier to manage. Possibilities
|
For tables, we should use astropy.tables to take advantage of the extensive work done by the larger community. |
I'd like to change the input file format: make |
Good point. Would be great to be able to just us a text file. One impact would mean it would be easy to setup the beast to run from any directory with the BEAST scripts installed on the system. |
HDF5 info python h5py closer to numpy than pytables: pytables more database orientated and includes non-numpy data types: |
ASDF info standard: python implementation: |
@meredith-durbin : I have a memory of you comparing asdf and hdf5 for speed, but can't find where your results were documented (an issue mabye?). Can you provide a pointer to your results? Or am I remembering something incorrectly and you did not do this? |
Another possibly relevant piece of information: I got this warning when running one of the large files for PHAT production runs. I have no idea if switching file formats would make this better.
|
That does not sound good. Any more information of what file this is for? Maybe the lnp file? |
From the Feb 2019 BEAST HackDay, the idea emerged that a major rework of the BEAST file data formats could provide significant benefits. One particular issue is the challenges of using the HDF5 format as it has hard for multiple BEAST developers to understand or manipulate. The benefits could be easier to understand/manipulate, less code for reading/writing -> less code maintenance, and better suited for Mega-BEAST (and BEAST) needs.
Currently, the formats are a mix of ASCII csv, hdf5, and fits files.
So, what would the format(s) of the BEAST files look like w/o the considering the current formats? In other words, if we started with a clean slate, what would we do? We understand the needs of the BEAST and Mega-BEAST much better now.
Please comment with ideas/proposals.
Existing issues that touch on this topic:
Output file format: #53
Size of the files: #295
HDF5 issues; #186, #262
unclosed files: #64
eztables: #9, #10
The text was updated successfully, but these errors were encountered: