You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are method to export to hdf5 or fits file but no method to export directly a selection (small enough) to a classical array or table format such as numpy array, astropy table and/or panda dataFrame.
ds.evaluate() might be part of the solution but it does not accept selection option.
The text was updated successfully, but these errors were encountered:
What I had in mind, are a few methods in the Dataset class like
as_dataframe (for pandas)
as_astropy_table
as_numpy (a dict with numpy arrays)
It makes sense for them to take a 'selection' argument, see for instance Dataset.scatter, and the arguments column_names and virtual like export_hdf5, possibly a strings argument like Dataset.get_column_names(..). It may be convenient to give the as_numpy method an extra with_units=False argument to put in the units (see http://docs.astropy.org/en/stable/units/ ). Astropy tables can also take units, for pandas dataframe I am not sure.
You might want to check out in Dataset.scatter how I protect against converting too much data, or see https://github.com/maartenbreddels/vaex/blob/2754156da08fd4fad2555fdf0d85373ebae10a35/vaex/export.py#L291 how I protect against using too much memory, and similary in the gui https://github.com/maartenbreddels/vaex/blob/master/vaex/ui/main.py#L1070 (Just noticed that the gui code should be refactored to use the vaex.utils)
Please also include unittests, looking forward to see you PR!
There are method to export to hdf5 or fits file but no method to export directly a selection (small enough) to a classical array or table format such as numpy array, astropy table and/or panda dataFrame.
ds.evaluate() might be part of the solution but it does not accept selection option.
The text was updated successfully, but these errors were encountered: