Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add method to export small data set as numpy array, astropy table and/or panda dataFrame #9

Closed
bombrun opened this issue Nov 3, 2016 · 2 comments

Comments

@bombrun
Copy link

bombrun commented Nov 3, 2016

There are method to export to hdf5 or fits file but no method to export directly a selection (small enough) to a classical array or table format such as numpy array, astropy table and/or panda dataFrame.

ds.evaluate() might be part of the solution but it does not accept selection option.

@maartenbreddels
Copy link
Member

@quiquinSP:

What I had in mind, are a few methods in the Dataset class like

  • as_dataframe (for pandas)
  • as_astropy_table
  • as_numpy (a dict with numpy arrays)

It makes sense for them to take a 'selection' argument, see for instance Dataset.scatter, and the arguments column_names and virtual like export_hdf5, possibly a strings argument like Dataset.get_column_names(..). It may be convenient to give the as_numpy method an extra with_units=False argument to put in the units (see http://docs.astropy.org/en/stable/units/ ). Astropy tables can also take units, for pandas dataframe I am not sure.
You might want to check out in Dataset.scatter how I protect against converting too much data, or see https://github.com/maartenbreddels/vaex/blob/2754156da08fd4fad2555fdf0d85373ebae10a35/vaex/export.py#L291 how I protect against using too much memory, and similary in the gui https://github.com/maartenbreddels/vaex/blob/master/vaex/ui/main.py#L1070 (Just noticed that the gui code should be refactored to use the vaex.utils)
Please also include unittests, looking forward to see you PR!

@maartenbreddels
Copy link
Member

Implemented in 7ca315b, with a short description here:
http://vaex.astro.rug.nl/latest/getting_data_in_vaex.html#getting-your-data-out
should be available in the next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants