Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column-level metadata/functional arguments to aq.matches #98

Open
bmschmidt opened this issue Feb 11, 2021 · 1 comment
Open

Column-level metadata/functional arguments to aq.matches #98

bmschmidt opened this issue Feb 11, 2021 · 1 comment
Labels
help wanted For potential community contributions

Comments

@bmschmidt
Copy link

bmschmidt commented Feb 11, 2021

This is an alternate solution related to the question at #33 about selecting certain subsets of columns based on types.

Arquero could have a metadata slot on each column that borrows from Apache Arrow's support of table and column-level metadata. If my arrow column is of type arrow.utf8 with a metadata field saying "language": "English", if would be useful to have an arquero table derived from it at some point declare table.metadata = {"language": "English", "arrow_type": "utf8"}. The pyarrow feather export functions do something similar with pandas frames: the feather metadata includes a description of the pandas dtypes.

If a function as argument to aq.matches worked with reference to the full column (not just the name as for strings and regexes) @ericemc3's case in #33 could be expressed something like this:

table.select(aq.matches(col => col.metadata.arrow_type.match(/int|float/)))

col.metadata would also conceivably be a useful place to expose information about what autotype inference in fromCSV and fromJSON decided to do.

Could try to find time for a pull request, but obviously this hinges on questions about how/whether you want to use slots on the column object other than data.

@jheer
Copy link
Member

jheer commented Feb 11, 2021

I had also thought about generally allowing a column metadata property, but it hasn’t (yet) been a priority. PRs welcome!

@jheer jheer added the help wanted For potential community contributions label Feb 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted For potential community contributions
Projects
None yet
Development

No branches or pull requests

2 participants