Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Squib::DataFrame: better xlsx and csv utilities #156

Closed
andymeneely opened this issue Apr 26, 2016 · 0 comments
Closed

Squib::DataFrame: better xlsx and csv utilities #156

andymeneely opened this issue Apr 26, 2016 · 0 comments
Milestone

Comments

@andymeneely
Copy link
Owner

andymeneely commented Apr 26, 2016

I think the "hash of arrays" idea works really well in Squib, but often I'm wondering if we could use more utility methods for this. For example, the data['Name'].size feels like a hack to me - it should just be data.rows or something.

Here's what I have in mind. I'll edit this list over time.

  • Call it a Squib::DataFrame
  • It acts like a hash of arrays. So it still supports the [] and returns arrays of values.
  • It has an #nrows method that returns the maximum length of a column across all columns
  • It has a #ncolumns method that returns the number of columns
  • It has a #columns method that returns the column names
  • It has a pretty_print method that works a lot like my to_json for better version control diff'ing. Can also print out a given card.
  • It has its own trim_whitespace functionality so that doesn't get duplicated (still gets turned on or off by csv and xlsx).
  • It has some metaprogramming that allows for dynamic methods to be declared based on snake-cased access to columns. For example data['Actor Name'] could be accessed by data.actor_name. We could even add another method like #bind_them! that dynamically declares methods in Squib::Deck based on column names. Or are these "too magical"?
  • Provide other validation in its construction, like duplicate columns and stuff.
  • Quantity explosion is implemented in this class so we're not duplicating code there either
  • More custom functionality around auto-converting to integers or floats or whatever. Nobody's complained about this yet, but it could be nice for some narrow cases.
andymeneely added a commit that referenced this issue Oct 28, 2016
@andymeneely andymeneely added this to the v0.12 milestone Nov 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant