-
Notifications
You must be signed in to change notification settings - Fork 15
Add helper functions to take the "head" of vector or matrix objects. #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This uses zero-copy export and import to try to be as efficient (including memory efficient), but it does modify `.gb_obj`, so it's a little funky. I'm thinking of using this to improve the repr for sparse data.
jim22k
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this approach to taking ownership, grabbing a portion of the data, and giving the original data back. Nice trick!
Did you plan to add a method so this can be called as myvec.ss.head(2)?
I'm undecided. I'm tempted to. Probably. What do you think? I may also add I can't guarantee this method for any backend, so it would be inappropriate to add this functionality to |
Also, add `dtype=` keyword argument to head so that it better matches `to_values`.
|
We don't have |
| (mask is None or mask.structure) | ||
| and df.shape != matrix.shape | ||
| and min(matrix._nvals, max_rows if matrix._nvals <= max_rows else min_rows) | ||
| > 2 * df.count().sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: this condition was chosen so I wouldn't need to rewrite any tests. "Friendlier" conditions for when to switch to the sparse repr may exist.
| # SS, SuiteSparse-specific: head | ||
| num_rows = matrix._nvals if matrix._nvals <= max_rows else min_rows | ||
| if matrix._is_transposed: | ||
| cols, rows, vals = matrix._matrix.ss.head(num_rows, sort=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: sorting here for a better repr result. If we find this is too slow for large objects, we can update it so we only sort for small objects (but we need to figure out where the threshold may be).
This uses zero-copy export and import to try to be as efficient (including memory efficient) as possible, but it does modify
.gb_obj, so it's a little funky.I'm thinking of using this to improve the repr for sparse data.
This was pretty quick to pull together. These aren't the most robust tests, but they're better than nothing. Improving the repr is where the "real" work will be!