Skip to content

Conversation

@eriknw
Copy link
Member

@eriknw eriknw commented Feb 25, 2021

This uses zero-copy export and import to try to be as efficient (including memory efficient) as possible, but it does modify .gb_obj, so it's a little funky.

I'm thinking of using this to improve the repr for sparse data.

This was pretty quick to pull together. These aren't the most robust tests, but they're better than nothing. Improving the repr is where the "real" work will be!

This uses zero-copy export and import to try to be as efficient (including
memory efficient), but it does modify `.gb_obj`, so it's a little funky.

I'm thinking of using this to improve the repr for sparse data.
Copy link
Member

@jim22k jim22k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this approach to taking ownership, grabbing a portion of the data, and giving the original data back. Nice trick!

Did you plan to add a method so this can be called as myvec.ss.head(2)?

@eriknw
Copy link
Member Author

eriknw commented Feb 25, 2021

Did you plan to add a method so this can be called as myvec.ss.head(2)?

I'm undecided. I'm tempted to. Probably. What do you think?

I may also add dtype= keyword argument to head to make it match to_values.

I can't guarantee this method for any backend, so it would be inappropriate to add this functionality to to_values directly.

Also, add `dtype=` keyword argument to head so that it better matches `to_values`.
@eriknw
Copy link
Member Author

eriknw commented Feb 25, 2021

myvec.ss.head(2) added. Also added dtype= keyword argument to make it behave more like .to_values().

We don't have A.T.ss attribute yet, but we could add e.g. A.T.ss.head() (just like how there's A.T.to_values()) where we switch rows and cols.

(mask is None or mask.structure)
and df.shape != matrix.shape
and min(matrix._nvals, max_rows if matrix._nvals <= max_rows else min_rows)
> 2 * df.count().sum()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: this condition was chosen so I wouldn't need to rewrite any tests. "Friendlier" conditions for when to switch to the sparse repr may exist.

# SS, SuiteSparse-specific: head
num_rows = matrix._nvals if matrix._nvals <= max_rows else min_rows
if matrix._is_transposed:
cols, rows, vals = matrix._matrix.ss.head(num_rows, sort=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: sorting here for a better repr result. If we find this is too slow for large objects, we can update it so we only sort for small objects (but we need to figure out where the threshold may be).

@eriknw eriknw merged commit e279df2 into python-graphblas:main Feb 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants