Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BMat8: rows array by reference #11

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

james-d-mitchell
Copy link
Member

This allows for fewer memory allocations if repeatedly computing row spaces.

@@ -193,6 +193,7 @@ class BMat8 {
size_t nr_rows() const;
//! Returns a \c std::vector for rows of \c this
std::vector<uint8_t> rows() const;
void rows(std::array<uint8_t, 8>&) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just returning a std::array<uint8_t, 8> instead of a std::vector<uint8_t> should remove all the allocation that motivate the change of API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I don't think it's possible to overload functions in C++ by return type, so simply adding a member function with return type std::array<uint8_t, 8> won't work. For backwards compatibility I suggest not removing the member function std::vector<uint8_t> rows() const, maybe the new member function could be called rows_array or something similar?

Copy link
Collaborator

@hivert hivert Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't pay attention to the fact that you where keeping the old function. rows_array looks good to me. Then rows should be an alias for a rows_vector.

inline void BMat8::rows(std::array<uint8_t, 8>& rows) const {
rows.fill(0);
for (size_t i = 0; i < 8; ++i) {
rows[i] = static_cast<uint8_t>(_data << (8 * i) >> 56);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be an even (slightly) faster way to do that using _mm_set_epi64x(_data, 0) and the _mm_extract_epi8 instruction. Actually some cast could do it, but we have to avoid some undefined behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be most excellent.

@james-d-mitchell
Copy link
Member Author

I updated this as discussed. I haven't added any tests because I can't get them to compile (see Issue #12).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants