Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve 4x4 matrix multiplication #158

Merged
merged 2 commits into from
Jun 3, 2021

Conversation

lgritz
Copy link
Contributor

@lgritz lgritz commented Jun 2, 2021

operator* and *= muliplying two Matrix44s had two efficiency problems:

  1. They both involved a temporary matrix using the default
    constructor, which sets it to the identity matrix, before then
    immediately putting other values into every element. (I think the
    reason it did not use the special constructor that leaves it
    uninitialized is because that conflicted with the desire for this
    to be constexpr. Uninitialized data during its C++ visible lifetime
    is a no-no for constexpr.)

  2. It relied on the multiply() helper, which took the ADDRESS of
    elements of the matrix. As discussed before in an earlier vector
    ops overhaul, this interferes with good code generation when the
    matrix multiply is inside a loop that we hope will
    autovectorize. Taking the address of an array and then using that
    as a pointer can run up against the fragility of the compiler in
    knowing when it can keep things in SIMD registers, etc.

The solution is as follows:

Add a new matrix() helper that takes two matrix reference params and
returns the result, and its implementation is fully inlined, no pointers
(and thus can be constexpr for C++14). This specific implementation
is backported from OSL, where it was written by Alex Wells from Intel,
who crafted it very carefully to compile to better (and more likely
to be autovectorized) code than the previous Imath code.

I implemented the other 4x4 matrix multiplies in terms of that. This
allows them to remain constexpr(14) while avoiding the unfortunate
use of the initializing constructor.

Signed-off-by: Larry Gritz lg@larrygritz.com

`operator*` and `*=` of two Matrix44s had two efficiency problems:

1. They both involved a temporary matrix using the default
   constructor, which sets it to the identity matrix, before then
   immediately putting other values into every element. (I think the
   reason it did not use the special constructor that leaves it
   uninitialized is because that conflicted with the desire for this
   to be constexpr. Uninitialized data during its C++ visible lifetime
   is a no-no for constexpr.)

2. It relied on the multiply() helper, which took the ADDRESS of
   elements of the matrix. As discussed before in an earlier vector
   ops overhaul, this interferes with good code generation when the
   matrix multiply is inside a loop that we hope will
   autovectorize. Taking the address of an array and then using that
   as a pointer can run up against the fragility of the compiler in
   knowing when it can keep things in SIMD registers, etc.

The solution is as follows:

Add a new matrix() helper that takes two matrix reference params and
returns the result, and its implementation is fully inlined, no pointers
(and thus can be constexpr for C++14). This specific implementation
is backported from OSL, where it was written by Alex Wells from Intel,
who crafted it very carefully to compile to better (and more likely
to be autovectorized) code than the previous Imath code.

I implemented the other 4x4 matrix multiplies in terms of that. This
allows them to remain constexpr(14) while avoiding the unfortunate
use of the initializing constructor.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Copy link
Member

@cary-ilm cary-ilm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cary-ilm cary-ilm merged commit b14f035 into AcademySoftwareFoundation:master Jun 3, 2021
@lgritz lgritz deleted the lg-matrix branch September 22, 2021 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants