Affine3D #157

emilk · 2021-04-06T12:12:56Z

Part of #25. A simpler version of #156.

This PR introduces the Affine3D type, implemented as 3x Vec4 (NOTE: only f32 version included).

Banchmark results on my Intel MacBook (best of a few runs):

op	`Mat4 sse2`	`Affine3D sse2`	`Mat4 scalar`	`Affine3D scalar`
`inverse`	`12 ns`	`9 ns` 🥇	`31 ns`	`16 ns` 🥇
`Self * Self`	`6.1 ns`	`4.2 ns` 🥇	`21 ns`	`14 ns` 🥇
`transform point3`	`2.6 ns` 🥇	`3.8 ns`	`3.8 ns`	`3.5 ns` 🥇
`transform vector3`	`2.5 ns` 🥇	`3.6 ns`	`3.3 ns`	`2.9 ns` 🥇

( 🥇 is the fastest in each pair of columns)

It would be nice to speed up the sse2 vector transforms more, as those are probably among the most common operations to do with a matrix. I've tried several different approaches, but I can't get further than this. Unless someone has a better idea, I think we should just recommend turning your Affine3D into a Mat4 on SSE2 targets when doing a lot of mat * vec transforms.

It is also possible that for some platforms (in particular spirv) a 4x Vec3 may actually be faster.

src/mat3x4.rs

bitshifter · 2021-04-07T04:03:47Z

Thanks for all your work on this!

There are a couple of things I'm not sure about as it currently stands, on naming and being row major.

Naming wise I was thinking of making this a special type, e.g. Transform3D or Affine3D or something along those lines. There are a few reasons:

It doesn't tie you to an internal representation, e.g. 4x3 vs 3x4 or rows vs columns
Methods like inverse, determinant etc. don't make sense for non square matrices, to be fair I have helper methods on Mat4 like transform_point3 etc. that don't make mathematical sense either
It doesn't make sense to multiply two 3x4 matrices but these transforms be able to be multiplied.
Making it a special type makes it really clear that it contains a valid affine transform and nothing else

I'm not totally sure about using row major when everything else is column major, it might cause confusion when debugging if the internals behave differently to the other matrix types. For scalar code, the size would be the same for 4x3 or 3x4 and I suspect it wouldn't make a huge difference one way or another for scalar performance. At the moment the SSE2 code isn't benefiting from row major, although I think switching to Vec3A in the implementation would get it closer to the Mat4 performance. Another concern I have is I think accessing the translation and the basis vectors is a very common thing to do and this might be slower if the data is organised in rows, also direct access is nice. Also converting to a Mat4 or Mat3 would require a transpose with row major.

emilk · 2021-04-07T06:53:47Z

Naming wise I was thinking of making this a special type, e.g. Transform3D or Affine3D or something along those lines.

I agree. I think Affine3D is the better name here (since it is more specific).

I'm not totally sure about using row major when everything else is column major.

I made this choice because I wanted to represent the transform using 12xf32, to save memory (for better cache locality). If we don't care about that then 4xVec3A columns makes much more sense. What is "best" choice will depend a lot on the circumstance. The extra saved memory can make a big difference to decrease cache misses, but for applications where you can read data in order, it won't matter much at all.

I can make a competing 4xVec3A column major implementation so we can compare raw numbers.

emilk · 2021-04-07T09:02:40Z

I think if we rename this Affine3D we should also hide all functions related to columns and rows to make it clear that this is a higher abstraction.

I've started work on a 4xVec3A version, but am having some problems. Maybe it can wait a bit and come as a follow-up PR where we switch to a (potentially) faster implementation?

bitshifter · 2021-04-07T09:12:36Z

I think it makes sense to remove those methods. My approach has usually been to provide a pretty minimal amount of functionality and add things that people are requesting.

No problem skipping the 4xVec3A implementation for now if you are having problems with it.

bitshifter · 2021-04-07T09:49:51Z

One thing I was planning on experimenting with was loading scalar types into SIMD to perform math heavy operations, so it wouldn't just be things like Vec3A that are SIMD optimised. If doing this improve performance of Mat3 inverse and vector multiplication functions potentially an Affine3D type could just be a composition Mat3 and Vec3. It would be similar in essence to the existing transform type https://github.com/bitshifter/glam-rs/blob/master/src/transform.rs#L217-L224. This does depend a bit on the results of my experiments though :)

The SSE2 dot product is not super fast. The column major vector multiplication doesn't require dot product so it should be a lot faster if SIMD is available, see https://github.com/bitshifter/glam-rs/blob/master/src/core/sse2/matrix.rs#L406-L421.

bitshifter · 2021-04-07T10:24:59Z

Related to my comment above, I wrote a bit more detail about where I'm thinking of going with glam here #159. You don't really have to do anything with this information for this PR, just thought it might be useful to know what I'm thinking on the performance side of things.

emilk · 2021-04-07T13:26:49Z

Current public interface:

emilk · 2021-04-08T08:11:01Z

I have a working 4xVec3A implementation now too, but so far it is slower at simd self*self than the 3xVec4 version. The transforming of points and vectors is faster though (as fast as for Mat4), so probably worth the trade. I'll do a separate PR for it with proper comparisons once this PR is merged.

src/quat.rs

tests/support/mod.rs

src/features/impl_serde.rs

src/affine3d.rs

bitshifter · 2021-04-09T08:52:26Z

The weird clippy error was just a run of the mill warning due to the method not being used by the DQuat implementation. If you still wanted to use it internally you could make it #[allow(dead_code)].

I've been asked to do a release by another embarker, so I might hold off merging this until I've done that. If you wanted you could put it on a feature so it won't affect the release, or just wait. I want to hold off having this on by default until the column major PR lands. I don't expect it to take long to do a release.

bitshifter · 2021-04-09T09:48:22Z

The release has been done, so should be good to merge once the conflicts are resolved.

emilk · 2021-04-12T07:03:30Z

Rebased to fix merge conflicts in CHANGELOG.md

emilk force-pushed the mat3x4-take2 branch 5 times, most recently from d500858 to 92b45ad Compare April 6, 2021 13:39

emilk marked this pull request as ready for review April 6, 2021 13:55

emilk marked this pull request as draft April 6, 2021 14:29

emilk force-pushed the mat3x4-take2 branch 3 times, most recently from a98fb57 to d525f64 Compare April 6, 2021 16:35

emilk marked this pull request as ready for review April 6, 2021 17:28

emilk mentioned this pull request Apr 6, 2021

Embark's pull requests EmbarkStudios/rust-ecosystem#20

Open

26 tasks

bitshifter reviewed Apr 6, 2021

View reviewed changes

src/mat3x4.rs Outdated Show resolved Hide resolved

bitshifter reviewed Apr 6, 2021

View reviewed changes

src/mat3x4.rs Outdated Show resolved Hide resolved

emilk force-pushed the mat3x4-take2 branch from 3442839 to 6fe339b Compare April 7, 2021 07:15

emilk changed the title ~~Mat3x4 (take2)~~ Affine3D Apr 7, 2021

emilk force-pushed the mat3x4-take2 branch from 744eebe to 36afdea Compare April 7, 2021 13:28

emilk mentioned this pull request Apr 7, 2021

WIP: Mat3x4 #156

Closed

4 tasks

emilk requested a review from bitshifter April 7, 2021 18:18

emilk force-pushed the mat3x4-take2 branch from 985b661 to 9ccaedd Compare April 8, 2021 07:48

bitshifter reviewed Apr 8, 2021

View reviewed changes

emilk requested a review from bitshifter April 8, 2021 11:47

emilk added 24 commits April 12, 2021 09:01

Add VecN::to_array()

234d35b

Add Mat3x4

647a0a7

Optimize Mat3x4 inverse

4907200

Optimize scalar transform_vector3

8fc12e2

Optimize mat3x4 * mat3x4

bf90cf2

Use .into() to convert to/from __m128

303a08d

Add better tests for matrix inverse

dade0e1

Inline more

9fa5cf7

Rename Mat3x4 to Affine3D and remove a lot of functions

2604bf2

Move vector transformation benchmarks to the mat4 and affine3d files

54b5041

Remove duplicated benchmark

b3096cb

Clippy: remove unused function

e3acdb8

Add Affine3D to the changelog

908a557

impl Serialize/Deserialize for Affine3D

6f59346

fix benchmark

6528a46

Serde fix

2ecff19

Optimize inverse

31c4302

Rename mul_vector4 to mul_vec4

fe6718e

Just one impl block

44bb0f4

Into turbofish -> From

0f8f910

Expose setter/getter for Mat3-part of Affine3d

94c6814

make from_rotation_axes pub(crate)

e1011a4

Remove is_invertible

19e2ccc

Remove from_rotation_axes due to weird clippy error

879b917

emilk force-pushed the mat3x4-take2 branch from ccf96cb to 879b917 Compare April 12, 2021 07:02

bitshifter merged commit 55f7cd4 into bitshifter:master Apr 12, 2021

khyperia deleted the mat3x4-take2 branch June 1, 2021 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Affine3D #157

Affine3D #157

emilk commented Apr 6, 2021 •

edited

bitshifter commented Apr 7, 2021

emilk commented Apr 7, 2021

emilk commented Apr 7, 2021

bitshifter commented Apr 7, 2021

bitshifter commented Apr 7, 2021

bitshifter commented Apr 7, 2021

emilk commented Apr 7, 2021

emilk commented Apr 8, 2021 •

edited

bitshifter commented Apr 9, 2021

bitshifter commented Apr 9, 2021

emilk commented Apr 12, 2021

Affine3D #157

Affine3D #157

Conversation

emilk commented Apr 6, 2021 • edited

bitshifter commented Apr 7, 2021

emilk commented Apr 7, 2021

emilk commented Apr 7, 2021

bitshifter commented Apr 7, 2021

bitshifter commented Apr 7, 2021

bitshifter commented Apr 7, 2021

emilk commented Apr 7, 2021

emilk commented Apr 8, 2021 • edited

bitshifter commented Apr 9, 2021

bitshifter commented Apr 9, 2021

emilk commented Apr 12, 2021

emilk commented Apr 6, 2021 •

edited

emilk commented Apr 8, 2021 •

edited