New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Affine3D #157
Affine3D #157
Conversation
d500858
to
92b45ad
Compare
a98fb57
to
d525f64
Compare
Thanks for all your work on this! There are a couple of things I'm not sure about as it currently stands, on naming and being row major. Naming wise I was thinking of making this a special type, e.g.
I'm not totally sure about using row major when everything else is column major, it might cause confusion when debugging if the internals behave differently to the other matrix types. For scalar code, the size would be the same for 4x3 or 3x4 and I suspect it wouldn't make a huge difference one way or another for scalar performance. At the moment the SSE2 code isn't benefiting from row major, although I think switching to |
I agree. I think
I made this choice because I wanted to represent the transform using I can make a competing |
I think if we rename this I've started work on a 4xVec3A version, but am having some problems. Maybe it can wait a bit and come as a follow-up PR where we switch to a (potentially) faster implementation? |
I think it makes sense to remove those methods. My approach has usually been to provide a pretty minimal amount of functionality and add things that people are requesting. No problem skipping the 4xVec3A implementation for now if you are having problems with it. |
One thing I was planning on experimenting with was loading scalar types into SIMD to perform math heavy operations, so it wouldn't just be things like The SSE2 dot product is not super fast. The column major vector multiplication doesn't require dot product so it should be a lot faster if SIMD is available, see https://github.com/bitshifter/glam-rs/blob/master/src/core/sse2/matrix.rs#L406-L421. |
Related to my comment above, I wrote a bit more detail about where I'm thinking of going with glam here #159. You don't really have to do anything with this information for this PR, just thought it might be useful to know what I'm thinking on the performance side of things. |
I have a working 4xVec3A implementation now too, but so far it is slower at simd self*self than the 3xVec4 version. The transforming of points and vectors is faster though (as fast as for Mat4), so probably worth the trade. I'll do a separate PR for it with proper comparisons once this PR is merged. |
The weird clippy error was just a run of the mill warning due to the method not being used by the I've been asked to do a release by another embarker, so I might hold off merging this until I've done that. If you wanted you could put it on a feature so it won't affect the release, or just wait. I want to hold off having this on by default until the column major PR lands. I don't expect it to take long to do a release. |
The release has been done, so should be good to merge once the conflicts are resolved. |
Rebased to fix merge conflicts in |
Part of #25. A simpler version of #156.
This PR introduces the
Affine3D
type, implemented as3x Vec4
(NOTE: onlyf32
version included).Banchmark results on my Intel MacBook (best of a few runs):
Mat4 sse2
Affine3D sse2
Mat4 scalar
Affine3D scalar
inverse
12 ns
9 ns
🥇31 ns
16 ns
🥇Self * Self
6.1 ns
4.2 ns
🥇21 ns
14 ns
🥇transform point3
2.6 ns
🥇3.8 ns
3.8 ns
3.5 ns
🥇transform vector3
2.5 ns
🥇3.6 ns
3.3 ns
2.9 ns
🥇( 🥇 is the fastest in each pair of columns)
It would be nice to speed up the sse2 vector transforms more, as those are probably among the most common operations to do with a matrix. I've tried several different approaches, but I can't get further than this. Unless someone has a better idea, I think we should just recommend turning your
Affine3D
into aMat4
on SSE2 targets when doing a lot ofmat * vec
transforms.It is also possible that for some platforms (in particular spirv) a
4x Vec3
may actually be faster.