Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We should switch to column-major notation with column-vector #331

Closed
Manishearth opened this issue Mar 28, 2019 · 24 comments
Closed

We should switch to column-major notation with column-vector #331

Manishearth opened this issue Mar 28, 2019 · 24 comments

Comments

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Mar 28, 2019

Previously: #146

In #329 we discovered that euclid uses row-vector notation, i.e. if you wish to transform a vector v you use v * T. Transformations compose as T1 * T2, where T1 is applied first. This is also what Gecko does. Column-vector notation works the other way, with transformations going T * v and composing as T2 * T1.

This is different from what OpenGL and most web specs (geometry, transforms, webxr) do. Overall while documents tend to explicitly specify row-major vs column-major since it varies a lot, documents do not specify row-vector vs column-vector, because column-vector notation is the norm.

Euclid is perfectly self-consistent with its usage of this notation, however, this breaks down at the border. If someone runs Transform3D::create_translation(...).to_row_major_arrays(), the translation values will be in the wrong spot if they expected it to be a column-vector based representation.

An example of this messing things up is here:

https://github.com/servo/webrender/blob/01046c6430992840c3a10e15762dbfb07a4b1039/webrender/src/device/gl.rs#L2323

uniform_matrix_4fv(_, false, _) takes a column-major array, however we call .to_row_major_array(). This works because a row-major row-vector matrix has the same representation as a column-major column-vector matrix.

There are cases in Gecko where similar things happen. Overall due to using a very unpopular convention (this is the first time I've come across this convention in physics, graphics, web specs, or browser impls) users have to deliberately use .row_major methods when they mean .column_major and vice versa, as well as swapping the order of .mXY accessors sometimes.

We should fix both. @nical has stated that they're open to this changing if Gecko is changed in lockstep. A bit of work, but doable.

This is a breaking change, however, and kinda annoying to do. I think it's worth it: This isn't the first time I've stumbled across this when implementing web specs, and last time was much worse since I never realized what the discrepancy was and just fixed it by swapping function calls until it worked (which largely seems to be what every other client of this library has been doing.)

cc @nox @pcwalton

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Jun 24, 2019

An example of this biting us... servo/rust-webvr@2b016cc

This commit took a day to work out, because the specs are pretty clear that they want column-major matrices, and it is very odd that to_row_major_arrtay() is the API for that.

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jun 24, 2019

Worth considering forking euclid (riemann?) with a fix, and using it for everything but webrender.

@pcwalton
Copy link
Contributor

@pcwalton pcwalton commented Jun 24, 2019

If we're going to fork incompatibly we might consider just switching to pathfinder_geometry, which I believe is correct here and also has the advantage of using SIMD everywhere.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Jun 24, 2019

We cooooould make this part of the phantom type, so in TypedVector3D<T, U> the type parameter U would be used to determine whether the vector was interpreted as a column vector or as a row vector?

@nical
Copy link
Collaborator

@nical nical commented Jun 26, 2019

Worth considering forking euclid (riemann?) with a fix, and using it for everything but webrender.

I don't like how this reads as an ultimatum.

When people express intent to fork an actively maintained project it usually means that either:

  • Different parties want to push the project in uncompatible directions.
  • There is tension between people involved.

I don't think that either of these two apply here. People on the gecko and WebRender side would also prefer matrices to follow the most popular convention, my only ask being that we do it consistently accross gecko as well to avoid subtle bugs.
I'd like to think we are on the same boat and that this isn't a huge thing to ask. I have been actively pushing against people making breaking changes in euclid because of how painful it is to bump euclid in the whole ecosystem of servo crates (or asking people who want to make such changes to take the burden of bumping servo crates, and I've have also had to do that myself).

I think that there is no shortage of good will to try to accommodate and make things work for both sides which shouldn't be hard since there doesn't seem to be disagreement about what everyone wants. This doesn't mean we are good at it but at least we try. I'm open to feedback if things are not working out.

I understand that the prospect of going thourgh Gecko's infamous C++ code isn't appealing, but the last year has been rather intense for people involved with WebRender, and it's been a bit hard for me and others to justify spending time on something non-critical, if that thing makes uplifting patches harder. That's not to say I would have prevented anyone from doing it, I'm happy to assist with reviews. If you want someone else to do the work, there is only so much you can demand from people under pressure of shipping something, even when everyone want the same thing.

So it comes as a surprise to me that the problem is considered important enough to fork euclid and have to juggle with both euclid and its fork in servo, but not enough to put effort into propagating the changes to both servo and gecko.

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jun 26, 2019

I don't like how this reads as an ultimatum.

When people express intent to fork an actively maintained project it usually means that either:

I mean, it's a project we maintain. We're forking our own project. I want a copy free from row matrices to use for the VR stuff. This isn't a hostile fork.

Also, forking frees us of having to worry about non-servo euclid users; we can design the new API without trying to smooth the transition.

This isn't supposed to be an ultimatum, this is just that this bug is causing trouble for us, and a fork is a simple way to solve this problem for the area where it's causing us trouble (VR) without causing too much strife for y'all (fork can avoid touching rendering code).

my only ask being that we do it consistently accross gecko as well to avoid subtle bugs.

We're already regularly hitting subtle bugs on the servo side because of this. We're working a lot with matrices for WebXR, and this bug makes it really easy to slip up since device docs and the WebXR spec assume column vector notation.

That's not to say I would have prevented anyone from doing it, I'm happy to assist with reviews. If you want someone else to do the work, there is only so much you can demand from people under pressure of shipping something, even when everyone want the same thing.

There were signals from Gecko folks (you? I forget) that this task would be incredibly nontrivial, which is why we've held off on this.

So it comes as a surprise to me that the problem is considered important enough to fork euclid and have to juggle with both euclid and its fork in servo

AIUI we don't actually use euclid transforms much in Servo, aside from VR. But we can totally do this fork for the VR stuff without coming close to touching webrender -- i.e. it wouldn't be much work to maintain at all.

@nical
Copy link
Collaborator

@nical nical commented Jun 28, 2019

I mean, it's a project we maintain. We're forking our own project.

This is beside the point. Even then, look at who submitted and reviewed changes in the last couple of years. Do you consider WebRender to be your project as well?
I'd appreciate if we could avoid an us-vs-them relationship since we share a lot of code, there's genuine interest in both servo and gecko succeeding and we have limited resources to make that happen.
Sometimes that means working with extra constraints but I think that it pays off in the long run.

This isn't a hostile fork.

Okay, I apologize for having taken a rather defensive stance here.

There were signals from Gecko folks (you? I forget) that this task would be incredibly nontrivial, which is why we've held off on this.

Could be me, the truth is I don't know most of the code that interacts with matrices so it's probably not incredibly nontrivial but I can't assume I'd be able to expedite it in a trivial amount of time either.

Now, if the situation is that dire on servo's side and you can't afford to spend time on Gecko's code, go ahead and change the matrix convention in euclid now. I would have liked to do gecko and servo in lockstep, but if that's not an option I don't mean to be uncompromising.
If you make this change, please at least update WebRender. Gecko will get updated whenever I or someone else have time.

In counterpart I have a branch with a number of euclid breaking changes that I held off because of how time consuming it is to bump euclid in all of servo (at least for me). If these make it into euclid master and I let you or someone else deal with updating servo, does that sound fair?

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jun 28, 2019

I'd appreciate if we could avoid an us-vs-them relationship since we share a lot of code, there's genuine interest in both servo and gecko succeeding and we have limited resources to make that happen.

I don't think I've done that at all. I only bring up the "we maintain it" because of you painting this fork as a hostile fork.

I don't think the fork is incompatible with working together, indeed I thought it would be the easiest way to work together and make this happen incrementally. Maintaining multiple versions of a single crate gets annoying and is prone to issues, whereas it's much easier for everyone involved if we use a new crate for all the matrices that don't touch webrender, which gives us a stopgap solution until we all actually have time for the coordinated updated.

I would have liked to do gecko and servo in lockstep, but if that's not an option I don't mean to be uncompromising.

I mean, I personally also want to avoid doing a euclid update that's not lockstep: it's not good for anyone involved. That's why I suggested the fork in the first place, it's a good bandaid, and it shouldn't affect y'all at all because our VR code is totally separate. It also means that we can make a breaking change without having to worry about how best to present that to consumers which aren't us. At the moment a major blocker is that I don't know how to change the euclid API in a way that makes it clear that this stuff has changed. We'd have to rename some methods, but it's not clear how.

If these make it into euclid master and I let you or someone else deal with updating servo, does that sound fair?

Sure

@kvark
Copy link
Member

@kvark kvark commented Jun 28, 2019

Trying to catch up on all the things here (with @jrmuizel , who helps a lot), it's not easy. Sorry if I'm missing something important!
So we have two different switches for representing matrices:

  • row or column vectors, which also correspond to how the "transform vector by a matrix" is notated: column-vector does "M * v" (which matches GLSL and the web specs @Manishearth refers to), while row-vector does "v * M" which matches Gecko, Cairo, and Postscript.
  • row or column major-ness, which affects the order in which elements are laid out in memory

Current euclid is row-major with row-vector. @Manishearth suggests to switch it to column-major column-vector. This wouldn't affect most of the logic, i.e. things like post_mul and transform_vector would stay the same. This would only affect the API pieces and documentation relative to conversions back and forth from arrays of data. Arguably, this is a minor change, but disagreeing with Gecko would have a cost, as well as changing the way all current WR engineers think about providing the data to euclid...

There is one external force that can help us make the right decision on these conventions: SIMD compatibility. Euclid today would require SIMD broadcast before doing a matrix-vector multiplication, since each SIMD vector is not continuous in memory. Fixing that would require changing exactly one of the conventions (but not both, like @Manishearth suggests) as well as a lot of internal logic, obviously... This is what @pcwalton suggests, and I sympathize with this approach. I.e. if we are to disrupt the workflow that much, better make it so we are faster as well, for SIMD usage (today or in the future).

That is to say, I just rounded up and explained the choices as I see them. No objectified preference is expressed here. Subjectively, column vectors with row-major representation would be ideal for me.

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jun 28, 2019

@Manishearth suggests to switch it to column-major column-vector.

Actually, I think we should switch to column-vector, I don't care about column/row-major. Pick whichever is more efficient, I recall things are easier to vectorize in one way. If that way is column-vector row-major, so be it, we should do that 😄

@pcwalton
Copy link
Contributor

@pcwalton pcwalton commented Jun 28, 2019

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jun 28, 2019

@nical we should merge your breaking changes first, so that any experimentation can be done on top of that

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jul 3, 2019

Euclid is perfectly self-consistent with its usage of this notation

I was wrong, Euclid has internal bugs due to this too, see #354

The core problem is that column vectors are so ubiquitous that nobody bothers to specify that they're using them, so it's going to be very easy to make such mistakes when working on euclid, let alone when using euclid.

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jul 3, 2019

Looks like our .mXY indices are row-first (unlike web specs, where they are column-first), so those cancel out at least. It's only *_mul() and to_*_major_array() that are affected by the row-vector-ness, everything else is self-consistent, barring #354

@icefoxen
Copy link

@icefoxen icefoxen commented Sep 19, 2019

As someone who isn't terribly good at math and just wants a lightweight vector math library for graphics... is there anything I can do to help this?

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Sep 19, 2019

Not really, you'd need to make a coordinated change across Gecko and Servo and that's really tricky to get right.

@nical
Copy link
Collaborator

@nical nical commented Sep 19, 2019

So for the record I'm fine with gecko not being updated now since it's too big a blocker to get things moving. Updating webrender would be nice though (I suspect it wouldn't be too much work), though if servo is not ready to follow with an update soon after, that would be inconvenient (for the servo folks).

@nical
Copy link
Collaborator

@nical nical commented Jul 8, 2020

I'm going to make some breaking changes soon, if there is still motivation around this, now is a good time to change the matrix conventions.

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jul 8, 2020

I have motivation, I just don't have the time to handle any Firefox-side changes

@nical
Copy link
Collaborator

@nical nical commented Jul 9, 2020

It's OK let's forget about firefox side changes.

@nical
Copy link
Collaborator

@nical nical commented Jul 9, 2020

Quick recap of various existing memory layouts and notations to make sure we are on the same page. The following assumes the members are laid out in the same order as the members are provided (so row by row). I highlighted the members that describe the translation (that's how I orient myself in the different conventions out there), and the terminology used by the project for that layout/notation.
Whether vectors are columns or rows is implicitly derived from the above.

The thing that makes efficient simd possible is the layout where the translation is contiguous in memory (euclid and pathfinder's current conventions). It's the only real argument that isn't based on someone's habits and tastes so let's start from there.

Another detail is whether the notation starts with zero or one (m00 or m11 being the first member). Pathfinder seems to be the odd one here starting with zero. I can probably deal with having to juggle between mXY/mYX in gecko's C++ and rust code, but honnestly I just don't want to have to offset the indices on top of that, so I'll pretend it's not a thing.

euclid, gecko

row-major:

 11  12  13  14

 21  22  23  24

 31  32  33  34
 ----------
|41  42  43| 44
 ----------

CSS

column-major

             __
 11  21  31 |41|
            |  |
 12  22  32 |42|
            |  |
 13  23  33 |43|
             --
 14  24  34  44

pathfinder_geometry

(edited)
(With indices starting at 1)

column-major

 11  12  13  14

 21  22  23  24

 31  32  33  34
 ----------
|41  42  43| 44
 ----------

euclid next

@Manishearth IIUC what you want is to use the same notation as pathfinder (swap mAB to mBA names and swap column_major to row_major terminology in method names). Is that right?
Or is it only swapping row-major/column-major names but keeping the member names consistent with their semantics in CSS (in the sense that [m41 m42m m43] describes the translation, etc)?

Also if we go through with this, are you OK with doing the servo update (I can deal with WR)?

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jul 9, 2020

The representation of pahfinder_geometry is backwards in your comment: it has the same model as CSS, but zero indexed and column major

We need to:

  • swap row_major and column_major, ideally using new names
  • swap the multiply functions, ideally using new names

I'd be happy to do the servo side of these changes if you handle webrender

mXY can stay the same

@nical
Copy link
Collaborator

@nical nical commented Jul 9, 2020

Indeed I was looking at its row major constructor. I fixed representation in the original comment.

I'd be happy to do the servo side of these changes if you handle webrender

Deal.

swap the multiply functions, ideally using new names

Ah right I remember some debates around the names.

So current post_mul/pre_mul names were meant to signify "apply a transformation before (pre) this one" or "after (post) this one". The unfortunate presence of "mul" makes it sounds like a left/right multiplication in mathematical-notation sense which led to to confusion.

I vaguely remember a discussion about this where you proposed t1.then(t2) which reads quite nicely. I asked the other gfx team folks in our standup meeting today and everyone present was happy with replacing pre_mul/post_mul with "then". Does that sound good to you?

In any case I want the API to be worded in terms of the semantics of the transformation rather than in terms of mathematical notations for abstract matrix objects whenever possible, and be worded in a way that makes that clear).

@Manishearth
Copy link
Member Author

@Manishearth Manishearth commented Jul 9, 2020

I love then()! This means we don't need to make silent breaking changes for this, too.

Then the only thing that neds to be swapped is row_major and column_major. We could also add all four variants of row_major_row_vector etc

@nical nical closed this Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.