Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix perspective projection transform #465

Open
wants to merge 1 commit into
base: master
from

Conversation

@hannobraun
Copy link
Contributor

hannobraun commented Aug 21, 2020

Please review this commit carefully, as I only partially know what I'm
doing. I believe that setting m44 to one for the perspective
projection matrix is wrong.

First, the sources I can find that explain this all set it to zero.
Granted, I can't find many sources online that explain this clearly, but
the few that I can find support my point (see Wikipedia[1] and this
university lecture[2], for example).

Second, I did some transformation on paper, to better understand this,
and while the approaches in the cited sources starting making sense to
me, I can't make head or tails of that one in the m44 position. It just
seems to get in the way.

So while I can't rule out that the way things currently are make total
sense for reasons I don't understand yet, as of right now, it seems more
like an oversight.

[1] https://en.wikipedia.org/wiki/Transformation_matrix#Perspective_projection
[2]
https://courses.cs.washington.edu/courses/cse455/09wi/Lects/lect5.pdf,
page 23

Please review this commit carefully, as I only partially know what I'm
doing. I believe that setting `m44` to one for the perspective
projection matrix is wrong.

First, the sources I can find that explain this all set it to zero.
Granted, I can't find many sources online that explain this clearly, but
the few that I can find support my point (see Wikipedia[1] and this
university lecture[2], for example).

Second, I did some transformation on paper, to better understand this,
and while the approaches in the cited sources starting making sense to
me, I can't make head or tails of that one in the m44 position. It just
seems to get in the way.

So while I can't rule out that the way things currently are make total
sense for reasons I don't understand yet, as of right now, it seems more
like an oversight.

[1] https://en.wikipedia.org/wiki/Transformation_matrix#Perspective_projection
[2]
https://courses.cs.washington.edu/courses/cse455/09wi/Lects/lect5.pdf,
page 23
@nical
Copy link
Collaborator

nical commented Aug 21, 2020

You had me scratch my head for a good hour and a half (and I learned stuff in the process).

Before we go into why, let's start with references (it at least gave me some confidence that we aren't doing something crazy)

Why then ?

It's interesting because your reasoning and your sources do make sense.

My understanding is that it is a question of which plane the perspective transform is projecting things on. In the case of euclid, the matrix projects points onto the plane defined by z = -d. In that wikipedia page, the transform is projecting to the z = 1 plane, same for that washington.edu course (see the "Homogenous coordinates: Geometric intuition" slide).

I think that it makes most sense for euclid to keep perspective as is and continue following the CSS specification. That said this method really need better documentation.
It wouldn't hurt to add other perspective transformations. I don't thing one that projects to z=1 would be particularly useful, but maybe equivalents to gluPerspective and gluLookAt that seem to be popular.

Anyway I'll close this now since I believe that we shouldn't change the behavior of perspective.

@nical nical closed this Aug 21, 2020
bors-servo added a commit that referenced this pull request Aug 21, 2020
Improve the documentation of Transform3D::perspective

There was a bit of confusion about the formulation of the perspective transform in #465. This clarifies things.
@hannobraun
Copy link
Contributor Author

hannobraun commented Aug 22, 2020

Thank you for your reply! I hate to take up more of your time, but here I go.

Before we go into why, let's start with references (it at least gave me some confidence that we aren't doing something crazy)

Those are certainly some strong sources. If I had been aware of those at an earlier point, I probably wouldn't have even posted this pull request. However, I'm too deep into it by now to just yield to these, especially since I don't quite follow your explanation :-)

My understanding is that it is a question of which plane the perspective transform is projecting things on. In the case of euclid, the matrix projects points onto the plane defined by z = -d. In that wikipedia page, the transform is projecting to the z = 1 plane, same for that washington.edu course (see the "Homogenous coordinates: Geometric intuition" slide).

I think this isn't quite on point, for two reasons.

First, yes, the Wikipedia matrix projects to z = 1, but that is just because a specific value was chosen there. The Washington matrix actually does project into z = -d, as best as I can tell. I believe that the slide you cited is just an example that doesn't have a 1-1 relation to the matrices on the next slide.

Second, I don't believe that projecting to z = -d is what the current Euclid implementation is actually doing. Let me demonstrate.

First, let's take a closer look at the matrices. I'm going to ignore the Wikipedia matrix from here on, as it's basically equivalent to the Washington matrix (modulo the hardcoded distance and different viewing direction), and the Washington matrix is much closer to the Euclid one, hence being better suited to demonstrate the difference.

Here's the Washington matrix:

| 1 0   0  0 |
| 0 1   0  0 |
| 0 0   1  0 |
| 0 0 -1/d 0 |

And here's the Euclid matrix. Please note that the Euclid documentation formats matrices differently, based on the order of their fields in memory. I'm translating it into the format used by all of the sources (including yours) here, for clarity:

| 1 0   0  0 |
| 0 1   0  0 |
| 0 0   1  0 |
| 0 0 -1/d 1 |

The only difference between them is the m44 value in the lower-right corner.

I'm claiming that the Washington matrix projects points into the z = -d plane and that the Euclid matrix doesn't. To demonstrate that, I'm doing a simple test: If a point already is in the z = -d plane, projecting it into that plane should not change the point.

Let's try that for the Washington matrix with d = 1:

| 1 0  0 0 |   |  2 |   |  2 |
| 0 1  0 0 |   |  3 |   |  3 |
| 0 0  1 0 | x | -1 | = | -1 |
| 0 0 -1 0 |   |  1 |   |  1 |

We're getting the same point out that we put in. w is still 1, so when the GPU does the perspective division to transform the point into normalized device coordinates, nothing will change either. To the best of my understanding, this is how it should be.

Let's do the same with the Euclid matrix:

| 1 0  0 0 |   |  2 |   |  2 |
| 0 1  0 0 |   |  3 |   |  3 |
| 0 0  1 0 | x | -1 | = | -1 |
| 0 0 -1 1 |   |  1 |   |  2 |

The 1 in m44 changes the value of w to 2. This means the GPU will divide by 2 for the perspective division, resulting in the following normalized device coordinates:

|  1   |
|  1.5 |
| -0.5 |

Which are not the same coordinates that our original point had, even though we were projecting it into the plane it was already in.

Doing the same for other planes leads to similar results. I've done lots more calculations along the same lines, and the result is always the same: The Washington approach (with 0 at m44) makes sense, the Euclid approach (with 1 at m44) doesn't.

Maybe I'm missing something. Maybe the Euclid approach has some advantage that I'm not seeing. Maybe something else needs to happen before or after the perspective transformation to correct for this (I don't know what that might be, or why it might be an advantage).

Thoughts on your sources

If this was just a Euclid thing, I'd be fully convinced that this is buggy behavior that needs to change. However, that you can point to those sources that are doing it the same way, is a strong argument that there is something going on here that I don't understand.

I don't have much insight into browser and web standards development, so what I'm about to say might be preposterous. Still, let me float this possibility: Is it possible, that this is an error in the spec, and everyone just copied the same mistake from the spec and/or each other?

Please consider the following notes, that could make this theory less preposterous:

  • If this is a bug, it is not an app-breaking one. All it does, as far as I can tell, is making the viewing angle wider than it should be. If I had encountered this bug (if it is one) in a different context, I would not have thought to scrutinize the projection matrix. My reaction would have been more along the lines of, "Oh, this is what a 90° field of view looks like? Guess I'll adjust the value".
  • I don't understand the context of this CSS spec. Is the resulting transformation matrix passed to the GPU directly or at all? I'm passing my matrix directly to the GPU (which does the perspective division). Maybe the same doesn't happen in the browsers.

Again, this might be preposterous, as I don't know how much scrutiny goes into the development of these standards and their implementation.

Additional sources

While articles explaining this subject clearly (and without resorting to "call this perspective function to make the matrix") seem to be rare, all I have seen use a 0 in the m44 position (I didn't save the additional ones I found, unfortunately, but can probably dig some of them up again, if requested).

I figured it wouldn't hurt to dig up some more sources, preferably involving real code. Here's what I came up with:

I didn't omit any sources. I just looked up the math libraries I could come up with off the top of my head, and all of them set m44 to 0. This certainly isn't conclusive evidence, but it does support my "web people got it wrong" theory.

I'm sorry for being so persistent when it's so likely I'm just missing something, but I've become obsessed at this point. If the behavior in Euclid is correct, I absolutely need to know why :-)

@nical
Copy link
Collaborator

nical commented Aug 23, 2020

I'm going to have to ponder on this for a bit, don't worry if you don't hear back for a time.
Changing this behavior is definitely app breaking for a browser that has to strictly abide to web specifications, but I'd like to at least better understand what this does and document it properly. And if this behavior only ever make sense for applications tied to CSS specifications, we can always add enough decorum to make it clear and add another function more useful to the wider audience.

@jrmuizel do you have any insight or know who was involved in this spec?

@nical nical reopened this Aug 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.