-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Path finding broken on Raspberry Pi #2088
Comments
Appears that 43647ce has introduced the problem. |
Simple fix proposal that works, since everything else looked tedious to check also w. r. t. compiler settings and compilation times on RPi:
|
Why 1 though? Everyone passes in the same type ... |
I'm seeing |
I'm having difficulty conceiving |
When I discovered that, I wrote a test on Linux on both x86_64 and RPi-ARM that report Can try to write a test case later to look for differences between float and double on both platforms. |
What did I do: rotate vector (2, 2) by 1° up to 360 times and then put x, y every time into Compiled over: So yes, there are cases where the downgrade from double to float give us a different output on different platforms here. |
It might help if you elaborated a bit. Have you ruled out |
Let's try being explicit about the rounding mode, I pick |
|
these type of comparisons seems suspicious |
Looking at godbolt.org assembly output for such a check, that does not seem to make any difference (using |
it looks like there is a bug here: |
I guess so. |
If the problem is |
Sounds somewhat counter-intuitive to fix a problem of small errors with even bigger errors? Also, not sure if |
I don't know why you're insisting the problem is one of precision. just because the problem "goes away" when you use a double means nothing. The math shouldn't require that, all we are doing is scaling a velocity along a vector. A better performing implementation for something on a hot path is always welcome too, provided it is accurate enough and I don't see why it wouldn't be. |
The problem goes away also when using floats, so I wouldn't discount precision so easily. I can't make sense of that at FLT_EVAL_METHOD being 0 though. |
In 2nd paragraph I meant to say that if we have precision problem here on a such smale scale (affecting our calculations in maybe 2/360 cases) and this ends up making 95% of movements invalid already, less precision doesn't look like the fix. But also, the whole image does not fit since the errors still look too small for that impact, and My gut feeling was somewhat right, this is probably not the place to focus on, at least for this problem. I put traces into the code and tested a case with the same journey on one PC on a map with no moving actors. So full determinism, if I'm not wrong. The result shows some interesting problem. This my desktop:
And this is the Raspberry:
We can focus on NavmapPoint nmptChild(nmptCurrent.x + 16 * dxAdjacent[i], nmptCurrent.y + 12 * dyAdjacent[i]);
Log(DEBUG, "FindPath", "nmptChild: {}, {} ({}, {})", nmptChild.x, nmptChild.y, i, int8_t{dxAdjacent[i]}); What we see is that on the RPi none of the accesses to With
ARMv8 assemler code in PathFinder.cpp:374cmp x20, #0x0
add x0, x20, #0x3f
ldr x2, [sp, #200]
csel x0, x0, x20, lt // lt = tstop
negs x1, x20
and x20, x20, #0x3f
asr x0, x0, #6
and x1, x1, #0x3f
csneg x1, x20, x1, mi // mi = first
add x0, x2, x0, lsl #3
tbz x1, #63, 0x7ff7e48fe4
add x1, x1, #0x40
sub x0, x0, #0x8
ldr x2, [x0]
mov x3, #0x1 // #1
adrp x4, 0x7ff7f0b000
add x4, x4, #0x138
lsl x1, x3, x1
orr x1, x2, x1
str x1, [x0]
add x0, x23, #0x10
stp x4, x0, [sp, #160]
add x24, sp, #0x1c8
ldr x4, [sp, #264]
add x19, x4, #0x138
add x21, x4, #0x140
ldrsb w3, [x21]
mov x0, x24
ldrsb w4, [x19] // <---- PC at L374
ldr w1, [sp, #440]
ldr w2, [sp, #444]
add w3, w3, w3, lsl #1
add w1, w1, w4, lsl #4
add w2, w2, w3, lsl #2
bl 0x7ff7cccf50 <_ZN5GemRB5PointC1Eii@plt>
mov x0, x24
bl 0x7ff7ccc570 <_ZN5GemRB3Map18ConvertCoordToTileERKNS_5PointE@plt>
|
Wow. Have you tried making them a static const (since constexpr working on classes is apparently a c++20 feature) or compiling with clang? |
Will try clang later. |
Eh ... godbolt put me in some good direction. First I was thinking that GCC was outsmarting me with taking all these 0, 1, -1 into its emitted assembly code or somthing since there was something disappearing at higher levels of O (see below). But no, this is a minimal case roughly looking looking like our problem (this is Details of the assembly are irrelevant, I guess, and also Although there are no apparent references to Looking at the result of |
you're misunderstanding me. What I'm saying is what we want to do here (normalize a velocity along a vector) does not require We also shouldn't discount the sort of assumptions |
Clang v14 is good. So what to do? My best guess is that there is some kind of problem that is isolated to GCC (v12) on (RPi-)ARM.
Indications of some kind of bug in GCC are strong but breaking this down into a small test case only gives clues but no reproducibility yet. |
Can you try with 13 or 14? Ideally it'd be a versioned problem and we could simply handle it in cmake. |
Hm, don't know. Raspbian Bookworm only provides v12 for now. |
Docker to the rescue again, although this isn't much fun on such a device. No improvement with GCC v13. |
Ok, thanks. I think the sanest thing for now is to block gcc RPI builds and if anyone external gets annoyed enough to investigate further successfully, we can lift the ban. Should be trivial to do in CONFIGURE_RPI_SPECIFICS, since by the time it's ran, the compiler has already been detected. Let me know if you want me to handle it. |
Bug description
I have found some mysterious bug, and only when running GemRB on a RPi.
Essentially almost all movement attempts fail on all the (IWD2) maps I've been trying, just leaving this log line as a clue:
gemrb/gemrb/core/Scriptable/Scriptable.cpp
Line 2226 in dbda9db
Yet the 2nd attempt won't work, too.
It's bad at master and 43647ce already and I'll try a bisection but that takes a little time since there were also compiler errors around.
GemRB version (check as many as you know apply)
Video Driver (check as many as you know apply)
OPENGL_BACKEND
enabledOPENGL_BACKEND
enabledThe text was updated successfully, but these errors were encountered: