Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Physics Crash: Force-related vector holding “not a number” #822

Closed
ArsThaumaturgis opened this issue Dec 21, 2019 · 28 comments
Closed

Physics Crash: Force-related vector holding “not a number” #822

ArsThaumaturgis opened this issue Dec 21, 2019 · 28 comments
Labels
bug
Milestone

Comments

@ArsThaumaturgis
Copy link
Contributor

@ArsThaumaturgis ArsThaumaturgis commented Dec 21, 2019

Every so often–very seldom, in fact–and usually when something else is going wrong (such as extraneous circumstances causing an extremely low frame-rate) a crash occurs in what seems to be physics logic.

I only have the most recent such crash to hand, I think, and it’s been long enough since the last one that I don’t know whether it was the same.

That said, here is that most-recent crash:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/direct/showbase/ShowBase.py", line 1815, in updateManagers
    self.physicsMgr.doPhysics(dt)
AssertionError: !child_vector.is_nan() at line 59 of panda/src/physics/linearForce.cxx
Traceback (most recent call last):
  File "GameCore.py", line 1471, in <module>
    gameFramework.run()
  File "/usr/local/lib/python3.6/dist-packages/direct/showbase/ShowBase.py", line 3152, in run
    self.taskMgr.run()
  File "/usr/local/lib/python3.6/dist-packages/direct/task/Task.py", line 537, in run
    self.step()
  File "/usr/local/lib/python3.6/dist-packages/direct/task/Task.py", line 491, in step
    self.mgr.poll()
  File "/usr/local/lib/python3.6/dist-packages/direct/showbase/ShowBase.py", line 1815, in updateManagers
    self.physicsMgr.doPhysics(dt)
AssertionError: !child_vector.is_nan() at line 59 of panda/src/physics/linearForce.cxx

As you can see, the issue seems to occur in “doPhysics”, and seems to result from some vector being “not a number”.

My project uses Bullet for physics, rather than Panda's internal physics system--but it does also use particles, and so enables Panda's physics.

I'm afraid that I don't currently have a reliable way to reproduce this issue. (Which is a bit of a concern for me, too, as it means that I don't know how to test whether the issue is still present, or whether any of my own code is prompting it.)

This comes from a thread on the forum, linked here:
https://discourse.panda3d.org/t/physics-crash-force-related-vector-holding-not-a-number

(It's been pointed out to me that the physics-manager indicated in the error above does seem to be related to the call to "enableParticles", so I've retracted my expression of suspicion that the issue might be Bullet-related.)

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Dec 22, 2019

What types of forces are exactly active on your particle system?

@rdb rdb added the bug label Dec 22, 2019
@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Dec 22, 2019

I'm not sure of where I was in the game when this crash happened, so I'm not sure of which particle effects were active at the time (and which, come to think of it, might have been cleaned up).

Based on a quick search, I believe that I have all of the following in one or another particle system:

  • LinearVectorForce
  • LinearFrictionForce
  • LinearNoiseForce
  • LinearSinkForce
@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 4, 2020

Thanks to @serega-kkz, it seems that we have more information!

Based on their findings in this post and the following, it looks like the issue (in at least one case) comes about as a result of the combination of a LinearVectorForce and a LinearFrictionForce.

In addition, they seem to be able to reliably produce the crash under Windows 7 by minimising the game-window.

Specifically, they found that the following combination of forces produced a crash:
(Copy-pasting from their post.)

force0 = LinearVectorForce(Vec3(0.0000, 0.0000, -9.8000), 1.0000, 0)
force0.setVectorMasks(0, 0, 1)
force0.setActive(1)
f0.addForce(force0)

force1 = LinearFrictionForce(1.0000, 30.0000, 0)
force1.setVectorMasks(0, 0, 1)
force1.setActive(1)
f0.addForce(force1)

Known pipe types:
  wglGraphicsPipe
(all display modules loaded.)
Assertion failed: friction.almost_equal(LVector3::zero()) || IS_NEARLY_EQUAL(nor
malize(v).dot(normalize(friction)), -1.0f), file c:\buildslave\sdk-windows-amd64
\build\panda\src\physics\linearFrictionForce.cxx, line 65

While the following combination did not produce a crash:
(Again, copy-pasting from their post.)

force0 = LinearVectorForce(Vec3(0.0000, 0.0000, -9.8000), 1.0000, 0)
force0.setVectorMasks(0, 1, 0)
force0.setActive(1)
f0.addForce(force0)

force1 = LinearFrictionForce(1.0000, 30.0000, 0)
force1.setVectorMasks(0, 0, 1)
force1.setActive(1)
f0.addForce(force1)

Note that the only difference seems to be that the working version applies a different mask to the Vector Force.

Now, it applies the mask such that no force is applied at all by the Vector Force. Is that relevant? Quite possibly; quite possibly not. Perhaps it's just that the masks are different, perhaps there's some issue with the maths involved in applying both acceleration and friction. It's not clear right now...

I do note that the error given above isn't quite the same as in the original error--but it may well still be part of the same problem.

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 4, 2020

Great! Could you put that in the form of a self-contained example? Maybe I can get a fix in in time for 1.10.5, if I can reproduce it easily.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 4, 2020

Serega already did so, at least for the non-working version--it's linked in the post to which I linked above, I believe.

For the sake of convenience and thread-completeness, here's the link:
https://discourse.panda3d.org/uploads/short-url/bSt3M7Z4VcIKabvREE8tkc6QN0P.zip

(Note that it contains my leaf-particle image.)

To switch to a "working" version, just edit the ptf file so that "force0" has a vector-mask of (0, 1, 0), rather than (0, 0, 1), if I'm not much mistaken.

[edit] I've updated the link above, Serega having replaced the particle-image. (As it was previously the leaf-image from my game.) The new version also now sets full-screen, which may be related to reproducing the issue.

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 4, 2020

When the window is minimized, for some reason, the particle positions keep getting larger and larger. This eventually leads to a numerical error in the LinearNoiseForce implementation.

I can fix the LinearNoiseForce issue so that it doesn't return a NaN, but it seems like there's another problem here that causes the positions to get very large when the window is minimized.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 4, 2020

Very interesting. I wonder why it seems to so consistently happen under Windows 7, but not my Linux installation?

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 4, 2020

I'm not sure.

I think it happens when minimized because Panda slows down the framerate when the main window is minimized, which causes the dt to grow. Perhaps some calculation in the physics system causes positions to grow out of proportions then. Indeed, adding time.sleep(0.1) in a task is enough to reproduce the issue (in lieu of minimization).

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 4, 2020

You're quite right--under Linux, the test-program does indeed crash for me if I add "time.sleep(0.1)" within a task.

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 4, 2020

I'm no physicist, but from what I can tell, LinearFrictionForce appears to be broken. It works by subtracting the current velocity times a coefficient from the object's acceleration.

If the issue isn't already obvious from basic dimensional analysis: after the gravity has been applied, the object's speed is -9.8 * dt which is -1.176 when the dt is (say) 0.12, which means that once multiplied by your amplitude of 30.0 the acceleration applied by friction will exceed the acceleration applied by gravity.

This means that, when the window is minimized or when framerate is low, your leaves will start falling upwards more and more rapidly, until they reach infinity and the simulation explodes.

I'm going to fix the numerical accuracy issue (so that the simulation doesn't explode when the leaves leave the galaxy), but clearly, the friction force should not behave in this way. But I'm not sure how to fix this because it changes the behaviour of existing applications.

I'm inclined to leave it as-is, deprecate the LinearFrictionForce, and instead create a new LinearDragForce (which is probably a more appropriate name to begin with) that does the right thing. I do have a hack in the back of my mind that would partially mitigate it—namely limiting the drag force so that it will never be greater than other forces acting on the object—but there's no elegant way to implement it.

In your use case, I would suggest that you probably don't need to use a LinearFrictionForce to begin with. It may otherwise be possible to work around it by changing the amplitude of the force every frame based on the dt, or to limit the dt of the simulation somehow, or to set the drag coefficient to be significantly lower than the default 1.0.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 4, 2020

Ah, I see. It's a pity that it is so broken!

But fair enough: your solutions make sense, I believe, especially given the constraint of not wanting to break extant behaviour.

Hum. In my case, there may be another solution: remove both the acceleration and the friction, and just provide a static initial velocity on the particle's birth. (I forget the specific parameter offhand; it's part of the system, not the force-group, as I recall.) As long as testing indicates that the noise-force still produces the intended drift, without incurring some terrible vertical acceleration, it should be fine. (And I believe that I have a force-mask applied to the noise-force that should prevent vertical acceleration, so that should be fine.)

One advantage of changing this this way is that I should be able to simply update the particle-file--no rebuilding required!

@serkkz

This comment has been minimized.

Copy link

@serkkz serkkz commented Jan 4, 2020

Hmm, it is strange that dt is used inside the acceleration formula. I think that this is acceptable only for calculating the distance traveled for a certain time. The formula itself must be isolated from non-physical factors.

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 4, 2020

Yes, well, that is exactly the job of the physics integrator: determine the displacement of the object since the last frame, which is the double integration of the acceleration over dt. The formula to do so ends up being particle.pos += particle.vel * dt + 0.5 * accel * dt * dt, where accel is determined by summing all the forces acting on the particle and dividing that by the particle mass.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 4, 2020

All of that sounds correct thus far, I believe. I'm guessing, then, that the problem is that the friction-force is being applied based on the previous velocity, not the current? Perhaps the solution is to apply friction as a separate step?

@serkkz

This comment has been minimized.

Copy link

@serkkz serkkz commented Jan 4, 2020

I think at the moment it somehow looks like this.
[LinearVectorForce.z*dt + LinearFrictionForce.z*dt]
But it will be more correct.
[LinearVectorForce.z + LinearFrictionForce.z]*dt

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 4, 2020

Those should be mathematically equal, I believe, save for precision errors.

@serkkz

This comment has been minimized.

Copy link

@serkkz serkkz commented Jan 5, 2020

Hmm, if calculations are performed in one frame for all forces, then yes. However, this may not be so.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 5, 2020

Indeed; I don't know how it's done internally, I'm afraid.

I still suspect that the problem may be that the friction-force incorporates the previous velocity into its value, rather than the current.

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 5, 2020

I don't think that's the problem, no. Let's take only a constant gravity force and the drag force to keep things simple. With the current drag equation, and the amplitude of 30 that you set, we get this if we fill in how these forces affect the velocity:

v = v_prev + 9.8 * dt - v_prev * 30 * dt

Given a large enough v_prev, this formula clearly has values for dt above which the resulting velocity will be negative, ie. the drag will be accelerating the object in the other direction. Decreasing the amplitude of the LinearFrictionForce helps a lot, but there are still such values for dt to be found.

The proper equation for drag appears to actually be proportional to the square of the velocity, so at least the units will line up.

I'm way out of my depth with all this, though. It could be that these inaccuracies are bound to happen with iterative rather than analytical solutions.

@serkkz

This comment has been minimized.

Copy link

@serkkz serkkz commented Jan 5, 2020

It seems to me that a distance formula is required, not a speed formula.
The speed formula does not apply time multiplication, division is used.
dt is essentially this time.
s = v*dt

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 5, 2020

... we get this if we fill in how these forces affect the velocity:

v = v_prev + 9.8 * dt - v_prev * 30 * dt

Hmm, you're right. Indeed, I wonder whether part of the problem is that a value of 30 is far too high. Looking at the relevant wiki article, I don't see drag coefficients anywhere near that value, offhand. There's also a factor of 0.5 applied in the force-equation.

I'm way out of my depth with all this, though. It could be that these inaccuracies are bound to happen with iterative rather than analytical solutions.

That's entirely possible. I remember an (iterative, I'm pretty sure) air-drag simulation that I did in university that produced some... interesting results if one turned certain values up too high...

rdb added a commit that referenced this issue Jan 5, 2020
@serkkz

This comment has been minimized.

Copy link

@serkkz serkkz commented Jan 5, 2020

Hmm there was a time and it was interesting to me physical modeling.
Oddly enough, I used the air resistance formula. And did not get any problems in the calculations.

Calculation of the maximum speed:

import math

m = 80      # m = mass of the falling body.
g = 9.8     # g = gravity acceleration.
p = 1.184   # ρ = air density.
A = 1       # A = projection area of ​​the body.
C = 0.7     # C = resistance coefficient.

print ('km/h:',(math.sqrt((2*m*g)/(p*A*C))*3600)/1000)
km/h: 156.58518103230244

True, I lost the code, I think I need to run tests on different FPS.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 5, 2020

Note, however, that in this case the coefficient may be the value 30 in posts above, which may be too high...

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 5, 2020

Indeed, you are far more likely to get stable results if you keep the value below 1.0. The coefficient itself is actually capped at 1.0, but the amplitude isn't, and the amplitude is just a multiplier for the coefficient.

A proper drag implementation needs to be proportional to the square of the velocity, rather than the velocity--but, it can be shown that doing that in itself won't fix this.

I'm going to close this for now since I checked in the fix for the asserts; we can add a note to the manual that amplitude values should not exceed 1.0, and we can add a drag force that is proportional to the velocity later.

@rdb rdb closed this Jan 5, 2020
@rdb rdb added this to the 1.10.5 milestone Jan 5, 2020
@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 5, 2020

Come to think of it, it may be a superior solution to always run the physics simulation at a fixed timestep (say, at 1/60.0). Instead of adjusting the dt based on the framerate, we would simply run fewer or more physics steps in a given frame.

It seems to me that with this approach, the risk of introducing numerical errors due to frame lags would be significantly reduced, since the dt is never higher than the simulation can handle.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 5, 2020

That sounds like it might work, as long as it doesn't have a significantly negative impact on performance! ^_^

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 5, 2020

Overriding this method in your ShowBase subclass might improve stability:

DT_LIMIT = 1 / 30.0

    def updateManagers(self, task):
        dt = globalClock.getDt()
        while dt > 0:
            limited_dt = min(dt, DT_LIMIT)
            if self.particleMgrEnabled:
                self.particleMgr.doParticles(limited_dt)
            if self.physicsMgrEnabled:
                self.physicsMgr.doPhysics(limited_dt)
            dt -= DT_LIMIT
        return task.cont

It will only take effect when your FPS falls below 30 (configurable). You can adjust DT_LIMIT to be lower (like 1/60) to have more frequent updates and more stable physics, at the expense of performance.

@ArsThaumaturgis

This comment has been minimized.

Copy link
Contributor Author

@ArsThaumaturgis ArsThaumaturgis commented Jan 5, 2020

Ah, thank you for that. ^_^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.