Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote the dangling Variant fix to release builds #1589

Open
RandomShaper opened this issue Oct 1, 2020 · 18 comments
Open

Promote the dangling Variant fix to release builds #1589

RandomShaper opened this issue Oct 1, 2020 · 18 comments

Comments

@RandomShaper
Copy link
Member

RandomShaper commented Oct 1, 2020

Describe the project you are working on:
It's not really related to my project, but to game projects from many people.

Describe the problem or limitation you are having in your project:
The problem is that the current dangling Variant fix (godotengine/godot#38119 plus some fixups) only applies to debug (or release_debug) builds, so people is getting bitten by crashes or wrong logic related to the lack of that protection on production. While it's true that the editor will warn you about those situations at dev time, there are usually still states a game can be in that are hard to cover in testing, but can very well happen to users.

Describe the feature / enhancement and how it helps to overcome the problem or limitation:
The idea is to have the following:

  • Godot 3.x: dangling Variant fix is applied on pure release builds, too. The performance measurements that have been done so far make this completely reasonably: you get extra robustness at the same performance (theoretically a bit worse, but negligible; but even better, as shown in some tests).
  • Godot 4.0: here it would be soon enough to take this some steps further:
    -- Finally get rid of things like is_instance_valid.
    -- Enable decayment of freed objects to null (that was initially in the fix for 3.2, but was removed because it was a change of behavior).
    -- Compare performance of the current approach to protection in 4.0 (quick ObjectDB checks) to the one in 3.2 and leave the fastest.
    UPDATE: I'm "deprecating" the list of ideas for 4.0, since I'd like to check the state of things again and refine the proposal for it. Let's focus on 3.x for now.

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:
I think it's already pretty clear.

If this enhancement will not be used often, can it be worked around with a few lines of script?:
If a developer could really ensure the lifetime of every object in their game is perfectly tracked, maybe some (or a lot of) scripting would fix the problem. But that's not the case.

Is there a reason why this should be core and not an add-on in the asset library?:
This is hardcore core stuff, so not possible.

@oeleo1
Copy link

oeleo1 commented Oct 1, 2020

Thanks for writing this proposal! This is so needed.

Big thumbs up for the 3.2+ fix - release builds are simply broken as they are. Note that we don't have any warning in Debug builds about that, but Release builds crash systematically and invariably.

As for 4.0, it sounds reasonable to compare the 3.2 with the fix performance to the mechanism in 4.0 and decide accordingly. Regarding is_instance_valid and queue_free, let me tell you that our experience with these over time was so painful that we completely dropped them out of our code. We've been bitten by those so many times that we even considered dropping Godot altogether because of them.

FWIW, instead of is_instance_valid we are using a global script procedure based on weakref and ref.is_queued_for_deletion checks. And instead of queue_free we use call_deferred("free") all over the place.

To sum up - a big YES to this proposal which we fully endorse. We can also poor money and beer non-exclusively to get this in.
Thanks!

@reduz
Copy link
Member

reduz commented Oct 1, 2020

I still think adding this on release is just a bad idea. If you don't get errors in the debug build but it crashes on release then this is probably a bug and/or unrelated. In this situation, the dangling pointer should always crash on debug.

While to you it may be better to have this fix, other users probably prefer to fix the errors reported and expect the release build to run as fast as possible.

Instead, I think the course of action should be:
-Figure out why no error appears on debug and fix it, if this is the case
-Make it easier to publish a debug build, if users want safety nets and dont care about performance.

@reduz
Copy link
Member

reduz commented Oct 1, 2020

@oeleo1 also by the way, is_instance_valid is completely trustworthy in recent 3.2 builds as well as 4.0, it was not trustworthy in previous 3.x builds, so it may be a misunderstanding from your part.

@RandomShaper
Copy link
Member Author

is_instance_valid() was made thread-safe, but it's still not something you can trust. An object allocated at the same address as a freed one will report a false positive. That's, of course, considering that what you want to check is that your variable is still pointing to the object you assigned, and not to just some object, which I think is the case.

@reduz
Copy link
Member

reduz commented Oct 1, 2020

Ah that may be still the case of 3.2, but in 4.0 this function is 100% trustworthy

@RandomShaper
Copy link
Member Author

Just a heads up, this is likely to be accepted for 3.2.

The part for 4.0 won't probably be needed because there things are meant to work better out from the box. When I confirm that's the case, I'll remove the relevant part and limit this proposal for 4.0.

@oeleo1
Copy link

oeleo1 commented Oct 1, 2020

Okay. Two points I'd like to draw your attention on:

  1. There is an obvious discrepancy between Debug and Release builds. To oversimplify, debug == dev, release == pro. A project life cycle goes trough the dev phase where debug makes perfect sense and is the right thing to use, but at some point it matures enough to become a product for potential release. That's where we go pro. Currently the whole Godot process is centered around debug, the editor is debug-release and the default export mode is debug. So most of Godot's use is debug-oriented due to the dev process organization. And if you ask me, that's okay, but that's a bit of a problem for entering the pro league. Because when the time comes to go pro (you do care about performance in most games after all) and for other reasons, one goes for a release build. And this is where Godot is quite weak in testing coverage and the above discrepancy is obvious. While I agree that for this particular issue it may not be completely appropriate to integrate the fix in release, a crash is a crash and that's what happens. Regularly. Systematically. Noone wants to gamble with crashes. And indeed, the crashes are probably in a 3rd party lib or some obscure bug unrelated to the variant fix, because the code base is subject to a more aggressive compiler optimization, but still - a systematic or sporadic crash is a no go and currently the release builds are this kind of gamble. Unacceptable. Going back to the subject, this issue is and has been the most probably cause for crashes in a long time. If it would be hiding errors, don't let it in, but if it helps reducing the gap between debug and release, which is my take on the situation, then get it in. Also, please consider shipping Godot in release mode and not debug-release so that the gap between the two fades away. I understand the challenge, but I also understand the problem, and so do you. Seasoned programmers know that things don't always work as intended or as designed, witness the latest exchange about is_instance_valid. There's no way to heal the inflicted pain that easily.

  2. With 4.0 you've got the exceptional one-time opportunity to fix most of the past design mistakes and get things straight. Use it wisely. But to me, 4.0 is at least a year away from any serious production work simply because it is not mature enough, it's pre-alpha, and when it comes out it will be full of bugs. So fixing 3.2+ best effort is of utmost importance.

I personally don't see a problem with this going in and I've tried to argument my understanding, but hey, this is what proposals are for - debate about pros and cons. Long live Godot!

@Calinou
Copy link
Member

Calinou commented Oct 1, 2020

Also, please consider shipping Godot in release mode and not debug-release so that the gap between the two fades away.

The editor relies on functionality only available in debug mode, so I'm afraid that's not possible.

@oeleo1
Copy link

oeleo1 commented Oct 2, 2020

@Calinou - not possible today indeed, but it doesn't have to be that way. Editor and build mode can be dissociated.

@Xrayez
Copy link
Contributor

Xrayez commented Oct 2, 2020

Whatever you do, a good piece of documentation has to be written describing the proper usage at the very least, and the existing limitations which most users are likely not even aware of. One has to be aware of a possibility that people may already rely on "bug-as-a-feature" effect even in production, and even what seems to be proper fixes can be seen as regressions in existing projects.

@remram44
Copy link

Is the "proper usage" to rely on WeakRef to reference Nodes that might go away?

@Sebjugate
Copy link

Sebjugate commented Aug 15, 2021

I whole-heartedly support making release work the same as debug even if there is a small performance hit. In my mind this is a major problem for a few reasons:

  1. Debug builds work fine and do not give any indication (error, warning) that the behavior will not be the same in release.
  2. When these bugs do occur in release, they are usually game-breaking (a reference suddenly pointing to some other random node is rarely insignificant)
  3. When the bugs are noticed in release it is almost impossible to find them in debug because of point number 1
  4. Even if WeakRef is the official work-around, it is mentioned almost nowhere in tutorials and documentation, official or unofficial. In fact, almost everywhere I've seen promotes using a plain reference to a node (i.e. onready var mynode = get_node('MyNode'). Sometimes there is a false sense of security given by suggesting that is_valid_instance() is a safety, when it really only protects against null pointers, not the problem of referencing another random, but valid, node.

Edited for typos and clarity.

@remram44
Copy link

Even if WeakRef is the official work-around, it is mentioned almost nowhere in tutorials and documentation

I can't even get a clear answer here on the bug tracker. I don't know if WeakRef works for Node or just for Reference, I don't even know if the devs know.

@RandomShaper
Copy link
Member Author

@remram44, let me quote https://docs.godotengine.org/en/stable/classes/class_weakref.html:

If this object is not a reference, weakref still works, however, it does not have any effect on the object.

@remram44
Copy link

remram44 commented Aug 16, 2021

What does "still works however does not have any effect" mean in this context? Does it mean anything in any context? If it doesn't have "any" effect, in what sense does it "work"?

@RandomShaper
Copy link
Member Author

Yes, it does. It means that it won't give errors and will tell you back what object it is referencing, regardless it's Reference or not, but only if it is, it will do something in addition to what a regular variable can do.

@remram44
Copy link

Thank you so much for the answer! I will see if I can help improve the docs. I believe for example that simply removing "it does not have any effect on the object" would be a clear improvement, since I cannot see what information this part of the sentence adds.

@akien-mga
Copy link
Member

Implemented for 3.4 by godotengine/godot#51796.

Relevant changes still need to be forward-ported to master for 4.0 too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Implemented
Development

No branches or pull requests

9 participants