Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash on removing added sibling in _ready #78834

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Haydoggo
Copy link
Contributor

@Haydoggo Haydoggo commented Jun 29, 2023

If a Node that was added to the tree via add_sibling() was removed/reparented in its _ready() function, it would cause a crash.

Example:

extends Node

class MyNode extends Node:
	var new_parent = Engine.get_main_loop().current_scene
	func _ready() -> void:
		reparent(new_parent)

func _ready() -> void:
	await get_parent().ready
	for i in 3:
		add_sibling(MyNode.new())

@Haydoggo Haydoggo requested a review from a team as a code owner June 29, 2023 12:48
@YuriSizov
Copy link
Contributor

Could you provide a reproduction project?

@AThousandShips
Copy link
Member

Better yet a formal bug report to track this in case this doesn't solve it

@Haydoggo
Copy link
Contributor Author

Could you provide a reproduction project?

Updated description with reproduction example

@AThousandShips
Copy link
Member

AThousandShips commented Jun 29, 2023

Awaiting parent ready signal seems like a huge no-no here to me

Also why is reparent allowed to be called in _ready that seems unsafe, I'd say that should be fixed just like add_child isn't permitted

Edit: realized that won't necessarily catch this as it's not blocked at this point

@Haydoggo
Copy link
Contributor Author

Haydoggo commented Jun 29, 2023

I though the purpose of the _ready function is that it's run when it is safe to add children, that's also why I wait for the parent to emit its own ready signal before starting.

@AThousandShips
Copy link
Member

AThousandShips commented Jun 29, 2023

It's not safe, it's explicitly saying that you can't add children to parent if you try it:

Parent node is busy setting up children, add_child() failed. Consider using add_child.call_deferred(child) instead.

You should do this in a separate function, or do the deferred call to keep this safe

Does this happen if you call reparent deferred instead?

@YuriSizov
Copy link
Contributor

YuriSizov commented Jun 29, 2023

While we should prevent the crash, this example looks confusing. What's the real life need for this kind of logic?

  • Have a node in a scene;
  • When its parent becomes ready, add 3 siblings;
  • But not really, because as siblings become ready, we move them elsewhere.

I don't understand the intent here, and as mentioned ready is not the best time to add children to a node. We should probably add a check to add_sibling, similar to the one we have in add_child.

As for the crash, we likely need to catch its condition with an ERR_* macro and prevent further execution.

@@ -1409,7 +1409,9 @@ void Node::add_sibling(Node *p_sibling, bool p_force_readable_name) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERR_FAIL_COND_MSG(data.blocked > 0
should more likely be
ERR_FAIL_COND_MSG(data.parent->data.blocked > 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing out this fix right now, thank you

Of course doesn't fix this case specifically, but still relevant parallel issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also reorder checks a bit, two involving p_sibling together and two involving data.parent together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add a data.blocked around the actual child additions in _add_child_nocheck, we do for removing and moving

Edit: a quick test with that works for this situation, but can't predict any issues from doing that, so would have to be investigated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name nocheck suggests that it shouldn't do any checking :P

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mean checking, but data.blocked++ :P

@Haydoggo
Copy link
Contributor Author

What's the real life need for this kind of logic?

I orignally came across this bug when making a building game. The logic was as follows:

  1. Place some part, and add it to the world with add_sibling.

  2. In the part's _ready function, it checks if it is touching another part, and if it is, it reparents itself so that it shares its parent with the touching part.

As for the crash, we likely need to catch its condition with an ERR_* macro and prevent further execution.

My rationale for not throwing an error here was that replacing add_sibling with get_parent().add_child in script dodges this error all together, and think it would be strange for those two options to not have the same behaviour here.

That being said, add_sibling does have slightly different behaviour to get_parent().add_child in that the node may end up with a different index depending on which one is used, so maybe we should different error handling behaviour?

What do you think @YuriSizov @AThousandShips ?
Btw I appreciate all the feedback so far, I'm just happy to help out and learn to do so better :)

@YuriSizov
Copy link
Contributor

get_parent().add_child shouldn't work either, it should be blocked the same way. If this doesn't happen, there are either other differences or a bug.

I think for now we can go with #78847 and then re-evaluate if there are still problems that cause crashes for your example. Then we can either update this PR to address them, or close it as superseded if #78847 ends up fixing all of it.

@Sauermann
Copy link
Contributor

As far as I understand #78834 (comment), the bug will not be completely fixed by #78847, which solves a related problem.

As mentioned above, it should be investigated, if a data.blocked++-data.blocked---section helps fixing this bug without introducing any problems.

@Haydoggo
Copy link
Contributor Author

Haydoggo commented Sep 23, 2023

get_parent().add_child shouldn't work either, it should be blocked the same way. If this doesn't happen, there are either other differences or a bug.

I think the example I used in the description has caused some confusion, especially in my use of running the example in the _ready function and using await get_parent().ready. These may be bad practice but do not affect the issue at hand. Here's a simpler MRP that avoids these problems:

extends Button

class MyNode extends Node:
	func _ready() -> void:
		get_parent().remove_child(self)

func _pressed() -> void:
	add_sibling(MyNode.new())

As mentioned above, it should be investigated, if a data.blocked++-data.blocked---section helps fixing this bug without introducing any problems.

The only way to address this with data.blocked is to ensure that a parent is always blocked when adding a child.
I tried this by adding a blocking clause around _add_child_nocheck in Node::add_child, and it does fix this bug, but I'm not sure how to ensure this doesn't break anything.

Node::add_child has a check in it to make sure it's not blocked when it's called, so I assume there's some problem with blocking it right after that check?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants