Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script API: Check that SAOs are still usable before attempting to use them #9390

Merged
merged 1 commit into from Feb 11, 2020

Conversation

sfan5
Copy link
Member

@sfan5 sfan5 commented Feb 11, 2020

fixes #8587, fixes #9387

To do

This PR is a Ready for Review.

Q: Should this print a warning so the mod this happened in can be investigated? (not necessary)

@sfan5 sfan5 merged commit 91eef64 into minetest:master Feb 11, 2020
@sfan5 sfan5 deleted the lua_sao_fix branch February 11, 2020 18:21
@thomasrudin
Copy link
Contributor

thomasrudin commented Feb 12, 2020

I'm now getting crashes from nil-values returned from self.object:get_velocity() not sure if they are related.

I'm opening an issue if it is caused by this..

@sfan5
Copy link
Member Author

sfan5 commented Feb 12, 2020

Hm, well this fix has the side effect that the ObjectRefs that were buggy before are now entirely non-functionality with all relevant methods returning nil.

It would be possible to work around this in the affected mods, but that would be incorrect.
Do the Pushing ObjectRef to removed/deactivated object, this is probably a bug. warnings still appear?

@thomasrudin
Copy link
Contributor

Do the Pushing ObjectRef to removed/deactivated object, this is probably a bug. warnings still appear?

No, not in that time-frame, the patch was only active for about 15 minutes..

@sorcerykid
Copy link
Contributor

Hm, well this fix has the side effect that the ObjectRefs that were buggy before are now entirely non-functionality with all relevant methods returning nil.

I'm definitely glad to hear that there is now fully consistent behavior for all methods of a freed ObjectRef. Perhaps it might be worth mentioning that point in the documentation.

@lhofhansl
Copy link
Contributor

Hmm... I'm seeing this too with various mods that use mobs_redo. Perhaps we should undo this one (I agree it's correct behavior, yet, there are mods that are broken by this).

@sfan5
Copy link
Member Author

sfan5 commented Feb 16, 2020

I believe the ultimately better outcome is to get mods fixed quickly right now instead of turning it into a soft warning and getting mods fixed sometime somewhen (or never).

Also, has anyone noticed errors with a mod other than mobs_redo?

@thomasrudin
Copy link
Contributor

Also, has anyone noticed errors with a mod other than mobs_redo?

Not yet, no, but i'm guessing this will be a mine-field... 😖

I fixed the 3d_armor issues in my own fork (minetest-mods/3d_armor#2) because the original mod by stujones11 is not maintained anymore...

@lhofhansl
Copy link
Contributor

I'm also not 100% sure the change has the behavior we want.
The engine calls on_step, as it should. If "remove" is called inside that, why wouldn't we allow still accessing the object for this step. It is weird that a reference that was valid before suddenly turns into nil.

I haven't looked closely at this part of the code, is there a risk of accessing freed memory?

@sorcerykid
Copy link
Contributor

I have to wonder how this change broke the 3d_armor mod. I looked at the PR linked above and don't see any changes dealing with ObjectRefs, just a lot of renaming of variables. Perhaps I missed something.

@sfan5
Copy link
Member Author

sfan5 commented Feb 16, 2020

why wouldn't we allow still accessing the object for this step?

I get that it's more convenient here but why would we? The mod has just asked to delete the object, it should cease to exist.

is there a risk of accessing freed memory?

There isn't and unlike the reported bug (#9387) for players I don't think using entities after their removal could crash anything either.

@sorcerykid
Copy link
Contributor

I can sort of see @lhofhansl's point. It might be more intuitive that calls to obj:remove() only add the object to a queue to be removed. But the object is not actually removed from the environment until control is returned back to the environment at the conclusion of the server step. (Personally, I thought that's how it already worked on the CPP side, but I suppose I'm recalling wrong).

Nevertheless, I must agree with @sfan5, that if an object is removed, then rightfully it should cease to exist at that point going forward. Mod developers should properly document their own API so that there is no risk of objects still being referenced after removal. Or at least they should provide some kind of callback mechanism so that other mods can be notified of a pending removal. These are mod-specific implementation details, and technically shouldn't be the responsibility of the engine.

@lhofhansl
Copy link
Contributor

lhofhansl commented Feb 17, 2020

I get that it's more convenient here but why would we? The mod has just asked to delete the object, it should cease to exist.

Upon further reflection I disagree.
The mod did not set the reference to nil, it simply called a method on the object - indicating that the object should be removed from the set of active objects, so that no future callbacks are called. Which is precisely the reason why the engine has an "isgone" method.

I vote for reverting... Perhaps with commenting what the delete method means.

@sorcerykid
Copy link
Contributor

sorcerykid commented Feb 17, 2020

If an object is removed from the set of active objects, then what type of object is it at that point? Its not a LuaEntitySAO nor is it a PlayerSAO. Perhaps it is a DeletedServerObj? Shouldn't there be documentation on the API for this new class of DeletedServerObjs?

@sfan5
Copy link
Member Author

sfan5 commented Feb 17, 2020

The mod did not set the reference to nil, it simply called a method on the object - indicating that the object should be removed from the set of active objects, so that no future callbacks are called.

Well not exactly. The mod has marked the object for removal, the earliest point at which that can happen is the current server step (after all step callbacks have run).
It's true that this PR artificially makes the object unusable despite it still existing.

I vote for reverting...

Would you want this reverted for players and Luaentities or just Luaentities?
Allowing interaction with "deleted" players will open up at least one segfaulting code path in the server code again, so that would have to be dealt with in another way.

If an object is removed from the set of active objects, then what type of object is it at that point?

The same it always was, the type does not change.

@lhofhansl
Copy link
Contributor

Would you want this reverted for players and Luaentities or just Luaentities?

Just Luaentities, I think - assuming others agree obviously.

@lhofhansl
Copy link
Contributor

So should we revert the Luaentities part of this?

@sfan5
Copy link
Member Author

sfan5 commented Feb 20, 2020

My vote is still no, I don't know the opinions of other coredevs.

@lhofhansl
Copy link
Contributor

lhofhansl commented Feb 21, 2020

My vote is still yes :)
(and document that soa.remove just removes the object from the set of active objects)

Perhaps some other core devs could weigh in.

And for the record I have reverted that part here, and seen no issues.

@SmallJoker
Copy link
Member

There were already warnings when invalid/deleted objects were accessed. This PR "only" enforced the object lifespan which now became an issue in mods that relied on it.
The handling after this PR is exactly as intended, and mod authors should fix their mods. BUT to ease this process an API function, say obj:is_valid() must be provided to perform a proper check whether the entity is already gone (LuaEntity and Players).

@lhofhansl
Copy link
Contributor

Honestly I'd rather get rid of the warnings then.

The object is not deleted, it's still there, just not in the set of currently active object anymore. That's easy to explain, easy to understand, and easy to use in mods.

We do the same in core. The object is still there, references are not set to null (not possible in C++ anyway) and there's an is_gone method.

Can't do obj:is_valid() if you pull the rug from underneath by returning a null-reference, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfault on Server::SendMovePlayer Segfault caused by chat sending scrollbar change
5 participants