Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InventoryList 'main' is currently in use and cannot be deleted or resized. #13785

Closed
kromka-chleba opened this issue Sep 7, 2023 · 23 comments · Fixed by #13894 or #13919
Closed

InventoryList 'main' is currently in use and cannot be deleted or resized. #13785

kromka-chleba opened this issue Sep 7, 2023 · 23 comments · Fixed by #13894 or #13919
Labels
Bug Issues that were confirmed to be a bug Regression Something that used to work no longer does. @ Script API
Milestone

Comments

@kromka-chleba
Copy link
Contributor

kromka-chleba commented Sep 7, 2023

Minetest version

Minetest 5.8.0-dev-debug-4252f9d4d-dirty (Linux)
Using Lua 5.1.5
BUILD_TYPE=Debug
RUN_IN_PLACE=1
USE_CURL=1
STATIC_SHAREDIR="."

Operating system and version

Debian GNU/Linux 12 (bookworm) x86_64

CPU model

AMD EPYC (with IBPB) (2) @ 3.493GHz

Summary

Our server was experiencing weird freezes so we used debug mode this time and got this instead:

terminate called after throwing an instance of 'BaseException'
what():  InventoryList 'main' is currently in use and cannot be deleted or resized.

Not really a C++ guru, but in inventory.cpp there's this:

void InventoryList::checkResizeLock()
{
	if (m_resize_locks == 0)
		return; // OK

	throw BaseException("InventoryList '" + m_name
			+ "' is currently in use and cannot be deleted or resized.");
}

ResizeLocked resizeLock() bumps the thing by 1 and operator() from ResizeUnlocker removes 1.
I guess this can get pretty ugly if it misses 0?

Steps to reproduce

The weird (and possibly unrelated?) freezes are random but reproducible.
For now it's the first time we see this bug.

@kromka-chleba kromka-chleba added the Unconfirmed bug Bug report that has not been confirmed to exist/be reproducible label Sep 7, 2023
@kromka-chleba
Copy link
Contributor Author

Here's the GDB output:
https://ufile.io/4lkj3qeg
Couldn't upload to GitHub because they don't allow xzip and other compressions can't fit under the 25MB limit.

@kromka-chleba
Copy link
Contributor Author

I guess you also need the binary:
https://ufile.io/0j9r6315

@SmallJoker
Copy link
Member

SmallJoker commented Sep 7, 2023

The resize locks were introduced due to various invalid memory writes that happened previously and were mostly left unnoticed (#13358 for example). This error is very likely caused by a mod, although it is strange that Minetest terminates instead of gracefully handling the exception (which would give you a Lua backtrace).

About your files: How did you compress them? They're either damaged or not supported by xz version 5.2.5 that I've got.

$ md5sum core.xz 
2af5fdc4b8e5d02b9919b87464fb444c  core.xz
$ file core.xz 
core.xz: data
$ xz -d core.xz 
xz: core.xz: File format not recognized

Would you please be so nice to either:

  1. Narrow down the cause of the error by enabling mods one-by-one in a new world OR
  2. Compress the files differently and upload them somewhere else? I've had a download speed of blazing fast 50 KiB/s (e.g. Dropbox, Google Drive, temporary file hosters)

@kromka-chleba
Copy link
Contributor Author

Okay, this should work hopefully:
https://drive.google.com/file/d/1Mz4_U1VpXStiGv0R3SYx2d9L7ybQKouA/view?usp=drive_link

Narrow down the cause of the error by enabling mods one-by-one in a new world OR

We use development branch of our game - Exile
https://codeberg.org/Mantar/Exile/src/branch/v4
The bug is hard to predict because it happens once or twice a day without leaving any logs or error messages.
All we have is the minetestserver binary using 100% CPU. Debugging the luaJIT version showed us it is something related to entity on_step, but after disabling luaJIT minetest simply stops working and prints the error message mentioned in the title instead of freezing.
We need players to play on the server for several hours for the bug to occur and players don't mention doing anything specific. So we simply have no clue. We also couldn't reproduce the bug on a local machine so it could be something network related? Exile is rather a monolithic game so we can't really remove parts of the game.

@kromka-chleba
Copy link
Contributor Author

About your files: How did you compress them? They're either damaged or not supported by xz version 5.2.5 that I've got.

xz (XZ Utils) 5.2.8
liblzma 5.2.8

Some file sharing sites recompress files and damage them.

@sfan5
Copy link
Member

sfan5 commented Sep 7, 2023

Instead of trying to transfer us the coredump and binary you can also just start minetest with the --debugger option and copy the backtrace it outputs.

@kromka-chleba
Copy link
Contributor Author

Ok, will do.

@jeremyshannon
Copy link
Contributor

jeremyshannon commented Sep 7, 2023

Attached is the backtrace, copy/pasted out of gdb:
coredump.txt
With a bit of further inspection, it looks like it's a player digging a bones pile, or maybe just taking an ordinary item ("nodes_nature:tanai_dead") out of it.

I think this is unrelated to our freezing bug, which is happening in an object or entity's on_step.

@sfan5
Copy link
Member

sfan5 commented Sep 7, 2023

Quick guess: does the callback being called (nodemeta_inventory_OnTake) by chance remove the node while the inventory operation is ongoing?

@jeremyshannon
Copy link
Contributor

The bones pile's on_metadata_inventory_take function does attempt to remove the node if the "main" inventory is empty. This is basically an old ~2019-2020 version of the bones mod, but it's worked fine until yesterday. Does JIT cover up this problem somehow?

@sfan5 sfan5 added Bug Issues that were confirmed to be a bug @ Script API Regression Something that used to work no longer does. and removed Unconfirmed bug Bug report that has not been confirmed to exist/be reproducible labels Sep 7, 2023
@sfan5
Copy link
Member

sfan5 commented Sep 7, 2023

In this case it's merely due to changes in the engine (0fb6dba) that added these extra checks.
Not sure if this is considered a bug but marking it as one since it deserves attention anyway (it's a regression in any case).

@jeremyshannon
Copy link
Contributor

Yeah I don't know if the bones thing should be called a bug on our end or not, but MT definitely shouldn't just crash without even an error message like this.

@kromka-chleba
Copy link
Contributor Author

The bug happened again when I tried to log into a busy server with severe lags (4 players, no luajit).
It timeouted before letting me join.

@jeremyshannon
Copy link
Contributor

I think that was a coincidence, and that another player was digging a bones pile at the time.

@kromka-chleba
Copy link
Contributor Author

Another one? Looks different than the first. "Sneachan" a mob. We're logging on_step functions because it appeared to us the server freezed while doing on_step for entities. As you can see the last thing the server prints before crashing is the on_step function. The problem is we don't print end of the on_step function so it is not clear if the crash is related to on_step.

Sneachan on-step
Sneachan on-step

Thread 5 "Server" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe3fff6c0 (LWP 35760)]
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x00007ffff74a9d9f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007ffff745af32 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007ffff7445472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x00007ffff7445395 in __assert_fail_base (fmt=0x7ffff75b9a90 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555555b1156b "move_count <= count", file=file@entry=0x555555b11138 "/home/exile/minetest/src/inventorymanager.cpp", line=line@entry=304, function=function@entry=0x555555b11518 "virtual void IMoveAction::apply(InventoryManager*, ServerActiveObject*, IGameDef*)") at ./assert/assert.c:92
#5  0x00007ffff7453e32 in __GI___assert_fail (assertion=0x555555b1156b "move_count <= count", file=0x555555b11138 "/home/exile/minetest/src/inventorymanager.cpp", line=304, function=0x555555b11518 "virtual void IMoveAction::apply(InventoryManager*, ServerActiveObject*, IGameDef*)") at ./assert/assert.c:101
#6  0x0000555555945daa in IMoveAction::apply (this=0x7fffbc6694d0, mgr=0x555555d96a30, player=0x7fffdf3c6650, gamedef=0x7fffffffd160) at /home/exile/minetest/src/inventorymanager.cpp:304
#7  0x000055555572b310 in Server::handleCommand_InventoryAction (this=0x7fffffffd150, pkt=0x7fffe3ffb950) at /home/exile/minetest/src/network/serverpackethandler.cpp:749
#8  0x00005555559ec52c in Server::handleCommand (this=0x7fffffffd150, pkt=0x7fffe3ffb950) at /home/exile/minetest/src/server.cpp:1163
#9  0x00005555559d8382 in Server::ProcessData (this=0x7fffffffd150, pkt=0x7fffe3ffb950) at /home/exile/minetest/src/server.cpp:1238
#10 0x00005555559d7498 in Server::Receive (this=0x7fffffffd150) at /home/exile/minetest/src/server.cpp:1067
#11 0x00005555559cfe67 in ServerThread::run (this=0x555555d44060) at /home/exile/minetest/src/server.cpp:125
#12 0x00005555557f9330 in Thread::threadProc (thr=0x555555d44060) at /home/exile/minetest/src/threading/thread.cpp:188
#13 0x00005555557f9f22 in std::__invoke_impl<void, void (*)(Thread*), Thread*> (__f=@0x5555561e8290: 0x5555557f92be <Thread::threadProc(Thread*)>) at /usr/include/c++/12/bits/invoke.h:61
#14 0x00005555557f9ea5 in std::__invoke<void (*)(Thread*), Thread*> (__fn=@0x5555561e8290: 0x5555557f92be <Thread::threadProc(Thread*)>) at /usr/include/c++/12/bits/invoke.h:96
#15 0x00005555557f9e15 in std::thread::_Invoker<std::tuple<void (*)(Thread*), Thread*> >::_M_invoke<0ul, 1ul> (this=0x5555561e8288) at /usr/include/c++/12/bits/std_thread.h:252
#16 0x00005555557f9dce in std::thread::_Invoker<std::tuple<void (*)(Thread*), Thread*> >::operator() (this=0x5555561e8288) at /usr/include/c++/12/bits/std_thread.h:259
#17 0x00005555557f9db2 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(Thread*), Thread*> > >::_M_run (this=0x5555561e8280) at /usr/include/c++/12/bits/std_thread.h:210
#18 0x00007ffff76d44a3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#19 0x00007ffff74a8044 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#20 0x00007ffff75285fc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

@sfan5
Copy link
Member

sfan5 commented Sep 13, 2023

That backtrace matches the yet unsolved bug #11805.

@dontfwiththestream
Copy link

same issue, player digging bones and same error on debugger

2023-10-22 02:44:02: ACTION[Server]: Casstiel_ takes default:diamond 2 from bones at (-38,-1789,-34)
terminate called after throwing an instance of 'BaseException'
what(): InventoryList 'main' is currently in use and cannot be deleted or resized.
Aborted (core dumped)

@jeremyshannon
Copy link
Contributor

Yeah the bones mod has to be modified to not destroy the node until the inventory operation is finished. Or just stick to using 5.7.0 for the time being and wait to see how this bug gets dealt with.

@dontfwiththestream
Copy link

If I could get 5.7.0 to compile I would do so but that's another story. Thanks!

@Desour
Copy link
Member

Desour commented Oct 22, 2023

Reopening, according to #13894 (review). (cc @gorop)

Idk exactly what's left to do here.

@Desour Desour reopened this Oct 22, 2023
@grorp
Copy link
Member

grorp commented Oct 22, 2023

If I understand the problem correctly, even though the mod in question won't crash anymore after #13894, it will still error. That's what I meant with "doesn't fully resolve the issue".

If that's intended/expected, I think this issue can be closed.

@jeremyshannon
Copy link
Contributor

Yeah, is the mod supposed to break? Am I recoding our bones mod? Or is the regression part of this still open?

@Desour
Copy link
Member

Desour commented Oct 22, 2023

Well, a very widely used mod (bones) still triggers a mod error:
https://github.com/minetest/minetest_game/blob/b58991d4f3d34449da670ee1948d414323fa89db/mods/bones/init.lua#L81
So some fixing is still necessary, before release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Issues that were confirmed to be a bug Regression Something that used to work no longer does. @ Script API
Projects
None yet
7 participants