Crash when .lock files are present #2050

cloudOver · 2024-05-20T08:29:25Z

Memgraph version
2.14.1

Environment

google GKE, x84
docker image memgraph:2.14.1
/var/lib/memgraph and /var/log/memgraph mounted via NFS.

Describe the bug
After several restarts during node pool update, memgraph didn't delete lock files in /var/lib/memgraph directory and subdirectories. Process was probably killed by host operating system, during node restart. After starting pod again, main process in container crashes without error messages and gets sigsegv.

Manually removing all lock files solves the issue, but it's difficult to spot the problem with only sigsegv information.

To Reproduce
Steps to reproduce the behavior:

Kill running instance of memgraph
Make sure that .lock file is present in /var/lib/memgraph and LOCK files are present in subdirectories (auth, settings, streams and triggers)
Start memgraph instance again, with present .lock files
Memgraph crashes with sigsegv (k8s gets exit code 139). Dmesg of host's kernel shows only segmentation fault info. There is no log on console about present lock files. After removing just .lock file MG starts showing error about file permissions

This problem existed only on kubernetes cluster with /var/lib/memgraph on NFS. On the same image and data, but running directly via docker, problem doesn't exist and memgraph starts even with existing lock files.

Expected behavior
When .lock files are present, memgraph should exit gracefully, showing information about existing lock files. Don't get sigsegv.

Logs
Dmesg logs looks like:
INFO TIMESTAMPZ [resource.labels.instanceId: xyzxyzxyz] traps: memgraph[pid] general protection fault ip:7a40b7050602 sp:7ffc98f66630 error:0 in libc-2.31.so[7a40b7050000+159000]

Verification Environment
Do you need:

memgraph/memgraph:2.14.1
x86?
GKE cluster running on coreOS nodes

The text was updated successfully, but these errors were encountered:

matea16 · 2024-05-20T08:55:04Z

Hi @cloudOver, thank you for reporting this issue. Could you try updating Memgraph to the latest version and see if the issue still persists?

cloudOver · 2024-05-21T10:59:58Z

Hi,
I've just checked and docker image with latest tag also crashes without warning. If anything is wrong with access to files, it results in sigsegv. Easiest way to reproduce it it to change ownership of .lock or logs files in /var/lib/memgraph.

Also I've noticed that uid was changed in latest version to 100, from 101 in 2.14. With wrong file permissions it is not creating any log message about this issue

matea16 · 2024-05-22T09:07:20Z

Thank you for checking @cloudOver. I've labeled your issue and someone from the team will take a look at it asap. Is this a blocker for you currently?

cloudOver · 2024-05-24T08:42:52Z

Hi, no, we've already fixed that one failure by removing above files and MG is now running. Just trying to not restart it if not necessary

cloudOver added the bug bug label May 20, 2024

matea16 added community community Effort - Unknown Effort - Unknown Severity - S2 Severity - S2 Frequency - Daily Frequency - Daily Reach - VeryFew Reach - VeryFew labels May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash when .lock files are present #2050

Crash when .lock files are present #2050

cloudOver commented May 20, 2024

matea16 commented May 20, 2024

cloudOver commented May 21, 2024

matea16 commented May 22, 2024

cloudOver commented May 24, 2024

Crash when .lock files are present #2050

Crash when .lock files are present #2050

Comments

cloudOver commented May 20, 2024

matea16 commented May 20, 2024

cloudOver commented May 21, 2024

matea16 commented May 22, 2024

cloudOver commented May 24, 2024