Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segfault occuring #201

Closed
NICHOLAS85 opened this issue Feb 24, 2021 · 36 comments
Closed

[BUG] Segfault occuring #201

NICHOLAS85 opened this issue Feb 24, 2021 · 36 comments

Comments

@NICHOLAS85
Copy link
Contributor

NOTE: Please try to upgrade Clight if not running on latest release, before reporting a bug!

Clight version the issue has been seen with:

6bf33b8

Used distribution:

Bedrock Linux 0.7.20beta2 Poki, Debian Testing stratum

Describe the bug
It seems Clight is segfaulting at times, I haven't attempted to reproduce but will once I upload these logs

Expected behavior
Not to segfault

To Reproduce
TBD

clight.log

@NICHOLAS85
Copy link
Contributor Author

Looking at the error message it seems related to the fix implemented in 76917a1 for the high cpu bug, seems to be crashing Clight.

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

Hi! That's a great news!
Sorry i was not able to test the fix because well...it is quite uncommon. I've got my laptop up since sunday but still nothing :)
Can you check if coredumpctl registered the crash? coredumpctl dump.

Btw I will give a look at this, thank you!

@NICHOLAS85
Copy link
Contributor Author

Thanks for continuing to pursue this issue, hope we can finally solve it soon. Running # coredumpctl dump returns No match found. Running # coredumpctl returns No coredumps found.

@NICHOLAS85
Copy link
Contributor Author

I was able to strace clight up until it crashed recently. Heres is the end of that log:

strace
14:28:38 write(6, "(D)[14:28:38]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
14:28:38 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
14:28:38 epoll_wait(5, [{EPOLLIN, {u32=4069835840, u64=94248537211968}}], 64, -1) = 1
14:29:38 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
14:29:38 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0&\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
14:29:38 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
14:29:38 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999383000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=974497544})
14:29:38 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0\326\1\0\0-\0\0\0\5\1u\0&\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
14:29:38 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.467\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
14:29:38 timerfd_settime(47, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
14:29:38 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
14:29:38 write(6, "(D)[14:29:38]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
14:29:38 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
14:29:38 epoll_wait(5, [{EPOLLIN, {u32=4069835840, u64=94248537211968}}], 64, -1) = 1
14:30:38 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
14:30:38 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0'\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
14:30:38 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
14:30:38 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999670000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=974689893})
14:30:38 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0\330\1\0\0-\0\0\0\5\1u\0'\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
14:30:38 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.467\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
14:30:38 timerfd_settime(47, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
14:30:38 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
14:30:38 write(6, "(D)[14:30:38]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
14:30:38 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
14:30:38 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=4069441360, u64=94248536817488}}], 64, -1) = 1
14:31:38 close(40)                      = 0
14:31:38 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
14:31:38 write(6, "(W)[14:31:38]{interface.c:443}\tConnection reset by peer\n", 56) = 56
14:31:38 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
14:31:38 epoll_ctl(5, EPOLL_CTL_DEL, 41, 0x55b7f28ebf80) = 0
14:31:38 close(41)                      = 0
14:31:38 writev(2, [{iov_base="corrupted double-linked list", iov_len=28}, {iov_base="\n", iov_len=1}], 2corrupted double-linked list
) = 29
14:31:38 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f313b969000
14:31:38 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
14:31:38 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [HUP INT TERM], 8) = 0
14:31:38 getpid()                       = 67978
14:31:38 gettid()                       = 67978
14:31:38 tgkill(67978, 67978, SIGABRT)  = 0
14:31:38 rt_sigprocmask(SIG_SETMASK, [HUP INT TERM], NULL, 8) = 0
14:31:38 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=67978, si_uid=1000} ---
14:31:38 +++ killed by SIGABRT +++
zsh: abort      strace -tvf -s999 clight

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

Thank you very much!
So, are you able to reproduce it quite often?
If yes, would you be so kind do a couple of tests for me?
Can you try commenting L 445 and 446 of interface module? (https://github.com/FedeDP/Clight/blob/master/src/modules/interface.c#L445)
I am pretty sure from your strace output that it is passing the m_deregister_fd line, then it is crashing somewhere between those 2 lines.
Thank you very much!

@NICHOLAS85
Copy link
Contributor Author

Sure, I've rebuilt clight commenting out these two lines https://github.com/FedeDP/Clight/blob/master/src/modules/interface.c#L445-L446. I'll see if I can reproduce it.

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

Thank you man!
The next step will obviously be to comment just one of them :D
Debugging by commenting you know...

@NICHOLAS85
Copy link
Contributor Author

Haha yea the most common way of debugging. Here's the strace it seems it still crashed with those lines commented:

strace
14:52:51 write(6, "(D)[14:52:51]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
14:52:51 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
14:52:51 epoll_wait(5, [{EPOLLIN, {u32=3381773376, u64=94608626402368}}], 64, -1) = 1
14:53:51 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
14:53:51 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0&\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
14:53:51 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
14:53:51 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999794000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=973352028})
14:53:51 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0\6\2\0\0-\0\0\0\5\1u\0&\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
14:53:51 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.485\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
14:53:51 timerfd_settime(47, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
14:53:51 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
14:53:51 write(6, "(D)[14:53:51]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
14:53:51 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
14:53:51 epoll_wait(5, [{EPOLLIN, {u32=3381773376, u64=94608626402368}}], 64, -1) = 1
14:54:51 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
14:54:51 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0'\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
14:54:51 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
14:54:51 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999670000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=974577028})
14:54:51 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0\10\2\0\0-\0\0\0\5\1u\0'\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
14:54:51 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.485\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
14:54:51 timerfd_settime(47, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
14:54:51 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
14:54:51 write(6, "(D)[14:54:51]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
14:54:51 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
14:54:51 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=3381378896, u64=94608626007888}}], 64, -1) = 1
14:55:51 close(40)                      = 0
14:55:51 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
14:55:51 write(6, "(W)[14:55:51]{interface.c:443}\tConnection reset by peer\n", 56) = 56
14:55:51 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
14:55:51 epoll_ctl(5, EPOLL_CTL_DEL, 41, 0x560bc98bbf80) = 0
14:55:51 close(41)                      = 0
14:55:51 epoll_wait(5, [{EPOLLIN, {u32=3381773376, u64=94608626402368}}], 64, -1) = 1
14:55:51 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
14:55:51 writev(2, [{iov_base="malloc(): unsorted double linked list corrupted", iov_len=47}, {iov_base="\n", iov_len=1}], 2malloc(): unsorted double linked list corrupted
) = 48
14:55:51 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f372a243000
14:55:51 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
14:55:51 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [HUP INT TERM], 8) = 0
14:55:51 getpid()                       = 72848
14:55:51 gettid()                       = 72848
14:55:51 tgkill(72848, 72848, SIGABRT)  = 0
14:55:51 rt_sigprocmask(SIG_SETMASK, [HUP INT TERM], NULL, 8) = 0
14:55:51 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=72848, si_uid=1000} ---
14:55:51 +++ killed by SIGABRT +++
zsh: abort      strace -tvf -s999 clight
It's weird how I am able to reproduce this within minutes whereas before it would only happen once in months.

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

Can you comment m_deregister_fd too?

It's weird how I am able to reproduce this within minutes whereas before it would only happen once in months.

I was thinking the same...it's super weird; while on my laptop still nothing after 3-4d! What are you doing, if i can ask? Ie: watching a movie, playing some games... maybe there is a way to increase this bug frequency!

@NICHOLAS85
Copy link
Contributor Author

Rebuilt with that line commented, running it now. I am not doing anything out of the ordinary. I've got firefox open with a couple of tabs but no media playing. Two terminals open, one with the strace -tvf -s999 clight and one viewing tail -f ~/.local/share/clight/clight.log. After that point, I've just been leaving the laptop alone.

@NICHOLAS85
Copy link
Contributor Author

New log, spammed Connection reset by peer so I truncated it with where it started and where it ended:

strace
15:03:42 write(6, "(D)[15:03:42]{interface.c:455}\tEmitting 'BlPct' property\n", 57) = 57
15:03:42 write(1, "Emitting 'BlPct' property\n", 26Emitting 'BlPct' property
) = 26
15:03:42 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1<\0\0\0\35\0\0\0v\0\0\0\1\1o\0\22\0\0\0/org/clight/clight\0\0\0\0\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Properties\0\3\1s\0\21\0\0\0PropertiesChanged\0\0\0\0\0\0\0\10\1g\0\10sa{sv}as\0\0\0", iov_len=136}, {iov_base="\21\0\0\0org.clight.clight\0\0\0\30\0\0\0\0\0\0\0\5\0\0\0BlPct\0\1d\0\0\0\0ffffff\356?\0\0\0\0", iov_len=60}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 196
15:03:42 epoll_wait(5, [{EPOLLIN, {u32=4281619520, u64=94029705663552}}], 64, -1) = 1
15:04:21 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
15:04:21 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0%\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
15:04:21 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
15:04:21 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999495000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=968936087})
15:04:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0001\2\0\0-\0\0\0\5\1u\0%\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
15:04:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.496\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
15:04:21 timerfd_settime(47, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
15:04:21 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:04:21 write(6, "(D)[15:04:21]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
15:04:21 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
15:04:21 epoll_wait(5, [{EPOLLIN, {u32=4281619520, u64=94029705663552}}], 64, -1) = 1
15:05:21 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
15:05:21 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0&\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
15:05:21 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
15:05:21 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999725000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=971854106})
15:05:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0002\2\0\0-\0\0\0\5\1u\0&\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
15:05:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.496\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
15:05:21 timerfd_settime(47, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
15:05:21 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:05:21 write(6, "(D)[15:05:21]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
15:05:21 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
15:05:21 epoll_wait(5, [{EPOLLIN, {u32=4281619520, u64=94029705663552}}], 64, -1) = 1
15:06:21 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
15:06:21 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0'\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
15:06:21 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
15:06:21 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999816000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=974597855})
15:06:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0003\2\0\0-\0\0\0\5\1u\0'\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
15:06:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.496\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
15:06:21 timerfd_settime(47, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
15:06:21 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:06:21 write(6, "(D)[15:06:21]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
15:06:21 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
15:06:21 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=4281225040, u64=94029705269072}}], 64, -1) = 1
15:07:21 close(40)                      = 0
15:07:21 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:07:21 write(6, "(W)[15:07:21]{interface.c:443}\tConnection reset by peer\n", 56) = 56
15:07:21 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
15:07:21 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=4281225040, u64=94029705269072}}], 64, -1) = 1
15:07:21 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:07:21 write(6, "(W)[15:07:21]{interface.c:443}\tConnection reset by peer\n", 56) = 56
15:07:21 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25

...

15:07:21 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=4281225040, u64=94029705269072}}], 64, -1) = 1
15:07:21 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:07:21 write(6, "(W)[15:07:21]{interface.c:443}\tConnection reset by peer\n", 56) = 56
15:07:21 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
15:07:21 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=4281225040, u64=94029705269072}}, {EPOLLIN, {u32=4281619520, u64=94029705663552}}], 64, -1) = 2
15:07:21 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:07:21 write(6, "(W)[15:07:21]{interface.c:443}\tConnection reset by peer\n", 56) = 56
15:07:21 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
15:07:21 read(47, "\1\0\0\0\0\0\0\0", 8) = 8
15:07:21 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0(\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
15:07:21 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
15:07:21 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999801000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=978045662})
15:07:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0004\2\0\0-\0\0\0\5\1u\0(\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
15:07:21 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.496\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\25\25\25\25\25\25\265?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
15:07:21 writev(2, [{iov_base="malloc(): unsorted double linked list corrupted", iov_len=47}, {iov_base="\n", iov_len=1}], 2malloc(): unsorted double linked list corrupted
) = 48
15:07:21 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4b00e9e000
15:07:21 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
15:07:21 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [HUP INT TERM], 8) = 0
15:07:21 getpid()                       = 74645
15:07:21 gettid()                       = 74645
15:07:21 tgkill(74645, 74645, SIGABRT)  = 0
15:07:21 rt_sigprocmask(SIG_SETMASK, [HUP INT TERM], NULL, 8) = 0
15:07:21 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=74645, si_uid=1000} ---
15:07:21 +++ killed by SIGABRT +++
zsh: abort      strace -tvf -s999 clight

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

So, this is the version with all 3 lines commented?

@NICHOLAS85
Copy link
Contributor Author

Yes

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

Dope! So we are now sure that this is #106
And it seems like in the end it is m_deregister_fd to cause the sigabrt. Interesting!
Can you reset the source file to master, and change (always in interface.c) the m_register_fd() call (L543) to
m_register_fd(fd, false, monbus);
Let's see if it crashes with this.

@NICHOLAS85
Copy link
Contributor Author

Sure, heres the diff I applied:

diff --git a/src/modules/interface.c b/src/modules/interface.c
index fadaf10..1e5e5b7 100644
--- a/src/modules/interface.c
+++ b/src/modules/interface.c
@@ -540,7 +540,7 @@ static int start_inhibit_monitor(void) {
     if (r == 0) {
         sd_bus_process(monbus, NULL);
         int fd = sd_bus_get_fd(monbus);
-        m_register_fd(dup(fd), true, monbus);
+        m_register_fd(fd, false, monbus);
     }
     return r;
 }

Rebuilding and running now.

@NICHOLAS85
Copy link
Contributor Author

New log!

strace
15:29:23 write(6, "(D)[15:29:23]{backlight.c:511}\tBacklight 'intel_backlight' level updated: 0.83.\n", 80) = 80
15:29:23 write(1, "Backlight 'intel_backlight' level updated: 0.83.\n", 49Backlight 'intel_backlight' level updated: 0.83.
) = 49
15:29:23 write(18, "@[\24g\361U\0\0", 8) = 8
15:29:23 write(33, "@[\24g\361U\0\0", 8) = 8
15:29:23 write(37, "@[\24g\361U\0\0", 8) = 8
15:29:23 write(39, "@[\24g\361U\0\0", 8) = 8
15:29:23 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
15:29:23 epoll_wait(5, [{EPOLLIN, {u32=1729401888, u64=94495304881184}}, {EPOLLIN, {u32=1729452944, u64=94495304932240}}, {EPOLLIN, {u32=1729478128, u64=94495304957424}}, {EPOLLIN, {u32=1729493216, u64=94495304972512}}], 64, -1) = 4
15:29:23 read(17, "@[\24g\361U\0\0", 8) = 8
15:29:23 read(32, "@[\24g\361U\0\0", 8) = 8
15:29:23 read(36, "@[\24g\361U\0\0", 8) = 8
15:29:23 read(38, "@[\24g\361U\0\0", 8) = 8
15:29:23 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:29:23 write(6, "(D)[15:29:23]{interface.c:455}\tEmitting 'BlPct' property\n", 57) = 57
15:29:23 write(1, "Emitting 'BlPct' property\n", 26Emitting 'BlPct' property
) = 26
15:29:23 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1<\0\0\0<\0\0\0v\0\0\0\1\1o\0\22\0\0\0/org/clight/clight\0\0\0\0\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Properties\0\3\1s\0\21\0\0\0PropertiesChanged\0\0\0\0\0\0\0\10\1g\0\10sa{sv}as\0\0\0", iov_len=136}, {iov_base="\21\0\0\0org.clight.clight\0\0\0\30\0\0\0\0\0\0\0\5\0\0\0BlPct\0\1d\0\0\0\0\316c\2472\237\206\352?\0\0\0\0", iov_len=60}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 196
15:29:23 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1729527632, u64=94495305006928}}], 64, -1) = 1
15:29:26 close(40)                      = 0
15:29:26 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:29:26 write(6, "(W)[15:29:26]{interface.c:443}\tConnection reset by peer\n", 56) = 56
15:29:26 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
15:29:26 epoll_ctl(5, EPOLL_CTL_DEL, 40, 0x55f167167f80) = -1 EBADF (Bad file descriptor)
15:29:26 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x55f167216008} ---
15:29:26 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:29:26 write(6, "(W)[15:29:26]{main.c:109}\tReceived sigsegv signal. Aborting.\n", 61) = 61
15:29:26 write(1, "Received sigsegv signal. Aborting.\n", 35Received sigsegv signal. Aborting.
) = 35
15:29:26 flock(6, LOCK_UN)              = 0
15:29:26 close(6)                       = 0
15:29:26 rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f88d2c44d60}, {sa_handler=0x55f167076f80, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f88d2c44d60}, 8) = 0
15:29:26 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [HUP INT SEGV TERM], 8) = 0
15:29:26 getpid()                       = 78830
15:29:26 gettid()                       = 78830
15:29:26 tgkill(78830, 78830, SIGSEGV)  = 0
15:29:26 rt_sigprocmask(SIG_SETMASK, [HUP INT SEGV TERM], NULL, 8) = 0
15:29:26 rt_sigreturn({mask=[HUP INT TERM]}) = 97
15:29:26 --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=78830, si_uid=1000} ---
15:29:26 +++ killed by SIGSEGV +++
zsh: segmentation fault  strace -tvf -s999 clight

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

15:29:26 epoll_ctl(5, EPOLL_CTL_DEL, 40, 0x55f167167f80) = -1 EBADF (Bad file descriptor)

That is really really weird...
New patch:
clight.txt

Thank you :)

@NICHOLAS85
Copy link
Contributor Author

Applied patch and rebuilt. High velocity debugging right here

@NICHOLAS85
Copy link
Contributor Author

strace
15:45:23 write(6, "(D)[15:45:23]{timer.c:32}\tSet timeout of 60s 0ns on fd 46.\n", 59) = 59
15:45:23 write(1, "Set timeout of 60s 0ns on fd 46.\n", 33Set timeout of 60s 0ns on fd 46.
) = 33
15:45:23 epoll_wait(5, [{EPOLLIN, {u32=582222912, u64=94485567767616}}], 64, -1) = 1
15:46:23 read(46, "\1\0\0\0\0\0\0\0", 8) = 8
15:46:23 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1'\0\0\0'\0\0\0\230\0\0\0\1\1o\0\33\0\0\0/org/clightd/clightd/Screen\0\0\0\0\0\3\1s\0\24\0\0\0GetEmittedBrightness\0\0\0\0\2\1s\0\32\0\0\0org.clightd.clightd.Screen\0\0\0\0\0\0\6\1s\0\23\0\0\0org.clightd.clightd\0\0\0\0\0\10\1g\0\2ss\0", iov_len=168}, {iov_base="\2\0\0\0:0\0\0\32\0\0\0/home/nicholas/.Xauthority\0", iov_len=39}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 207
15:46:23 recvmsg(3, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
15:46:23 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=24, tv_nsec=999617000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {tv_sec=24, tv_nsec=972735439})
15:46:23 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\10\0\0\0\302\2\0\0-\0\0\0\5\1u\0'\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
15:46:23 recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\6\1s\0\6\0\0\0:1.530\0\0\10\1g\0\1d\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\36\36\36\36\36\36\256?", iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 48
15:46:23 timerfd_settime(46, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=60, tv_nsec=0}}, NULL) = 0
15:46:23 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:46:23 write(6, "(D)[15:46:23]{timer.c:32}\tSet timeout of 60s 0ns on fd 46.\n", 59) = 59
15:46:23 write(1, "Set timeout of 60s 0ns on fd 46.\n", 33Set timeout of 60s 0ns on fd 46.
) = 33
15:46:23 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=581828432, u64=94485567373136}}], 64, -1) = 1
15:47:23 close(40)                      = 0
15:47:23 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
15:47:23 write(6, "(W)[15:47:23]{interface.c:444}\tConnection reset by peer\n", 56) = 56
15:47:23 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
15:47:23 epoll_ctl(5, EPOLL_CTL_DEL, 40, 0x55ef22adff80) = -1 EBADF (Bad file descriptor)
15:47:23 writev(2, [{iov_base="corrupted double-linked list", iov_len=28}, {iov_base="\n", iov_len=1}], 2corrupted double-linked list
) = 29
15:47:23 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fce04dbb000
15:47:23 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
15:47:23 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [HUP INT TERM], 8) = 0
15:47:23 getpid()                       = 81941
15:47:23 gettid()                       = 81941
15:47:23 tgkill(81941, 81941, SIGABRT)  = 0
15:47:23 rt_sigprocmask(SIG_SETMASK, [HUP INT TERM], NULL, 8) = 0
15:47:23 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=81941, si_uid=1000} ---
15:47:23 +++ killed by SIGABRT +++
zsh: abort      strace -tvf -s999 clight

@NICHOLAS85
Copy link
Contributor Author

Ran it again and seemed to run into a slightly different log:

strace
15:55:57 write(6, "(D)[15:55:57]{interface.c:456}\tEmitting 'BlPct' property\n", 57) = 57
15:55:57 write(1, "Emitting 'BlPct' property\n", 26Emitting 'BlPct' property
) = 26
15:55:57 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1<\0\0\0A\0\0\0v\0\0\0\1\1o\0\22\0\0\0/org/clight/clight\0\0\0\0\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Properties\0\3\1s\0\21\0\0\0PropertiesChanged\0\0\0\0\0\0\0\10\1g\0\10sa{sv}as\0\0\0", iov_len=136}, {iov_base="\21\0\0\0org.clight.clight\0\0\0\30\0\0\0\0\0\0\0\5\0\0\0BlPct\0\1d\0\0\0\0ffffff\356?\0\0\0\0", iov_len=60}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 196
15:55:57 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=989695824, u64=94219687268176}}], 64, -1) = 1
15:56:34 close(40)                      = 0
15:56:34 writev(2, [{iov_base="Assertion 'm->n_ref > 0' failed at src/libsystemd/sd-bus/bus-message.c:932, function sd_bus_message_unref(). Aborting.", iov_len=118}, {iov_base="\n", iov_len=1}], 2Assertion 'm->n_ref > 0' failed at src/libsystemd/sd-bus/bus-message.c:932, function sd_bus_message_unref(). Aborting.
) = 119
15:56:34 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
15:56:34 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [HUP INT TERM], 8) = 0
15:56:34 getpid()                       = 83610
15:56:34 gettid()                       = 83610
15:56:34 tgkill(83610, 83610, SIGABRT)  = 0
15:56:34 rt_sigprocmask(SIG_SETMASK, [HUP INT TERM], NULL, 8) = 0
15:56:34 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=83610, si_uid=1000} ---
15:56:34 +++ killed by SIGABRT +++
zsh: abort      strace -tvf -s999 clight

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

I don't really get what's happening here... ok let's reset to master.
New patch:
clight.txt

@NICHOLAS85
Copy link
Contributor Author

Applied patch, rebuilt. New log, message ended up looping in strace and no crash (progress!). Two parts:

strace
16:08:26 write(6, "(D)[16:08:26]{timer.c:32}\tSet timeout of 60s 0ns on fd 47.\n", 59) = 59
16:08:26 write(1, "Set timeout of 60s 0ns on fd 47.\n", 33Set timeout of 60s 0ns on fd 47.
) = 33
16:08:26 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:26 close(40)                      = 0
16:09:26 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
16:09:26 write(6, "(W)[16:09:26]{interface.c:444}\tConnection reset by peer\n", 56) = 56
16:09:26 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
16:09:26 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 40
16:09:26 getsockopt(40, SOL_SOCKET, SO_RCVBUF, [212992], [4]) = 0
16:09:26 setsockopt(40, SOL_SOCKET, SO_RCVBUF, [8388608], 4) = 0
16:09:26 getsockopt(40, SOL_SOCKET, SO_RCVBUF, [425984], [4]) = 0
16:09:26 setsockopt(40, SOL_SOCKET, SO_RCVBUFFORCE, [8388608], 4) = -1 EPERM (Operation not permitted)
16:09:26 getsockopt(40, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
16:09:26 setsockopt(40, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0
16:09:26 getsockopt(40, SOL_SOCKET, SO_SNDBUF, [425984], [4]) = 0
16:09:26 setsockopt(40, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = -1 EPERM (Operation not permitted)
16:09:26 connect(40, {sa_family=AF_UNIX, sun_path="/run/user/1000/bus"}, 21) = 0
16:09:26 getsockopt(40, SOL_SOCKET, SO_PEERCRED, {pid=26578, uid=1000, gid=1000}, [12]) = 0
16:09:26 getsockopt(40, SOL_SOCKET, SO_PEERSEC, 0x55f06a91bae0, [64]) = -1 ENOPROTOOPT (Protocol not available)
16:09:26 getsockopt(40, SOL_SOCKET, SO_PEERGROUPS, "\30\0\0\0\31\0\0\0\33\0\0\0\35\0\0\0\36\0\0\0,\0\0\0.\0\0\0i\0\0\0j\0\0\0m\0\0\0p\0\0\0u\0\0\0v\0\0\0\330\3\0\0\350\3\0\0\351\3\0\0", [256->64]) = 0
16:09:26 fstat(40, {st_dev=makedev(0, 0x8), st_ino=768064, st_mode=S_IFSOCK|0777, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=0, st_size=0, st_atime=0, st_atime_nsec=0, st_mtime=0, st_mtime_nsec=0, st_ctime=0, st_ctime_nsec=0}) = 0
16:09:26 getsockopt(40, SOL_SOCKET, SO_ACCEPTCONN, [0], [4]) = 0
16:09:26 getsockname(40, {sa_family=AF_UNIX}, [128->2]) = 0
16:09:26 sendmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0AUTH EXTERNAL\r\nDATA\r\n", iov_len=22}, {iov_base="NEGOTIATE_UNIX_FD\r\n", iov_len=19}, {iov_base="BEGIN\r\n", iov_len=7}], msg_iovlen=3, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 48
16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="DATA\r\nOK 3f9198b50042873411968b696035fbd5\r\nAGREE_UNIX_FD\r\n", iov_len=256}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 58
16:09:26 dup(40)                        = 50
16:09:26 epoll_ctl(5, EPOLL_CTL_ADD, 50, {EPOLLIN, {u32=1788415584, u64=94491068927584}}) = 0
16:09:26 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:26 sendmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1\0\0\0\0\1\0\0\0m\0\0\0\1\1o\0\25\0\0\0/org/freedesktop/DBus\0\0\0\3\1s\0\5\0\0\0Hello\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0", iov_len=128}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 128
16:09:26 sendmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\1\18\0\0\0\2\0\0\0\211\0\0\0\1\1o\0\25\0\0\0/org/freedesktop/DBus\0\0\0\3\1s\0\r\0\0\0BecomeMonitor\0\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Monitoring\0\6\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\10\1g\0\3asu\0\0\0\0\0\0\0\0", iov_len=160}, {iov_base=".\0\0\0)\0\0\0destination='org.freedesktop.ScreenSaver'\0\0\0\0\0\0\0", iov_len=56}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 216
16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\v\0\0\0\1\0\0\0=\0\0\0\6\1s\0\6\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base=":1.168\0\0\5\1u\0\1\0\0\0\10\1g\0\1s\0\0\7\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\0\0\0:1.168\0", iov_len=67}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 67
16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1\v\0\0\0\2\0\0\0\215\0\0\0\1\1o\0\25\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="/org/freedesktop/DBus\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\3\1s\0\f\0\0\0NameAcquired\0\0\0\0\6\1s\0\6\0\0\0:1.168\0\0\10\1g\0\1s\0\0\7\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\0\0\0:1.168\0", iov_len=147}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 147
16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1\v\0\0\0\3\0\0\0\215\0\0\0\1\1o\0\25\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="/org/freedesktop/DBus\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\3\1s\0\10\0\0\0NameLost\0\0\0\0\0\0\0\0\6\1s\0\6\0\0\0:1.168\0\0\10\1g\0\1s\0\0\7\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\0\0\0:1.168\0", iov_len=147}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 147
16:09:26 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:26 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:26 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:26 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1

...

16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:09:50 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=1787940688, u64=94491068452688}}], 64, -1) = 1
16:09:50 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
^Cstrace: Process 85875 detached
Received 2. Leaving.

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

Nice!
First part: you can see how it reaches start_inhibit_monitor() and the m_register_fd(dup(monbus_fd), true, monbus); line:

16:09:26 dup(40) = 50
16:09:26 epoll_ctl(5, EPOLL_CTL_ADD, 50, {EPOLLIN, {u32=1788415584, u64=94491068927584}}) = 0

Then it receives some preparing messages for monitoring;
Finally, it loop on EAGAIN, that is quite weird (obviously!! That is the main theme today! :D )

I added a sd_bus_message_dump call() there, i want to make clear if we are really receiving a

16:09:26 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="/org/freedesktop/DBus\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\3\1s\0\10\0\0\0NameLost\0\0\0\0\0\0\0\0\6\1s\0\6\0\0\0:1.168\0\0\10\1g\0\1s\0\0\7\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\0\0\0:1.168\0", iov_len=147}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 147

NameLost message; and why.
clight.txt

@NICHOLAS85
Copy link
Contributor Author

New log after patch:

strace
16:41:20 write(6, "(D)[16:41:20]{interface.c:458}\tEmitting 'BlPct' property\n", 57) = 57
16:41:20 write(1, "Emitting 'BlPct' property\n", 26Emitting 'BlPct' property
) = 26
16:41:20 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1<\0\0\0=\0\0\0v\0\0\0\1\1o\0\22\0\0\0/org/clight/clight\0\0\0\0\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Properties\0\3\1s\0\21\0\0\0PropertiesChanged\0\0\0\0\0\0\0\10\1g\0\10sa{sv}as\0\0\0", iov_len=136}, {iov_base="\21\0\0\0org.clight.clight\0\0\0\30\0\0\0\0\0\0\0\5\0\0\0BlPct\0\1d\0\0\0\0\177\331=yX\250\345?\0\0\0\0", iov_len=60}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 196
16:41:20 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=2749243216, u64=94676713348944}}], 64, -1) = 1
16:41:29 close(40)                      = 0
16:41:29 write(1, "MESSAGE \"\" {\n", 13MESSAGE "" {
) = 13
16:41:29 write(1, "};\n\n", 4};

)          = 4
16:41:29 stat("/etc/localtime", {st_dev=makedev(0, 0x1e), st_ino=4912, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=3536, st_atime=1614124962 /* 2021-02-23T19:02:42.815182319-0500 */, st_atime_nsec=815182319, st_mtime=1611525923 /* 2021-01-24T17:05:23-0500 */, st_mtime_nsec=0, st_ctime=1612132600 /* 2021-01-31T17:36:40.006005453-0500 */, st_ctime_nsec=6005453}) = 0
16:41:29 write(6, "(W)[16:41:29]{interface.c:446}\tConnection reset by peer\n", 56) = 56
16:41:29 write(1, "Connection reset by peer\n", 25Connection reset by peer
) = 25
16:41:29 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 40
16:41:29 getsockopt(40, SOL_SOCKET, SO_RCVBUF, [212992], [4]) = 0
16:41:29 setsockopt(40, SOL_SOCKET, SO_RCVBUF, [8388608], 4) = 0
16:41:29 getsockopt(40, SOL_SOCKET, SO_RCVBUF, [425984], [4]) = 0
16:41:29 setsockopt(40, SOL_SOCKET, SO_RCVBUFFORCE, [8388608], 4) = -1 EPERM (Operation not permitted)
16:41:29 getsockopt(40, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
16:41:29 setsockopt(40, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0
16:41:29 getsockopt(40, SOL_SOCKET, SO_SNDBUF, [425984], [4]) = 0
16:41:29 setsockopt(40, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = -1 EPERM (Operation not permitted)
16:41:29 connect(40, {sa_family=AF_UNIX, sun_path="/run/user/1000/bus"}, 21) = 0
16:41:29 getsockopt(40, SOL_SOCKET, SO_PEERCRED, {pid=26578, uid=1000, gid=1000}, [12]) = 0
16:41:29 getsockopt(40, SOL_SOCKET, SO_PEERSEC, 0x561ba3de0ae0, [64]) = -1 ENOPROTOOPT (Protocol not available)
16:41:29 getsockopt(40, SOL_SOCKET, SO_PEERGROUPS, "\30\0\0\0\31\0\0\0\33\0\0\0\35\0\0\0\36\0\0\0,\0\0\0.\0\0\0i\0\0\0j\0\0\0m\0\0\0p\0\0\0u\0\0\0v\0\0\0\330\3\0\0\350\3\0\0\351\3\0\0", [256->64]) = 0
16:41:29 fstat(40, {st_dev=makedev(0, 0x8), st_ino=841977, st_mode=S_IFSOCK|0777, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=0, st_size=0, st_atime=0, st_atime_nsec=0, st_mtime=0, st_mtime_nsec=0, st_ctime=0, st_ctime_nsec=0}) = 0
16:41:29 getsockopt(40, SOL_SOCKET, SO_ACCEPTCONN, [0], [4]) = 0
16:41:29 getsockname(40, {sa_family=AF_UNIX}, [128->2]) = 0
16:41:29 sendmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0AUTH EXTERNAL\r\nDATA\r\n", iov_len=22}, {iov_base="NEGOTIATE_UNIX_FD\r\n", iov_len=19}, {iov_base="BEGIN\r\n", iov_len=7}], msg_iovlen=3, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 48
16:41:29 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="DATA\r\nOK 3f9198b50042873411968b696035fbd5\r\nAGREE_UNIX_FD\r\n", iov_len=256}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 58
16:41:29 dup(40)                        = 50
16:41:29 epoll_ctl(5, EPOLL_CTL_ADD, 50, {EPOLLIN, {u32=2749757760, u64=94676713863488}}) = 0
16:41:29 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=2749243216, u64=94676713348944}}], 64, -1) = 1
16:41:29 sendmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\0\1\0\0\0\0\1\0\0\0m\0\0\0\1\1o\0\25\0\0\0/org/freedesktop/DBus\0\0\0\3\1s\0\5\0\0\0Hello\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0", iov_len=128}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 128
16:41:29 sendmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\1\18\0\0\0\2\0\0\0\211\0\0\0\1\1o\0\25\0\0\0/org/freedesktop/DBus\0\0\0\3\1s\0\r\0\0\0BecomeMonitor\0\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Monitoring\0\6\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\10\1g\0\3asu\0\0\0\0\0\0\0\0", iov_len=160}, {iov_base=".\0\0\0)\0\0\0destination='org.freedesktop.ScreenSaver'\0\0\0\0\0\0\0", iov_len=56}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 216
16:41:29 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\v\0\0\0\1\0\0\0=\0\0\0\6\1s\0\6\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
16:41:29 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base=":1.186\0\0\5\1u\0\1\0\0\0\10\1g\0\1s\0\0\7\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\0\0\0:1.186\0", iov_len=67}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 67
16:41:29 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1\v\0\0\0\2\0\0\0\215\0\0\0\1\1o\0\25\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
16:41:29 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="/org/freedesktop/DBus\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\3\1s\0\f\0\0\0NameAcquired\0\0\0\0\6\1s\0\6\0\0\0:1.186\0\0\10\1g\0\1s\0\0\7\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\0\0\0:1.186\0", iov_len=147}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 147
16:41:29 write(1, "MESSAGE \"s\" {\n", 14MESSAGE "s" {
) = 14
16:41:29 ioctl(1, TCGETS, {c_iflags=0x5500, c_oflags=0x5, c_cflags=0xbf, c_lflags=0x8a3b, c_line=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0
16:41:29 ioctl(2, TCGETS, {c_iflags=0x5500, c_oflags=0x5, c_cflags=0xbf, c_lflags=0x8a3b, c_line=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0
16:41:29 write(1, "        STRING \"\33[0;1;39m:1.186\33[0m\";\n", 38        STRING ":1.186";
) = 38
16:41:29 write(1, "};\n\n", 4};

)          = 4
16:41:29 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\1\v\0\0\0\3\0\0\0\215\0\0\0\1\1o\0\25\0\0\0", iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
16:41:29 recvmsg(40, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="/org/freedesktop/DBus\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\3\1s\0\10\0\0\0NameLost\0\0\0\0\0\0\0\0\6\1s\0\6\0\0\0:1.186\0\0\10\1g\0\1s\0\0\7\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\0\0\0:1.186\0", iov_len=147}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 147
16:41:29 write(1, "MESSAGE \"s\" {\n", 14MESSAGE "s" {
) = 14
16:41:29 write(1, "        STRING \"\33[0;1;39m:1.186\33[0m\";\n", 38        STRING ":1.186";
) = 38
16:41:29 write(1, "};\n\n", 4};

)          = 4
16:41:29 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:41:29 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=2749243216, u64=94676713348944}}], 64, -1) = 1
16:41:29 recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
16:41:29 epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=2749243216, u64=94676713348944}}], 64, -1) = 1

(repeats)recvmsg(40, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) epoll_wait(5, [{EPOLLIN|EPOLLHUP, {u32=2749243216, u64=94676713348944}}], 64, -1) = 1

@NICHOLAS85
Copy link
Contributor Author

NICHOLAS85 commented Feb 24, 2021

I find that running clight with strace makes this happen a lot faster than without. Tried letting it run on its own and didn't run into issues immediately like with strace.

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

I'll give it a shot then :)
If i was able to reproduce it as quick as you it would be great!
Thank for all your testing, i'm done for today!
Hopefully i will be able to reproduce it!

@NICHOLAS85
Copy link
Contributor Author

No problem and alright! Let me know if you want me to test anything else out for you in the future, hope you are able to reproduce it yourself reliably.

@NICHOLAS85
Copy link
Contributor Author

Should I keep these changes you sent or reset back to master in the meantime?

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

I'd reset to master (even if it segfaults...)!

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

I WAS ABLE TO REPRODUCE IT through strace!!

@FedeDP
Copy link
Owner

FedeDP commented Feb 24, 2021

With latest master, i am not able to reproduce the crash. Even if i don't really understand why.
But it seems like running it through strace interferes with Clight: no Inhibit message is received (Inhibited is always false), thus i am not sure this is a real test. Anyway, i am hoping that fixing this crash will lead us somewhere :D

@NICHOLAS85
Copy link
Contributor Author

Looks like the crash has indeed been fixed. Just found a Connection reset by peer message in my log without any sort of crash or high cpu resulting after. Also, I now notice what you mean with how strace interferes with Clight, strange.

@FedeDP
Copy link
Owner

FedeDP commented Feb 25, 2021

And... Is screensaver inhibition working after the "Connection reset" message?

@NICHOLAS85
Copy link
Contributor Author

Looks like it is, inhibition is being detected and the correct behavior is occurring

@FedeDP
Copy link
Owner

FedeDP commented Feb 25, 2021

I really can't believe 😂

Thank you for all your testing!!

@NICHOLAS85
Copy link
Contributor Author

Happy to help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants