-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rar2fs crash with corrupted double linked list error #98
Comments
Update 1: I am unable to reproduce the problem with just mounting the rar2fs point, but I can with mounting 1 rar2fs point and doing a unionfs.
|
Update 2:
|
Thanks for the issue report. I think we should separate this from #95 at least until we cannot prove it is having the same root cause. Currently I do not think that is the case due to the very different crash signature and stack traces. I really appreciate all the hard work you are putting into trying to collect more data for trouble shooting. We will need more of that since I am currently unable to reproduce anything similar. There are a few things that are interesting to note here:
These kind of issues are very hard to troubleshoot, especially since we do not seem to be able to collect a decent stack trace. We also cannot out rule the fact that the source of the corruption could be located some where else, not in rar2fs itself. The crash always points to the main process, but there are several linked in libraries being used too. What strikes me as a bit odd is that you started to see this after a system upgrade and without unionsfs it also seems to behave. From what I can tell so far is that the last thing you do see is some partial read of a file, right? What if the buffer provided by FUSE in the read call simply is not big enough? I am not saying that is the case, only saying that a corruption can have many different sources. So, we need to narrow it down somehow. Would it be possible for you to downgrade rar2fs to , lets say, v1.24.1? Not that I think it will do much good but we need to try something. Another thing you can try on the current version is mounting using the -s flag to force rar2fs/FUSE into single threaded mode. |
Update 2 (sorry for being late, but a new addition to the family totally changed all plans):
|
Update:
|
Yea, I was sort of afraid of this... |
What you can try is to compile rar2fs using the maximum debug level, I believe it is |
Tried that, but it is not really giving more information compared to --enable-debug which is level 4. Now the interesting but is that the exception happens in thread 11892. However, this is not rar2fs. All activities in the log happen in 11871. Is that perhaps a clue?
|
The thread id is a hint but not much more than that I am afraid.
The size read is 14288 bytes but still FUSE thinks we read 131072 bytes?
but with a size of 14288 rather than 131072. We can try to add some protection in the raw read function in case we are called from different threads using asynchronous reads. The -osync_read is hard-coded by rar2fs so that would surprise me though. Unfortunately we dump the PID and not the TID in the log. We might have spotted if the read is coming from the thread that caught the exception. Can you please step up to latest version of rar2fs? Also, am I right in that it seem to be different files each time being involved in the crash? |
Okay, I had a look around and I think we might have some bit more information still running 1.27.0. Added some more gcc hardening flags like -fsanitize=address* and some others, and when things crash I have a lot more information 👍. The below should help, right? :)
Update1:
Update2:
Update3:
Update4:
[*] for future reference, add -fsanitize=address to CFLAGS/CPPFLAGS and -fsanitize=address -static-libasan to LDFLAGS |
Thanks, will look into it. |
updated post |
Yepp, you spotted a problem that has been there all the time I believe. Nice catch.
That is also what I suspect, this is probably a very silent bug hiding under the radar. Otherwise I think someone (including myself) would have seen this a long time ago. Again, nice catch. |
Please try this patch. I primarily wish to see if it resolves the problem(s) you have observed. This is not necessarily the final solution. |
Thanks, that seems to have solved it. AddressSanitizer does no longer complain and the few test cases I have thrown at it all work beautifully. Cheers! |
Excellent. Please give it some more run-time and then report back. It is very possible this also solves #95. The |
This patch fixes a problem discovered by compiling using the -fsanitize=address option while trouble shooting issue #98. Signed-off-by: Hans Beckerus <hans.beckerus at gmail.com>
This patch should resolve the problem with sporadic crashes as reported in issue #98. Signed-off-by: Hans Beckerus <hans.beckerus at gmail.com>
Made a push to master. Please try it and report back. If everything seems to work well we can close this issue. I will probably make an official patch release later this week. |
Been running for the past six days and no issues detected. Finally my sbc is stable again - used to lock up & reboot due to rar2fs. |
Thanks, I will make a release ASAP. Did not manage to get the time last week. |
Continuing from #95
Actually managed to get two core dumps
1
2
The text was updated successfully, but these errors were encountered: