New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diffwr option OOM kill #16
Comments
Fair enough. I'll take a look at your claims. |
It is just a remark...
It's not tested, just a suggestion. |
Yes, I was analyzing the code before and I came to that conclusion, too. In the process of converting your code, I missed the free() on the "same block" path. I am preparing a fix for it, and I would like to ask you to test it when I push the test branch before I commit the final version. Is that ok? |
Moving "free" outside... working like a charm... Now, there are two options:
|
It was a sneaking bug. Because only occur when:
Ok, no problem... |
Yes, I'm working on a diff moving |
I can do that moving in minutes, if you like. |
I already did the patch and I'm almost ready to push it for testing. The only thing I'm wondering now is whether should we if (rptr)
free(rptr); I know that it would be virtually impossible for it to happen, as if the |
Before this patch, free() was called only when blocks were not the same, causing memory leaks when they were, what could eventually cause an OOM due to the accumulated effect of those leaks on large files. This patch frees allocated memory for both paths, thus fixing the bug. Thanks @szolnokit for the report. Closes #16.
Put it boldly without This is good, I have tested:
|
Yes, I was just trying to check if there could be a corned case in the middle :). I have pushed a branch with the fix. In my tests with valgrind, I didn't see any leaks1. Please test it and tell me if it is ok now. Footnotes
|
Ok, I'm beginning of some test with diffwr. With big test files, and real block devices. With commit 2cddf08 |
Yes, I have seen. There are some other mem leaks, but these are small leaks, and all occurs only once per execute. Not "inflating" memory leaks. From |
Yeah, but I will try to hunt them anyway :). |
OK. I tested commit 2cddf08 . No more OOM. |
Thanks a lot! And sorry for the mistake! I will merge it into master now. |
Closed |
Out of memory: Killed process 9737 (dcfldd-v1.9) total-vm:847432kB, anon-rss:844620kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1696kB oom_score_adj:0
Prepare:
The bug:
dcfldd diffwr=on if=/dev/zero of=test.dd # OOM kill
Default option (without diffwr) isn't affected.
The problem:
When destination blocks are same (not written), memory blocks remain allocated.
Suggestion:
Use my "full_write" function, instead code in released v1.9
At first, thank you for commit.
And sorry, because I have no time to test @davidpolverari suggestion.
Dear @davidpolverari
Thank you for suggestion, but your code is BAD :(. And unfortunately your suggested code became part of the version, instead of mine code. Your code eat all memory, and trigger OOM kill in a real life. My code maybe not beautiful as your, but I tested weeks, and used on a real life work before I published.
I only noticed that the v1.9 release contains memory leak bug, because I installed it from a distribution instead of mine version. When I tired my routine job with "diffwr" option, I get OOM kill after dcfldd eat all memory. And this is caused by your proposed code.
But I tested with valgrind against my version vs released v1.9. And the suspicion turned out to be true. With diffwr=on option almost all malloc-ed block remain allocated. But only in your suggested code, not in my code.
I see only this leak in your (=v1.9 released) version:
The leak size almost same size as the (not) written data size. With GB (not) written, easy to eat all memory.
Remark:
My do..while(0) not nice I know, but this just a "simulated" try...throw..finally statement. The "break" is the "throw". :D Not nice, but my code always free the malloc-ed block. I don't know why, but your solution doesn't do that.
Maybe my code would have been better in the released version. Now the current release is buggy. :(
Originally posted by @szolnokit in #13 (comment)
The text was updated successfully, but these errors were encountered: