-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in #211 #248
Comments
Hi Takahiro,
A cursory inspection may indicate that when the sort file is closed and removed, the gzip buffers
are not freed. Then, when a new sortfile is opened, it may allocate more buffers.
I will have to investigate this better.
Jos
… On 6 Dec 2017, at 19:02, Takahiro Ueda ***@***.***> wrote:
This is a follow-up of #211 <#211>. The example in that issue works:
#-
#: TermsInSmall 128
#: LargePatches 16
#: FilePatches 4
#: SubTermsInSmall 64
#: SubLargePatches 8
#: SubFilePatches 2
Off Statistics;
CFunction f,g;
Symbol x,y;
* 128*16=2048 terms cause a sort of the large buffer to disk.
* multiples of 2048*4=8192 terms cause a stage 4 sort
#define NTERMS "40001"
#define ARGNTERMS "2001"
Local test1 = <f(1)>+...+<f(`NTERMS')>;
Local test2 = g(<f(1)>+...+<f(`ARGNTERMS')>);
.sort
* Cancel all terms, but keep distance so that most terms only cancel in the final sort
Identify f(x?) = f(x) - f(`NTERMS'-x+1);
Argument g;
Identify f(x?) = f(x) - f(`ARGNTERMS'-x+1);
EndArgument;
Print +s;
.end
but now there are some memory leaks, though I'm not sure if it is very serious: valgrind --leak-check=full gives
==2303== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==2303== Command: ./vorm test.frm
==2303==
FORM 4.2.0 (Nov 30 2017, v4.2.0-24-g1514492) 64-bits Run: Wed Dec 6 18:56:14 2017
...
test1 = 0;
test2 =
+ g(0)
;
11.89 sec out of 11.89 sec
==2303==
==2303== HEAP SUMMARY:
==2303== in use at exit: 346,930,544 bytes in 161 blocks
==2303== total heap usage: 1,210 allocs, 1,049 frees, 392,480,661 bytes allocated
==2303==
==2303== 160,112 (112 direct, 160,000 indirect) bytes in 1 blocks are definitely lost in loss record 134 of 152
==2303== at 0x4C2AB96: malloc (vg_replace_malloc.c:299)
==2303== by 0x4FF908: Malloc1 (tools.c:2230)
==2303== by 0x42A966: SetupOutputGZIP (compress.c:85)
==2303== by 0x4E1F32: MergePatches (sort.c:3570)
==2303== by 0x4E4016: EndSort (sort.c:1066)
==2303== by 0x4BA233: Processor (proces.c:431)
==2303== by 0x437809: DoExecute (execute.c:838)
==2303== by 0x44DD20: ExecModule (module.c:274)
==2303== by 0x4AFFC1: PreProcessor (pre.c:962)
==2303== by 0x4E8E1F: main (startup.c:1605)
==2303==
==2303== 320,224 (224 direct, 320,000 indirect) bytes in 2 blocks are definitely lost in loss record 139 of 152
==2303== at 0x4C2AB96: malloc (vg_replace_malloc.c:299)
==2303== by 0x4FF908: Malloc1 (tools.c:2230)
==2303== by 0x42A966: SetupOutputGZIP (compress.c:85)
==2303== by 0x4E1F32: MergePatches (sort.c:3570)
==2303== by 0x4E33A6: StoreTerm (sort.c:4301)
==2303== by 0x4B73E8: Generator (proces.c:3157)
==2303== by 0x4B89A5: Generator (proces.c:3931)
==2303== by 0x4BA125: Processor (proces.c:404)
==2303== by 0x437809: DoExecute (execute.c:838)
==2303== by 0x44DD20: ExecModule (module.c:274)
==2303== by 0x4AFFC1: PreProcessor (pre.c:962)
==2303== by 0x4E8E1F: main (startup.c:1605)
==2303==
==2303== 428,208 (5,952 direct, 422,256 indirect) bytes in 1 blocks are definitely lost in loss record 140 of 152
==2303== at 0x4C2AB96: malloc (vg_replace_malloc.c:299)
==2303== by 0x4E3B46B: deflateInit2_ (in /localstore/theorie/tueda/build/linuxbrew/Cellar/zlib/1.2.11/lib/libz.so.1.2.11)
==2303== by 0x4E3B613: deflateInit_ (in /localstore/theorie/tueda/build/linuxbrew/Cellar/zlib/1.2.11/lib/libz.so.1.2.11)
==2303== by 0x42AA22: SetupOutputGZIP (compress.c:119)
==2303== by 0x4E1F32: MergePatches (sort.c:3570)
==2303== by 0x4E4016: EndSort (sort.c:1066)
==2303== by 0x4BA233: Processor (proces.c:431)
==2303== by 0x437809: DoExecute (execute.c:838)
==2303== by 0x44DD20: ExecModule (module.c:274)
==2303== by 0x4AFFC1: PreProcessor (pre.c:962)
==2303== by 0x4E8E1F: main (startup.c:1605)
==2303==
==2303== LEAK SUMMARY:
==2303== definitely lost: 6,288 bytes in 4 blocks
==2303== indirectly lost: 902,256 bytes in 9 blocks
==2303== possibly lost: 0 bytes in 0 blocks
==2303== still reachable: 346,022,000 bytes in 148 blocks
==2303== suppressed: 0 bytes in 0 blocks
==2303== Reachable blocks (those to which a pointer was found) are not shown.
==2303== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==2303==
==2303== For counts of detected and suppressed errors, rerun with: -v
==2303== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#248>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFLxEt3nDVK72OoqQNMGGELr70c1VKkjks5s9tbCgaJpZM4Q4WKH>.
|
Hi Takahiro,
I think I have indeed found the problem.
I have to go now to the “jamboree”, and will repair it later.
It is indeed that in CloseFile I should free all memory related to compressions.
And in CreateFile I should make sure all the relevant buffers are set to zero.
That should probably fix it.
Cheers
Jos
… On 10 Dec 2017, at 23:32, Jos Vermaseren ***@***.***> wrote:
Hi Takahiro,
A cursory inspection may indicate that when the sort file is closed and removed, the gzip buffers
are not freed. Then, when a new sortfile is opened, it may allocate more buffers.
I will have to investigate this better.
Jos
> On 6 Dec 2017, at 19:02, Takahiro Ueda ***@***.*** ***@***.***>> wrote:
>
> This is a follow-up of #211 <#211>. The example in that issue works:
>
> #-
> #: TermsInSmall 128
> #: LargePatches 16
> #: FilePatches 4
> #: SubTermsInSmall 64
> #: SubLargePatches 8
> #: SubFilePatches 2
>
> Off Statistics;
>
> CFunction f,g;
> Symbol x,y;
>
> * 128*16=2048 terms cause a sort of the large buffer to disk.
> * multiples of 2048*4=8192 terms cause a stage 4 sort
> #define NTERMS "40001"
>
> #define ARGNTERMS "2001"
>
> Local test1 = <f(1)>+...+<f(`NTERMS')>;
> Local test2 = g(<f(1)>+...+<f(`ARGNTERMS')>);
> .sort
>
> * Cancel all terms, but keep distance so that most terms only cancel in the final sort
> Identify f(x?) = f(x) - f(`NTERMS'-x+1);
> Argument g;
> Identify f(x?) = f(x) - f(`ARGNTERMS'-x+1);
> EndArgument;
>
> Print +s;
> .end
> but now there are some memory leaks, though I'm not sure if it is very serious: valgrind --leak-check=full gives
>
> ==2303== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==2303== Command: ./vorm test.frm
> ==2303==
> FORM 4.2.0 (Nov 30 2017, v4.2.0-24-g1514492) 64-bits Run: Wed Dec 6 18:56:14 2017
>
> ...
>
> test1 = 0;
>
> test2 =
> + g(0)
> ;
>
> 11.89 sec out of 11.89 sec
> ==2303==
> ==2303== HEAP SUMMARY:
> ==2303== in use at exit: 346,930,544 bytes in 161 blocks
> ==2303== total heap usage: 1,210 allocs, 1,049 frees, 392,480,661 bytes allocated
> ==2303==
> ==2303== 160,112 (112 direct, 160,000 indirect) bytes in 1 blocks are definitely lost in loss record 134 of 152
> ==2303== at 0x4C2AB96: malloc (vg_replace_malloc.c:299)
> ==2303== by 0x4FF908: Malloc1 (tools.c:2230)
> ==2303== by 0x42A966: SetupOutputGZIP (compress.c:85)
> ==2303== by 0x4E1F32: MergePatches (sort.c:3570)
> ==2303== by 0x4E4016: EndSort (sort.c:1066)
> ==2303== by 0x4BA233: Processor (proces.c:431)
> ==2303== by 0x437809: DoExecute (execute.c:838)
> ==2303== by 0x44DD20: ExecModule (module.c:274)
> ==2303== by 0x4AFFC1: PreProcessor (pre.c:962)
> ==2303== by 0x4E8E1F: main (startup.c:1605)
> ==2303==
> ==2303== 320,224 (224 direct, 320,000 indirect) bytes in 2 blocks are definitely lost in loss record 139 of 152
> ==2303== at 0x4C2AB96: malloc (vg_replace_malloc.c:299)
> ==2303== by 0x4FF908: Malloc1 (tools.c:2230)
> ==2303== by 0x42A966: SetupOutputGZIP (compress.c:85)
> ==2303== by 0x4E1F32: MergePatches (sort.c:3570)
> ==2303== by 0x4E33A6: StoreTerm (sort.c:4301)
> ==2303== by 0x4B73E8: Generator (proces.c:3157)
> ==2303== by 0x4B89A5: Generator (proces.c:3931)
> ==2303== by 0x4BA125: Processor (proces.c:404)
> ==2303== by 0x437809: DoExecute (execute.c:838)
> ==2303== by 0x44DD20: ExecModule (module.c:274)
> ==2303== by 0x4AFFC1: PreProcessor (pre.c:962)
> ==2303== by 0x4E8E1F: main (startup.c:1605)
> ==2303==
> ==2303== 428,208 (5,952 direct, 422,256 indirect) bytes in 1 blocks are definitely lost in loss record 140 of 152
> ==2303== at 0x4C2AB96: malloc (vg_replace_malloc.c:299)
> ==2303== by 0x4E3B46B: deflateInit2_ (in /localstore/theorie/tueda/build/linuxbrew/Cellar/zlib/1.2.11/lib/libz.so.1.2.11)
> ==2303== by 0x4E3B613: deflateInit_ (in /localstore/theorie/tueda/build/linuxbrew/Cellar/zlib/1.2.11/lib/libz.so.1.2.11)
> ==2303== by 0x42AA22: SetupOutputGZIP (compress.c:119)
> ==2303== by 0x4E1F32: MergePatches (sort.c:3570)
> ==2303== by 0x4E4016: EndSort (sort.c:1066)
> ==2303== by 0x4BA233: Processor (proces.c:431)
> ==2303== by 0x437809: DoExecute (execute.c:838)
> ==2303== by 0x44DD20: ExecModule (module.c:274)
> ==2303== by 0x4AFFC1: PreProcessor (pre.c:962)
> ==2303== by 0x4E8E1F: main (startup.c:1605)
> ==2303==
> ==2303== LEAK SUMMARY:
> ==2303== definitely lost: 6,288 bytes in 4 blocks
> ==2303== indirectly lost: 902,256 bytes in 9 blocks
> ==2303== possibly lost: 0 bytes in 0 blocks
> ==2303== still reachable: 346,022,000 bytes in 148 blocks
> ==2303== suppressed: 0 bytes in 0 blocks
> ==2303== Reachable blocks (those to which a pointer was found) are not shown.
> ==2303== To see them, rerun with: --leak-check=full --show-leak-kinds=all
> ==2303==
> ==2303== For counts of detected and suppressed errors, rerun with: -v
> ==2303== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub <#248>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFLxEt3nDVK72OoqQNMGGELr70c1VKkjks5s9tbCgaJpZM4Q4WKH>.
>
|
Nice. Do you think it is the cause of memory leaks in long runs? Happens many times in a run? |
Hi Takahiro
It probably happens in each module for each
Expression that needs a sortfile. And in each worker.
Cheers
Jos
Verstuurd vanaf mijn iPhone
… Op 12 dec. 2017 om 12:51 heeft Takahiro Ueda ***@***.***> het volgende geschreven:
Nice. Do you think it is the cause of memory leaks in long runs? Happens many times in a run?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
tueda
added a commit
that referenced
this issue
Dec 13, 2017
Currently the test fails with asan/valgrind (#248). This will be fixed by the next commit.
Fixed in a484c0a and 4493f01. |
Indeed, this is identified as the source of big memory leaks in long runs. After fixing the leaks, we have confirmed memory usage of a program (TFORM with 24 cores, 93 hours) is considerably reduced (from VSZ 279G RSS 150G to VSZ 118G RSS 77G). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is a follow-up of #211. The example in that issue works:
but now there are some memory leaks, though I'm not sure if it is very serious:
valgrind --leak-check=full
givesThe text was updated successfully, but these errors were encountered: