Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/2 mem issues #91

Merged
merged 4 commits into from
Oct 2, 2020
Merged

Fix/2 mem issues #91

merged 4 commits into from
Oct 2, 2020

Conversation

dtynn
Copy link
Contributor

@dtynn dtynn commented Apr 21, 2020

hi max, appreciate your great work!
during my usage of c-for-go, i found 2 mem issues. so i made this pr.

descriptions

  1. fatal address (SIGSEGV), most likely related to the gc of a uintptr, since go does not keep track of things inside uintptrs

  2. unexpectedENOMEM while there is still plenty of mem available

reproducing

you can use the demo codes
sorry that i have to use rust to provide the underlying lib

here are some details:

  1. use run-gcfatal for issue-1 and run-enomem for issue-2

  2. codes in gen are purely generated by c-for-go

  3. codes in gen_no_set_finalizer are modified to manually free the alloc map, based on gen

  4. issue-1 can be reproduced on macOS / centos 7.7 within seconds

  5. issue-2 can be reproduced on centos 7.7 / ubuntu 19.04 with ptmalloc2 from glibc
    it will take tens of seconds. we can restart the demo a couple of times to make it happen fastser

  6. for issue-2, the ENOMEM always comes along with a non-NULL ptr from calloc.

    • if i ignore the err in this situation, things seem to be going well, no sigsegv, no mem leaks.
    • and, if i switch the allactor from ptmalloc2 to tcmalloc, the ENOMEM just disappears

logs

gc caused SIGSEGV

 fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x31a00000 pc=0x40a0087]

goroutine 1 [running]:
runtime.throw(0x414fda9, 0x5)
	/Users/wanglin/work/go/src/runtime/panic.go:774 +0x72 fp=0xc00005dd60 sp=0xc00005dd30 pc=0x402c752
runtime.sigpanic()
	/Users/wanglin/work/go/src/runtime/signal_unix.go:401 +0x3de fp=0xc00005dd90 sp=0xc00005dd60 pc=0x403f28e
github.com/dtynn/go-gc/gen.packSGogcPublicReplicaInfo(0xc0000ce000, 0x3d0900, 0x3d0900, 0x31a00000)
	/Users/wanglin/work/ongit/go-gc/gen/cgo_helpers.go:277 +0x57 fp=0xc00005de00 sp=0xc00005dd90 pc=0x40a0087
github.com/dtynn/go-gc/gen.GogcVerifyPost(0xc0000ce000, 0x3d0900, 0x3d0900, 0x3d0900, 0x0, 0xc00001c1c0, 0x0, 0x0)
	/Users/wanglin/work/ongit/go-gc/gen/gen.go:20 +0xbb fp=0xc00005de68 sp=0xc00005de00 pc=0x40a024b
main.main()
	/Users/wanglin/work/ongit/go-gc/gcfatal/main.go:38 +0x2ba fp=0xc00005df60 sp=0xc00005de68 pc=0x40a08ea
runtime.main()
	/Users/wanglin/work/go/src/runtime/proc.go:203 +0x21e fp=0xc00005dfe0 sp=0xc00005df60 pc=0x402e0de
runtime.goexit()
	/Users/wanglin/work/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc00005dfe8 sp=0xc00005dfe0 pc=0x4056671

goroutine 8 [syscall]:
github.com/dtynn/go-gc/gen._Cfunc_free(0x31a00000)
	_cgo_gotypes.go:122 +0x41
github.com/dtynn/go-gc/gen.(*cgoAllocMap).Free.func1(0xc00f4fe760)
	/Users/wanglin/work/ongit/go-gc/gen/cgo_helpers.go:63 +0x5e
github.com/dtynn/go-gc/gen.(*cgoAllocMap).Free(0xc00000e020)
	/Users/wanglin/work/ongit/go-gc/gen/cgo_helpers.go:63 +0xa4
created by github.com/dtynn/go-gc/gen.unpackArgSGogcPublicReplicaInfo.func1
	/Users/wanglin/work/ongit/go-gc/gen/cgo_helpers.go:251 +0x41
exit status 2

unexpected enomem

panic: memory alloc error: cannot allocate memory                                                                                                              │34108 root       20   0  124M 37704  4328 S  0.0  0.0 15:29.18 ./node_exporter
                                                                                                                                                               │34120 root       20   0  124M 37704  4328 S  0.0  0.0 17:10.74 ./node_exporter
goroutine 531 [running]:                                                                                                                                       │ 1433 root       20   0  124M 37704  4328 S  0.0  0.0 17:44.54 ./node_exporter
github.com/dtynn/go-gc/gen_no_set_finalizer.allocGogcPublicReplicaInfoMemory(0xc8000, 0xc001d7e000)                                                            │34131 root       20   0  124M 37704  4328 S  0.0  0.0 15:49.83 ./node_exporter
        /home/wanglin/proj/github.com/dtynn/go-gc/gen_no_set_finalizer/cgo_helpers.go:154 +0xcd                                                                │35287 root       20   0  124M 37704  4328 S  0.0  0.0  1:17.60 ./node_exporter
github.com/dtynn/go-gc/gen_no_set_finalizer.unpackArgSGogcPublicReplicaInfo(0xc0166c0000, 0xc8000, 0xc8000, 0x0, 0x0)                                          │34123 root       20   0  124M 37704  4328 S  0.0  0.0 16:37.22 ./node_exporter
        /home/wanglin/proj/github.com/dtynn/go-gc/gen_no_set_finalizer/cgo_helpers.go:251 +0x71                                                                │34110 root       20   0  124M 37704  4328 S  0.0  0.0 16:38.03 ./node_exporter
github.com/dtynn/go-gc/gen_no_set_finalizer.GogcVerifyPost(0xc0166c0000, 0xc8000, 0xc8000, 0xc8000, 0x0, 0x0, 0x0, 0x0)                                        │34099 root       20   0  124M 37704  4328 S  0.0  0.0 17:42.68 ./node_exporter
        /home/wanglin/proj/github.com/dtynn/go-gc/gen_no_set_finalizer/gen.go:17 +0x7d                                                                         │24208 root       20   0  124M 37704  4328 S  0.0  0.0 13:21.61 ./node_exporter
main.fire.func1(0xc0000d6060, 0xc000018198, 0xc0166c0000, 0xc8000, 0xc8000, 0xc8000, 0x6034d512b92edf1a, 0xc0033740e0, 0x38)                                   │34126 root       20   0  124M 37704  4328 S  0.0  0.0 16:18.73 ./node_exporter
        /home/wanglin/proj/github.com/dtynn/go-gc/enomem/main.go:53 +0xaf                                                                                      │ 2408 root       20   0  124M 37704  4328 S  0.0  0.0 11:14.28 ./node_exporter
created by main.fire                                                                                                                                           │34269 root       20   0  124M 37704  4328 S  0.0  0.0 16:52.18 ./node_exporter
        /home/wanglin/proj/github.com/dtynn/go-gc/enomem/main.go:45 +0x230                                                                                     │34182 root       20   0  124M 37704  4328 S  0.0  0.0 17:17.10 ./node_exporter
exit status 2                                                                                                                                                  │34138 root       20   0  124M 37704  4328 S  0.0  0.0 16:33.00 ./node_exporter
make: *** [run-enomem] Error 1

@xlab
Copy link
Owner

xlab commented Apr 21, 2020

Hi @dtynn ! Thanks for the suggestions. But how you would collect allocated C memory if finalizers are not set? The idea was to track all C allocations, from e.g. []byte to char* conversions, and then free them when the allocation map is collected on GC side in Go.

@@ -251,10 +248,7 @@ func (gen *Generator) unpackSlice(buf1 io.Writer, buf2 *reverseBuffer, cgoSpec t
gen.submitHelper(sizeOfPtr)
gen.submitHelper(cgoAllocMap)

fmt.Fprintf(buf1, `allocs = new(cgoAllocMap)
defer runtime.SetFinalizer(&unpacked, func(*%s) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this removal is compensated? There are some cases like media player where packets of [][][]float32 are being allocated per second.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is for the 1st issue, that gc caused fatal address inside the packSGogcPublicReplicaInfo call

since functions like unpackXXX & packXXX are private, i think that we could just focus on the uintptr's lifetime, thus the unpack/pack cycle.

we should keep the uintptr alive until the pack things are completed.
but adding a single runtime.KeepAlive of the uintptr, and leaving the SetFinalizer untouched seems to be useless in my tests.

so, i removed these SetFinalizers here, and added a defer freeing of the returned allocMap after every unpack call, right in the arg proxy logic.

pls correct me if i misunderstood anything~~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i also made some nested slices to be generated by my patch, to see if it would cause some troubles

see this

i think that all the allocations are tracked by the root alloc map, am i right on this?

@dtynn
Copy link
Contributor Author

dtynn commented Apr 21, 2020

Hi @dtynn ! Thanks for the suggestions. But how you would collect allocated C memory if finalizers are not set? The idea was to track all C allocations, from e.g. []byte to char* conversions, and then free them when the allocation map is collected on GC side in Go.

hey max, your words when the allocation map is collected on GC side in Go here just enlightened me, what about we set finalizier for the alloc map instead, and add a KeepAlive for the map after packXXXX?
i had a demo tested for minutes, and didn't see any leaks or SIGSEGVs

+++++++++
UPDATE: already pushed a new commit for this

@xlab
Copy link
Owner

xlab commented Apr 22, 2020

Now I like it! :)
Yes, Finalizer should be there because alloc maps are tracking refs in allocated objects.
I'm going to test for the leaks using my previously bound packages such as
https://github.com/xlab/vorbis-go

Because in mutimedia apps leaks are more likely to occur.

@whyrusleeping
Copy link

@xlab any update here? I'm really itching to get this fix :)

@xlab
Copy link
Owner

xlab commented Apr 27, 2020

Sorry I had crazy week! :)
I'm putting this on my daily todo list, will check asap

@laser
Copy link

laser commented May 1, 2020

Any luck? :)

@vmx
Copy link

vmx commented Jul 30, 2020

@xlab We are still using this fix. Could you please have another look, it would be great if you could switch to a proper upstream version. Is there any way I can help getting this PR merged?

@xlab xlab merged commit c134bfa into xlab:master Oct 2, 2020
@xlab
Copy link
Owner

xlab commented Oct 2, 2020

Sorry to all people that were blocked by this PR! I just realized that there are enough external projects who can do testing and verification of the changes. In the past, I relied on my own test suite of bindings, but due to a shortage of time, it's better to trust the community now. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants