-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "Eliminate memory allocations from Compress/Decompress." #54
Conversation
This reverts commit e068f47. Fix corruption with Compress. Previous change wasnt correct in a sense that it didnt take into account the need to keep byte slice alive when it goes into C call. For example, with the previous code, this is how cgo looks like: ``` //go:cgo_unsafe_args func _Cfunc_ZSTD_compress_wrapper(p0 _Ctype_uintptr_t, p1 _Ctype_size_t, p2 _Ctype_uintptr_t, p3 _Ctype_size_t, p4 _Ctype_int) (r1 _Ctype_size_t) { _cgo_runtime_cgocall(_cgo_d37665ec54a7_Cfunc_ZSTD_compress_wrapper, uintptr(unsafe.Pointer(&p0))) if _Cgo_always_false { _Cgo_use(p0) _Cgo_use(p1) _Cgo_use(p2) _Cgo_use(p3) _Cgo_use(p4) } return } cWritten := (_Cfunc_ZSTD_compress_wrapper)( _Ctype_uintptr_t(uintptr(unsafe.Pointer(&dst[0]))), _Ctype_size_t(len(dst)), _Ctype_uintptr_t(uintptr(unsafe.Pointer(&src[0]))), _Ctype_size_t(len(src)), _Ctype_int(level)) ``` You can see that `_Cgo_use` is used to keep argument alive, but these arguments are only uintptrs, not actual data/unsafe.Pointer, so there's no guarantee that dst or src is sane during the call because uintptr doesn't have pointer semantic. Compared to passing only `unsafe.Pointer` ``` func _Cfunc_ZSTD_compress(p0 unsafe.Pointer, p1 _Ctype_size_t, p2 unsafe.Pointer, p3 _Ctype_size_t, p4 _Ctype_int) (r1 _Ctype_size_t) { _cgo_runtime_cgocall(_cgo_a7b8c310911d_Cfunc_ZSTD_compress, uintptr(unsafe.Pointer(&p0))) if _Cgo_always_false { _Cgo_use(p0) _Cgo_use(p1) _Cgo_use(p2) _Cgo_use(p3) _Cgo_use(p4) } return } written = int(func(_cgo0 _cgo_unsafe.Pointer, _cgo1 _Ctype_size_t, _cgo2 _cgo_unsafe.Pointer, _cgo3 _Ctype_size_t, _cgo4 _Ctype_int) _Ctype_size_t {; _cgoCheckPointer(_cgo0, dst); _cgoCheckPointer(_cgo2, src); return (_Cfunc_ZSTD_compress)(_cgo0, _cgo1, _cgo2, _cgo3, _cgo4);}( unsafe.Pointer(&dst[0]), _Ctype_size_t(len(dst)), unsafe.Pointer(&src[0]), _Ctype_size_t(len(src)), _Ctype_int(level))) ``` Here you can see that cgo does several things: - it performs the `_cgoCheckPointer(_cgo0, dst); _cgoCheckPointer(_cgo2, src);` to make sure that we follow https://golang.org/cmd/cgo/#hdr-Passing_pointers - it correctly have `_Cgo_use(unsafe.Pointer)` so the data is sane. Signed-off-by: Daniel Dao <dqminh89@gmail.com>
Go may garbage collect
The first condition may occur only starting from Go 1.12 where mid-stack inlining has been partially enabled or if The proper fix for the issue is to add diff --git a/zstd.go b/zstd.go
index da29d2e..19993e7 100644
--- a/zstd.go
+++ b/zstd.go
@@ -23,6 +23,7 @@ import (
"bytes"
"errors"
"io/ioutil"
+ "runtime"
"unsafe"
)
@@ -83,6 +84,8 @@ func CompressLevel(dst, src []byte, level int) ([]byte, error) {
srcPtr,
C.size_t(len(src)),
C.int(level))
+ // Prevent from GC'ing src during CGO call above.
+ runtime.KeepAlive(src)
written := int(cWritten)
// Check if the return is an Error code
@@ -106,6 +109,8 @@ func Decompress(dst, src []byte) ([]byte, error) {
C.size_t(len(dst)),
C.uintptr_t(uintptr(unsafe.Pointer(&src[0]))),
C.size_t(len(src)))
+ // Prevent from GC'ing src during CGO call above.
+ runtime.KeepAlive(src)
written := int(cWritten)
// Check error |
In Go 1.12, inlining got more aggressive so here it can now detect that src is not used and GC earlier. We need to force tell Go that we are still using the variable. See #54
Yes, adding KeepAlive manually should also work as we are really just passing normal bytes over. I will see if we can try the patch internally, or perhaps have a better replication for current issue. This should not even be related to Go 1.12+. We had the issue in Go 1.11 for example. The reason i didn't consider it before was that its a bit unorthodox for a lot of normal cgo use cases to use the pattern when all you need is just passing unsafe.Pointer. Doing it otherwise is error prone as we have seen here :'( And Compress/Decompress is not even in the hot path ( if it is probably you are doing it wrong ) so i dont actually care about the allocations. |
In Go 1.12, inlining got more aggressive so here it can now detect that src is not used and GC earlier. We need to force tell Go that we are still using the variable. See DataDog#54
This reverts commit e068f47.
Fix corruption with Compress. Previous change wasnt correct in a sense
that it didnt take into account the need to keep byte slice alive when
it goes into C call.
For example, with the previous code, this is how cgo looks like:
You can see that
_Cgo_use
is used to keep argument alive, but thesearguments are only uintptrs, not actual data/unsafe.Pointer, so there's no
guarantee that dst or src is sane during the call because uintptr
doesn't have pointer semantic.
Compared to passing only
unsafe.Pointer
Here you can see that cgo does several things:
it performs the
_cgoCheckPointer(_cgo0, dst); _cgoCheckPointer(_cgo2, src);
to make sure that we follow https://golang.org/cmd/cgo/#hdr-Passing_pointers
it correctly have
_Cgo_use(unsafe.Pointer)
so the data is sane.Signed-off-by: Daniel Dao dqminh89@gmail.com