Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unsafe code for appengine #97

Closed
cristaloleg opened this issue Oct 15, 2018 · 11 comments
Closed

Remove unsafe code for appengine #97

cristaloleg opened this issue Oct 15, 2018 · 11 comments

Comments

@cristaloleg
Copy link
Collaborator

cristaloleg commented Oct 15, 2018

This function should be replaced with a safe versions and protected with a build tag:

func bytesToString(b []byte) string {
	return string(b)
}

See: #96

@lukechampine
Copy link
Contributor

I can take a stab at this. Just wondering, how critical is it to avoid the string conversion here? Is bytesToString being called frequently enough to warrant the optimization?

@ineiti
Copy link

ineiti commented Feb 13, 2020

@lukechampine - currently looking at all dependencies with unsafe from c4dt/byzcoin, and I'm also wondering if it's really worth it to use unsafe here?

@cristaloleg
Copy link
Collaborator Author

@ineiti well yes, but everything depends on a context.

AppEngine do not allow you to use anything from unsafe package, but when all your product is build on such platform, you do not have many options. And obvious one is to remove any unsafe things.

IMO: there is no sense to add // build !appengine and remove unsafe until you need it heavily. For BigCache it make sense due to wide usage and different customer's needs.

@janisz
Copy link
Collaborator

janisz commented Feb 13, 2020

I did some benchmarks

name                                               old time/op    new time/op    delta
WriteToCacheWith1Shard-4                              537ns ± 3%     523ns ± 4%   ~     (p=0.400 n=3+3)
WriteToLimitedCacheWithSmallInitSizeAnd1Shard-4       400ns ± 1%     401ns ± 0%   ~     (p=0.600 n=3+3)
WriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-4     836ns ±11%     814ns ±12%   ~     (p=0.700 n=3+3)
WriteToCache/1-shards-4                               519ns ± 1%     503ns ± 1%   ~     (p=0.100 n=3+3)
WriteToCache/512-shards-4                             215ns ± 0%     214ns ± 2%   ~     (p=0.600 n=3+3)
WriteToCache/1024-shards-4                            217ns ± 2%     218ns ± 1%   ~     (p=0.700 n=3+3)
WriteToCache/8192-shards-4                            280ns ± 0%     277ns ± 1%   ~     (p=0.200 n=3+3)
ReadFromCache/1-shards-4                              241ns ± 2%     237ns ± 1%   ~     (p=0.500 n=3+3)
ReadFromCache/512-shards-4                            250ns ± 0%     241ns ± 2%   ~     (p=0.100 n=3+3)
ReadFromCache/1024-shards-4                           250ns ± 2%     240ns ± 1%   ~     (p=0.100 n=3+3)
ReadFromCache/8192-shards-4                           276ns ± 1%     264ns ± 1%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/1-shards-4                      250ns ± 1%     242ns ± 1%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/512-shards-4                    254ns ± 2%     254ns ± 1%   ~     (p=1.000 n=3+3)
ReadFromCacheWithInfo/1024-shards-4                   253ns ± 2%     251ns ± 1%   ~     (p=1.000 n=3+3)
ReadFromCacheWithInfo/8192-shards-4                   279ns ± 1%     275ns ± 1%   ~     (p=0.200 n=3+3)
IterateOverCache/512-shards-4                         159ns ± 1%     180ns ± 0%   ~     (p=0.100 n=3+3)
IterateOverCache/1024-shards-4                        159ns ± 0%     179ns ± 1%   ~     (p=0.100 n=3+3)
IterateOverCache/8192-shards-4                        155ns ± 0%     179ns ± 0%   ~     (p=0.100 n=3+3)
WriteToCacheWith1024ShardsAndSmallShardInitSize-4     320ns ± 8%     297ns ± 4%   ~     (p=0.200 n=3+3)
ReadFromCacheNonExistentKeys/1-shards-4               114ns ± 1%     111ns ± 1%   ~     (p=0.100 n=3+3)
ReadFromCacheNonExistentKeys/512-shards-4             112ns ± 0%     110ns ± 1%   ~     (p=0.100 n=3+3)
ReadFromCacheNonExistentKeys/1024-shards-4            113ns ± 3%     112ns ± 1%   ~     (p=1.000 n=3+3)
ReadFromCacheNonExistentKeys/8192-shards-4            113ns ± 2%     111ns ± 1%   ~     (p=0.300 n=3+3)
FnvHashSum64-4                                       6.51ns ± 0%    6.70ns ± 0%   ~     (p=0.100 n=3+3)
FnvHashStdLibSum64-4                                 40.0ns ± 0%    40.0ns ± 0%   ~     (p=0.600 n=3+3)

name                                               old alloc/op   new alloc/op   delta
WriteToCacheWith1Shard-4                               602B ± 0%      602B ± 0%   ~     (p=1.000 n=3+3)
WriteToLimitedCacheWithSmallInitSizeAnd1Shard-4       41.0B ± 0%     41.0B ± 0%   ~     (all equal)
WriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-4    3.94kB ± 7%    3.90kB ± 7%   ~     (p=0.400 n=3+3)
WriteToCache/1-shards-4                                602B ± 0%      600B ± 0%   ~     (p=0.100 n=3+3)
WriteToCache/512-shards-4                              597B ± 0%      596B ± 0%   ~     (p=0.700 n=3+3)
WriteToCache/1024-shards-4                             596B ± 0%      597B ± 0%   ~     (p=1.000 n=3+3)
WriteToCache/8192-shards-4                             628B ± 2%      619B ± 1%   ~     (p=0.300 n=3+3)
ReadFromCache/1-shards-4                               271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
ReadFromCache/512-shards-4                             271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
ReadFromCache/1024-shards-4                            271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
ReadFromCache/8192-shards-4                            271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/1-shards-4                       271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/512-shards-4                     271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/1024-shards-4                    271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/8192-shards-4                    271B ± 0%      279B ± 0%   ~     (p=0.100 n=3+3)
IterateOverCache/512-shards-4                         20.0B ± 0%     36.0B ± 0%   ~     (p=0.100 n=3+3)
IterateOverCache/1024-shards-4                        20.0B ± 0%     36.0B ± 0%   ~     (p=0.100 n=3+3)
IterateOverCache/8192-shards-4                        20.0B ± 0%     36.0B ± 0%   ~     (p=0.100 n=3+3)
WriteToCacheWith1024ShardsAndSmallShardInitSize-4    1.07kB ±24%    0.89kB ±14%   ~     (p=0.400 n=3+3)
ReadFromCacheNonExistentKeys/1-shards-4               7.00B ± 0%     7.00B ± 0%   ~     (all equal)
ReadFromCacheNonExistentKeys/512-shards-4             7.00B ± 0%     7.00B ± 0%   ~     (all equal)
ReadFromCacheNonExistentKeys/1024-shards-4            7.00B ± 0%     7.00B ± 0%   ~     (all equal)
ReadFromCacheNonExistentKeys/8192-shards-4            7.00B ± 0%     7.00B ± 0%   ~     (all equal)
FnvHashSum64-4                                        0.00B          0.00B        ~     (all equal)
FnvHashStdLibSum64-4                                  16.0B ± 0%     16.0B ± 0%   ~     (all equal)

name                                               old allocs/op  new allocs/op  delta
WriteToCacheWith1Shard-4                               3.00 ± 0%      3.00 ± 0%   ~     (all equal)
WriteToLimitedCacheWithSmallInitSizeAnd1Shard-4        3.00 ± 0%      3.00 ± 0%   ~     (all equal)
WriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-4      2.00 ± 0%      2.00 ± 0%   ~     (all equal)
WriteToCache/1-shards-4                                3.00 ± 0%      3.00 ± 0%   ~     (all equal)
WriteToCache/512-shards-4                              3.00 ± 0%      3.00 ± 0%   ~     (all equal)
WriteToCache/1024-shards-4                             3.00 ± 0%      3.00 ± 0%   ~     (all equal)
WriteToCache/8192-shards-4                             3.00 ± 0%      3.00 ± 0%   ~     (all equal)
ReadFromCache/1-shards-4                               2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
ReadFromCache/512-shards-4                             2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
ReadFromCache/1024-shards-4                            2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
ReadFromCache/8192-shards-4                            2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/1-shards-4                       2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/512-shards-4                     2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/1024-shards-4                    2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
ReadFromCacheWithInfo/8192-shards-4                    2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
IterateOverCache/512-shards-4                          2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
IterateOverCache/1024-shards-4                         2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
IterateOverCache/8192-shards-4                         2.00 ± 0%      3.00 ± 0%   ~     (p=0.100 n=3+3)
WriteToCacheWith1024ShardsAndSmallShardInitSize-4      3.00 ± 0%      3.00 ± 0%   ~     (all equal)
ReadFromCacheNonExistentKeys/1-shards-4                0.00           0.00        ~     (all equal)
ReadFromCacheNonExistentKeys/512-shards-4              0.00           0.00        ~     (all equal)
ReadFromCacheNonExistentKeys/1024-shards-4             0.00           0.00        ~     (all equal)
ReadFromCacheNonExistentKeys/8192-shards-4             0.00           0.00        ~     (all equal)
FnvHashSum64-4                                         0.00           0.00        ~     (all equal)
FnvHashStdLibSum64-4                                   2.00 ± 0%      2.00 ± 0%   ~     (all equal)

@cristaloleg
Copy link
Collaborator Author

@janisz what exactly do you compare? :)

@janisz
Copy link
Collaborator

janisz commented Feb 13, 2020

go test -bench=. -benchmem  ./... -count 3 -cpu 4 > new.txt && \
git rh && \
go test -bench=. -benchmem  ./... -count 3 -cpu 4 > old.txt && \
benchstat old.txt new.txt
diff --git a/bytes.go b/bytes.go
deleted file mode 100644
index 3944bfe..0000000
--- a/bytes.go
+++ /dev/null
@@ -1,14 +0,0 @@
-// +build !appengine
-
-package bigcache
-
-import (
-       "reflect"
-       "unsafe"
-)
-
-func bytesToString(b []byte) string {
-       bytesHeader := (*reflect.SliceHeader)(unsafe.Pointer(&b))
-       strHeader := reflect.StringHeader{Data: bytesHeader.Data, Len: bytesHeader.Len}
-       return *(*string)(unsafe.Pointer(&strHeader))
-}
diff --git a/bytes_appengine.go b/bytes_appengine.go
index 3892f3b..6718f6b 100644
--- a/bytes_appengine.go
+++ b/bytes_appengine.go
@@ -1,5 +1,3 @@
-// +build appengine
-
 package bigcache
 
 func bytesToString(b []byte) string {

@ineiti
Copy link

ineiti commented Feb 13, 2020

So I guess your diff misses the return string(b)...

Anyway, looking at the times and memory allocated, it seems that the unsafe bytesToString is not really worth it.

Funny thing is that https://github.com/prataprc/goparsec/blob/master/scanner.go#L228 is doing nearly the same. Was there an old version of go that misbehaved when you did a string(b)?

@janisz
Copy link
Collaborator

janisz commented Feb 13, 2020

package bigcache_test

import (
	"bytes"
	"reflect"
	"testing"
	"unsafe"
)

func bytesToString(b []byte) string {
	return string(b)
}

func bytesToStringUnsafe(b []byte) string {
	bytesHeader := (*reflect.SliceHeader)(unsafe.Pointer(&b))
	strHeader := reflect.StringHeader{Data: bytesHeader.Data, Len: bytesHeader.Len}
	return *(*string)(unsafe.Pointer(&strHeader))
}

var a = bytes.Repeat([]byte("abcd"), 1024)
var sink = ""

func BenchmarkBytesToString(b *testing.B) {
	for i := 0; i < b.N; i++ {
		sink = bytesToString(a)
	}
}

func BenchmarkBytesToStringUnsafe(b *testing.B) {
	for i := 0; i < b.N; i++ {
		sink = bytesToStringUnsafe(a)
	}
}
# go test -bench="BenchmarkBytesToString" -benchmem x_test.go
goos: linux
goarch: amd64
BenchmarkBytesToString-8         	 1074472	       977 ns/op	    4096 B/op	       1 allocs/op
BenchmarkBytesToStringUnsafe-8   	546898618	         2.06 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	command-line-arguments	3.296s

@cristaloleg
Copy link
Collaborator Author

^^^ it does make sense, 'cause converting []byte to string must create an immutable object (a string). And with a current Go memory model there is no other way to do this without an allocation.

Converting []byte to string via unsafe, via direct memory addressing solve this. It's possible that we do not have such place, where we convert bytes to string and that's why we do not see any changes in benchmarks.

Probably we can get rid of unsafe at all, wdyt @janisz ?

@janisz
Copy link
Collaborator

janisz commented Feb 13, 2020

We need more benchmarks, The only usage of bytesToString is in readKeyFromEntry. We need to investigate the impact. In current benchmarks we can observe additional allocation when not using unsafe. I did some benchmarks for older versions as well

name \ time/op       go1.10.8     go1.11.13    go1.12.16    go1.13.7     go1.14rc     go1.5.4      go1.6.4      go1.7.6      go1.8.7      go1.9.7
goos:linux goarch:amd64
BytesToString         757ns ± 2%   750ns ± 1%   855ns ± 2%   540ns ± 2%   510ns ± 1%                                                       641ns ± 2%
BytesToStringUnsafe  2.05ns ± 0%  2.05ns ± 0%  2.05ns ± 0%  2.09ns ± 2%  2.05ns ± 0%                                                      2.30ns ± 0%

BytesToString                                                                          363ns ± 1%   638ns ±12%   352ns ± 7%   665ns ± 6%
BytesToStringUnsafe                                                                   3.07ns ± 0%  3.08ns ± 0%  2.57ns ± 1%  2.57ns ± 1%

name \ alloc/op      go1.10.8     go1.11.13    go1.12.16    go1.13.7     go1.14rc     go1.5.4      go1.6.4      go1.7.6      go1.8.7      go1.9.7
goos:linux goarch:amd64
BytesToString        4.10kB ± 0%  4.10kB ± 0%  4.10kB ± 0%  4.10kB ± 0%  4.10kB ± 0%                                                      4.10kB ± 0%
BytesToStringUnsafe   0.00B        0.00B        0.00B        0.00B        0.00B                                                            0.00B     

BytesToString                                                                         4.10kB ± 0%  4.10kB ± 0%  4.10kB ± 0%  4.10kB ± 0%
BytesToStringUnsafe                                                                    0.00B        0.00B        0.00B        0.00B     

name \ allocs/op     go1.10.8     go1.11.13    go1.12.16    go1.13.7     go1.14rc     go1.5.4      go1.6.4      go1.7.6      go1.8.7      go1.9.7
goos:linux goarch:amd64
BytesToString          1.00 ± 0%    1.00 ± 0%    1.00 ± 0%    1.00 ± 0%    1.00 ± 0%                                                        1.00 ± 0%
BytesToStringUnsafe    0.00         0.00         0.00         0.00         0.00                                                             0.00     

BytesToString                                                                           1.00 ± 0%    1.00 ± 0%    1.00 ± 0%    1.00 ± 0%
BytesToStringUnsafe                                                                     0.00         0.00         0.00         0.00   

And unsafe time is constant across the versions while safe differs from 352ns ± 7% (1.9.7) to 855ns ± 2% (1.12.16) so I prefer to stay with unsafe to reduce the risk of this having impact on our codebase under different go versions.

I still do not understand why this have so minimal impact on our benchmarks

@siennathesane
Copy link
Collaborator

I prefer to stay with unsafe to reduce the risk of this having impact on our codebase under different go versions.

I think this is the right methodology, constant performance is the right approach.

I still do not understand why this have so minimal impact on our benchmarks

Do share your results here when you find it, I am curious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants