Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize hexadecimal decoding #568

Merged
merged 5 commits into from
Jul 21, 2021
Merged

Optimize hexadecimal decoding #568

merged 5 commits into from
Jul 21, 2021

Conversation

kkHAIKE
Copy link
Contributor

@kkHAIKE kkHAIKE commented Jul 13, 2021

simple optimize hexToString..
hexToRune is copy from encoding/json

benchmark

func Benchmark_hexToString(b *testing.B) {
	for i := 0; i < b.N; i++ {
		hexToString([]byte("0001F680"), 8)
	}
}
goos: darwin
goarch: arm64
pkg: tstgo
Benchmark_hexToString
Benchmark_hexToString-8    	30278191	        39.64 ns/op	      16 B/op	       2 allocs/op
Benchmark_hexToString2
Benchmark_hexToString2-8   	134665851	         8.828 ns/op	       0 B/op	       0 allocs/op

@kkHAIKE kkHAIKE changed the base branch from master to v2 July 13, 2021 12:21
Copy link
Owner

@pelletier pelletier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your help on go-toml!

Out of curiosity, do you have benchmarks for this function? The existing benchmarks show no time improvement, but I suspect parsing hex strings represents a minority of the time in those.

Before merging this I'd love to see a better-or-equal execution time and allocations benchmark (I'm fine with equal, just want to check for regressions), as well as handling non-hex characters correctly.

Thank you!

parser.go Outdated Show resolved Hide resolved
@kkHAIKE
Copy link
Contributor Author

kkHAIKE commented Jul 21, 2021

sorry, i forgot to name it clearly. hexToString2 test is new function, 8.828 ns/op, 4.5x faster than old one...

@pelletier
Copy link
Owner

Doh! I'm sorry, I somehow didn't see the benchmark in the PR description.

Pushed a commit to benchmark using parser.parseBasicString instead of hexToString, as it should be closer to what the user experiences. This made me realize that this function was only used in basic (multiline) string parsing, with its result always appended to a string buffer. As a result I changed the signature to return the rune directly, which removes some allocation. Then I inline hexToRune to return an error directly from the loop to scope the returned error to the specific invalid character, which should help the user see where the problem is.


Benchmarks:

Baseline vs new hex2string implementation

name                           old time/op    new time/op     delta
ParseBasicStringWithUnicode/4     413ns ± 0%      259ns ± 0%  -37.37%  (p=0.000 n=9+9)
ParseBasicStringWithUnicode/8     353ns ± 0%      246ns ± 0%  -30.35%  (p=0.000 n=10+10)

name                           old speed      new speed       delta
ParseBasicStringWithUnicode/4  91.9MB/s ± 0%  146.8MB/s ± 0%  +59.65%  (p=0.000 n=9+9)
ParseBasicStringWithUnicode/8   119MB/s ± 0%    171MB/s ± 0%  +43.57%  (p=0.000 n=10+10)

name                           old alloc/op   new alloc/op    delta
ParseBasicStringWithUnicode/4      112B ± 0%        88B ± 0%  -21.43%  (p=0.000 n=10+10)
ParseBasicStringWithUnicode/8     96.0B ± 0%      80.0B ± 0%  -16.67%  (p=0.000 n=10+10)

name                           old allocs/op  new allocs/op   delta
ParseBasicStringWithUnicode/4      13.0 ± 0%        7.0 ± 0%  -46.15%  (p=0.000 n=10+10)
ParseBasicStringWithUnicode/8      9.00 ± 0%       5.00 ± 0%  -44.44%  (p=0.000 n=10+10)

Baseline vs direct hex2rune call

name                           old time/op    new time/op     delta
ParseBasicStringWithUnicode/4     413ns ± 0%      146ns ± 0%   -64.62%  (p=0.000 n=9+10)
ParseBasicStringWithUnicode/8     353ns ± 0%      172ns ± 0%   -51.35%  (p=0.000 n=10+9)

name                           old speed      new speed       delta
ParseBasicStringWithUnicode/4  91.9MB/s ± 0%  259.8MB/s ± 0%  +182.67%  (p=0.000 n=9+10)
ParseBasicStringWithUnicode/8   119MB/s ± 0%    244MB/s ± 0%  +105.52%  (p=0.000 n=10+9)

name                           old alloc/op   new alloc/op    delta
ParseBasicStringWithUnicode/4      112B ± 0%        64B ± 0%   -42.86%  (p=0.000 n=10+10)
ParseBasicStringWithUnicode/8     96.0B ± 0%      64.0B ± 0%   -33.33%  (p=0.000 n=10+10)

name                           old allocs/op  new allocs/op   delta
ParseBasicStringWithUnicode/4      13.0 ± 0%        1.0 ± 0%   -92.31%  (p=0.000 n=10+10)
ParseBasicStringWithUnicode/8      9.00 ± 0%       1.00 ± 0%   -88.89%  (p=0.000 n=10+10)

Side by side

name \ time/op                 baseline.txt   branch.txt      branch-2.txt
ParseBasicStringWithUnicode/4     413ns ± 0%      259ns ± 0%      146ns ± 0%
ParseBasicStringWithUnicode/8     353ns ± 0%      246ns ± 0%      172ns ± 0%

name \ speed                   baseline.txt   branch.txt      branch-2.txt
ParseBasicStringWithUnicode/4  91.9MB/s ± 0%  146.8MB/s ± 0%  259.8MB/s ± 0%
ParseBasicStringWithUnicode/8   119MB/s ± 0%    171MB/s ± 0%    244MB/s ± 0%

name \ alloc/op                baseline.txt   branch.txt      branch-2.txt
ParseBasicStringWithUnicode/4      112B ± 0%        88B ± 0%        64B ± 0%
ParseBasicStringWithUnicode/8     96.0B ± 0%      80.0B ± 0%      64.0B ± 0%

name \ allocs/op               baseline.txt   branch.txt      branch-2.txt
ParseBasicStringWithUnicode/4      13.0 ± 0%        7.0 ± 0%        1.0 ± 0%
ParseBasicStringWithUnicode/8      9.00 ± 0%       5.00 ± 0%       1.00 ± 0%

@pelletier
Copy link
Owner

I'm going to merge this as is given the significant improvement. Thank you so much for your patch and your help in improving go-toml!

@pelletier pelletier merged commit a93b34d into pelletier:v2 Jul 21, 2021
@kkHAIKE kkHAIKE deleted the hexToString branch July 21, 2021 09:03
@pelletier pelletier added the performance Issue related to a performance problem or pull request improving performance. label Oct 28, 2021
@pelletier pelletier changed the title hexToString optimization Optimize hexadecimal decoding Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issue related to a performance problem or pull request improving performance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants