Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quote: Convert []byte to string using unsafe to avoid realloc #75

Merged
merged 1 commit into from Oct 18, 2022
Merged

quote: Convert []byte to string using unsafe to avoid realloc #75

merged 1 commit into from Oct 18, 2022

Conversation

eNV25
Copy link
Contributor

@eNV25 eNV25 commented Aug 8, 2022

No description provided.

@eNV25 eNV25 marked this pull request as draft August 8, 2022 09:13
@eNV25 eNV25 marked this pull request as ready for review August 8, 2022 09:16
@eNV25
Copy link
Contributor Author

eNV25 commented Aug 9, 2022

@fhs I was wrong about the Unicode part in the previous PR, but after looking at the compiled binary, I saw the buffer was being reallocated into a string.

This uses unsafe.Pointer to convert a buffer to a string. This is the same thing that strings.Builder.String does.

@fhs
Copy link
Owner

fhs commented Oct 16, 2022

I don't like using unsafe. Let's use strings.Builder if we need to. We can do what you did in #72 but iterate on bytes instead of runes. I'm guessing iterating over bytes is faster, but it's hard to be certain without benchmarks. It'd also be nice to see the allocation change from the benchmark.

@eNV25
Copy link
Contributor Author

eNV25 commented Oct 17, 2022

I don't like using unsafe. Let's use strings.Builder if we need to.

When I tried it out and checked the assembly output, it looked like operations weren't inlined. I think buffer+unsafe is a cleaner approach.

If you're concerned about the safety, this method to convert []byte to string is the same used by strings.Builder.String in the stdlib. I think it's fine.

benchmarks

Good idea. I wrote a quick PR #77. I'll look into results later.

@fhs
Copy link
Owner

fhs commented Oct 17, 2022

If you're concerned about the safety, this method to convert []byte to string is the same used by strings.Builder.String in the stdlib. I think it's fine.

I'm more concerned that we're assuming that memory layout of []byte and string are such that you can just cast the pointer. The standard library can make such assumption because it ships with the compiler, but we cannot. I don't think the Go spec specifies the memory layout of slices and strings. Go compiler can decide to change the layout in the future or we may be using a different Go compiler (e.g. gccgo) that has a different layout.

@eNV25
Copy link
Contributor Author

eNV25 commented Oct 17, 2022

That's true, it should be much easier when unafe.{Slice,String}{,Data} ships. golang/go#53003

@eNV25
Copy link
Contributor Author

eNV25 commented Oct 17, 2022

Difference, without using strings.Builder

$ benchstat bench-old.txt bench-new.txt
name     old time/op    new time/op    delta
Quote-4     130ns ± 5%      86ns ±10%  -33.67%  (p=0.000 n=10+10)

name     old alloc/op   new alloc/op   delta
Quote-4     96.0B ± 0%     64.0B ± 0%  -33.33%  (p=0.000 n=10+10)

name     old allocs/op  new allocs/op  delta
Quote-4      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

@eNV25
Copy link
Contributor Author

eNV25 commented Oct 17, 2022

Difference, using strings.Builder

diff --git a/mpd/client.go b/mpd/client.go
index addfcc7..5430d5c 100644
--- a/mpd/client.go
+++ b/mpd/client.go
@@ -21,19 +21,19 @@ import (
 // NB: this function shouldn't be used on the PROTOCOL LEVEL because it considers single quotes special chars and
 // escapes them.
 func quote(s string) string {
-	q := make([]byte, 2+2*len(s))
-	i := 0
-	q[i], i = '"', i+1
+	var q strings.Builder
+	q.Grow(2 + 2*len(s))
+	q.WriteByte('"')
 	for _, c := range []byte(s) {
 		// We need to escape single/double quotes and a backslash by prepending them with a '\'
 		switch c {
 		case '"', '\\', '\'':
-			q[i], i = '\\', i+1
+			q.WriteByte('\\')
 		}
-		q[i], i = c, i+1
+		q.WriteByte(c)
 	}
-	q[i], i = '"', i+1
-	return unsafe_String(q[:i:i])
+	q.WriteByte('"')
+	return q.String()
 }
 
 // Quote quotes each string of args in the format understood by MPD.
$ benchstat bench-old.txt bench-new-builder.txt
name     old time/op    new time/op    delta
Quote-4     130ns ± 5%     135ns ± 2%   +4.14%  (p=0.001 n=10+10)

name     old alloc/op   new alloc/op   delta
Quote-4     96.0B ± 0%     64.0B ± 0%  -33.33%  (p=0.000 n=10+10)

name     old allocs/op  new allocs/op  delta
Quote-4      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

@eNV25
Copy link
Contributor Author

eNV25 commented Oct 17, 2022

Looks like using both have equal improvements in memory usage.

The strings.Builder version performs a bit worse than the original, but it improves memory usage.

The unsafe version improves both performance and memory, usage.

The version using strings.Builder is probably better, even if it worsens performance by a bit. The only problem is that mpd.quote doesn't inline any more.

@eNV25
Copy link
Contributor Author

eNV25 commented Oct 17, 2022

@fhs What do you think?

@fhs
Copy link
Owner

fhs commented Oct 17, 2022

Let's use the version using strings.Builder and add a note to use unafe.{Slice,String}{,Data} when it's available.

Copy link
Owner

@fhs fhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@fhs fhs merged commit c182d63 into fhs:master Oct 18, 2022
@eNV25 eNV25 deleted the quote branch October 18, 2022 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants