Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse buffers when marshaling messages in unary Connect protocol #561

Merged
merged 1 commit into from
Aug 13, 2023

Conversation

jhump
Copy link
Member

@jhump jhump commented Aug 11, 2023

The marshalAppender interface was previously only used in the envelope writer. That means it was already reusing buffers from the pool for marshaling messages in all other protocols (Connect streaming, gRPC, and gRPC-Web); it was just missing from here.

The envelope writer logic is much more complicated, where it tries to use a copy to save the allocation of a bytes.Buffer struct to wrap the []byte. I'm curious if that's been benchmarked and how big of a difference it really makes. I did not add the same logic here, but I can if we really think it's important. Do let me know!

I was also looking through the implementation of bytes.Buffer.Write, which under the does a copy between bytes slices. And in the case where marshalling did not have to increase the buffer capacity, the src and dest slices will actually be the same. So I dug into the runtime source code to see how it handles overlapping regions, hoping that maybe it's smart enough to also no-op when src == dest. It is not that smart. So it does indeed perform a copy of bytes even though it's effectively a no-op 馃し ..

@jhump jhump requested a review from akshayjshah August 11, 2023 02:54
@CLAassistant
Copy link

CLAassistant commented Aug 11, 2023

CLA assistant check
All committers have signed the CLA.

@emcfarlane
Copy link
Contributor

copy(dst, src) looks like a noop. Ran a quick benchmark:

package main

import (
	"crypto/rand"
	"testing"
)

var slice []byte

func BenchmarkCopySame(b *testing.B) {
	b.ReportAllocs()
	size := 1024
	slice = make([]byte, size*size*size)
	rand.Read(slice)
	chunkSize := size * size
	b.Run("CopySame", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			chunk := slice[i : chunkSize+i]
			copy(slice[i:], chunk)
		}
	})
	b.Run("CopyDiff", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			chunk := slice[i+2 : chunkSize+i+2]
			copy(slice[i:], chunk)
		}
	})
}
BenchmarkCopySame/CopySame-8            1000000000               0.5857 ns/op          0 B/op          0 allocs/op
BenchmarkCopySame/CopyDiff-8               64106             18709 ns/op               0 B/op          0 allocs/op

@jhump
Copy link
Member Author

jhump commented Aug 11, 2023

@emcfarlane, I suspect that's on your laptop (an Apple silicon Mac, right?). I was looking at the impl for AMD64. (For these low-level things, it's all assembly.)

We'd need to run the benchmark on an x86 machine in the cloud to verify.

@emcfarlane
Copy link
Contributor

Ah right, yeah be good to check. I didn't look at the source. And yep apple silicon: goos: darwin, goarch: arm64.

@akshayjshah
Copy link
Member

We can always include the benchmark in this PR (temporarily) - it's all running on x86 on Github Actions, so we can easily look at the output. We'd need to add a line to the make test definition or the CI github action config temporarily, but that's easy enough.

Copy link
Member

@akshayjshah akshayjshah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Nice catch, Josh. Any further optimizations can (I think) be follow-on PRs.

@jhump jhump merged commit 04dfeb4 into main Aug 13, 2023
6 checks passed
@jhump jhump deleted the jh/re-use-buffers-for-unary-connect branch August 13, 2023 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants