Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

may avoid unnecessary data copy in processMsg for large message payloads #787

Closed
ninjken opened this issue Aug 4, 2021 · 3 comments
Closed
Assignees

Comments

@ninjken
Copy link
Contributor

ninjken commented Aug 4, 2021

In function func (nc *Conn) processMsg(data []byte) there's a snippet of code

	// FIXME(dlc): Need to copy, should/can do COW?
	msgPayload := make([]byte, len(data))
	copy(msgPayload, data)

It appears that every time we see a new message coming in, the payload gets copied into a new buffer. This is reasonable for small payloads since we are reusing the scratch buffer scratch [MAX_CONTROL_LINE_SIZE]byte. But for large payloads a new buffer area sized as that of the actual message payload has already been created by invoking make([]byte)

		// If we will overflow the scratch buffer, just create a
		// new buffer to hold the split message.
		if nc.ps.ma.size > cap(nc.ps.scratch)-len(nc.ps.argBuf) {
			lrem := len(buf[nc.ps.as:])
			nc.ps.msgBuf = make([]byte, lrem, nc.ps.ma.size)
			copy(nc.ps.msgBuf, buf[nc.ps.as:])
		} else {
			nc.ps.msgBuf = nc.ps.scratch[len(nc.ps.argBuf):len(nc.ps.argBuf)]
			nc.ps.msgBuf = append(nc.ps.msgBuf, (buf[nc.ps.as:])...)
		}

Here I think we may avoid this unnecessary copy for large message payloads within processMsg(data []byte) by introducting a bool flag, say largeMsgBuf, indicating that we don't need to copy the message payload a second time

type parseState struct {
    ...
	msgBuf      []byte
	largeMsgBuf bool
	scratch     [MAX_CONTROL_LINE_SIZE]byte
}
func (nc *Conn) processMsg(data []byte)
    ...
	// FIXME(dlc): Need to copy, should/can do COW?
	var msgPayload = data
	if !nc.ps.largeMsgBuf {
		msgPayload = make([]byte, len(data))
		copy(msgPayload, data)
	}

I've done some preliminary benchmark tests locally. The result showed that this fix could gain approximately 5%~6% performance boost for large message mayloads sized at 5K.
How does this small fix look here?

@kozlovic
Copy link
Member

kozlovic commented Aug 4, 2021

I think this could work indeed. I would not call the boolean largeMsgBuf since actually any time we make a copy of the message buffer in the parser, processMsg() should not have to copy it. So copied or something like that may be better. For instance, it could be that the message arguments (PUB ...) takes quite a bit of the scratch buffer and the message payload itself is not that big, but does not fit into the scratch buffer.

Also, you are not showing where the new boolean is set, but I would think that you need to set it in this "if" statement:

if nc.ps.ma.size > cap(nc.ps.scratch)-len(nc.ps.argBuf) {
   nc.ps.copied = true
...

Also, need to reset this boolean to false after processing the message in the parser code.

Do you want to submit a PR?

@ninjken
Copy link
Contributor Author

ninjken commented Aug 5, 2021

Yes. PR opened for current issue, please review.

[CHANGED] avoid unnecessary data copy in processMsg #788

@kozlovic
Copy link
Member

kozlovic commented Aug 9, 2021

Closing this since it is addressed in #788

@kozlovic kozlovic closed this as completed Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants