Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added function for lazy message reading per #222 #224

Merged
merged 4 commits into from
Jul 3, 2015

Conversation

asergeyev
Copy link
Collaborator

No description provided.

@miekg
Copy link
Owner

miekg commented Jun 24, 2015

As I think is my normal review mode, i will go over each change in detail and then (hopefully) see the bigger picture and moan about that :-)

return nil, err
} else if n < 12 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think 12 is better as a constant, headerSize ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to choose private vs making it public?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ Quoting notifications@github.com in "Re: [dns] Added function for lazy m..." ]

     return nil, err
  • } else if n < 12 {

Any reason to choose private vs making it public?

Could as well be public.

@miekg
Copy link
Owner

miekg commented Jun 24, 2015

Looks pretty good. Actually hooking it up in the lib itself is another PR (which is really fine)

@asergeyev
Copy link
Collaborator Author

Should I continue in this one to reuse Msg.Unpack properly or finish this one and open new later?

@miekg
Copy link
Owner

miekg commented Jun 26, 2015

Fixing this up first and then a new PR is fine. Does this function need to be exported though?

@asergeyev
Copy link
Collaborator Author

Does this function need to be exported though?

Yeah... it's good function for convenience of those who write clients. Sometimes you only need packets with answers and without this function you'd go and parse all fields in authority/additional even if your client does not care about them.

@miekg
Copy link
Owner

miekg commented Jun 26, 2015

[ Quoting notifications@github.com in "Re: [dns] Added function for lazy m..." ]

Does this function need to be exported though?

Yeah... it's good function for convenience of those who write clients. Sometimes you only need packets with answers and without this function you'd go and parse all fields in authority/additional even if your client does not care about them.

Ok. Works for me.

/Miek

Miek Gieben

@asergeyev
Copy link
Collaborator Author

Well... Here is what I got...

I still followed with changing all we discussed in one PR, since there is not much. Logic is still same as it was in ReadMsg.

I'm not sure there is any good in pursuing creation of a function that's not reserving such large buffers... In TCP it's probably doable, but in UDP things must be loaded in one Read call; it's really not possible to change...

@miekg
Copy link
Owner

miekg commented Jun 29, 2015

[ Quoting notifications@github.com in "Re: [dns] Added function for lazy m..." ]

Well... Here is what I got...

I still followed with changing in one PR, since there is not much changed as of now. Logic is still same. I'm not sure there is any good in pursuing creation of a function that's not reserving such large buffers... In TCP it's probably doable, but in UDP things should be loaded in one read call so it's really not possible to change...

Ah yeah. I see what you mean... you can't do a partial read and then
read-some-more when you're ready for it... Part of the appeal of such
a function it that it would be nice if you could. Not sure how though...

@asergeyev
Copy link
Collaborator Author

Last patch shows how I see it could be done for TCP. 64K is very large buffer I agree. dns.Conn.Read becomes less useful here since it expects to read everything in one pre-allocated buffer in TCP.

//
// Note that this function would not be able to report TSIG error or
// check it got actual DNS payload.
func (co *Conn) ReadMsgBytes(hdr *Header) ([]byte, error) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... What if we would really makes this only read 12 (or 14 in case of tcp) bytes and let it return the Conn, the bytes read and an error? Maybe the buffer should also by given to the function, so it is really without any internal allocations? So if you get an error you just discard and write FormErr (maybe from within the lib already). If thinks look OK we can parse the bytes to a real header and assemble the packet...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you read from UDP rest is discarded. You could not really read again there so you end up buffering and faking ability to read which is slow.

http://golang.org/src/net/udpsock_posix.go?s=1547:1620#L53
http://golang.org/src/net/fd_unix.go#L256

then from rcvfrom(2):

All three routines return the length of the message on successful 
completion. If a message is too long to fit in the supplied buffer, 
excess bytes may be discarded depending on the type of socket 
the message is received from.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ Quoting notifications@github.com in "Re: [dns] Added function for lazy m..." ]

+// Note that this function would not be able to report TSIG error or
+// check it got actual DNS payload.
+func (co *Conn) ReadMsgBytes(hdr *Header) ([]byte, error) {

Once you read from UDP rest is discarded. You could not really read again there so you end up buffering and faking ability to read which is slow.

http://golang.org/src/net/udpsock_posix.go?s=1547:1620#L53
http://golang.org/src/net/fd_unix.go#L256

then from rcvfrom(2):

All three routines return the length of the message on successful completion. If a message
is too long to fit in the supplied buffer, excess bytes may be discarded depending on the
type of socket the message is received from.

Grrrr :( So what would be the win then, this is the typical space/time trade
off? So the allocation of memory still happens, the only thing that you shortcut
is the potentially slowish parsing of the rest of the packet. I rather spend
cycles on making that faster, or fail ealier.

/Miek

Miek Gieben

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadMsgBytes -> ReadMsgHeader as a function name.

@asergeyev
Copy link
Collaborator Author

I'll experiment with profiler a bit and let you know. IMO net lib has limits by itself plus fact that dns lib built around strings... Allocs for message reading are probably not worst thing there....

@asergeyev
Copy link
Collaborator Author

I created new clientbench branch and put some test code there... As expected, UDP is close, almost no difference but TCP is faster in new implementation (not reserving 64K each time):

BenchmarkNewUDPClient      10000        342285 ns/op
BenchmarkOldUDPClient       5000        347112 ns/op
BenchmarkNewTCPClient      10000        132508 ns/op
BenchmarkOldTCPClient      10000        202532 ns/op

With pprof and web -cum BenchmarkName it looks that most CPU time was still on syscalls... and pack/unpack StructValue; I decided to stub all of those, wrote small fake net.Conn for that... (second commit in that branch)... I also changed UDPSize to 4096 and run it with benchtime=10s to make sure I'll have small CPU expenses visible (paranoid Alex).

Here is what it looks like for "web -cum ReadMsg" only on new client test:

 https://www.dropbox.com/s/3iw1o23ytnksvie/pprof0002.svg?dl=1

You can see there is almost no time spent where you think we can win something.

@miekg
Copy link
Owner

miekg commented Jul 2, 2015

[ Quoting notifications@github.com in "Re: [dns] Added function for lazy m..." ]

I created new clientbench branch and put some test code there... As expected, UDP is close, almost no difference but TCP is faster in new implementation (not reserving 64K each time):

BenchmarkNewUDPClient 10000 342285 ns/op
BenchmarkOldUDPClient 5000 347112 ns/op
BenchmarkNewTCPClient 10000 132508 ns/op
BenchmarkOldTCPClient 10000 202532 ns/op

With pprof and web -cum BenchmarkName it looks that most CPU time was still on syscalls... and pack/unpack StructValue; I decided to stub all of those, wrote small fake net.Conn for that... (second commit in that branch)... I also changed UDPSize to 4096 and run it with benchtime=10s to make sure I'll have small CPU expenses visible (paranoid Alex).

Here is what it looks like for "web -cum ReadMsg" only on new client test:

https://www.dropbox.com/s/3iw1o23ytnksvie/pprof0002.svg?dl=1

You can see there is almost no time spent where you think we can win something.

Ack. Yep, that points to something different.

@asergeyev
Copy link
Collaborator Author

But convinience that I was asking for in #222 is already there plus small bonus for TCP... Maybe you can merge this one? (I understand if not...)

as of performance, could be separate effort to look at different places but the need of reserving smallish structures with new(...) (or &Name{}) and work with strings is what could be hard to deal with...
And, since your lib is not just wire client but also works with text representation a lot I do not think you should change the approach.

@miekg
Copy link
Owner

miekg commented Jul 2, 2015

[ Quoting notifications@github.com in "Re: [dns] Added function for lazy m..." ]

But convinience that I was asking for in #222 is already there plus small bonus for TCP... Maybe you can merge this one? (I understand if not...)

Yes, I will. But I first want to take another look.

as of performance, could be separate effort to look at different places but the
need of reserving smallish structures with new(...) (or &Name{}) and work with
strings is what could be hard to deal with... And, since your lib is not just
wire client but also works with text representation a lot I do not think you
should change the approach.

No, indeed. Having just strings make it easy to work with.

/Miek

Miek Gieben

return m, err
}

// ReadMsgBytes reads DNS packet, parses and fills wire header (passing nil
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadMsgBytes reads a DNS message, parses it and populates hdr (when hdr is not nil)

}

// ReadMsgBytes reads DNS packet, parses and fills wire header (passing nil
// would cause skipping that). Returns message in byte format to parse with
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It return the rest of the message as a byte slice to pasrse with Msg.Unpack later on.

@asergeyev
Copy link
Collaborator Author

Thanks for careful review... I decided not to wait on your comments, I think I followed most of your recommendations here.

@miekg
Copy link
Owner

miekg commented Jul 3, 2015

Ack, two naming nits. But it fix those shortly.

miekg added a commit that referenced this pull request Jul 3, 2015
Added function for lazy message reading per #222
@miekg miekg merged commit 6a8b26e into miekg:master Jul 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants