-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle receiving duplicate packets #69
Conversation
c93be00
to
cf7eda1
Compare
Patrick, Thank you for taking the time to prepare a patch. I do want to note for further discussion that this commit fundamentally Here are a few comments on the existing design of this function I need to understand the implications a little better here though. On Sat, Jul 9, 2016 at 7:32 PM, Patrick Hemmer notifications@github.com
Whitham D. Reeve II |
Alright, I have verified
It would be a good idea to write a test for this scenario, request On Sun, Jul 10, 2016 at 12:57 AM, Whitham Reeve thetawaves@gmail.com
Whitham D. Reeve II |
Oh, and and I just realized that timeout retry behavior in the issue you linked is also another cause of this issue. |
I suppose on windows that is the expected behavior. Though not on unix like Also, changing the buffer size to 64k does not change the problem. It On Sun, Jul 10, 2016 at 10:06 AM, Patrick Hemmer notifications@github.com
Whitham D. Reeve II |
Not true.
|
I just wrote a quick benchmark with the idea: Master branch:
64kb buffer:
|
I've added a commit to use a 64kb read buffer. |
26be96e
to
07893db
Compare
I rather like this commit. Due diligence requires me to understand why this code was implemented in and then: I'm guessing that the second commit supersedes any benefits provided by the On Sun, Jul 10, 2016 at 8:11 PM, Patrick Hemmer notifications@github.com
Whitham D. Reeve II |
It looks like both those commits were trying to address what is described here: https://blog.golang.org/go-slices-usage-and-internals at the bottom in the A possible "gotcha" section. Looking at the old code (that was removed in b84d786), it was allocating 64kb each time something was received. If reference to that allocated slice was maintained, then it could end up sitting in memory until all references were released. So b84d786 tried to solve the issue by allocating less memory. But still if thousands of references were held, it would still add up to a lot of memory. So it looks like 3e6fc99 tried to then fix the issue by performing a copy of exactly the size needed, which would allow the original over-sized allocation to be released. This PR changes the approach entirely so that instead of performing an allocation every time we read something, we re-use the same buffer over and over. Though we still do an allocation of the exact size needed to hold the response, and if thousands of those are held, then it could add up. But that would be the caller's fault for holding them, not gosnmp (at least it shouldn't be. I don't think there's anywhere that maintains a permanent reference to the received response). Edit: I amended the commit to add a description given the controversial history |
This commit changes the `receive()` method to use a static 64k buffer. This is done to alleviate the issue that arises where if we try to read a UDP message that is larger than the buffer, then the message is truncated. 64k is the maximum size of a IP packet (even when fragmented), so that is the size of our buffer. We then persist the buffer between `receive()` calls by placing it on the `GoSNMP` struct. This is so that the garbage collector is not constantly having to clean them up. In the future, if the golang compiler escape detector is smart enough to figure out that the buffer doesn't escape the function, we could declare the buffer within `receive()`, and let it sit on the stack. But for the moment the escape detector is not smart enough to do this.
Hello @phemmer, you got the history of it right. As a user of the library, I encountered this nasty bug where memory would not be released indeed, in a not obvious way. The real solution was to copy the slice to force the runtime to deallocate this unreferenced memory (I think a comment explaining this needs to remain next to the code as that was definitely a gotcha). The least I could do would be to test out your patch against my code, see if memory usage stays reasonable (at least in my use case). |
If we're this concerned about memory usage, it would be easy to write a test to perform a few thousand |
@phemmer if you can reproduce the problem (by avoiding the copying and triggering the bug), why not. As I recall, I wasn't holding on to any reference, but I must admit I don't recall the details as clearly as I should. Sorry I haven't had the time to look into it today, I'll try to look into it tomorrow. |
Test available at e3df90e Results:
f6904fd (master)
8267234 (this PR)
Note that the high memory usage (in 6a71adf) only happens with octet string responses. Meaning gosnmp is likely using a byte slice of the raw response when unmarshalling. Thus the memory usage could get even lower by doing a copy within the unmarshal. Edit: indeed, as here is the result with copying the octet string during unmarshal:
|
I've been backpacking around Brazil for a few months :-) I'd like some time to review this while my technical brain reboots :-D |
* Add a new and improved snmp plugin * update gosnmp for duplicate packet fix gosnmp/gosnmp#68 gosnmp/gosnmp#69
fixes #68
As an unrelated note, it happens at numerous places in the code where a method is called on a struct, and is passed members of itself as arguments.
For example in the previous code that this PR replaced:
Why not have the
dispatch
method just accessx.Conn
itself instead of passing it?Other example:
packetOut
is a SnmpPacket in which.Variables
can replace thepdus
argument..PDUType
can replace thepacketOut.PDUType
argument,.MsgID
can replace themsgID
argument, and.RequestID
can replace thereqID
argument.And numerous similar examples.
I think it would make the code easier for contributors to work with if these were cleaned up. I personally found it highly confusing trying to figure out why this was being done.