-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bulk_walk causes a LOT of allocations? #2
Comments
To be honest, I never really instrumented the memory behavior. Which tool would you recommend? I chose BTreeMap over HashMap to ensure the OIDs remain in order; perhaps I should rearchitect the API to allow a choice of either. |
I used DHAT to try and diagnose it as I'm remaking a larger internal tool from C# that does SNMP and some other stuff, trying to watch memory usage and allocations to keep performance high as i go through, and was a bit shocked at the number of allocations. I haven't taken a deeper look at your parsing etc, but not sure if u knew there's even a nom parsing crate now that can offload the parsing side of the snmp, not sure how it benchmarks/performs vs the way you're doing it internally, might make your internal work easier to refactor. |
Switching from BTreeMap to HashMap didn't do much; it appears most of the allocations are related to BigInt/BigUint, since ASN.1 allows arbitrary-length integers. I'll sample the other available ASN.1 libraries; maybe one of them provides a more efficient representation. |
As you might have seen, I started an experimental Comparing the two branches when bulk-walking 1.3.6.1.2.1 on a pretty bog-standard Cisco switch:
Honestly, just time-wise, that seems like quite the improvement, although I'm not sure that it solves your allocation issue -- it allocates fewer blocks but still a very similar number of bytes. |
I mean your allocating similar bytes but much fewer blocks and a 10-fold performance increase is pretty darn impressive Here's my run of my code swaping between branches...
However with dhat disabled actual execution time is about the same ~600ms, the improvement we're seeing in time is because of dhat having to record the much higher number of memory traffic, but saving 47000 allocations to memory is an improvement regardless, still odd the byte size is so high for relatively little data, sort of points toward the memory usage being from something outside the decoding... I wonder if the bytes allocated is coming from the objectvalue side or the objectidentifier side. |
I tried rewriting
So kinda as expected: less memory but more blocks. Also, |
bytes drastically getting reduced makes sense, but it's odd now that allocations went up 2x... on the bright side it's not just total bytes allocated, but t-gmax also went down, i went from a t-gmax of 2mb to 170kb a major savings. In the dhat view, it seems almost all the allocations that are popping up now are coming from clones of the objectidentifier vec 15k of the 24k allocations in my run were from the clones. Total: 951,036 bytes (31.91%, 93,911.16/s) in 15,851 blocks (64.34%, 1,565.23/s), avg size 60 bytes, avg lifetime 397,278.04 µs (3.92% of program duration) |
Started testing with csnmp since theirs not much snmp support packages in rust, and noticed that the btreemap generation seems to result in a lot of allocations, i bulk walked a table that returned ~52k allocations for the 400 returned oids... is this expected behaviour, also was wondering why you went with btreemap instead of hashmap as i imagine that would be lighter on memory footprint no?
The text was updated successfully, but these errors were encountered: