bulk_walk causes a LOT of allocations? #2

cchance27 · 2023-05-06T04:00:58Z

Started testing with csnmp since theirs not much snmp support packages in rust, and noticed that the btreemap generation seems to result in a lot of allocations, i bulk walked a table that returned ~52k allocations for the 400 returned oids... is this expected behaviour, also was wondering why you went with btreemap instead of hashmap as i imagine that would be lighter on memory footprint no?

RavuAlHemio · 2023-05-06T11:21:17Z

To be honest, I never really instrumented the memory behavior. Which tool would you recommend?

I chose BTreeMap over HashMap to ensure the OIDs remain in order; perhaps I should rearchitect the API to allow a choice of either.

cchance27 · 2023-05-06T16:57:21Z

I used DHAT to try and diagnose it as I'm remaking a larger internal tool from C# that does SNMP and some other stuff, trying to watch memory usage and allocations to keep performance high as i go through, and was a bit shocked at the number of allocations.

I haven't taken a deeper look at your parsing etc, but not sure if u knew there's even a nom parsing crate now that can offload the parsing side of the snmp, not sure how it benchmarks/performs vs the way you're doing it internally, might make your internal work easier to refactor.

RavuAlHemio · 2023-05-07T10:18:13Z

Switching from BTreeMap to HashMap didn't do much; it appears most of the allocations are related to BigInt/BigUint, since ASN.1 allows arbitrary-length integers. I'll sample the other available ASN.1 libraries; maybe one of them provides a more efficient representation.

RavuAlHemio · 2023-05-09T17:33:44Z

As you might have seen, I started an experimental derparser branch to switch from simple_asn1 to the nom-based der-parser.

Comparing the two branches when bulk-walking 1.3.6.1.2.1 on a pretty bog-standard Cisco switch:

parser	bytes	blocks	time
`simple_asn1`	358,385,203	3,692,966	51,74 s
`der-parser`	217,130,033	96,592	5,66 s

Honestly, just time-wise, that seems like quite the improvement, although I'm not sure that it solves your allocation issue -- it allocates fewer blocks but still a very similar number of bytes.

cchance27 · 2023-05-09T20:23:47Z

I mean your allocating similar bytes but much fewer blocks and a 10-fold performance increase is pretty darn impressive

Here's my run of my code swaping between branches...

parser	bytes	blocks	time
`simple_asn1`	34,130,594	54,246	21,84 s
`der-parser`	32,264,796	7,238	3,55 s

However with dhat disabled actual execution time is about the same ~600ms, the improvement we're seeing in time is because of dhat having to record the much higher number of memory traffic, but saving 47000 allocations to memory is an improvement regardless, still odd the byte size is so high for relatively little data, sort of points toward the memory usage being from something outside the decoding...

I wonder if the bytes allocated is coming from the objectvalue side or the objectidentifier side.

RavuAlHemio · 2023-05-10T09:17:58Z

I tried rewriting ObjectIdentifier to use Vec<u32> instead of a fixed-size array; the dhat results are as follows:

`ObjectIdentifier`	bytes	blocks
array	313,295,165	262,723
vector	96,831,229	488,935

So kinda as expected: less memory but more blocks. Also, ObjectIdentifier::new can no longer be const.

cchance27 · 2023-05-10T19:01:24Z

bytes drastically getting reduced makes sense, but it's odd now that allocations went up 2x... on the bright side it's not just total bytes allocated, but t-gmax also went down, i went from a t-gmax of 2mb to 170kb a major savings.

In the dhat view, it seems almost all the allocations that are popping up now are coming from clones of the objectidentifier vec

15k of the 24k allocations in my run were from the clones.

Total: 951,036 bytes (31.91%, 93,911.16/s) in 15,851 blocks (64.34%, 1,565.23/s), avg size 60 bytes, avg lifetime 397,278.04 µs (3.92% of program duration)
Max: 57,600 bytes in 960 blocks, avg size 60 bytes
At t-gmax: 57,600 bytes (33.73%) in 960 blocks (68.23%), avg size 60 bytes
At t-end: 0 bytes (0%) in 0 blocks (0%), avg size 0 bytes
Allocated at {
^1: 0x7ff61d217ded: alloc::alloc::impl$1::allocate (alloc\src\alloc.rs:237:0)
^2: 0x7ff61d2152cc: alloc::raw_vec::RawVec<u32,alloc::alloc::Global>::allocate_in<u32,alloc::alloc::Global> (alloc\src\raw_vec.rs:185:0)
^3: 0x7ff61d236884: alloc::slice::hack::impl$1::to_vec<u32,alloc::alloc::Global> (alloc\src\slice.rs:162:0)
^4: 0x7ff61d2107ba: alloc::slice::hack::to_vec (alloc\src\slice.rs:111:0)
^5: 0x7ff61d2107ba: alloc::slice::impl$0::to_vec_in (alloc\src\slice.rs:441:0)
^6: 0x7ff61d2107ba: alloc::vec::impl$11::clone<u32,alloc::alloc::Global> (src\vec\mod.rs:2655:0)
^7: 0x7ff61d17515d: csnmp::oid::impl$18::clone (csnmp\src\oid.rs:75:0)

This was referenced Feb 26, 2024

Catch up derparser branch with main #8

Merged

Catch up derparser-oidvec branch with main #9

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bulk_walk causes a LOT of allocations? #2

bulk_walk causes a LOT of allocations? #2

cchance27 commented May 6, 2023

RavuAlHemio commented May 6, 2023 •

edited

cchance27 commented May 6, 2023

RavuAlHemio commented May 7, 2023

RavuAlHemio commented May 9, 2023

cchance27 commented May 9, 2023 •

edited

RavuAlHemio commented May 10, 2023 •

edited

cchance27 commented May 10, 2023 •

edited

bulk_walk causes a LOT of allocations? #2

bulk_walk causes a LOT of allocations? #2

Comments

cchance27 commented May 6, 2023

RavuAlHemio commented May 6, 2023 • edited

cchance27 commented May 6, 2023

RavuAlHemio commented May 7, 2023

RavuAlHemio commented May 9, 2023

cchance27 commented May 9, 2023 • edited

RavuAlHemio commented May 10, 2023 • edited

cchance27 commented May 10, 2023 • edited

RavuAlHemio commented May 6, 2023 •

edited

cchance27 commented May 9, 2023 •

edited

RavuAlHemio commented May 10, 2023 •

edited

cchance27 commented May 10, 2023 •

edited