Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_buffer grows until allocation fails #291

Open
mkeeter opened this issue Sep 4, 2019 · 2 comments · May be fixed by #328
Open

out_buffer grows until allocation fails #291

mkeeter opened this issue Sep 4, 2019 · 2 comments · May be fixed by #328

Comments

@mkeeter
Copy link

mkeeter commented Sep 4, 2019

I've been using ws-rs in an application which passes images from an embedded device to a webpage, as a simple video streaming solution.

On very rare occasions, the system suffers from memory exhaustion, prints something like
memory allocation of 268447744 bytes failed
and crashes.

I've managed to trigger this situation and catch it in the debugger. The issue appears to be out_buffer growing without bounds. Here's a full backtrace:

#0  0xb6ba293c in raise () from /lib/libc.so.6
#1  0xb6ba3bac in abort () from /lib/libc.so.6
#2  0x006ae5f4 in std::sys::unix::abort_internal ()
#3  0x006a99b4 in rust_oom ()
#4  0x006bb780 in alloc::alloc::handle_alloc_error ()
#5  0x006982c8 in <alloc::raw_vec::RawVec<T, A>>::allocate_in (cap=167772160, zeroed=false, a=...) at liballoc/raw_vec.rs:110
#6  0x00698104 in <alloc::raw_vec::RawVec<T>>::with_capacity (cap=167772160) at liballoc/raw_vec.rs:150
#7  0x00698ec8 in <alloc::vec::Vec<T>>::with_capacity (capacity=167772160) at liballoc/vec.rs:368
#8  0x004701a4 in <ws::connection::Connection<H>>::check_buffer_out (self=0x9098b8, frame=0xbeff9cf8)
    at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/connection.rs:1184
#9  0x0046a5f8 in <ws::connection::Connection<H>>::buffer_frame (self=0x9098b8, frame=...)
    at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/connection.rs:1166
#10 0x0046b9c0 in <ws::connection::Connection<H>>::send_message (self=0x9098b8, msg=...)
    at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/connection.rs:1049
#11 0x00446a14 in <ws::io::Handler<F>>::handle_queue (self=0xbeffe328, poll=0xbeffe400, cmd=...) at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/io.rs:827
#12 0x00442da8 in <ws::io::Handler<F>>::handle_event (self=0xbeffe328, poll=0xbeffe400, token=..., events=...)
    at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/io.rs:635
#13 0x004416f8 in <ws::io::Handler<F>>::event_loop (self=0xbeffe328, poll=0xbeffe400) at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/io.rs:516
#14 0x00449cbc in <ws::io::Handler<F>>::run (self=0xbeffe328, poll=0xbeffe400) at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/io.rs:483
#15 0x0048879c in <ws::WebSocket<F>>::run (self=...) at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/lib.rs:336
#16 0x00489354 in <ws::WebSocket<F>>::listen::{{closure}} (server=...) at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/lib.rs:321
#17 0x004552c8 in <core::result::Result<T, E>>::and_then (self=..., op=...) at libcore/result.rs:647
#18 0x00489320 in <ws::WebSocket<F>>::listen (self=..., addr_spec=...) at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/lib.rs:321
#19 0x00438e30 in printercam_server::main () at src/main.rs:346

I tried to limit the size of out_buffer by configuring out_buffer_grow: false, out_buffer_capacity: 1310720 in the Builder settings. The debugger confirms that these settings have been applied.

However, out_buffer has still grown to a much larger size than expected:

(gdb) fr 8
#8  0x004701a4 in <ws::connection::Connection<H>>::check_buffer_out (self=0x9098b8, frame=0xbeff9cf8)
    at /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/connection.rs:1184
1184    in /usr/src/debug/printercam-server/1.0-r0/cargo_home/bitbake/ws-0.7.9/src/connection.rs
(gdb) print self.out_buffer
$4 = std::io::cursor::Cursor<alloc::vec::Vec<u8>> {inner: alloc::vec::Vec<u8> {buf: alloc::raw_vec::RawVec<u8, alloc::alloc::Global> {ptr: core::ptr::Unique<u8> {pointer: core::nonzero::NonZero<*const u8> (0x83cb7008 "3Zs\022\066T\363\243붡\205<\253\225\315g%\250\033\366\215\275\225k\261\211<\264\302\364\256\203MIG\034S\263\212[\214)\255\351@\016\307ZJ`6\214P\300;{\322\366\252\260\005/z\220\nL\373S\001\071\343\212Nz\320\002{\322m\a\005\207\064\206;=\263I\234t\247a\r\371\271'\024\360\a\245\003\023\031\246\064k'\312\312\b\245`\"\373,\003#\313\\\036ئ\033\070\267p\017\347TU\306\065\217\374\362|}j\215\317ڭ:\215\311\354(\217\230\020\255\377\000"), _marker: core::marker::PhantomData<u8>}, cap: 167772160, a: alloc::alloc::Global}, len: 167756563}, pos: 34954476}

Specifically, cap: 167772160 is much larger than the fixed capacity of 1310720.

Digging into the source, it seems like the only place this buffer is touched is check_out_buffer. Do you have any ideas why the buffer could be growing large enough to trigger allocation failures?

@mkeeter
Copy link
Author

mkeeter commented Sep 4, 2019

The issue appears to be the allocation strategy in buffer_frame and check_out_buffer.

Printing out_buffer.get_ref().len() and out_buffer.get_ref().capacity() after this line, I see a graph that continues to allocate more memory ad infinitum:

Screen Shot 2019-09-04 at 3 11 57 PM

@mkeeter
Copy link
Author

mkeeter commented Sep 5, 2019

Digging deeper, the buffer is often reallocated in frame.format(...). The Cursor implementation for Vec<u8> will push data to the vector, which may cause reallocation (and doubling in size). This isn't blocked by the out_buffer_grow setting, which is why I saw buffer size growth even when setting it to false.

I've patched this locally with this commit, which

  • Correctly detects when frame.format(&mut self.out_buffer)?; will cause a vector to be reallocated, and raises an error in that case if self.settings.out_buffer_grow is false.
  • Allows the vector to shrink (when selecting the capacity for new), rather than monotonically increasing in size, which should prevent the unbounded buffer growth that I was seeing.

Happy to PR this if you want; until then, I've changed my build to point to the forked repo.

@maciejhirsz maciejhirsz linked a pull request Sep 17, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant