-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack overflow with Boxed array #53827
Comments
Duplicate of #28008. |
But #28008 is closed. So we should probably keep one issue open for this. Especially since |
Just 2 cents from the peanut gallery... This bug has been around for 3
years. Surely Rust needs a stable way to allocate large arrays.
…On Sat, Sep 1, 2018, 3:58 AM Ryan Scheel ***@***.***> wrote:
But #28008 <#28008> is closed. So
we should probably keep one issue open for this. Especially since box
syntax isn't coming any time soon AFAICT.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#53827 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADUBIyLz22NknasbYL72xHVmMtxQTmpks5uWj4agaJpZM4WTto6>
.
|
We did make an attempt at solving this with placement-in, but that didn’t work out overall. Same problems are ailing the |
If Box causes stack overflow, and box is not stable (and isn't coming soon), what is the recommended way to allocate a large block of memory in Rust? |
@opus111 I am most certainly not a rust expert, but I bet the standard response would be to use the container types in the standard library, i.e. Vec. Depending on the application, increasing the allowable stack size may also be an option. Is there a standard workaround for this (probably using unsafe for the cast) that works for any fixed size custom type? |
@drewm1980 Thanks. I am also not an expert, but wouldn't a container also be allocated on the stack? My understanding is that Box or box is how one specifies that memory is not on the stack. Surely, somebody is using Rust for large objects (e.g. images). How is it done? Does one have to be unsafe and call C? |
You could use a |
@opus111 for images consider the ndarray crate. The standard libraries, and libraries like ndarray are doing heap allocations internally. They're probably using unsafe{} blocks internally, but it's OK, writers of standard libraries tend to know what they're doing. Rust, like C, kept the core language very small. Like in C memory allocation is handled in a library, not at the language level. Syntactically speaking, to get an object on the heap, you have to first create it (on the stack like everything else in the core language) and pass it into a library function that does the heap allocation and a copy. If you want to really double-down on using the stack for everything, I just found: It would be neat if rust had an ergonomic way to statically specify the max size of your stack (or have it automatically grow), and have the stack "just work" even for large objects for the main thread. If the rust devs focus on making the heap more ergonomic, people are going to use the heap more. If they focus on making the stack more ergonomic, people will use the stack more. Some people actually need to allocate their big objects on the heap, but some users are just being bitten by a default stack size limit that hasn't kept pace with the hardware, and a stack that can't grow transparently like the heap can. |
I googled a bit more, and discussion around stack size issues in Rust's design unsurprisingly goes waaay back, i.e. : "The conclusion was that we would always use split stacks, but on 64-bit platforms with overcommit we would make the initial segment large enough that it doesn't need to grow in typical use. This was also already implemented in the old runtime." They apparently ripped out the split stack stuff, but I guess the "make the initial segment large enough that it doesn't need to grow in typical use" didn't happen, since by default Rust's stack overflows long before you actually run out of physical RAM. Another recently commented on ancient thread asking for configurable stack size: |
Thanks drew and steven. I am trying to avoid putting this object on the stack, because it is too big. I am surprised to learn that "to get an object on the heap, you have to first create it (on the stack...and pass it into a library function that does the heap allocation and a copy.". That doesn't sound very efficient :-( Thanks for the pointer. Its not an image, but I'll try ndarray |
vec![-1; 3000000].into_boxed_slice() A note of difference with the
There is also the |
It seems that |
Anyone know how to do this:
Without blowing the stack? |
At least on nightly you can do
but it does require unsafe code. |
Is there a reason you don't use a 1D array and index into it as a 2D array? Your code will be simpler and I believe it results in some minor performance benefits. You'd be able to use @memoryruins' solution above: let mut a = vec![-1; 2048 * 2048].into_boxed_slice();
fn get_item(x: usize, y: usize) -> i32 {
a[x % 2048 + y / 2048] // a[x][y]
} You could simplify a bit further by pulling out a constant: const SIZE: usize = 2048;
let mut a = vec![-1; SIZE * SIZE].into_boxed_slice();
fn get_item(x: usize, y: usize) -> i32 {
a[x % SIZE + y / SIZE] // a[x][y]
} |
Don't you mean:
I thought about that. I wanted to avoid that ugly messing around. After all we have a high level language with a syntax for 2d indexing, we should be able to use it.
Thanks all. |
Whoops, yes, I think I wrote that too fast and mixed it up with the reverse operation (index-to-x/y) One thing you can do to at least hide that "ugly messing around" is to create a struct and hide the array behind a method that gets a mutable reference to the index. That way you can both get and set it like you normally would, and the code that uses it never has to know it's implemented as a 1D array. |
Just for reference another workaround that works on stable but requires unsafe is:
This at least gives you the benefit that |
I might be missing something but since a "clean" solution to this problem is not going to happen any time soon since it's a tough nut to crack, why not work around the issue by having a macros similar to /// A macro similar to `vec![$elem; $size]` which returns a boxed array.
///
/// ```rustc
/// let _: Box<[u8; 1024]> = box_array![0; 1024];
/// ```
macro_rules! box_array {
($val:expr ; $len:expr) => {{
// Use a generic function so that the pointer cast remains type-safe
fn vec_to_boxed_array<T>(vec: Vec<T>) -> Box<[T; $len]> {
let boxed_slice = vec.into_boxed_slice();
let ptr = ::std::boxed::Box::into_raw(boxed_slice) as *mut [T; $len];
unsafe { Box::from_raw(ptr) }
}
vec_to_boxed_array(vec![$val; $len])
}};
} Isn't it a common enough issue to warrant being provided in the standard? That would prevent everybody from coming up with their own solutions and the bugs that may come with it.
|
Not sure if I'm understanding everything here properly (one of the reasons I enjoy using Rust is because I don't need to worry to much about managing memory correctly or knowing too much about the underlying structures, since the compiler will have my back), but just wanted to chime in as someone who ran into this problem while trying to figure out a way to do image processing and passing image data structures between C++ and Rust. I believe a use case like this was mentioned by @opus111 earlier in this issue. In particular, I was looking for a way to get an OpenCV I think I'll likely have to either drop back into doing some unsafe code referenced above and go from there, or stick with doing things in C++ instead. |
@quietlychris You may want to try copyless and use it until the compiler gets smart enough to eliminate all the unnecessary memory copying it currently does on various occasions. |
@MSxDOS I hadn't seen that crate before, thanks for the tip! |
There is still no way to initialize an array directly on the heap without using Vector or unsafe? Need a way to emplace data directly in the stack frame or heap. Rust is really missing it. |
Is this the real life?(Is this just fantasy) 7 year old issues, what, are we the gtk filechooser now? I understand that stack exists, but unlike say cpp rust treats array as first-class-for-real. Disappointing to see same old footguns behind the new branding. I mean, AT LEAST you could make it a compile error to declare array that will kill your stack. |
What do you mean? |
There are older issues, Rust teams have different priorities ATM and many of the Rust team members are volunteers with reduced resources. Perhaps a compilation of everything that happened until now can lead to a reasonable and well-founded RFC? It might be worth opening a Zulip thread for further discussion. |
After faceplanting into this brick wall I feel this should be pointed out: the issue is broader than arrays and even broader than Box specifically. In systems domains, preventing large stack allocations and large copies often matters, and we just don't really have a good way to say "take this memory which is known by the compiler to be suitable for this object and create this object in it". Box is where we tend to run into it because it's our answer to Even simple assignment can overflow the stack by insisting on copying from it: #![feature(new_uninit)]
#![allow(dead_code)]
use std::io::Write;
use std::hint::black_box;
use std::mem::size_of;
#[derive(Default, Debug, Copy, Clone)]
struct Page([u128; 32], [u128; 32], [u128; 32]);
#[derive(Debug)]
struct Big([[Page; 32]; 32]);
#[derive(Default, Debug)]
struct Small([u8; 16]);
#[derive(Debug)]
#[repr(u8)]
enum EBig {
B(Small) = 0,
A(Big),
}
#[derive(Debug)]
#[repr(u8)]
enum ESmall {
A(Small) = 0,
B(Small)
}
pub fn main() {
// a small enum variant whose tag is 0 is represented as zeroes
// (prints array of 17 zeroes)
//let small: ESmall = ESmall::A(Small::default());
//println!("{:?}", unsafe { *(&small as *const _ as *const [u8; size_of::<ESmall>()]) } );
//let _ = std::io::stdout().flush();
let mut boxed: Box<EBig> = unsafe {
Box::<EBig>::new_zeroed().assume_init()
};
// overflow the stack
*boxed.as_mut() = EBig::B(Small::default());
} |
I was hoping for copy elision here, but it seems like this might overflow on Windows platforms. Until rust-lang/rust#53827 gets a proper fix, this may be the best we can do, even if it's uglier. (We still construct the inner dimension on the stack, which seems fine.)
A very sneaky version of this happens when you derive This sucks even more than having to build the struct using unsafe, as the derive hides the offending code. |
As mentioned by Lokathor:
Is this truly the recommended/best way to create an array on the heap?: /// Allocates a `Box<T>` with all of the contents being zeroed out.
///
/// This uses the global allocator to create a zeroed allocation and _then_
/// turns it into a Box. In other words, it's 100% assured that the zeroed data
/// won't be put temporarily on the stack. You can make a box of any size
/// without fear of a stack overflow.
///
/// ## Failure
///
/// This fails if the allocation fails.
#[inline]
pub fn try_zeroed_box<T: Zeroable>() -> Result<Box<T>, ()> {
if size_of::<T>() == 0 {
// This will not allocate but simply create a dangling pointer.
let dangling = core::ptr::NonNull::dangling().as_ptr();
return Ok(unsafe { Box::from_raw(dangling) });
}
let layout = Layout::new::<T>();
let ptr = unsafe { alloc_zeroed(layout) };
if ptr.is_null() {
// we don't know what the error is because `alloc_zeroed` is a dumb API
Err(())
} else {
Ok(unsafe { Box::<T>::from_raw(ptr as *mut T) })
}
} |
Because emplace was mentioned: The recommended version of emplace appears to be This means that I can't use this as a workaround to create a large singleton object (not merely an array) directly on the heap. Which means I think there's no safe way in standard Rust to create such an object in a single allocation—I'd have to break components of it into separate Boxes which may have performance implications for access/memory caching. |
Due to a long unfixed issue in Rust (rust-lang/rust#53827), initializing large arrays and boxing them causes a stack overflow. To avoid this, the new_boxed fn allocates itself manually.
Due to a long unfixed issue in Rust (rust-lang/rust#53827), initializing large arrays and boxing them causes a stack overflow. To avoid this, the new_boxed fn allocates itself manually.
Due to a long unfixed issue in Rust (rust-lang/rust#53827), initializing large arrays and boxing them causes a stack overflow. To avoid this, the new_boxed fn allocates itself manually.
This is possibly the same bug as
#40862
Using the latest version of Rust
rustc 1.27.2 (58cc626 2018-07-18)
The following code causes a stack overflow
Workarounds
Using
Vec<T>
This does not have overhead.
Unstably using
new_uninit
This requires an unstable API and unsafe, but is more flexible.
The text was updated successfully, but these errors were encountered: