Join GitHub today
Builtin function to tell you the maximum stack size of a given function #157
I propose a
This function causes it to become a compile error if the function named - or any function called therein, invokes direct or indirect recursion on a non-compile time constant parameter.
This builtin function could be used when creating a new thread, to determine the stack size. This stack would never have to grow or shrink, and it would be the maximum number of bytes that could possibly be used, likely very small.
It would also force the programmer to not use recursion on user input, a limitation that some categories of software require, and violating this would be caught by the compiler, since it makes returning a correct number from
Users may even want to introduce this limitation on the main thread, and they could do this by adding a top level assert such as
If they do this, then we can modify the binary's stack size to be only as large as necessary, which is probably a minuscule improvement in memory footprint and startup time, but it's a strong guarantee of robustness at least in this one regard.
Another thing that would cause this function to cause a compile error is calls to an external library. External library functions could be annotated with a maximum stack size attribute, or we can have another threading abstraction that does not use
Tangential idea: when Zig outputs a shared library, in the .h file it generates, it can annotate functions with additional information in the comments, such as maximum stack size, and Zig can look for these annotations when importing .h files.
This function also introduces some weird interdependencies. For example, I could do:
Zig would have to detect this dependency loop and report a compile error.
Implementation of this would be a bit tricky since currently it is LLVM that determines the stack size of everything, and optimizations can increase and decrease the stack size of various functions. However, I think it is possible to have a sound upper bound.
My first interpretation of
Some of your wording sounds like you're going to impose this limitation everywhere, which is a bad idea. Having this feature available for new threads and for the binary's main function is cool, but we can't force users to never recurse on user input. That would make a recursive-descent parser impossible in Zig, which seems like a silly limitation.
What if we could expose the innerworkings of this function to the user at compiletime-constant time? What if we could give the non-recursive stack requirement of each individual function, and additionally give the user a way to walk the callgraph. Then the user could detect all the paths where their recursive descent parser recurses, calculate the maximum stack requirement for a recursion level, and then translate between maximum recursion levels and bytes required for the the stack (calculating either one from the other).
We can also have the compiler do all this for us with the limitation that there needs to be a deterministic, finite answer (your proposal), but that special case might miss out on a lot of other potential from this kind of functionality. Or maybe I'm getting too excited about YAGNI-cream pies in the sky, but i think it's worth discussing at least.
I am not sure, but I think this is undecidable if you allow function pointers to be called. You can manually set up recursion that way. So, I think you would need to restrict any call to any function that is not a compile time constant that can be checked against the function in question.
What is the purpose for this restriction? Having done some embedded coding, I can see a use in that field.
IIRC, LLVM supports tail recursion. In that case, you could have an unknown amount of recursion at compile time yet still have a constant stack size.
In order for this feature to work, zig would have to know the set of functions a function pointer could potentially call. Difficult, but perhaps not impossible.
Yes, when performing the stack size requirements, zig should identify tail recursion and not count it.
(Can't figure out how to quote a list)
I think the most common use cases here will be:
If you want to be able to have a threading model like Erlang or Go, then you would need to do a lot of gymnastics anyway in the run time (and forget embedded). It is unclear that you can scale to that many independent threads without GC, realistically. (As an aside, in the embedded world if you want to retain many threads of control, people tend to use protothreads anyway. Adam Dunkels is amazing.)
My arguments boil down to two things: in the embedded space you already have so many restrictions that having harsh ones in order to calculate static stack size is fine and the result of proving the static stack size is very, very desirable and you really do not care about the standard library. In the rest of computing, you have enough RAM and CPU to not worry about it.
I have run tests on Linux and created 1000 threads (that did something, not just looping) and found that each used about 32-64k for the stack in real physical RAM. That was 1000 threads in less than 64MB of space. Is that really a problem? With the recent massive increase in physical cores (thank you, AMD) that is happening, having a pile of real threads is not a problem. In Linux I ran the same kind of test, but fork()ed and created new processes and did not get much of a difference. It is very efficient.
What kinds of restrictions would be necessary to have a static stack size calculated? I think almost anything that could be compile time could also have this property. There are surely some edge cases I am missing though :-(
referenced this issue
Oct 2, 2017
I think this is feature is neat for coroutine's as a library when combined with long jump. If you look at bearssl, the author wrote his own statically checked language just to get this feature. It let him do zero allocations in the library, but still use coroutines to let him yield control flow.
I started a discussion on the llvm-dev mailing list.
I've wanted this feature from a low-lever programming language a long time. It would be very useful in some embedded systems.
As thejoshwolfe mentions, it would probably a good to have both the immediate stack requirement (not sure what the use-case is though) and the worst case requirement.
There are obviously a lot of things you can't do in these functions (I see you covered everything I thought about and then some in the llvm-dev thread andrewrk). But that's OK. Sometimes it's worth it to work within certain restrictions to be able to prove certain things (like pure/functional functions and global state)