Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Magic buffer sizes #69

Open
joto opened this issue Oct 6, 2016 · 2 comments
Open

Magic buffer sizes #69

joto opened this issue Oct 6, 2016 · 2 comments

Comments

@joto
Copy link

joto commented Oct 6, 2016

All the sample code contains the same lines with magic numbers:

ExecutionBuffer codeAllocator(8192);
Allocator allocator(8192);
FunctionBuffer code(codeAllocator, 8192);

I have been running into cases with more complex expressions where this is too small. But there is no documentation on how big this should be and how the different buffers fit together. Of course it would be even better if this allocation was dynamic, so I don't have to know...

As a side note: FunctionBuffer takes an unsigned as capacity while ExecutionBuffer and Allocator take a size_t which is inconsistent.

@MikeHopcroft
Copy link
Contributor

Thanks for raising this issue. Magic numbers are indeed problematic.

Historically speaking, eliminating the need for the magic numbers wasn’t a priority in Bing because their use case employed fixed-sized buffers in order to maximize performance. Using fixed-sized buffers allowed them to eliminate all heap allocations on the query processing path. Eliminating heap allocations mainly had an impact on reducing lock contention in a multi-threaded environment. The Bing use case went to extreme lengths to boost performance. As an example, they also used the virtual memory system to allocate guard pages beyond the end of buffers to catch overflows so that they could eliminate bounds checking on each byte write operation.

Most users of NativeJIT don’t need these extreme optimizations and could benefit from buffers that grow automatically as needed. I did a brief investigation and feel it would be feasible to add this support to the codebase without hurting the performance for those who want to use fixed sized buffers.

Before I get into the details, I'll suggest that in many use cases you can just pick magic numbers that are ridiculously large and move on. For example, give each buffer 1Mb. This is likely to work in the vast majority of cases and is still inexpensive on modern hardware.

If you are interested in eliminating magic numbers from the API, the details follow. If you are considering putting together a PR, please check with us first to increase the likelihood that we will be able to take your contribution.

  • Allocator This class is just a naïve implementation of IAllocator. One could easily implement an ExpandableAllocator class that would maintain an std::vector<void*> of buffers. As each buffer was depleted, a new one would be added. The simplest version of this class would only free buffers in its destructor, so its memory usage would grow over time to some high water mark. A more nuanced implementation might release excess buffers after calls to IAllocator::Reset().

  • ExecutionBuffer This class is also an IAllocator, but it has the additional requirement that the memory it allocates must have Executable Space Protection disabled. This involves configuring the virtual memory tables, which in some uses cases may be too expensive to do on a per-compilation basis. Those who aren't sensitive to the reconfiguration cost might consider an execution buffer
    that maps addition virtual memory pages as needed.

    One potential issue with this approach is that the JumpTable class associates jump targets with absolute memory addresses, so if the buffer were to grow, one would have ensure that the JumpTable entries remained valid. One approach would be to control the virtual memory mapping in such a way that the buffer can grow while retaining its original start address. Another would be to convert the JumpTable to use relative addresses. Another, less desirable, approach would be to relocate the JumpTable after resizing the buffer.

  • FunctionBuffer constructor This constructor takes a capacity parameter which is passed up the chain of base class constructors to CodeBuffer which uses the value to initialize m_bufferEnd, which is inspected by VerifyNoBufferOverflow() One would need to modify the EmitBytes() method to trigger buffer growth as needed, instead of allowing VerifyNoBufferOverflow() to throw.

@MikeHopcroft
Copy link
Contributor

With regards to @joto's side note, we're transitioning the code base to use size_t anytime we want a default size unsigned value. Since our compiler targets x64, we feel that there is no performance penalty to using size_t in our API (vs unsigned). Once we've finished transitioning to size_t, the only smaller types will be in places where we require a smaller size (e.g. field of a size-sensitive struct). In general, we're trying to avoid signed types as well unless the use case requires negative numbers. We will transition the code over time, but the priority is low - we're mainly fixing things as we encounter them while implementing higher priority changes. Please feel free to submit PRs on this topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants