-
Notifications
You must be signed in to change notification settings - Fork 627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: temporarily workaround the LimitedMemoryPool limiting the concurrency of the contract runtime too much #10733
runtime: temporarily workaround the LimitedMemoryPool limiting the concurrency of the contract runtime too much #10733
Conversation
This is not a proper fix as certain configuration settings can still make this number to be exceeded, but the changes there are a little more involved than what I'd be comfortable with making in a rush, especially for a backport to the 1.37 release. In order to keep the initial memory use reasonably bounded, I have chosen to reduce the initial size of each map to a smaller number. Since the runtime reuses most recently returned memories first, in use-cases where there are few concurrent accesses, only that many memories will get resized to fit all the required contracts over time, thus keeping the memory use reasonably limited. This will result in slightly less predictable function call performance for the first invocation of the very few first contract invocations that require more memory, but this seems like a required tradeoff to avoid preallocating dozens of GB of mappings that will later go mostly unused.
This is not something that I am aware of happening but a fresh look at the code immediately spotted it as icky part of the code. The new ordering will still fail to resize and most likely lead to the termination of the node, but if the caller chooses to do something else about the error, this new setup will correctly handle returning the previous memory to the memory pool.
crossbeam's implementation is a FIFO queue, which will repeatedly touch all of the memories, regardless of the concurrency which then can result in more memory use than strictly necessary in low-concurrency scenarios.
@@ -258,7 +258,7 @@ impl NearVM { | |||
// that particular function call will be slower. Not to mention there isn't a | |||
// strong guarantee on the upper bound of the memory that the contract runtime may | |||
// require. | |||
LimitedMemoryPool::new(8, 64 * 1024 * 1024).unwrap_or_else(|e| { | |||
LimitedMemoryPool::new(256, 1 * 1024 * 1024).unwrap_or_else(|e| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size of each individual memory in the pool grows as required (see CodeMemory::resize
). This grow operation is fallible, unfortunately, but at the very least with the change to the LIFO queue for the memory pool, in most "regular" applications only the required number of memories should actually grow. For a validator that tracks 4 shards, that would mean just 4 of these memories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance, this looks good so I am approving to unblock. Still a thorough review from Leo would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a few comments around mutex handling :)
…ncurrency of the contract runtime too much (near#10733) See individual commits and their descriptions for an easier review experience. This PR is intended to be easily backportable with a minimal risk, with perhaps only the 2nd commit being superfluous in that context. There are definitely things that we can do to make the reliability and sensibility of this area much better, but they would result in a much more invasive change. My belief is that we should land both this workaround *and* fix near#10732 as a follow-up. cc near#10732 cc @posvyatokum we will want this backported to 1.37.1, let me know if you'd like me to prepare a backport and/or where you want it to end up at. --- runtime: hotfix to allow up-to 256 concurrent runtime invocations This is not a proper fix as certain configuration settings can still make this number to be exceeded, but the changes there are a little more involved than what I'd be comfortable with making in a rush, especially for a backport to the 1.37 release. In order to keep the initial memory use reasonably bounded, I have chosen to reduce the initial size of each map to a smaller number. Since the runtime reuses most recently returned memories first, in use-cases where there are few concurrent accesses, only that many memories will get resized to fit all the required contracts over time, thus keeping the memory use reasonably limited. This will result in slightly less predictable function call performance for the first invocation of the very few first contract invocations that require more memory, but this seems like a required tradeoff to avoid preallocating dozens of GB of mappings that will later go mostly unused. --- runtime: avoid losing memories when failing to resize them This is not something that I am aware of happening but a fresh look at the code immediately spotted it as icky part of the code. The new ordering will still fail to resize and most likely lead to the termination of the node, but if the caller chooses to do something else about the error, this new setup will correctly handle returning the previous memory to the memory pool. --- runtime: switch LimitedMemoryPool to a LIFO queue crossbeam's implementation is a FIFO queue, which will repeatedly touch all of the memories, regardless of the concurrency which then can result in more memory use than strictly necessary in low-concurrency scenarios.
See individual commits and their descriptions for an easier review experience.
This PR is intended to be easily backportable with a minimal risk, with perhaps only the 2nd commit being superfluous in that context. There are definitely things that we can do to make the reliability and sensibility of this area much better, but they would result in a much more invasive change. My belief is that we should land both this workaround and fix #10732 as a follow-up.
cc #10732
cc @posvyatokum we will want this backported to 1.37.1, let me know if you'd like me to prepare a backport and/or where you want it to end up at.
runtime: hotfix to allow up-to 256 concurrent runtime invocations
This is not a proper fix as certain configuration settings can still
make this number to be exceeded, but the changes there are a little more
involved than what I'd be comfortable with making in a rush, especially
for a backport to the 1.37 release.
In order to keep the initial memory use reasonably bounded, I have
chosen to reduce the initial size of each map to a smaller number.
Since the runtime reuses most recently returned memories first, in
use-cases where there are few concurrent accesses, only that many
memories will get resized to fit all the required contracts over time,
thus keeping the memory use reasonably limited. This will result in
slightly less predictable function call performance for the first
invocation of the very few first contract invocations that
require more memory, but this seems like a required tradeoff to avoid
preallocating dozens of GB of mappings that will later go mostly unused.
runtime: avoid losing memories when failing to resize them
This is not something that I am aware of happening but a fresh look at
the code immediately spotted it as icky part of the code. The new
ordering will still fail to resize and most likely lead to the
termination of the node, but if the caller chooses to do something else
about the error, this new setup will correctly handle returning the
previous memory to the memory pool.
runtime: switch LimitedMemoryPool to a LIFO queue
crossbeam's implementation is a FIFO queue, which will repeatedly touch
all of the memories, regardless of the concurrency which then can result
in more memory use than strictly necessary in low-concurrency scenarios.