Research Prototype: Finer-grain Concurrency Control for Resource Instance nodes #35393

apparentlymart · 2024-06-27T18:52:49Z

This PR is not intended to be merged, and is instead just a snapshot of some experimentation I've been working on.

For a long time now we've been interested in the idea of making batch requests to providers in various cases, so that we can reduce the number of roundtrips required when an underlying API allows describing multiple operations in a single request. (For older discussion on that, refer to hashicorp/terraform-plugin-sdk#66 and hashicorp/terraform-plugin-sdk#7.)

I've taken various swings at this problem over the years and one thing that's consistently made these attempts difficult is that Terraform Core's execution model uses entire graph nodes as the unit of scheduling, and applies its concurrency limit in terms of number of concurrent node executions rather than number of concurrent provider requests, and therefore we're not scheduling work at the right granularity to be able to do cleverer things, such as noticing that there are multiple "refresh" requests pending and the provider knows how to coalesce them so we could send them all to the provider in a single round-trip.

This PR was an experiment toward one way to avoid that difficulty: allowing particular graph node types to opt out of the whole-node-level concurrency control in favor of handling concurrency control themselves inline.

To experiment with that idea here I've made all of the resource-instance-related graph node types enter their Execute methods without any concurrency limit whatsoever, but then we pass the concurrency-limiting semaphore into the Execute method so that the implementation can acquire and release the semaphore itself in a more fine-grain way.

In this case I made it hold the semaphore only while making a request to a provider, which is the most granular approach possible. I don't think that is actually the right level of granularity in practice though, because it causes us to tell the UI layer that all of the operations are happening concurrently, and so e.g. 500 "Refreshing" messages can appear all at once even though Terraform is still chunking through them in blocks of 10. The UI should only signal that an operation is starting once it's actually starting, so if we did this for real we'd want to acquire the semaphore at the same time as signalling that the refresh is starting and release it when we signal that it's finished.

Since Terraform's execution is very provider-I/O-bound in most cases, this change alone doesn't make any material difference to the overall execution time: we're still spending most of our time waiting for the provider requests to complete.

However, this new approach would make it more feasible for us to, for example, centralize the handling of "refresh" requests so that they can be queued as soon as they become ready (ignoring the concurrency limit) and then the subsystem reading from that queue can be the one to acquire the semaphore, before taking multiple queued requests from the queue to handle all at once in a single provider round-trip.

That design exploits the fact that the concurrency limit inherently creates delays where requests become ready but cannot actually begin executing, which therefore creates an opportunity for many such requests to become ready before an execution slot becomes available, and then all of those requests can be evaluated together and batched whenever possible.

In practice then, with our default concurrency limit of 10, I expect that a state with 500 objects that need refreshing (and that have no dependencies between them) would start by making 10 individual requests that happened to become runnable first, but then the remaining 490 would all get queued waiting for an execution slot. Terraform could then in principle send a single provider request for all 490 objects in the best case.

In practice it's unlikely to be that ideal for various reasons, including but not limited to:

There's an upper limit on what is a reasonable number of refresh requests to bundle into a single gRPC request to a provider.
The objects often won't all be of the same type and in that case are less likely to be mutually-coalescable.
Underlying APIs often have their own limits for how many objects can be requested concurrently.

...but I expect this optimization would still be pretty profitable, and would be especially profitable in cases like using Terraform to manage user accounts where there tend to be many objects all of the same type, since that's the most commonly-supported batching scenario in underlying APIs. (Generalizations like GraphQL or Multipart-MIME batch processing endpoints can be better, but those facilities are not very common in APIs Terraform is typically used with.)

…elves Terraform allows the user to configure a concurrency limit which is intended to ensure that no more than a given number of network requests (or similar) can be active simultaneously. Previously that was always handled on a whole-graph-node basis, limiting the number of nodes whose Execute methods can be running at a time. That's a good broad default behavior, but over time we've moved towards doing more work with fewer node types so that we can rely more on normal control flow in situations where steps run sequentially anyway and so we have a few node types whose Execute implementations represent a potentially large number of sequential provider requests that currently get treated as a single unit for concurrency-limit purposes. Now we'll offer the option of a graph node type to do its own concurrency limit handling, by implementing the new interface GraphNodeExecuteSema instead of GraphNodeExecute. In that case, the main graph walk behavior won't acquire the semaphore automatically and will instead just pass it in to the node's Execute method so the node can decide for itself when to acquire and release the semaphore. As of this commit there are no graph node types implementing the new interface, and so this is really just achieving the same result in a different way and should not cause any material behavior changes. Later commits will change some particularly-complex node types to use the new interface.

This is a helper for a soon-to-be common case of holding a semaphore only for a short stretch of code representing a single I/O request, to reduce the boilerplate required and to help ensure correct semaphore usage discipline.

Instead of holding the concurrency semaphore for the entire execution of each resource instance graph node, we'll now tell the graph walker we want these nodes to handle their own concurrency control and then hold the semaphore only while we're making requests to a provider. The resource instance graph nodes tend to do a bunch of steps in sequence, with lots of non-I/O glue code between them, and so this should allow better interleaving of work by letting the Go scheduler deal with efficiently executing the CPU-bound work and using our concurrency semaphore only for its documented purpose of coarsely limiting the number of concurrently-active API requests.

github-actions · 2024-08-01T02:11:50Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apparentlymart added 3 commits June 27, 2024 09:41

terraform: whileHoldingSemaphore utility

b511fe9

This is a helper for a soon-to-be common case of holding a semaphore only for a short stretch of code representing a single I/O request, to reduce the boilerplate required and to help ensure correct semaphore usage discipline.

apparentlymart closed this Jul 1, 2024

apparentlymart mentioned this pull request Jul 11, 2024

Coalescing of identical datasources #19942

Open

github-actions bot locked as resolved and limited conversation to collaborators Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research Prototype: Finer-grain Concurrency Control for Resource Instance nodes #35393

Research Prototype: Finer-grain Concurrency Control for Resource Instance nodes #35393

apparentlymart commented Jun 27, 2024

github-actions bot commented Aug 1, 2024

Research Prototype: Finer-grain Concurrency Control for Resource Instance nodes #35393

Research Prototype: Finer-grain Concurrency Control for Resource Instance nodes #35393

Conversation

apparentlymart commented Jun 27, 2024

github-actions bot commented Aug 1, 2024