Rewrite `StackNesting` to use a single-pass algorithm #84688

rjmccall · 2025-10-04T07:26:01Z

The previous algorithm was doing an iterative forward data flow analysis followed by a reverse data flow analysis. I suspect the history here is that it was a reverse analysis, and that didn't really work for infinite loops, and so complexity kindof accumulated.

The new algorithm is quite straightforward and relies on the allocations being properly jointly post-dominated, just not nested. We simply walk forward through the blocks in consistent-with-dominance order, maintaining the stack of active allocations and deferring deallocations that are improperly nested until we deallocate the allocations above it.

The reason I'm doing this, besides it just being a simpler, faster algorithm, is that modeling some of the uses of the async stack allocator properly requires builtins that cannot just be semantically reordered. That should be somewhat easier to handle with the new approach, although really (1) we should not have runtime functions that need this and (2) we're going to need a conservatively-correct solution that's different from this anyway because hoisting allocations is also limited in its own way.

The test cases that changed are... I don't think the new output is wrong under the current rules that are being enforced, but really we should be enforcing different rules, because it's not really okay to have broken stack nesting in blocks just because they don't lead to an exit. But it's broken coming into StackNesting, and I don't think the rewrite actually makes it worse, so...

The thing that concerns me most about the rewritten pass is that it isn't actually validating joint post-dominance on input, so if you give it bad input, it might be a little mystifying to debug the verifier failures.

rjmccall · 2025-10-04T07:26:12Z

@swift-ci Please test

docs/SIL/SIL.md

include/swift/SIL/SILBasicBlock.h

lib/SILOptimizer/Utils/StackNesting.cpp

test/SILOptimizer/stack_promotion.sil

rjmccall · 2025-10-08T04:49:22Z

@swift-ci Please test

eeckstein · 2025-10-08T06:26:24Z

@rjmccall can you please

add unit tests
run the benchmarks to rule out that something stupid went wrong with performance

rjmccall · 2025-10-08T15:51:31Z

@rjmccall can you please

add unit tests

run the benchmarks to rule out that something stupid went wrong with performance

I saw your comment before, and I'm intending to write a lot more tests, yes. Right now I'm still just trying to get this thing working.

rjmccall · 2025-10-09T00:11:30Z

@swift-ci Please test windows

rjmccall · 2025-10-09T01:41:01Z

@swift-ci Please test windows

rjmccall · 2025-11-01T04:04:02Z

@swift-ci Please test

rjmccall · 2025-11-01T04:04:30Z

Now with the tweaked algorithm and a rather over-pedantic proof. Still needs benchmarking and (mostly) testing

rjmccall · 2025-11-01T16:40:45Z

@swift-ci Please test

rjmccall · 2025-11-01T21:59:12Z

@swift-ci Please test compiler performance

rjmccall · 2025-11-03T15:54:29Z

Well, the performance run seems to show no drastic regressions, and that Windows failure seems unrelated, so I just need to write tests.

The previous algorithm was doing an iterative forward data flow analysis followed by a reverse data flow analysis. I suspect the history here is that it was a reverse analysis, and that didn't really work for infinite loops, and so complexity accumulated. The new algorithm is quite straightforward and relies on the allocations being properly jointly post-dominated, just not nested. We simply walk forward through the blocks in consistent-with-dominance order, maintaining the stack of active allocations and deferring deallocations that are improperly nested until we deallocate the allocations above it. The only real subtlety is that we have to delay walking into dead-end regions until we've seen all of the edges into them, so that we can know whether we have a coherent stack state in them. If the state is incoherent, we need to remove any deallocations of previous allocations because we cannot talk correctly about what's on top of the stack. The reason I'm doing this, besides it just being a simpler and hopefully faster algorithm, is that modeling some of the uses of the async stack allocator properly requires builtins that cannot just be semantically reordered. That should be somewhat easier to handle with the new approach, although really (1) we should not have runtime functions that need this and (2) we're going to need a conservatively-correct solution that's different from this anyway because hoisting allocations is *also* limited in its own way. I've attached a rather pedantic proof of the correctness of the algorithm. The thing that concerns me most about the rewritten pass is that it isn't actually validating joint post-dominance on input, so if you give it bad input, it might be a little mystifying to debug the verifier failures.

rjmccall · 2025-11-03T19:51:52Z

@swift-ci Please test

rjmccall · 2025-11-03T19:52:23Z

@eeckstein Tests added, ready for final review.

rjmccall requested review from eeckstein and jckarter as code owners October 4, 2025 07:26

eeckstein reviewed Oct 6, 2025

View reviewed changes

rjmccall force-pushed the stack-nesting-2 branch 2 times, most recently from 19b34ee to 30df880 Compare October 8, 2025 04:49

rjmccall force-pushed the stack-nesting-2 branch from 94538b0 to 0b5853e Compare November 1, 2025 04:03

rjmccall force-pushed the stack-nesting-2 branch from 0b5853e to 203f0f7 Compare November 1, 2025 16:40

rjmccall force-pushed the stack-nesting-2 branch from 203f0f7 to 8d231d2 Compare November 3, 2025 19:51

rjmccall merged commit ba47e23 into swiftlang:main Nov 4, 2025
5 checks passed

jamieQ mentioned this pull request Nov 5, 2025

Model async let begin/finish as builtins in SIL #84528

Merged

Rewrite StackNesting to use a single-pass algorithm #84688

Rewrite StackNesting to use a single-pass algorithm #84688

Uh oh!

Conversation

rjmccall commented Oct 4, 2025

Uh oh!

rjmccall commented Oct 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rjmccall commented Oct 8, 2025

Uh oh!

eeckstein commented Oct 8, 2025

Uh oh!

rjmccall commented Oct 8, 2025

Uh oh!

rjmccall commented Oct 9, 2025

Uh oh!

rjmccall commented Oct 9, 2025

Uh oh!

rjmccall commented Nov 1, 2025

Uh oh!

rjmccall commented Nov 1, 2025

Uh oh!

rjmccall commented Nov 1, 2025

Uh oh!

rjmccall commented Nov 1, 2025

Uh oh!

rjmccall commented Nov 3, 2025

Uh oh!

rjmccall commented Nov 3, 2025

Uh oh!

rjmccall commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Rewrite `StackNesting` to use a single-pass algorithm #84688

Rewrite `StackNesting` to use a single-pass algorithm #84688