Partially unrolling loops for better level consumption (Halo B-2)#2538
Merged
copybara-service[bot] merged 1 commit intogoogle:mainfrom Jan 23, 2026
Merged
Partially unrolling loops for better level consumption (Halo B-2)#2538copybara-service[bot] merged 1 commit intogoogle:mainfrom
copybara-service[bot] merged 1 commit intogoogle:mainfrom
Conversation
Collaborator
Author
|
@asraa this one is ready for a first pass review. I left in a few FIXMEs that I intend to convert to TODOs before merging, as described in the PR description. There are also a handful of tests that are failing or timing out, I believe this is due to some small issues with the rewritten level analysis code. |
This was referenced Jan 21, 2026
4c1006e to
5c9b8d8
Compare
asraa
approved these changes
Jan 22, 2026
lib/Transforms/Halo/Patterns.cpp
Outdated
|
|
||
| for (Value iterArg : forOp.getRegionIterArgs()) { | ||
| if (isSecret(iterArg, solver)) { | ||
| if (iterArg.getNumUses() > 1) { |
Collaborator
There was a problem hiding this comment.
out of convenience, there's a nice hasOneUse function that MLIR has
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements Solution B-2 from the HALO paper:
Implementation notes
When is the level budget decided?
The loop unrolling logic from HALO asks you to determine if the body of the loop "exhausts the level budget." HEIR, however, is choosing the level budget. I haven't figured out exactly how the integration will work at this point: will the level budget be fixed before the loop is attempted to be unrolled? Or will the loop unrolling logic happen simultaneously with the level budget selection? To avoid answering this question now, I punted by adding a "forceMaxLevel" pass option that lets tests behave as if this level budget were decided in advance.
The reason this is nontrivial is that, if I were to just call
getMaxLevelon the entire IR to decide what my level budget is for unrolling, it may have a max level of 1. This was the case for my simple lit tests, where the entire program was the one (rolled) loop with a single mul (+ ops to maintain loop invariance): if you leave it rolled and bootstrap at every iteration, you never need more than one level (really, one level more than the number of levels required to bootstrap), and so you would never have enough levels to unroll. In this case it would be more optimal to increase the level budget so that you could unroll more and avoid bootstrapping as often.Effective bootstrap level vs "max level"
The "forceMaxLevel" flag implies that we're still ignoring the "effective bootstrap level," i..e, the difference between the max level of the IR and the level after bootstrap is applied (which is less than the max level because bootstrap consumes levels).
I want to change this to an "effective bootstrap level", but I need the backend-specific data about how many levels bootstrap consumes. This is hinted at in simple_ckks_bootstrapping_test.cpp, that it depends on some configurable parameters (levelBudgetEncode and levelBudgetDecode, usually 3) and some implementation details (such as how many levels are used to implement mod1, in OpenFHE's case I think it's 14).
Because we compensate for this manually in
ConfigureCryptoContext.cpp, making the change here would also require updating it there, and the corresponding integration fixes that go along with it.Multiple
iter_argsMost of the use cases we have for rolled loops are FHE kernels that involve a single accumulated ciphertext. However, if there are multiple iter_arg ciphertexts, this pass may be inefficient. Currently it calculates the largest allowable unroll factor for each iter arg, and then unrolls the loop according to the min. If there are two iter args with two very different unroll factors, e.g.,
Xthat can be unrolled by 2 andYthat can be unrolled by 10, then we could instead unroll by 10 (beneficial to bootstrap theYpath less), and insert bootstrap ops for the parts of theXpath that require additional bootstrapping. Overall this would be less bootstrapping. However, it requires a bootstrapping placement method that can be applied to the loop body in isolation, and probably should be smarter than waterline bootstrapping. I'm punting here to a TODO.