Skip to content

Partially unrolling loops for better level consumption (Halo B-2)#2538

Merged
copybara-service[bot] merged 1 commit intogoogle:mainfrom
j2kun:loop-support-3
Jan 23, 2026
Merged

Partially unrolling loops for better level consumption (Halo B-2)#2538
copybara-service[bot] merged 1 commit intogoogle:mainfrom
j2kun:loop-support-3

Conversation

@j2kun
Copy link
Collaborator

@j2kun j2kun commented Jan 16, 2026

This PR implements Solution B-2 from the HALO paper:

Solution B-2: Unrolling the loops reflecting the required level. Loop unrolling reduces its iteration count, thus reducing the number of bootstrap. If computation in a loop iteration does not fully consume ciphertext levels, HALO unrolls the loop to fully utilize the levels recovered from bootstrap.

Implementation notes

When is the level budget decided?

The loop unrolling logic from HALO asks you to determine if the body of the loop "exhausts the level budget." HEIR, however, is choosing the level budget. I haven't figured out exactly how the integration will work at this point: will the level budget be fixed before the loop is attempted to be unrolled? Or will the loop unrolling logic happen simultaneously with the level budget selection? To avoid answering this question now, I punted by adding a "forceMaxLevel" pass option that lets tests behave as if this level budget were decided in advance.

The reason this is nontrivial is that, if I were to just call getMaxLevel on the entire IR to decide what my level budget is for unrolling, it may have a max level of 1. This was the case for my simple lit tests, where the entire program was the one (rolled) loop with a single mul (+ ops to maintain loop invariance): if you leave it rolled and bootstrap at every iteration, you never need more than one level (really, one level more than the number of levels required to bootstrap), and so you would never have enough levels to unroll. In this case it would be more optimal to increase the level budget so that you could unroll more and avoid bootstrapping as often.

Effective bootstrap level vs "max level"

The "forceMaxLevel" flag implies that we're still ignoring the "effective bootstrap level," i..e, the difference between the max level of the IR and the level after bootstrap is applied (which is less than the max level because bootstrap consumes levels).

I want to change this to an "effective bootstrap level", but I need the backend-specific data about how many levels bootstrap consumes. This is hinted at in simple_ckks_bootstrapping_test.cpp, that it depends on some configurable parameters (levelBudgetEncode and levelBudgetDecode, usually 3) and some implementation details (such as how many levels are used to implement mod1, in OpenFHE's case I think it's 14).

Because we compensate for this manually in ConfigureCryptoContext.cpp, making the change here would also require updating it there, and the corresponding integration fixes that go along with it.

Multiple iter_args

Most of the use cases we have for rolled loops are FHE kernels that involve a single accumulated ciphertext. However, if there are multiple iter_arg ciphertexts, this pass may be inefficient. Currently it calculates the largest allowable unroll factor for each iter arg, and then unrolls the loop according to the min. If there are two iter args with two very different unroll factors, e.g., X that can be unrolled by 2 and Y that can be unrolled by 10, then we could instead unroll by 10 (beneficial to bootstrap the Y path less), and insert bootstrap ops for the parts of the X path that require additional bootstrapping. Overall this would be less bootstrapping. However, it requires a bootstrapping placement method that can be applied to the loop body in isolation, and probably should be smarter than waterline bootstrapping. I'm punting here to a TODO.

@j2kun j2kun marked this pull request as draft January 16, 2026 22:34
@j2kun
Copy link
Collaborator Author

j2kun commented Jan 21, 2026

@asraa this one is ready for a first pass review. I left in a few FIXMEs that I intend to convert to TODOs before merging, as described in the PR description. There are also a handful of tests that are failing or timing out, I believe this is due to some small issues with the rewritten level analysis code.


for (Value iterArg : forOp.getRegionIterArgs()) {
if (isSecret(iterArg, solver)) {
if (iterArg.getNumUses() > 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of convenience, there's a nice hasOneUse function that MLIR has

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@j2kun j2kun added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Jan 23, 2026
@copybara-service copybara-service bot merged commit 634fbc6 into google:main Jan 23, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants