-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BGV]: add lazy relinearization #599
Comments
Hi @j2kun I would like to work on this. Could you shade some light on how may I approach this. I have gathered some basic knowledge about bgv, openfhe and their conversion dialects. |
Sure! The basic idea is that an arith.muli lowers to the ops bgv.mul and bgv.relin. The relin is necessary to return the key basis from (1, s, s^2) to (1, s) after the bgv.mul, but sometimes the relin op can be delayed if the output of the mul is combined with other values in that same key basis (1, s, s^2). For example, if you multiply two pairs of values and then add their results, you can do it with one relin op after the addition, rather than two relin ops, one after each mul. The implementation of this feature would, I believe, require some relatively straightforward tablegen patterns. Look for opportunities to push a relin op further back, by matching on ops whose return values have a ciphertext type with the same key basis attribute. If you'd like, we could work through some examples in detail in this issue thread. |
While looking at
|
I didn't realize that relinearizing after repeated multiplications is a thing people do. I assume that:
|
It's not really something people do outside of very specific use cases. I think quite a few of the common libraries don't even implement "general" relinearization, because it'd blow up the evaluation key size so much. Coming back to the pass: My suggestion isn't necessarily to emit final programs with repeated multiplications (i.e., the openFHE pipline would always contain |
I do worry a little bit that the key basis in the attributes would explode for some mul-heavy IRs, but I'm down to try it. Another approach would be to take any IR, using relins or not, and re-relinearize-ify it. That would be the most general approach, but couldn't be done with a simple set of patterns. I bet we could formulate it as an ILP if it gets to that. |
Yes, working through some example would be helpful for me. I was thinking about this pattern
do we have a possibility to reach at something like this |
I think so: if we had off-topic question: is it better to add a third pattern |
x and y are not operations. Here I am considering both operands of |
I don't think that'd be valid (assuming we never want to exceed ctxt-degree three) as the output of |
I think perhaps the simplest initial test case would be %x = bgv.mul %0, %1
%x_relin = bgv.relinearize %x
%y = bgv.mul %3, %4
%y_relin = bgv.relinearize %y
%z = bgv.add %x_relin, %y_relin Which should map to %x = bgv.mul %0, %1
%y = bgv.mul %3, %4
%z = bgv.add %x, %y
%z_relin = bgv.relinearize %z So long as |
Looking back on https://www.jeremykun.com/2023/11/15/mlir-a-global-optimization-and-dataflow-analysis/#an-ilp-optimization-pass, I'm not seeing any real obstacle to adapting that ILP to this problem. Basically, after each BGV op in the IR, you add a "InsertRelin" variable (instead of "InsertReduceNoise"), and a "KeyBasisDegree" variable instead of a "NoiseAt" variable. The constraints are simpler because KeyBasisDegree is additive in each mul and unchanged by other ops, and we can constrain the KeyBasisDegree of each starting SSA value and any decrypted/returned SSA values to be 1. There would be no upper bound on the maximum allowable key basis degree, though in practice we may want to make this configurable. The objective would be to minimize the sum of InsertRelin variable values. @ahmedshakill what do you think, are you up for implementing an ILP? The existing implementation is here |
Yeah, I am up for it. The task is interesting and the resources (blogs and discussions) are really good. I would hate to lose this opportunity. |
Mh, wouldn't you need to keep all the noise-related stuff in the problem, still? Otherwise, you could end-up with a trivial solution that lets the degree explode throughout the computation and only relinearizes at the very end, which would be both very slow and absolutely catastrophic for noise growth. Maybe enforcing that no relin is "higher degree" than 3->2 is sufficient to make it work? I vaguely recall papers looking into things more optimal than the basic layz relin, but I'm not sure if they showed sufficient performance gains to make it worthwhile. |
That is true. You could easily force it to have max degree 3, or add a cost to each relin based on the input degree. I didn't think relin affected noise growth though. Can't you modulus switch in a higher key basis? |
@ahmedshakill Any progress or help needed on this topic? Feel free to post a draft PR and I can take a look and offer some suggestions. |
I was taking a bit of time. I'll update you on my progress and post a pr asap. |
Hi @j2kun it took me a while to respond. I have created a PR. In the current state it holds a initial template for the pass. Please have a look and let me know about your thoughts. |
Btw, I realized that my "simple DDR Pattern"-based approach probably wouldn't work, as it wouldn't correctly handle something like: // Assume arg0, arg1, arg2, arg3 are "appropriately typed" function parameters
%0 = bgv.mul(%arg0, %arg1)
%1 = bgv.add(%0, %arg2)
%2 = bgv.mul(%1, %arg3) I.e., if there are any kind of non-mul operations between the multiplications, the patterns wouldn't catch them. Even if you're going for a more ILP/optimization based approach, I'd expect the mult-depth analysis to come in handy :) |
I think, now I understand the weight of the above line. Here is the current progress status
|
I can certainly help answer any questions if you have them. Feel free to
chime in on Discord as well (see the link at https://heir.dev/community/)
for less formal discussions
…On Thu, Jul 11, 2024, 1:35 PM Shakil Ahmed ***@***.***> wrote:
what do you think, are you up for implementing an ILP?
I think, now I understand the weight of the above line. Here is the
current progress status
- The lazy relin pass is visible in heir-opt.
- Looked into google or-tools tutorial on LP and MIP.
- Need an in-depth look into Jeremy's ILP blog.
- Translate the minimization problem into an ILP one.
- Introduce trait/interface
—
Reply to this email directly, view it on GitHub
<#599 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAS2PKTE4NAOWAXB3WTMWQLZL3UA7AVCNFSM6AAAAABFZXY3A6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRTHA4DCMBTGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Ideally this is done in a way that the optimization pass can be shared between all relinearizing ops across all schemes. In particular, we should implement a
RelinearizeLike
trait/interface, and implement the optimization pass against the interface.The text was updated successfully, but these errors were encountered: