-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thoughts on making Step and ProtocolContext cheaper #139
Comments
what are you proposing is the process to assign globally unique ints to each step? would there be some kind of code generation that occurs on execution? or would you just run the protocol, have it print out all the steps, then manually assign int values to them? or something else? |
I'm proposing that we run the entire process in "debug" mode using a generic implementation that can report on what steps were created (that can be behind a feature flag as it will likely require globally-consistent, shared mutable state and really affect performance). That will produce a list of steps in a form that we can then copy into the code. We might use macros or some codegen to turn that output into the code I described. We might automate this further, but that's where I would start. |
@akoshelev mentioned one possibility where we implement some proc macro that we would add to all steps that would accomplish the same thing at compile time. I like the idea of that since it'd be less maintenance (aside from having to maintain a proc macro) |
Another thing we may want to implement with macro/compile time approach is to measure the depth of the circuit. MP-SPDZ seems to be using topological sort for it, we may benefit from doing it as well. |
Another thing to consider: We might want to consider renaming Step to Gate so that we can use G for the generic parameter, rather than S. (Edit: We can maybe s/Substep/Step/ in this case. You step from gate to gate...) |
We don't really know whether these structures are going to impose significant overheads just yet, but we have identified reasons for both needing some attention.
Warning: lots of rust code that I just typed into a text editor. No guarantees that any of this works.
ProtocolContext
I'm advocating for ProtocolContext being cheap so that we can make copies for every step and every record. This should improve API ergonomics. The structure (as of writing) is:
That's a very small enum for role (1 bit possible, 8 bits likely), an owned string (yikes), and two references (64 bits each maybe, though this might be compressed on some platforms).
The primary offender here is
step
, which we'll get back to, but it seems like there is an obvious fix here. Moving the shared state into another object could make this nicer. One way to do that is to do something like this:This style of indirection might be necessary to allow the in-memory networking to refer to a single global state and a real networking setup to refer to a single helper-wide object. Either way, the overall cost is then dominated by the step identifier. The interface is no more complex. The only cost is following a pointer when we want something.
Step
In debugging and testing, we want something that is descriptive of the process. Our current approach describes steps with strings and then composes those together into something like a path.
We can use that process happily in debugging, but step identifiers need to be passed around a LOT and that might hurt performance when we need to shift a lot of data. What we need is a process that produces an object that can be represented in memory as a simple integer.
One way to do that is to map the path-like structure we naturally produce onto an integer, assigning each node in the path (including "directories") an integer value.
Then we have a way to represent any step in that specific process with a simple and small number1. The obvious way to do this is to collect all of the steps that are used (including intermediate nodes), printing them out, (maybe) sorting them, and assigning each a number in sequence.
We can't do the number thing until we have a complete understanding of all of the steps involved in a protocol execution. For that, we can simply assemble a complete protocol, run it using generic steps, and collect the unique steps that are created during execution. From that, we can develop a complete set of states and a mapping to a number. This might be turned into a distinct type using either code generation (in build.rs say), a macro, or some combination of the two2.
From this we can produce a wrapper around an integer type that encodes this information efficiently based on calls to the
Step::narrow()
function. Recall that each use ofnarrow()
takes an arbitrary enum type as input. We can then define a type that captures this information.Taking the example above, there are two types involved in narrowing, which we might say are enums
SubstepABCD
andSubstepXY
.First, we establish a general interface that allows the new type we are defining to accept sub steps:
Then, using whatever code generation method we decide, we define a new type of step.
Then we define the state transitions that are valid as follows:
For debugging purposes we would still implement
AsRef<str>
for the new type (plus ensuring that it is implemented for each sub-step). For the step type, this can use a macro that just maps back to the debug-friendly strings that we use.This requires that we make the entire protocol context generic on the protocol again, which is a bit annoying, but it should only manifest in the type definition for the protocol context. For this, we define a
Step
like so:In place of the struct we currently have. Then we take that implementation and we can define a blanket implementation for
StepNarrow
.We can also retain the implementation of
Substep
forstr
andString
to make testing simpler, though this will necessarily only work forGenericStep
.When we instantiate a version of the protocol in tests we can do something clean like this:
And for a real helper where we care about performance, we can instantiate a
ProtocolContext
usingExampleStep
or theFullIpaStep
or whatever we call it3. Our performance tests can define a full step mapping so that they aren't burdened with strings.Then the protocol a) won't compile if
narrow()
is passed the wrong types, b) will crash if we pass the wrong values in the wrong places, and c) will not have much overhead.Footnotes
Note that we could also define an enum for this, but that might be complicated as you effectively need nodes to be
Option<T>
so that you can represent the parent node in the tree. Also, it's not clear that Rust will happily collapse the representation of that enum without extensive coaxing. That might be something we can explore later. That said, defining a flat enum is a distinct possibility. ↩I tend to think that a macro is better because we can tune the list to remove nodes that are never used. Rust macros let you define a very clean interface in whatever format you prefer. The compiler provides some amount of type checking here as well, which is nice. ↩
We'll probably need a turbofish for that as we don't need to pass the starting value for the step if we require
Default
. ↩The text was updated successfully, but these errors were encountered: