Thoughts on making Step and ProtocolContext cheaper #139

martinthomson · 2022-10-19T06:07:07Z

We don't really know whether these structures are going to impose significant overheads just yet, but we have identified reasons for both needing some attention.

Warning: lots of rust code that I just typed into a text editor. No guarantees that any of this works.

ProtocolContext

I'm advocating for ProtocolContext being cheap so that we can make copies for every step and every record. This should improve API ergonomics. The structure (as of writing) is:

pub struct ProtocolContext<'a, N> {
    role: Identity,
    step: UniqueStepId, // or Step
    prss: &'a PrssEndpoint,
    gateway: &'a Gateway<N>,
}

That's a very small enum for role (1 bit possible, 8 bits likely), an owned string (yikes), and two references (64 bits each maybe, though this might be compressed on some platforms).

The primary offender here is step, which we'll get back to, but it seems like there is an obvious fix here. Moving the shared state into another object could make this nicer. One way to do that is to do something like this:

trait HelperState {  // name TBD
  type Network: crate::protocol::Network;
  fn role(&self) -> Identity;
  fn gateway(&self) -> &Gateway<Self::Network>;
  fn prss(&self) -> &PrssEndpoint;
}

pub struct ProtocolContext<'a, H> {
  global: &'a H,
  step: UniqueStepId, // or Step
}

impl<H: HelperState> for ProtocolContext<'_, H> {
  // This can access self.global.prss() rather than self.prss
}

This style of indirection might be necessary to allow the in-memory networking to refer to a single global state and a real networking setup to refer to a single helper-wide object. Either way, the overall cost is then dominated by the step identifier. The interface is no more complex. The only cost is following a pointer when we want something.

Step

In debugging and testing, we want something that is descriptive of the process. Our current approach describes steps with strings and then composes those together into something like a path.

We can use that process happily in debugging, but step identifiers need to be passed around a LOT and that might hurt performance when we need to shift a lot of data. What we need is a process that produces an object that can be represented in memory as a simple integer.

One way to do that is to map the path-like structure we naturally produce onto an integer, assigning each node in the path (including "directories") an integer value.

protocol => 0
protocol/a => 1
protocol/a/x => 2
protocol/a/y => 3
protocol/b => 4
protocol/c => 5
protocol/c/x => 6

Then we have a way to represent any step in that specific process with a simple and small number¹. The obvious way to do this is to collect all of the steps that are used (including intermediate nodes), printing them out, (maybe) sorting them, and assigning each a number in sequence.

We can't do the number thing until we have a complete understanding of all of the steps involved in a protocol execution. For that, we can simply assemble a complete protocol, run it using generic steps, and collect the unique steps that are created during execution. From that, we can develop a complete set of states and a mapping to a number. This might be turned into a distinct type using either code generation (in build.rs say), a macro, or some combination of the two².

From this we can produce a wrapper around an integer type that encodes this information efficiently based on calls to the Step::narrow() function. Recall that each use of narrow() takes an arbitrary enum type as input. We can then define a type that captures this information.

Taking the example above, there are two types involved in narrowing, which we might say are enums SubstepABCD and SubstepXY.

First, we establish a general interface that allows the new type we are defining to accept sub steps:

trait StepNarrow<SS> {
  fn narrow(self, sub_step: SS) -> Self;
}

Then, using whatever code generation method we decide, we define a new type of step.

trait Step: AsRef<str> {}
struct ExampleStep(u8);
impl Step for ExampleStep {}

Then we define the state transitions that are valid as follows:

impl StepNarrow<SubstepXY> for ExampleStep {
  fn narrow(self, ss: Self::SS) -> Self {
    Self(match (self.0, ss) {
      (1, SubstepXY::X) => 2,
      (1, SubstepXY::Y) => 3,
      (5, SubstepXY::X) => 6,
      _ => panic!("cannot narrow with {ss.as_ref()} from state {self.as_ref()}");
    })
  }
}

impl StepNarrow<SubstepABCD> for ExampleStep {
  fn narrow(self, ss: Self::SS) -> Self {
    assert_eq!(self.0, 0, "cannot narrow with {ss.as_ref()} from state {self.as_ref()}");
    Self(match ss {
      SubstepABCD::A => 1,
      SubStepABCD::B => 4,
      SubStepABCD::C => 5,
    }) // this form might be more common, but it might not be worth implementing this way
       // because a single form is probably easier; I only did this because I'm hand-rolling code.
  }
}

For debugging purposes we would still implement AsRef<str> for the new type (plus ensuring that it is implemented for each sub-step). For the step type, this can use a macro that just maps back to the debug-friendly strings that we use.

This requires that we make the entire protocol context generic on the protocol again, which is a bit annoying, but it should only manifest in the type definition for the protocol context. For this, we define a Step like so:

pub trait Step: AsRef<str> + Default {}
pub trait Substep: AsRef<str> {
}

In place of the struct we currently have. Then we take that implementation and we can define a blanket implementation for StepNarrow.

#[derive(Default)] // or implement it, whatever suits best
pub struct GenericStep {
  s: String,
  // + all the debug stuff
}
impl<SS: Substep + ?Sized> StepNarrow<SS> for GenericStep {
  fn narrow(&self, substep: SS) -> Self {
    Self { s: self.s + '/' + substep.as_str(), /* etc... */ }
  }
}

We can also retain the implementation of Substep for str and String to make testing simpler, though this will necessarily only work for GenericStep.

When we instantiate a version of the protocol in tests we can do something clean like this:

  let world = make_world();
  let (c0, c1, c2) = make_contexts(&world);
  // cN has a type of `ProtocolContext<'a, Arc<InMemoryEndpoint>, GenericStep>`,
  // or something along those lines.  The type won't need to be exposed in the interface.
  // This works because the test code always uses `GenericStep`

And for a real helper where we care about performance, we can instantiate a ProtocolContext using ExampleStep or the FullIpaStep or whatever we call it³. Our performance tests can define a full step mapping so that they aren't burdened with strings.

Then the protocol a) won't compile if narrow() is passed the wrong types, b) will crash if we pass the wrong values in the wrong places, and c) will not have much overhead.

Note that we could also define an enum for this, but that might be complicated as you effectively need nodes to be Option<T> so that you can represent the parent node in the tree. Also, it's not clear that Rust will happily collapse the representation of that enum without extensive coaxing. That might be something we can explore later. That said, defining a flat enum is a distinct possibility. ↩
I tend to think that a macro is better because we can tune the list to remove nodes that are never used. Rust macros let you define a very clean interface in whatever format you prefer. The compiler provides some amount of type checking here as well, which is nice. ↩
We'll probably need a turbofish for that as we don't need to pass the starting value for the step if we require Default. ↩

The text was updated successfully, but these errors were encountered:

thurstonsand · 2022-10-21T20:27:16Z

what are you proposing is the process to assign globally unique ints to each step? would there be some kind of code generation that occurs on execution? or would you just run the protocol, have it print out all the steps, then manually assign int values to them? or something else?

martinthomson · 2022-10-23T23:44:02Z

I'm proposing that we run the entire process in "debug" mode using a generic implementation that can report on what steps were created (that can be behind a feature flag as it will likely require globally-consistent, shared mutable state and really affect performance). That will produce a list of steps in a form that we can then copy into the code. We might use macros or some codegen to turn that output into the code I described.

We might automate this further, but that's where I would start.

thurstonsand · 2022-10-25T15:19:23Z

@akoshelev mentioned one possibility where we implement some proc macro that we would add to all steps that would accomplish the same thing at compile time. I like the idea of that since it'd be less maintenance (aside from having to maintain a proc macro)

akoshelev · 2023-01-25T22:41:08Z

Another thing we may want to implement with macro/compile time approach is to measure the depth of the circuit. MP-SPDZ seems to be using topological sort for it, we may benefit from doing it as well.

martinthomson · 2023-04-25T07:32:17Z

Another thing to consider: We might want to consider renaming Step to Gate so that we can use G for the generic parameter, rather than S.

(Edit: We can maybe s/Substep/Step/ in this case. You step from gate to gate...)

akoshelev mentioned this issue Oct 21, 2022

New step model rejects attempts to use the same step on multiple records #135

Closed

akoshelev mentioned this issue Oct 28, 2022

Secure apply inv #165

Merged

martinthomson mentioned this issue Nov 20, 2022

Protocol context is a trait #253

Merged

martinthomson mentioned this issue Dec 22, 2022

Channel-per-step for sending #362

Closed

akoshelev mentioned this issue Jan 12, 2023

Explore whether cloning context can be made more efficient #380

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts on making Step and ProtocolContext cheaper #139

Thoughts on making Step and ProtocolContext cheaper #139

martinthomson commented Oct 19, 2022

thurstonsand commented Oct 21, 2022

martinthomson commented Oct 23, 2022

thurstonsand commented Oct 25, 2022

akoshelev commented Jan 25, 2023

martinthomson commented Apr 25, 2023 •

edited

Loading

Thoughts on making Step and ProtocolContext cheaper #139

Thoughts on making Step and ProtocolContext cheaper #139

Comments

martinthomson commented Oct 19, 2022

ProtocolContext

Step

Footnotes

thurstonsand commented Oct 21, 2022

martinthomson commented Oct 23, 2022

thurstonsand commented Oct 25, 2022

akoshelev commented Jan 25, 2023

martinthomson commented Apr 25, 2023 • edited Loading

martinthomson commented Apr 25, 2023 •

edited

Loading