Oblivious attribution protocol [part 1 - attribution] #108

taikiy · 2022-10-10T09:58:24Z

Overview

This diff implements the first part (out of two - "attribution" and "aggregation") of the last touch oblivious attribution in a secure multiparty computation. Currently it supports "Last Touch" attribution only. Produced result will be used in the second part - "aggregation", which requires sorting of the list.

Details

Last touch attribution algorithm is described in details here:
https://github.com/patcg-individual-drafts/ipa/blob/main/IPA-End-to-End.md#oblivious-last-touch-attribution

First part of the protocol accesses and accumulates data of each node in a sorted list, and produces an intermediate result that has accumulated credits for each breakdown key. The document also describes how the logic can be optimized as follows:

flag = current.stop_bit * successor.helper_bit * successor.trigger_bit
current.credit = current.credit + flag * successor.credit
current.stop_bit = flag * successor.stop_bit

This is what's implemented in last_touch_attribution() with multiplications done on a MPC. I believe there could be more optimizations if we put more constrains on the input, but for this round this basic/general implementation should be enough for us to measure baseline benchmarks.

taikiy · 2022-10-10T10:48:48Z

Rust check workflow failing because of a clippy error in a code comment. I'll fix this in the next commit when I address feedbacks.

src/protocol/attribute.rs

martinthomson

Just a short review, based on a few things that I noticed.

src/protocol/attribute.rs

src/protocol/attribution/accumulate.rs

benjaminsavage

This looks good to me. I recommend you coordinate with @martinthomson to see if you should land this first, or wait until he lands his first.

src/protocol/batch.rs

… not so subtle operations

…plement Iterator.

taikiy · 2022-10-17T19:00:18Z

Rebase onto @martinthomson's renewed Step scheme. The most recent commit to be reviewed: 9e4d537

src/protocol/batch.rs

src/protocol/attribution/mod.rs

akoshelev · 2022-10-17T23:15:53Z

src/protocol/attribution/accumulate_credit.rs

+                ));
+            }
+
+            let results = try_join_all(accumulation_futures).await?;


@martinthomson in many protocols there seem to be a pattern of generating large number of futures and then polling them all. Do you see it as a problem, given that no work is done until the first poll?

akoshelev · 2022-10-17T23:23:33Z

src/protocol/attribution/mod.rs

+
+    fn next(&mut self) -> &Self {
+        self.count += 1;
+        self.id = format!("{}_{}", self.name, self.count);


there will be $O(N)$ allocations here, we probably want to mark this as todo to fix. I think @martinthomson was planning to add a cheap way to create a unique step id for release builds

This should go away once we decide to move away from AsRef<str>. I understand AsRef<str> for steps is temporary for this prototype?

akoshelev · 2022-10-17T23:27:48Z

src/protocol/attribution/accumulate_credit.rs

+/// Accumulation step for Oblivious Attribution protocol.
+#[allow(dead_code)]
+pub struct AccumulateCredit<'a, F> {
+    input: &'a Batch<AttributionInputRow<F>>,


this input format implies that somebody after doing the sort must convert the regular batch of sorted secret shares into this "expanded" format with helper bits. Is the plan to insert an "intermediate" step between sort and attribution to convert shares format or you want AccumulateCredit to eventually implement the conversion?

Helper bits will be a part of the sort output. IIUC, the sort protocol will output a vector or vectors of index which corresponds to the sorted locations of the input list. I think there will be an intermediate step where we might remove unnecessary fields like match keys and timestamps, but that is still unknown.

I have this strong suspicion that generating the "helper bits" within the MPC will be more efficient than trying to generate them in a separate step.

akoshelev · 2022-10-17T23:32:50Z

src/protocol/attribution/accumulate_credit.rs

+        // iteration, and the interaction do not depend on the calculation results
+        // of other elements, allowing the algorithm to be executed in parallel.
+
+        let mut iteration_step = IterStep::new("iteration");


Please correct me if I am wrong, but it seems we are using record_id as step and iteration_step as record_id. Is it intended?

iteration_step and multiply_step are used to create sub-contexts for each input row per step_size iteration.

This is the issue @benjaminsavage and I were talking about during the meeting. For each row in the input, I call the sub-protocol get_accumulated_credit. Since this is a sub-protocol, I wanted to pass a reference to the ctx, with each row i having RecordId::from(i). There are 4 multiplications in the sub-protocol, so let's give them steps mult_1, mult_2, .... With this set up, in the sub-protocol, the first multiplication's step should look like protocol/iteration_1/mult_1 and recrod_id=i. But when I try to run that, I get this error:

thread 'protocol::attribution::accumulate_credit::tests::accumulate' panicked at 'Refined 'protocol' with step 'mult_1' twice', src/protocol/mod.rs:94:13

This is because I'm using the same mult_1 for all rows even though they have different record ids. It seems that we should concatenate record_id=i at the end of step str to generate a unique step identifier.

As a workaround, I ended up calling another ctx.narrow() for each row to give them unique steps (i.e. "row_i"), and each multiplication as separate records.

We shouldn't have to do this. This is an issue with where that "check" happens in the code. Looking forward to an infra fix that helps avoid the need to do this!

akoshelev · 2022-10-17T23:34:25Z

src/protocol/attribution/accumulate_credit.rs

+        // 1. Create credit and stop_bit vectors
+        // These vectors are updated in each iteration to help accumulate values and determine when to stop accumulating.
+
+        let one = Replicated::one(ctx.mesh().identity());


I wonder if it is possible to make credit and stop bit part of the input?

Credits and stop bits are the states/results generated by running this protocol, so it seems reasonable to create them inside this protocol.

I agree with @taikiy - these are basically local variables to this protocol. They're all initialized to Replicated::one(), or one of the other inputs at the beginning of the loop (and over-written in each iteration) so there's no value in passing them in.

benjaminsavage

This is all looking really good, with the exception of needing to work around that infra issue that prevents us from using the same step name on different records.

benjaminsavage · 2022-10-18T07:11:20Z

src/protocol/attribution/accumulate_credit.rs

+/// Accumulation step for Oblivious Attribution protocol.
+#[allow(dead_code)]
+pub struct AccumulateCredit<'a, F> {
+    input: &'a Batch<AttributionInputRow<F>>,


I have this strong suspicion that generating the "helper bits" within the MPC will be more efficient than trying to generate them in a separate step.

benjaminsavage · 2022-10-18T07:13:57Z

src/protocol/attribution/accumulate_credit.rs

+        // 1. Create credit and stop_bit vectors
+        // These vectors are updated in each iteration to help accumulate values and determine when to stop accumulating.
+
+        let one = Replicated::one(ctx.mesh().identity());


I agree with @taikiy - these are basically local variables to this protocol. They're all initialized to Replicated::one(), or one of the other inputs at the beginning of the loop (and over-written in each iteration) so there's no value in passing them in.

benjaminsavage · 2022-10-18T07:17:20Z

src/protocol/attribution/accumulate_credit.rs

+        // iteration, and the interaction do not depend on the calculation results
+        // of other elements, allowing the algorithm to be executed in parallel.
+
+        let mut iteration_step = IterStep::new("iteration");


We shouldn't have to do this. This is an issue with where that "check" happens in the code. Looking forward to an infra fix that helps avoid the need to do this!

benjaminsavage · 2022-10-18T07:25:39Z

src/protocol/attribution/accumulate_credit.rs

+
+        // first, calculate [successor.helper_bit * successor.trigger_bit]
+        let mut b = ctx
+            .multiply(RecordId::from(0))


This seems wrong. This should be a proper record_id, not just 0 or 1. This is related to the same infra issue from above - we need the ability to run N multiplications in parallel, with the same step, and only the record_id differing.

I agree this is wrong, too. The only way I was able to make the code work was either like this, or create yet another step for each multiplication and pass record_id = 0 to all of them, which seems equally wrong. I'm looking forward to fixing this part when the infra is updated.

taikiy · 2022-10-18T08:03:06Z

Addressed the latest PR comments. All should be good except for the steps part @benjaminsavage mentioned. Would it be possible to merge this and address the steps issue in a separate PR?

benjaminsavage · 2022-10-18T08:22:19Z

OK, that sounds reasonable. We will fix this soon (once possible). In the meantime - let's land this and you can move on to the next protocol!

martinthomson reviewed Oct 11, 2022

View reviewed changes

src/protocol/attribute.rs Outdated Show resolved Hide resolved

martinthomson requested changes Oct 11, 2022

View reviewed changes

benjaminsavage reviewed Oct 11, 2022

View reviewed changes

src/protocol/attribute.rs Outdated Show resolved Hide resolved

benjaminsavage reviewed Oct 11, 2022

View reviewed changes

src/protocol/attribute.rs Outdated Show resolved Hide resolved

akoshelev reviewed Oct 11, 2022

View reviewed changes

akoshelev mentioned this pull request Oct 13, 2022

Create a wrapper around Vec that supports indexing by u32 #116

Open

taikiy force-pushed the attribution branch from 1af1e3f to ec655e3 Compare October 13, 2022 17:34

taikiy commented Oct 13, 2022

View reviewed changes

src/protocol/attribution/accumulate.rs Outdated Show resolved Hide resolved

taikiy commented Oct 13, 2022

View reviewed changes

src/protocol/attribution/accumulate.rs Outdated Show resolved Hide resolved

benjaminsavage reviewed Oct 14, 2022

View reviewed changes