Skip to content

Update to prio 0.18.1-alpha.3#4472

Merged
tgeoghegan merged 21 commits intomainfrom
timg/prio-0.18
Mar 17, 2026
Merged

Update to prio 0.18.1-alpha.3#4472
tgeoghegan merged 21 commits intomainfrom
timg/prio-0.18

Conversation

@tgeoghegan
Copy link
Copy Markdown
Contributor

@tgeoghegan tgeoghegan commented Mar 16, 2026

To implement draft-ietf-ppm-dap-17+, we need draft-irtf-cfrg-vdaf-18, which means we need a new prio. 0.18.1-alpha.3 is the current release candidate (divviup/libprio-rs#1391). It has all the major expected changes for the next Prio release, so now is the time to bite off the messy changes and update.

Mostly this is straightforward stuff, renaming variables and fields to match name changes in prio. Some nuances are:

  • Some VDAFs lose a bits parameter and instead use max_measurement (this is because VDAF/prio now use tighter range checks)
  • bad_client integration test is re-enabled now that Janus and prio both use the same rand
  • Support for the L2FixedPoint VDAFs is removed, since it's gone from prio. This also means we can stop enabling feature experimental.
  • We can drop the fixed and cfg_if dependencies
  • Janus no longer has the fpvec_bounded_l2 feature
  • There's some outstanding messiness around some VDAF configuration parameters like Prio3SumVec::max_measurement. These are going to be harmonized in the specification with Widen types used in VDAF config structs ietf-wg-ppm/draft-ietf-ppm-dap#777, and then we'll make it all make sense across taskprov, Janus, divviup-api and the various task representations we have.
  • The collector tool CLI no longer takes --bits since no VDAF has that parameter.

Part of #4402

@tgeoghegan tgeoghegan requested a review from a team as a code owner March 16, 2026 20:20
Comment thread Cargo.toml Outdated
pretty_assertions = "1.4.1"
# Disable default features so that individual workspace crates can choose to re-enable them
prio = { version = "0.18.1-alpha.2", default-features = false, features = ["experimental"] }
prio = { git = "https://github.com/divviup/libprio-rs", rev = "215cedb99f8b0d7395a5f885af53cd0398c878f3", default-features = false }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be version = "0.18.1-alpha.3" once we release that.

@tgeoghegan
Copy link
Copy Markdown
Contributor Author

tgeoghegan commented Mar 16, 2026

We still should do a mass rename of "prep", "prepare" to "verifier", "verify". In this PR, I only did the ones that are more or less forced by updating the prio dep to avoid adding unnecessary noise. Another thing I punted on was handling max_measurement as a u128. We agreed last week that the task config encodings in DAP should change since u32 isn't big enough. The u128-ification of max_measurement will happen when we make the Janus changes to adopt that.

@tgeoghegan
Copy link
Copy Markdown
Contributor Author

I just merged a change to libprio-rs to remove the L2FixedPointVec stuff. So that will break the Janus build unless we remove all the stuff gated on feature fpvec_bounded_l2. But I want to do that in a further PR to keep this one manageable. I think that we can leave it all in so long as we just change the interop Dockerfiles to not enable that feature anymore.

@tgeoghegan
Copy link
Copy Markdown
Contributor Author

I just merged a change to libprio-rs to remove the L2FixedPointVec stuff. So that will break the Janus build unless we remove all the stuff gated on feature fpvec_bounded_l2. But I want to do that in a further PR to keep this one manageable. I think that we can leave it all in so long as we just change the interop Dockerfiles to not enable that feature anymore.

Nope, that won't work. We'll have to delete L2FixedPointVec here. Ah, well, at least it should be overwhelmingly deletions, which are easy to review.

Comment thread core/src/test_util/mod.rs Outdated
let mut helper_prepare_transitions = Vec::new();

// Shard inputs into input shares, and initialize the initial PrepareTransitions.
println!("measurement: {measurement:?}");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to be rid of this before merging

Suggested change
println!("measurement: {measurement:?}");

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, well spotted. Rather rude of me to keep adding commits while I have this marked as ready to review...

@tgeoghegan tgeoghegan changed the title Move to latest prio 0.18.1 Update to prio 0.18.1-alpha.3 Mar 17, 2026
Copy link
Copy Markdown
Contributor

@jcjones jcjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big'un. Please check me on that with_verify_init_fn one - that might be important.

Thanks for taking on this big :oof: of a changeset!

Comment thread messages/src/taskprov.rs Outdated
Comment thread messages/src/taskprov.rs Outdated
Comment thread core/src/vdaf.rs Outdated
))
},
);
let $vdaf = ::prio::vdaf::dummy::Vdaf::new(1).with_verify_next_fn(|_| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the same construction as aggregator.rs:1041 but it's not, and I think this one is wrong.

So here, for both FakeFailsPrepInit and FakeFailsPrepStep you call with_verify_next_fn -- which makes them functionally identical. By comparison, in aggregator.rs they're split up -- the Init step calls with_verify_init_fn while the Step calls next:

            VdafInstance::FakeFailsPrepInit => VdafOps::Fake(Arc::new(
                dummy::Vdaf::new(1).with_verify_init_fn(|_| -> Result<(), VdafError> {
                    Err(VdafError::Uncategorized(
                        "FakeFailsPrepInit failed at prep_init".to_string(),
                    ))
                }),
            )),

            #[cfg(feature = "test-util")]
            VdafInstance::FakeFailsPrepStep => {
                VdafOps::Fake(Arc::new(dummy::Vdaf::new(1).with_verify_next_fn(
                    |_| -> Result<VerifyTransition<dummy::Vdaf, 0, 16>, VdafError> {

Being not a VDAF expert I'm not sure what the consequences are, but it was odd to me that we called the same thing in two different arms here, and then I remembered the other file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, this was a copy-pasta error on my part. It didn't cause test failures because we just check for something failing with VdafError::Uncategorized, I think.

dp_strategy: _,
max_measurement, ..
} => metrics
.aggregated_report_share_dimension_histogram
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess all these change the metrics, but they'll all change together at least.

Do we need to file a ticket to update any alerts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think so. We define an alert:

- alert: janus_report_aggregation_rate
    expr: |-
      sum(rate(
        janus_aggregated_report_share_vdaf_dimension_count[1h]
      )) by (namespace)
      < on () group_left ()
      max(alert_threshold_janus_report_aggregation_rate)
    labels:
      severity: "CRITICAL"
    annotations:
      summary: "Report aggregation rate low"
      description: "Reports are not being aggregated at the expected rate in
        namespace {{ $labels.namespace }}."

But that histogram will still exist, with the same bucket boundaries. However I now think that my math is wrong: the previous math was doing bits * length to get a sense of the total bit size of the measurement. max_measurement isn't the same thing because it doesn't account for vector length. I'm not sure if we should use max_measurement * length or log2(max_measurement) * length, though. The latter is consistent with the dimension we recorded before, but the former is consistent with what we do for Prio3Sum. IIRC our goal here is to bucket report share counts by the size of the measurement, so I think it'd be nice to be consistent about doing so in units of bits.

Comment thread integration_tests/tests/integration/common.rs Outdated
PingPongError::CodecPrepMessage(_) => (
PingPongError::CodecVerifierMessage(_) => (
format!("Couldn't decode {peer_role} prepare message"),
format!("{peer_role}_prep_message_decode_failure"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
format!("{peer_role}_prep_message_decode_failure"),
format!("{peer_role}_verify_message_decode_failure"),

Comment thread aggregator/src/aggregator/error.rs
Comment thread aggregator/src/aggregator/error.rs
Comment thread aggregator/src/aggregator/error.rs
Comment thread aggregator_core/src/datastore/models.rs
Comment thread interop_binaries/src/commands/janus_interop_collector.rs
Copy link
Copy Markdown
Contributor

@jcjones jcjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +673 to +674
// Bucket written reports by the number of bits their
// representation uses
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is a bit clearer:

Suggested change
// Bucket written reports by the number of bits their
// representation uses
// Bucket written reports by the size of their
// encoded measurement in field elements

.aggregated_report_share_dimension_histogram
.record(
*max_measurement,
max_measurement.next_power_of_two().ilog2() as u64,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In libprio we do max_measurement.ilog2() + 1. I think these would differ on powers of two.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that aligning with what libprio does is right, but what about the case where max_measurement is a power of 2? Then the +1 is unnecessary, right?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking 8 as an example, 8.ilog2() is 3 (as 2^3 is 8), so 8.ilog2() + 1 is 4, which is what we want since 8 is 0b1000. The +1 is necessary, but it's a constant offset, as ilog2() and counting bits happen to have the same boundary behavior. With this PR's code, we'd compute 8.next_power_of_two(), which is 8, then take the ilog2()` of that, which is 3.

max_measurement.ilog2() + 1 also works with non-powers of two: for 7, the floored base two log is 2, so we get 3, and for 9, the floored base two log is 3, so we get 4.

Comment on lines +719 to +724
// Each histogram bucket can contain up to
// Field128::modulus(), so 128 bits
Prio3Histogram { length, .. } => metrics
.aggregated_report_share_dimension_histogram
.record(
u64::try_from(*length).unwrap_or(u64::MAX),
*length as u64 * 128,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, each measurement is a one-hot vector, so there's no factor of 128.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I put it back to u64::try_from(*length).unwrap_or(u64::MAX)

.saturating_mul(
u64::try_from(*length).unwrap_or(u64::MAX),
),
(length * (max_measurement.ilog2() as usize) + 1)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses are misplaced here

Suggested change
(length * (max_measurement.ilog2() as usize) + 1)
(length * (max_measurement.ilog2() as usize + 1))

@tgeoghegan tgeoghegan merged commit 5662de6 into main Mar 17, 2026
8 checks passed
@tgeoghegan tgeoghegan deleted the timg/prio-0.18 branch March 17, 2026 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants