-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helper agg init: remove copy of report share data. #3303
Conversation
Previously, we copied report share data into the transaction callback (re-copying it on transaction retry). Notably, this required us to keep two copies of the output shares in memory. This was done to allow modification of the report aggregations based on information retrieved from the datastore, i.e. if the report was replayed. Now, we remove the copy by passing an Arc of the relevant data into the transaction callback. Modification of the report aggregation is implemented via a Cow, which will be borrowed from the data in the Arc in the common case of no modification. The aggregation job writer is augmented with the ability to accept report aggregations already stored in a Cow, and is taught to be smart enough not to double-wrap Cow around its already-Cow'ed internal state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM and I think they accomplish the goal.
Do you have memory profiles that show improvement? Since a few LGTMs are not necessarily evidence of it working. Broadly, it would be nice to document how we're profiling, if for no other reason than I'm unfamiliar with Rust profiling strategies (but I don't think that should hold up this PR).
@@ -507,7 +507,7 @@ where | |||
aggregation_job: Cow::Borrowed(aggregation_job), | |||
report_aggregations: report_aggregations | |||
.iter() | |||
.map(Cow::Borrowed) | |||
.map(RA::borrow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check my understanding: at this point, report_aggregations is (roughly) a Vec<impl ReportAggregationUpdate>
, which, concretely, could be a struct ReportAggregationMetadata
or a Cow
of it?
In which case a naive map(Cow::Borrowed)
would double Cow
a Cow<'_, ReportAggregationMetadata>
, so we need a trait method to avoid this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. The implementation of ReportAggregationUpdate
over Cow
exists to allow a Cow<WritableReportAggregation>
(or a Cow
of anything else that implements ReportAggregationUpdate
) to be provided by the user of AggregationJobWriter
, without needing to deal with Cow<Cow<T>>
internally.
@@ -2373,24 +2375,23 @@ impl VdafOps { | |||
} | |||
|
|||
// Write report shares, and ensure this isn't a repeated report aggregation. | |||
try_join_all(report_share_data.iter_mut().map(|rsd| { | |||
let report_aggregations = try_join_all(report_share_data.iter().map(|rsd| { | |||
let task = Arc::clone(&task); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we accomplish the goal of the PR by doing .iter_mut()
and modifying the elements of ReportShareData
in place? We would need to at least wrap it in a std::sync::Mutex
for this to work, but it would avoid copying in all cases and save a bunch of code.
(Just checking if you've considered it, it probably is more involved of a change than I'm thinking).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't modify report_share_data
: it's in an Arc
to avoid copying the ReportShareData
inside the Vec
, and Arc
doesn't allow mutable references. Or, to look at it another way, we can't modify report_share_data
because we might need the unmodified value if we retry the transaction callback.
(We used to be able to modify report_share_data
, but that's because we were cloning the entire Vec
& its contents with each transaction retry -- which is the copying we are trying to avoid with this PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I was able to coerce my suggestion into "working" by having report_share_data
being in an Arc<tokio::sync::Mutex<_>>
, but it then becomes difficult to take a .report_aggregation
for use in the AggregationJobWriter
without .clone()
ing it--i.e. I landed on needing to pass around a Cow
anyway. So I think your approach is super-correct 👍. (Not that I doubted it, was just curious where exactly Mutex breaks down).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah -- N.B. the problem with Arc<Mutex<Vec<T>>>
is that, without a Cow, ultimately we would be modifying the report_share_data
, which would remain modified in the case of a transaction retry (and on retry, we want to start from the same data as in the previous attempt, to ensure correctness).
As you note, we could do Arc<Mutex<Vec<Cow<T>>>>
, but at that point, I think we're better off just generating a new Vec<Cow<T>>
based off of the report_share_data
since we need to modify the vector anyway.
I re-tested this and it demonstrated the desired drop in peak memory usage. (something a bit below 50%) I merged https://github.com/divergentdave/janus/tree/david/experiment-dhat-harness-2 onto this branch, built it in release mode with debug symbols, populated test fixture files by running the different subcommands several times, and then ran |
Thank you for profiling this! |
Previously, we copied report share data into the transaction callback (re-copying it on transaction retry). Notably, this required us to keep two copies of the output shares in memory. This was done to allow modification of the report aggregations based on information retrieved from the datastore, i.e. if the report was replayed.
Now, we remove the copy by passing an Arc of the relevant data into the transaction callback. Modification of the report aggregation is implemented via a Cow, which will be borrowed from the data in the Arc in the common case of no modification. The aggregation job writer is augmented with the ability to accept report aggregations already stored in a Cow, and is taught to be smart enough not to double-wrap Cow around its already-Cow'ed internal state.
Closes #3285.