Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify aggregator_core::task::Task #1524

Closed
Tracked by #2071
tgeoghegan opened this issue Jun 21, 2023 · 0 comments
Closed
Tracked by #2071

Simplify aggregator_core::task::Task #1524

tgeoghegan opened this issue Jun 21, 2023 · 0 comments
Assignees

Comments

@tgeoghegan
Copy link
Contributor

Janus' task representation contains a bunch of unnecessary information:

  • an aggregator never needs to know its own DAP API URL
  • a helper doesn't need to know the other aggregator's DAP API URL
  • a task cannot have more than one VDAF verify key
  • a task cannot have multiple collector auth tokens

And it's possible a task will only ever have a single aggregator auth token, too (we have to decide whether to support rotation of that credential). As we revisit the task creation API in the aggregator API (#1506), this leads to some awkward impedance mismatches. We should resolve those by amending the Task struct.

This goes hand-in-hand with changing the database schema for tasks, tracked in #1521.

Note that we serialize this structure to/from YAML when provisioning tasks with janus_cli, so as long as we intend to support task provisioning with that tool, the Task struct is arguably public API, so we should be careful about breaking this.

@tgeoghegan tgeoghegan self-assigned this Jun 30, 2023
tgeoghegan added a commit that referenced this issue Sep 19, 2023
The security analysis done for DAP-04 concluded that the VDAF verify key
must be agreed upon by aggregators before any reports may be aggregated
and cannot later be changed. That means a DAP task can only ever have a
single VDAF verify key. This commit simplifies the representation of
tasks in the various in-memory structures as well as the database to
reflect this. The biggest change is that we no longer have table
`task_vdaf_verify_keys` and instead add a row `vdaf_verify_key` to table
`tasks`.

Part of #1524, #1521
tgeoghegan added a commit that referenced this issue Sep 20, 2023
The security analysis done for DAP-04 concluded that the VDAF verify key
must be agreed upon by aggregators before any reports may be aggregated
and cannot later be changed. That means a DAP task can only ever have a
single VDAF verify key. This commit simplifies the representation of
tasks in the various in-memory structures as well as the database to
reflect this. The biggest change is that we no longer have table
`task_vdaf_verify_keys` and instead add a row `vdaf_verify_key` to table
`tasks`.

Part of #1524, #1521
tgeoghegan added a commit that referenced this issue Sep 21, 2023
We've decided that each task has exactly one aggregator and collector
auth token, and rotating those means rotating the task. This commit
updates the representations of auth tokens in the database and in memory
(`aggregator_core::task::Task`) accordingly.

Several tests relied upon tasks constructed in tests having two
aggregator auth tokens, one of each supported type. Those tests are
fixed to explicitly construct tasks using `DAP-Auth-Token` tokens where
necessary.

Given that there is now a single aggregator or collector auth token per
task, we could remove the `task_aggregator_auth_tokens` and
`task_collector_auth_tokens` tables and instead add columns
`aggregator_auth_token`, `aggregator_auth_token_type`,
`collector_auth_token` and `collector_auth_token_type` to `tasks`.

I chose not to do this because validating the correctness of those
columns would be tricky. We couldn't make them all `NOT NULL`, because a
helper task doesn't have a collector auth token and a task provisioned
via taskprov won't have either token (instead it'll use tokens from the
`taskprov_*_auth_tokens` tables). So we would have to write constraints
on the `tasks` table to ensure pairs of columns are either `NULL` or
`NOT NULL` in tandem, which I think are expressed more clearly in the
independent `task_*_auth_tokens` tables. Plus, if we ever _do_ decide to
support auth token rotation, it'll be easier to do with these tables in
place.

Part of #1524, #1521
tgeoghegan added a commit that referenced this issue Sep 22, 2023
* Tasks have only one aggregator, collector token

We've decided that each task has exactly one aggregator and collector
auth token, and rotating those means rotating the task. This commit
updates the representations of auth tokens in the database and in memory
(`aggregator_core::task::Task`) accordingly. In particular, the tables
`task_aggregator_auth_tokens` and `task_collector_auth_tokens` tables are
removed and folded into table `tasks`.

Several tests relied upon tasks constructed in tests having two
aggregator auth tokens, one of each supported type. Those tests are
fixed to explicitly construct tasks using `DAP-Auth-Token` tokens where
necessary.

Part of #1524, #1521
tgeoghegan added a commit that referenced this issue Sep 28, 2023
We want to enable aggregators to only store authentication token hashes
if they don't need the actual token (e.g., a helper only receives
aggregation sub-protocol messages). With the existing
`janus_aggregator_core::task::Task` structure, this is awkward, because
that struct was used both as a view into all of a task's parameters and
secrets, useful for test code that orchestrates the provisioning and
execution of a task, as well as a specific DAP participant's view of the
task, which is useful in Janus itself.

To resolve that tension, this commit introduces distinct structures for
those distinct views. `janus_aggregator_core::task::test_util::Task`
contains all of a task's parameters and secrets, regardless of any
protocol role. `janus_aggregatore_core::task::test_util::NewTaskBuilder`
is used in test contexts to build `test_util::Task` values, and then
`test_util::Task::{leader_view, helper_view}` then allow obtaining the
leader or helper specific view of the task as a
`janus_aggregator_core::task::AggregatorTask`. This holistic view of a
task is only used in test code, and thus we confine it to the
`test_util` module.

`AggregatorTask` is an aggregator's view of the task, and uses an enum
to represent the leader, helper or taskprov-provisioned helper specific
parameters. This task representation is what gets used by non-test
configurations of Janus. We also add `SerializediAggregatorTask`, which
much like `SerializedTask` is an adapter so that `AggregatorTask` values
can be serialized to and from YAML, for use in `janus_cli`.

The goal is to migrate all of Janus from the existing
`janus_aggregator_core::task::Task` and
`janus_aggregator_core::task::test_util::TaskBuilder` onto
`task::test_util::Task`, `task::test_util::NewTaskBuilder` and
`task::AggregatorTask`. Doing this all in one go would yield a huge diff
that is difficult to review, so in order to enable incremental adoption
of the new structures throughout Janus, we have them co-exist for now.
Once all usages of `task::Task` and `task::test_util::TaskBuilder` have
been removed, those types can be deleted and `NewTaskBuilder` can be
renamed to just `TaskBuilder`.

To see where all this is going, check out #2014.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 28, 2023
Adopts `janus_aggregator_core::task::test_util::NewTaskBuilder` and
`janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::datastore` module. Much as the previous change
provides two kinds of `Task` structure, we now provide two sets of
methods for reading and writing tasks: one that deals in the new
`AggregatorTask` and the other which deals in the old `Task`.

We add routines for converting between `task::Task` and
`task::AggregatorTask` to make it easier for these two paths through the
datastore to co-exist. This conversion is lossy because `AggregatorTask`
only retains one of the aggregator endpoints, but this doesn't cause
substantial problems in Janus, and we can live it transitionally.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 28, 2023
Adopts `janus_aggregator_core::task::test_util::NewTaskBuilder` and
`janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::datastore` module. Much as the previous change
provides two kinds of `Task` structure, we now provide two sets of
methods for reading and writing tasks: one that deals in the new
`AggregatorTask` and the other which deals in the old `Task`.

We add routines for converting between `task::Task` and
`task::AggregatorTask` to make it easier for these two paths through the
datastore to co-exist. This conversion is lossy because `AggregatorTask`
only retains one of the aggregator endpoints, but this doesn't cause
substantial problems in Janus, and we can live it transitionally.

Finally, the SQL schema for tasks is changed so that only the peer
aggregator's endpoint is stored.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::query_type` and `janus_aggregator::query_type`
modules, making use of the routines for converting between
`janus_aggregator_core::Task` and
`janus_aggregator_core::AggregatorTask` to make this minimally
intrusive.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
We want to enable aggregators to only store authentication token hashes
if they don't need the actual token (e.g., a helper only receives
aggregation sub-protocol messages). With the existing
`janus_aggregator_core::task::Task` structure, this is awkward, because
that struct was used both as a view into all of a task's parameters and
secrets, useful for test code that orchestrates the provisioning and
execution of a task, as well as a specific DAP participant's view of the
task, which is useful in Janus itself.

To resolve that tension, this commit introduces distinct structures for
those distinct views. `janus_aggregator_core::task::test_util::Task`
contains all of a task's parameters and secrets, regardless of any
protocol role. `janus_aggregatore_core::task::test_util::NewTaskBuilder`
is used in test contexts to build `test_util::Task` values, and then
`test_util::Task::{leader_view, helper_view}` then allow obtaining the
leader or helper specific view of the task as a
`janus_aggregator_core::task::AggregatorTask`. This holistic view of a
task is only used in test code, and thus we confine it to the
`test_util` module.

`AggregatorTask` is an aggregator's view of the task, and uses an enum
to represent the leader, helper or taskprov-provisioned helper specific
parameters. This task representation is what gets used by non-test
configurations of Janus. We also add `SerializediAggregatorTask`, which
much like `SerializedTask` is an adapter so that `AggregatorTask` values
can be serialized to and from YAML, for use in `janus_cli`.

The goal is to migrate all of Janus from the existing
`janus_aggregator_core::task::Task` and
`janus_aggregator_core::task::test_util::TaskBuilder` onto
`task::test_util::Task`, `task::test_util::NewTaskBuilder` and
`task::AggregatorTask`. Doing this all in one go would yield a huge diff
that is difficult to review, so in order to enable incremental adoption
of the new structures throughout Janus, we have them co-exist for now.
Once all usages of `task::Task` and `task::test_util::TaskBuilder` have
been removed, those types can be deleted and `NewTaskBuilder` can be
renamed to just `TaskBuilder`.

To see where all this is going, check out #2014.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::test_util::NewTaskBuilder` and
`janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::datastore` module. Much as the previous change
provides two kinds of `Task` structure, we now provide two sets of
methods for reading and writing tasks: one that deals in the new
`AggregatorTask` and the other which deals in the old `Task`.

We add routines for converting between `task::Task` and
`task::AggregatorTask` to make it easier for these two paths through the
datastore to co-exist. This conversion is lossy because `AggregatorTask`
only retains one of the aggregator endpoints, but this doesn't cause
substantial problems in Janus, and we can live it transitionally.

Finally, the SQL schema for tasks is changed so that only the peer
aggregator's endpoint is stored.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::query_type` and `janus_aggregator::query_type`
modules, making use of the routines for converting between
`janus_aggregator_core::Task` and
`janus_aggregator_core::AggregatorTask` to make this minimally
intrusive.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::AggregatorTask` in the `taskprov`
module and code that uses it in `janus_aggregator`.

Of particular interest is that we do away with
`janus_aggregator_core::taskprov::Task` and instead represent taskprov
tasks as a `janus_aggregator_core::task::Task` with
`AggregatorTaskParameters::TaskProvHelper`. Hopefully we'll be able to
further unify handling of taskprov and regular tasks in the future.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopt `janus_aggregator_core::task::AggregatorTask` throughout
`janus_aggregator_api`. Gratifyingly, this removes some hacks that
resolved impedance mismatches between the `PostTaskReq` and
`janus_aggregator_core::task::Task`, which are no longer needed because
`AggregatorTask` only has the peer aggregator's endpoint in it.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::test_util::NewTaskBuilder` and
`janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::datastore` module. Much as the previous change
provides two kinds of `Task` structure, we now provide two sets of
methods for reading and writing tasks: one that deals in the new
`AggregatorTask` and the other which deals in the old `Task`.

We add routines for converting between `task::Task` and
`task::AggregatorTask` to make it easier for these two paths through the
datastore to co-exist. This conversion is lossy because `AggregatorTask`
only retains one of the aggregator endpoints, but this doesn't cause
substantial problems in Janus, and we can live it transitionally.

Finally, the SQL schema for tasks is changed so that only the peer
aggregator's endpoint is stored.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::query_type` and `janus_aggregator::query_type`
modules, making use of the routines for converting between
`janus_aggregator_core::Task` and
`janus_aggregator_core::AggregatorTask` to make this minimally
intrusive.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::AggregatorTask` in the
`janus_aggregator_core::query_type` and `janus_aggregator::query_type`
modules, making use of the routines for converting between
`janus_aggregator_core::Task` and
`janus_aggregator_core::AggregatorTask` to make this minimally
intrusive.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `janus_aggregator_core::task::AggregatorTask` in the `taskprov`
module and code that uses it in `janus_aggregator`.

Of particular interest is that we do away with
`janus_aggregator_core::taskprov::Task` and instead represent taskprov
tasks as a `janus_aggregator_core::task::Task` with
`AggregatorTaskParameters::TaskProvHelper`. Hopefully we'll be able to
further unify handling of taskprov and regular tasks in the future.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopt `janus_aggregator_core::task::AggregatorTask` throughout
`janus_aggregator_api`. Gratifyingly, this removes some hacks that
resolved impedance mismatches between the `PostTaskReq` and
`janus_aggregator_core::task::Task`, which are no longer needed because
`AggregatorTask` only has the peer aggregator's endpoint in it.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `NewTaskBuilder` and `AggregatorTask` across portions of the
test utilities and tests in the `janus_aggregator` module.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` in `janus_cli`. There's a functional change
here: the YAML task accepted by `janus_cli` is now an `AggregatorTask`
(via `SerializedAggregatorTask`), containing an aggregator's specific
view of a task, meaning it contains just the peer aggregator's endpoint
URL. In the near future, there will be more changes to the task YAML
format so that for instance leader tasks only get the collector auth
token _hash_ and helper tasks only get the aggregator auth token _hash_.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `NewTaskBuilder` and `AggregatorTask` across portions of the
test utilities and tests in the `janus_aggregator` module.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask`, `test_util::Task` and `NewTaskBuilder` across
the integration test and interop binaries modules.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask`, `test_util::Task` and `NewTaskBuilder` across
the integration test and interop binaries modules.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` in `garbage_collector`
module.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Removes the old `Task`, `SerializedTask` and `TaskBuilder` structures
now that everything uses the new stuff. `NewTaskBuilder` is renamed to
`TaskBuilder` since it's the only `TaskBuilder` we've got.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` in `garbage_collector`
module.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` across the helper's
aggregate share handlers and the leader's collection job driver.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` across the helper's
aggregate share handlers and the leader's collection job driver.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` across the aggregation job
handling components.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` across the aggregation job
handling components.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 29, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` across remainder of test
and non-test portions of `janus_aggregator::aggregator`, and the
graceful shutdown test.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 30, 2023
Adopts `AggregatorTask` and `NewTaskBuilder` across remainder of test
and non-test portions of `janus_aggregator::aggregator`, and the
graceful shutdown test.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 30, 2023
Adopts `AggregatorTask` in `janus_cli`. There's a functional change
here: the YAML task accepted by `janus_cli` is now an `AggregatorTask`
(via `SerializedAggregatorTask`), containing an aggregator's specific
view of a task, meaning it contains just the peer aggregator's endpoint
URL. In the near future, there will be more changes to the task YAML
format so that for instance leader tasks only get the collector auth
token _hash_ and helper tasks only get the aggregator auth token _hash_.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 30, 2023
Adopts `AggregatorTask` in `janus_cli`. There's a functional change
here: the YAML task accepted by `janus_cli` is now an `AggregatorTask`
(via `SerializedAggregatorTask`), containing an aggregator's specific
view of a task, meaning it contains just the peer aggregator's endpoint
URL. In the near future, there will be more changes to the task YAML
format so that for instance leader tasks only get the collector auth
token _hash_ and helper tasks only get the aggregator auth token _hash_.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 30, 2023
Adopts `AggregatorTask`, `test_util::Task` and `NewTaskBuilder` across
the integration test and interop binaries modules.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 30, 2023
Adopts `AggregatorTask`, `test_util::Task` and `NewTaskBuilder` across
the integration test and interop binaries modules.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 30, 2023
Removes the old `Task`, `SerializedTask` and `TaskBuilder` structures
now that everything uses the new stuff. `NewTaskBuilder` is renamed to
`TaskBuilder` since it's the only `TaskBuilder` we've got.

Part of #1524
tgeoghegan added a commit that referenced this issue Sep 30, 2023
Removes the old `Task`, `SerializedTask` and `TaskBuilder` structures
now that everything uses the new stuff. `NewTaskBuilder` is renamed to
`TaskBuilder` since it's the only `TaskBuilder` we've got.

Part of #1524
tgeoghegan added a commit that referenced this issue Oct 2, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Part of #1509, #1524
tgeoghegan added a commit that referenced this issue Oct 2, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Additionally, we extend the aggregator API's config endpoint so that it
now can report a list of features that will be used by divvi-up API to
decide how to interact with it. The only such feature, which Janus
unconditionally advertises, is `token-hash`, indicating that Janus
stores token hashes instead of unhashed tokens where possible.

Part of #1509, #1524
tgeoghegan added a commit that referenced this issue Oct 3, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Additionally, we extend the aggregator API's config endpoint so that it
now can report a list of features that will be used by divvi-up API to
decide how to interact with it. The only such feature, which Janus
unconditionally advertises, is `token-hash`, indicating that Janus
stores token hashes instead of unhashed tokens where possible.

Part of #1509, #1524
tgeoghegan added a commit that referenced this issue Oct 4, 2023
The leader only needs the hash of the collector auth token in order to
authenticate collection sub-protocol requests and thus we can avoid the
risk of storing the unhashed token.

This commit changes the `tasks` table to store a hash instead of the
token, and makes the corresponding change to
`janus_aggregator_core::task::AggregatorTask`. There are corresponding
changes to the aggregator API: the leader must now be provided the
collector auth token hash during task creation, and `TaskResp` now only
contains a collector auth token hash.

Additionally, we extend the aggregator API's config endpoint so that it
now can report a list of features that will be used by divvi-up API to
decide how to interact with it. The only such feature, which Janus
unconditionally advertises, is `token-hash`, indicating that Janus
stores token hashes instead of unhashed tokens where possible.

Part of #1509, #1524
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant