refactor(experiment): make report goroutine safe #1612

bassosimone · 2024-06-05T13:01:01Z

This diff refactors *engine.experiment to make the report field goroutine safe. It also moves at the bottom of experiment.go code that was intermixed with *engine.experiment methods.

Part of ooni/probe#2607

This diff refactors *engine.experiment to make the report field goroutine safe. It also moves at the bottom of experiment.go code that was intermixed with *engine.experiment methods. Part of ooni/probe#2607

DecFox · 2024-06-05T17:03:08Z

LGTM!

We originally developed asynchronous experiments as a mean to support websteps, a nettest that returned more than one measurement per invocation of its Measure method. Since then, we removed websteps. Therefore, this code is currently technically unused. Additionally, this code further complicates implementing richer input, because it is another way of performing measurements. As such, in the interest of switfly moving forward with richer input _and_ of simplifying the engine, we are now removing this unused functionality from the tree. While there, better the documentation of OnProgress and undeprecate functions to open/close reports since it's clear that, for now, using a submitter in cmd/ooniprobe is a bit of a stretch given our current goals. So, it feels best to avoid deprecating until we have nice plan for replacement. To have more confidence that the job we did is ~correct, we use the following table to cross compare the operations that we previously performed for running sync experiments (i.e., all experiments) in an async way to the code we're using after this diff to run experiments. (Note that the diff itself is such that you can easily see the deleted and the added code inside of the `experiment.go` file.) In both cases, we're looking at the operations performed starting from `MeasureWithContext`. | Operation | Before | After | | --------------------------- | ------ | ----- | | MaybeLookupLocationContext | yes | yes | | use e.session.byteCounter | yes | yes | | use e.byteCounter | yes | yes | | newMeasurement | yes | yes | | save start time | yes | yes | | initialize experiment args | yes | yes | | call measurer.Run | yes | yes | | save end time | yes | yes | | compute measurement runtime | yes | yes | | scrub measurement | yes | yes | | return measurement | yes | yes | Part of ooni/probe#2607 Depends on #1612

Most of the existing code is designed to move around lists of `model.OOAPIURLInfo` and measuring such URLs. This design originally suited Web Connectivity but it's not good enough for richer input because it does not contain options. With this diff, we move into the direction of richer input by replacing `model.OOAPIURLInfo` lists with lists of: ```Go // internal/model/experiment.go type ExperimentTarget struct { Category() string Country() string Input() string } ``` where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a trivial way and where, additionally: 1. the `InputLoader` is modified to load `ExperimentTarget`; 2. the `Experiment` is modify to measure an `ExperimentTarget`. In addition to applying these changes, this diff also adapts the whole tree to use `ExperimentTarget` in all places and adds a trivial constructor to obtain `OOAPIURLInfo` when the category code and the country code are unknown. With this diff merged, implementing richer input for real is a matter of implementing the following changes: 1. the `*registry.Factory` has a new func field, defined by each experiment, that loads a list of `ExperimentTarget`; 2. we have a library for input loading containing the same code that we currently use for the input loader; 3. the `InputLoader` is gone and instead we use the factory (or its `*engine.experimentBuilder` wrapper for input loading; 4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to contain an additional field that is the `ExperimentTarget` we want to measure; 5. each experiment that needs richer input type-casts from the `ExperimentTarget` interface to the concrete type that the experiment richer input should have and accesses any option. Part of #1612.

@ainghazal

Most of the existing code is designed to move around lists of `model.OOAPIURLInfo` and measuring such URLs. This design originally suited Web Connectivity but it's not good enough for richer input because it does not contain options. With this diff, we move into the direction of richer input by replacing `model.OOAPIURLInfo` lists with lists of: ```Go // internal/model/experiment.go type ExperimentTarget struct { Category() string Country() string Input() string } ``` where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a trivial way and where, additionally: 1. the `InputLoader` is modified to load `ExperimentTarget`; 2. the `Experiment` is modify to measure an `ExperimentTarget`. In addition to applying these changes, this diff also adapts the whole tree to use `ExperimentTarget` in all places and adds a trivial constructor to obtain `OOAPIURLInfo` when the category code and the country code are unknown. With this diff merged, implementing richer input for real is a matter of implementing the following changes: 1. the `*registry.Factory` has a new func field, defined by each experiment, that loads a list of `ExperimentTarget`; 2. we have a library for input loading containing the same code that we currently use for the input loader; 3. the `InputLoader` is gone and instead we use the factory (or its `*engine.experimentBuilder` wrapper for input loading; 4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to contain an additional field that is the `ExperimentTarget` we want to measure; 5. each experiment that needs richer input type-casts from the `ExperimentTarget` interface to the concrete type that the experiment richer input should have and accesses any option. Part of #1612. This implementation strategy emerged while discussing this matter with @ainghazal, thank you so much for that!

@ainghazal

Most of the existing code is designed to move around lists of `model.OOAPIURLInfo` and measuring such URLs. This design originally suited Web Connectivity but it's not good enough for richer input because it does not contain options. With this diff, we move into the direction of richer input by replacing `model.OOAPIURLInfo` lists with lists of: ```Go // internal/model/experiment.go type ExperimentTarget struct { Category() string Country() string Input() string } ``` where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a trivial way and where, additionally: 1. the `InputLoader` is modified to load `ExperimentTarget`; 2. the `Experiment` is modify to measure an `ExperimentTarget`. In addition to applying these changes, this diff also adapts the whole tree to use `ExperimentTarget` in all places and adds a trivial constructor to obtain `OOAPIURLInfo` when the category code and the country code are unknown. With this diff merged, implementing richer input for real is a matter of implementing the following changes: 1. the `*registry.Factory` has a new func field, defined by each experiment, that loads a list of `ExperimentTarget`; 2. we have a library for input loading containing the same code that we currently use for the input loader; 3. the `InputLoader` is gone and instead we use the factory (or its `*engine.experimentBuilder` wrapper for input loading; 4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to contain an additional field that is the `ExperimentTarget` we want to measure; 5. each experiment that needs richer input type-casts from the `ExperimentTarget` interface to the concrete type that the experiment richer input should have and accesses any option. Part of #1612. This implementation strategy emerged while discussing this matter with @ainghazal, thank you so much for that! Co-Authored-by: <99027643+ainghazal@users.noreply.github.com>

@ainghazal

Most of the existing code is designed to move around lists of `model.OOAPIURLInfo` and measuring such URLs. This design originally suited Web Connectivity but it's not good enough for richer input because it does not contain options. With this diff, we move into the direction of richer input by replacing `model.OOAPIURLInfo` lists with lists of: ```Go // internal/model/experiment.go type ExperimentTarget struct { Category() string Country() string Input() string } ``` where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a trivial way and where, additionally: 1. the `InputLoader` is modified to load `ExperimentTarget`; 2. the `Experiment` is modify to measure an `ExperimentTarget`. In addition to applying these changes, this diff also adapts the whole tree to use `ExperimentTarget` in all places and adds a trivial constructor to obtain `OOAPIURLInfo` when the category code and the country code are unknown. With this diff merged, implementing richer input for real is a matter of implementing the following changes: 1. the `*registry.Factory` has a new func field, defined by each experiment, that loads a list of `ExperimentTarget`; 2. we have a library for input loading containing the same code that we currently use for the input loader; 3. the `InputLoader` is gone and instead we use the factory (or its `*engine.experimentBuilder` wrapper for input loading; 4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to contain an additional field that is the `ExperimentTarget` we want to measure; 5. each experiment that needs richer input type-casts from the `ExperimentTarget` interface to the concrete type that the experiment richer input should have and accesses any option. Part of #1612. This implementation strategy emerged while discussing this matter with @ainghazal, thank you so much for that! Co-authored-by: <ainghazal42@gmail.com>

@ainghazal

Most of the existing code is designed to move around lists of `model.OOAPIURLInfo` and measuring such URLs. This design originally suited Web Connectivity but it's not good enough for richer input because it does not contain options. With this diff, we move into the direction of richer input by replacing `model.OOAPIURLInfo` lists with lists of: ```Go // internal/model/experiment.go type ExperimentTarget struct { Category() string Country() string Input() string } ``` where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a trivial way and where, additionally: 1. the `InputLoader` is modified to load `ExperimentTarget`; 2. the `Experiment` is modify to measure an `ExperimentTarget`. In addition to applying these changes, this diff also adapts the whole tree to use `ExperimentTarget` in all places and adds a trivial constructor to obtain `OOAPIURLInfo` when the category code and the country code are unknown. With this diff merged, implementing richer input for real is a matter of implementing the following changes: 1. the `*registry.Factory` has a new func field, defined by each experiment, that loads a list of `ExperimentTarget`; 2. we have a library for input loading containing the same code that we currently use for the input loader; 3. the `InputLoader` is gone and instead we use the factory (or its `*engine.experimentBuilder` wrapper for input loading; 4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to contain an additional field that is the `ExperimentTarget` we want to measure; 5. each experiment that needs richer input type-casts from the `ExperimentTarget` interface to the concrete type that the experiment richer input should have and accesses any option. Part of #1612. This implementation strategy emerged while discussing this matter with @ainghazal, thank you so much for that! --------- Co-authored-by: <ainghazal42@gmail.com>

@ainghazal

Most of the existing code is designed to move around lists of `model.OOAPIURLInfo` and measuring such URLs. This design originally suited Web Connectivity but it's not good enough for richer input because it does not contain options. With this diff, we move into the direction of richer input by replacing `model.OOAPIURLInfo` lists with lists of: ```Go // internal/model/experiment.go type ExperimentTarget struct { Category() string Country() string Input() string } ``` where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a trivial way and where, additionally: 1. the `InputLoader` is modified to load `ExperimentTarget`; 2. the `Experiment` is modify to measure an `ExperimentTarget`. In addition to applying these changes, this diff also adapts the whole tree to use `ExperimentTarget` in all places and adds a trivial constructor to obtain `OOAPIURLInfo` when the category code and the country code are unknown. With this diff merged, implementing richer input for real is a matter of implementing the following changes: 1. the `*registry.Factory` has a new func field, defined by each experiment, that loads a list of `ExperimentTarget`; 2. we have a library for input loading containing the same code that we currently use for the input loader; 3. the `InputLoader` is gone and instead we use the factory (or its `*engine.experimentBuilder` wrapper for input loading; 4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to contain an additional field that is the `ExperimentTarget` we want to measure; 5. each experiment that needs richer input type-casts from the `ExperimentTarget` interface to the concrete type that the experiment richer input should have and accesses any option. Part of #1612. This implementation strategy emerged while discussing this matter with @ainghazal, thank you so much for that!

@ainghazal

Most of the existing code is designed to move around lists of `model.OOAPIURLInfo` and measuring such URLs. The `model.OOAPIURLInfo` type is like: ```Go // internal/model/ooapi.go type OOAPIURLInfo struct { CategoryCode string CountryCode string URL string } ``` This design originally suited Web Connectivity but it's not good enough for richer input because it does not contain options. With this diff, we move into the direction of richer input by replacing `model.OOAPIURLInfo` lists with lists of: ```Go // internal/model/experiment.go type ExperimentTarget struct { Category() string Country() string Input() string } ``` where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a trivial way and where, additionally: 1. the `InputLoader` is modified to load `ExperimentTarget`; 2. the `Experiment` is modify to measure an `ExperimentTarget`. In addition to applying these changes, this diff also adapts the whole tree to use `ExperimentTarget` in all places and adds a trivial constructor to obtain `OOAPIURLInfo` when the category code and the country code are unknown. With this diff merged, implementing richer input for real is a matter of implementing the following changes: 1. the `*registry.Factory` has a new func field, defined by each experiment, that loads a list of `ExperimentTarget`; 2. we have a library for input loading containing the same code that we currently use for the input loader; 3. the `InputLoader` is gone and instead we use the factory (or its `*engine.experimentBuilder` wrapper) for input loading; 4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to contain an additional field that is the `ExperimentTarget` we want to measure; 5. each experiment that needs richer input type-casts from the `ExperimentTarget` interface to the concrete type that the experiment richer input should have and accesses any option. Part of #1612. This implementation strategy emerged while discussing this matter with @ainghazal, thank you so much for that! --------- Co-authored-by: DecFox <33030671+DecFox@users.noreply.github.com>

This commit moves the engine.InputLoader type to a new package called inputloading and adapts the naming to avoid stuttering. The reason for moving InputLoader is that the engine package depends on registry, and, per the plan described by the first richer input PR, #1615, we want to move input loading directly inside the registry. To this end, we need to move the input loading feature outside of engine to avoid creating import loops. We keep an integration test inside the engine package because it seems such an integration test was checking both engine and the InputLoader together. We may further refactor this test in the future. Part of #1612

This commit moves the engine.InputLoader type to a new package called inputloading and adapts the naming to avoid stuttering. We therefore have engine.InputLoaderSession => targetloading.Session and other similar renames. The reason for moving InputLoader is that the engine package depends on registry, and, per the plan described by the first richer input PR, #1615, we want to move target loading directly inside the registry. To this end, we need to move the target loading feature outside of engine to avoid creating import loops, which prevent the code from compiling because Go does not support them. While there, name the package targetloading rather than inputloading since richer input is all about targets, where a target is defined by the (input, options) tuple. Also, try to consistently rename types to mention targets. We keep an integration test inside the engine package because it seems such an integration test was checking both engine and the Loader together. We may further refactor this test in the future. Part of #1612 --------- Co-authored-by: DecFox <33030671+DecFox@users.noreply.github.com>

refactor(experiment): make report goroutine safe

83a2f4a

This diff refactors *engine.experiment to make the report field goroutine safe. It also moves at the bottom of experiment.go code that was intermixed with *engine.experiment methods. Part of ooni/probe#2607

bassosimone requested review from hellais and DecFox as code owners June 5, 2024 13:01

x

a5076b5

bassosimone mentioned this pull request Jun 5, 2024

cleanup: remove asynchronous experiments #1613

Merged

DecFox approved these changes Jun 5, 2024

View reviewed changes

bassosimone merged commit cf2d2cc into master Jun 5, 2024
17 of 19 checks passed

bassosimone deleted the issue/2607 branch June 5, 2024 17:04

bassosimone mentioned this pull request Jun 5, 2024

feat: introduce richer input #1615

Merged

bassosimone mentioned this pull request Jun 6, 2024

refactor: move engine.InputLoader to targetloading #1616

Merged

bassosimone added the 2024-06-richer-input Tracking 2024-06 richer input work label Jul 2, 2024

BrewTestBot mentioned this pull request Aug 8, 2024

ooniprobe 3.23.0 Homebrew/homebrew-core#180481

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(experiment): make report goroutine safe #1612

refactor(experiment): make report goroutine safe #1612

bassosimone commented Jun 5, 2024

DecFox commented Jun 5, 2024

refactor(experiment): make report goroutine safe #1612

refactor(experiment): make report goroutine safe #1612

Conversation

bassosimone commented Jun 5, 2024

DecFox commented Jun 5, 2024