Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(experiment): make report goroutine safe #1612

Merged
merged 2 commits into from
Jun 5, 2024
Merged

Conversation

bassosimone
Copy link
Contributor

This diff refactors *engine.experiment to make the report field goroutine safe. It also moves at the bottom of experiment.go code that was intermixed with *engine.experiment methods.

Part of ooni/probe#2607

This diff refactors *engine.experiment to make the report field
goroutine safe. It also moves at the bottom of experiment.go code
that was intermixed with *engine.experiment methods.

Part of ooni/probe#2607
@DecFox
Copy link
Contributor

DecFox commented Jun 5, 2024

LGTM!

@bassosimone bassosimone merged commit cf2d2cc into master Jun 5, 2024
17 of 19 checks passed
@bassosimone bassosimone deleted the issue/2607 branch June 5, 2024 17:04
bassosimone added a commit that referenced this pull request Jun 5, 2024
We originally developed asynchronous experiments as a mean to support
websteps, a nettest that returned more than one measurement per
invocation of its Measure method.

Since then, we removed websteps. Therefore, this code is currently
technically unused. Additionally, this code further complicates
implementing richer input, because it is another way of performing
measurements.

As such, in the interest of switfly moving forward with richer input
_and_ of simplifying the engine, we are now removing this unused
functionality from the tree.

While there, better the documentation of OnProgress and undeprecate
functions to open/close reports since it's clear that, for now, using a
submitter in cmd/ooniprobe is a bit of a stretch given our current
goals. So, it feels best to avoid deprecating until we have nice plan
for replacement.

To have more confidence that the job we did is ~correct, we use the
following table to cross compare the operations that we previously
performed for running sync experiments (i.e., all experiments) in an
async way to the code we're using after this diff to run experiments.
(Note that the diff itself is such that you can easily see the deleted
and the added code inside of the `experiment.go` file.) In both cases,
we're looking at the operations performed starting from
`MeasureWithContext`.

| Operation                   | Before | After |          
| --------------------------- | ------ | ----- |
| MaybeLookupLocationContext  | yes    | yes   |
| use e.session.byteCounter   | yes    | yes   |
| use e.byteCounter           | yes    | yes   |
| newMeasurement              | yes    | yes   |
| save start time             | yes    | yes   |
| initialize experiment args  | yes    | yes   |
| call measurer.Run           | yes    | yes   |
| save end time               | yes    | yes   |
| compute measurement runtime | yes    | yes   |
| scrub measurement           | yes    | yes   | 
| return measurement          | yes    | yes   |

Part of ooni/probe#2607

Depends on #1612
bassosimone added a commit that referenced this pull request Jun 5, 2024
Most of the existing code is designed to move around lists of
`model.OOAPIURLInfo` and measuring such URLs.

This design originally suited Web Connectivity but it's not good
enough for richer input because it does not contain options.

With this diff, we move into the direction of richer input by
replacing `model.OOAPIURLInfo` lists with lists of:

```Go
// internal/model/experiment.go

type ExperimentTarget struct {
	Category() string
	Country() string
	Input() string
}
```

where `*model.OOAPIURLInfo` implements `model.ExperimentTarget`
in a trivial way and where, additionally:

1. the `InputLoader` is modified to load `ExperimentTarget`;

2. the `Experiment` is modify to measure an `ExperimentTarget`.

In addition to applying these changes, this diff also adapts the
whole tree to use `ExperimentTarget` in all places and adds a
trivial constructor to obtain `OOAPIURLInfo` when the category
code and the country code are unknown.

With this diff merged, implementing richer input for real is a
matter of implementing the following changes:

1. the `*registry.Factory` has a new func field, defined by
each experiment, that loads a list of `ExperimentTarget`;

2. we have a library for input loading containing the same code
that we currently use for the input loader;

3. the `InputLoader` is gone and instead we use the factory (or its
`*engine.experimentBuilder` wrapper for input loading;

4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer`
to contain an additional field that is the `ExperimentTarget` we
want to measure;

5. each experiment that needs richer input type-casts from the
`ExperimentTarget` interface to the concrete type that the experiment
richer input should have and accesses any option.

Part of #1612.
bassosimone added a commit that referenced this pull request Jun 5, 2024
Most of the existing code is designed to move around lists of
`model.OOAPIURLInfo` and measuring such URLs.

This design originally suited Web Connectivity but it's not good
enough for richer input because it does not contain options.

With this diff, we move into the direction of richer input by
replacing `model.OOAPIURLInfo` lists with lists of:

```Go
// internal/model/experiment.go

type ExperimentTarget struct {
	Category() string
	Country() string
	Input() string
}
```

where `*model.OOAPIURLInfo` implements `model.ExperimentTarget`
in a trivial way and where, additionally:

1. the `InputLoader` is modified to load `ExperimentTarget`;

2. the `Experiment` is modify to measure an `ExperimentTarget`.

In addition to applying these changes, this diff also adapts the
whole tree to use `ExperimentTarget` in all places and adds a
trivial constructor to obtain `OOAPIURLInfo` when the category
code and the country code are unknown.

With this diff merged, implementing richer input for real is a
matter of implementing the following changes:

1. the `*registry.Factory` has a new func field, defined by
each experiment, that loads a list of `ExperimentTarget`;

2. we have a library for input loading containing the same code
that we currently use for the input loader;

3. the `InputLoader` is gone and instead we use the factory (or its
`*engine.experimentBuilder` wrapper for input loading;

4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer`
to contain an additional field that is the `ExperimentTarget` we
want to measure;

5. each experiment that needs richer input type-casts from the
`ExperimentTarget` interface to the concrete type that the experiment
richer input should have and accesses any option.

Part of #1612.

This implementation strategy emerged while discussing this matter
with @ainghazal, thank you so much for that!
bassosimone added a commit that referenced this pull request Jun 5, 2024
Most of the existing code is designed to move around lists of
`model.OOAPIURLInfo` and measuring such URLs.

This design originally suited Web Connectivity but it's not good
enough for richer input because it does not contain options.

With this diff, we move into the direction of richer input by
replacing `model.OOAPIURLInfo` lists with lists of:

```Go
// internal/model/experiment.go

type ExperimentTarget struct {
	Category() string
	Country() string
	Input() string
}
```

where `*model.OOAPIURLInfo` implements `model.ExperimentTarget`
in a trivial way and where, additionally:

1. the `InputLoader` is modified to load `ExperimentTarget`;

2. the `Experiment` is modify to measure an `ExperimentTarget`.

In addition to applying these changes, this diff also adapts the
whole tree to use `ExperimentTarget` in all places and adds a
trivial constructor to obtain `OOAPIURLInfo` when the category
code and the country code are unknown.

With this diff merged, implementing richer input for real is a
matter of implementing the following changes:

1. the `*registry.Factory` has a new func field, defined by
each experiment, that loads a list of `ExperimentTarget`;

2. we have a library for input loading containing the same code
that we currently use for the input loader;

3. the `InputLoader` is gone and instead we use the factory (or its
`*engine.experimentBuilder` wrapper for input loading;

4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer`
to contain an additional field that is the `ExperimentTarget` we
want to measure;

5. each experiment that needs richer input type-casts from the
`ExperimentTarget` interface to the concrete type that the experiment
richer input should have and accesses any option.

Part of #1612.

This implementation strategy emerged while discussing this matter
with @ainghazal, thank you so much for that!

Co-Authored-by: <99027643+ainghazal@users.noreply.github.com>
bassosimone added a commit that referenced this pull request Jun 5, 2024
Most of the existing code is designed to move around lists of
`model.OOAPIURLInfo` and measuring such URLs.

This design originally suited Web Connectivity but it's not good
enough for richer input because it does not contain options.

With this diff, we move into the direction of richer input by
replacing `model.OOAPIURLInfo` lists with lists of:

```Go
// internal/model/experiment.go

type ExperimentTarget struct {
	Category() string
	Country() string
	Input() string
}
```

where `*model.OOAPIURLInfo` implements `model.ExperimentTarget`
in a trivial way and where, additionally:

1. the `InputLoader` is modified to load `ExperimentTarget`;

2. the `Experiment` is modify to measure an `ExperimentTarget`.

In addition to applying these changes, this diff also adapts the
whole tree to use `ExperimentTarget` in all places and adds a
trivial constructor to obtain `OOAPIURLInfo` when the category
code and the country code are unknown.

With this diff merged, implementing richer input for real is a
matter of implementing the following changes:

1. the `*registry.Factory` has a new func field, defined by
each experiment, that loads a list of `ExperimentTarget`;

2. we have a library for input loading containing the same code
that we currently use for the input loader;

3. the `InputLoader` is gone and instead we use the factory (or its
`*engine.experimentBuilder` wrapper for input loading;

4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer`
to contain an additional field that is the `ExperimentTarget` we
want to measure;

5. each experiment that needs richer input type-casts from the
`ExperimentTarget` interface to the concrete type that the experiment
richer input should have and accesses any option.

Part of #1612.

This implementation strategy emerged while discussing this matter
with @ainghazal, thank you so much for that!

Co-authored-by: <ainghazal42@gmail.com>
bassosimone added a commit that referenced this pull request Jun 5, 2024
Most of the existing code is designed to move around lists of
`model.OOAPIURLInfo` and measuring such URLs.

This design originally suited Web Connectivity but it's not good
enough for richer input because it does not contain options.

With this diff, we move into the direction of richer input by
replacing `model.OOAPIURLInfo` lists with lists of:

```Go
// internal/model/experiment.go

type ExperimentTarget struct {
	Category() string
	Country() string
	Input() string
}
```

where `*model.OOAPIURLInfo` implements `model.ExperimentTarget`
in a trivial way and where, additionally:

1. the `InputLoader` is modified to load `ExperimentTarget`;

2. the `Experiment` is modify to measure an `ExperimentTarget`.

In addition to applying these changes, this diff also adapts the
whole tree to use `ExperimentTarget` in all places and adds a
trivial constructor to obtain `OOAPIURLInfo` when the category
code and the country code are unknown.

With this diff merged, implementing richer input for real is a
matter of implementing the following changes:

1. the `*registry.Factory` has a new func field, defined by
each experiment, that loads a list of `ExperimentTarget`;

2. we have a library for input loading containing the same code
that we currently use for the input loader;

3. the `InputLoader` is gone and instead we use the factory (or its
`*engine.experimentBuilder` wrapper for input loading;

4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer`
to contain an additional field that is the `ExperimentTarget` we
want to measure;

5. each experiment that needs richer input type-casts from the
`ExperimentTarget` interface to the concrete type that the experiment
richer input should have and accesses any option.

Part of #1612.

This implementation strategy emerged while discussing this matter
with @ainghazal, thank you so much for that!

---------

Co-authored-by: <ainghazal42@gmail.com>
bassosimone added a commit that referenced this pull request Jun 5, 2024
Most of the existing code is designed to move around lists of
`model.OOAPIURLInfo` and measuring such URLs.

This design originally suited Web Connectivity but it's not good
enough for richer input because it does not contain options.

With this diff, we move into the direction of richer input by
replacing `model.OOAPIURLInfo` lists with lists of:

```Go
// internal/model/experiment.go

type ExperimentTarget struct {
	Category() string
	Country() string
	Input() string
}
```

where `*model.OOAPIURLInfo` implements `model.ExperimentTarget`
in a trivial way and where, additionally:

1. the `InputLoader` is modified to load `ExperimentTarget`;

2. the `Experiment` is modify to measure an `ExperimentTarget`.

In addition to applying these changes, this diff also adapts the
whole tree to use `ExperimentTarget` in all places and adds a
trivial constructor to obtain `OOAPIURLInfo` when the category
code and the country code are unknown.

With this diff merged, implementing richer input for real is a
matter of implementing the following changes:

1. the `*registry.Factory` has a new func field, defined by
each experiment, that loads a list of `ExperimentTarget`;

2. we have a library for input loading containing the same code
that we currently use for the input loader;

3. the `InputLoader` is gone and instead we use the factory (or its
`*engine.experimentBuilder` wrapper for input loading;

4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer`
to contain an additional field that is the `ExperimentTarget` we
want to measure;

5. each experiment that needs richer input type-casts from the
`ExperimentTarget` interface to the concrete type that the experiment
richer input should have and accesses any option.

Part of #1612.

This implementation strategy emerged while discussing this matter
with @ainghazal, thank you so much for that!
bassosimone added a commit that referenced this pull request Jun 6, 2024
Most of the existing code is designed to move around lists of
`model.OOAPIURLInfo` and measuring such URLs.

The `model.OOAPIURLInfo` type is like:

```Go
// internal/model/ooapi.go

type OOAPIURLInfo struct {
	CategoryCode string
	CountryCode string
	URL string
}
```

This design originally suited Web Connectivity but it's not good enough
for richer input because it does not contain options.

With this diff, we move into the direction of richer input by replacing
`model.OOAPIURLInfo` lists with lists of:

```Go
// internal/model/experiment.go

type ExperimentTarget struct {
	Category() string
	Country() string
	Input() string
}
```

where `*model.OOAPIURLInfo` implements `model.ExperimentTarget` in a
trivial way and where, additionally:

1. the `InputLoader` is modified to load `ExperimentTarget`;

2. the `Experiment` is modify to measure an `ExperimentTarget`.

In addition to applying these changes, this diff also adapts the whole
tree to use `ExperimentTarget` in all places and adds a trivial
constructor to obtain `OOAPIURLInfo` when the category code and the
country code are unknown.

With this diff merged, implementing richer input for real is a matter of
implementing the following changes:

1. the `*registry.Factory` has a new func field, defined by each
experiment, that loads a list of `ExperimentTarget`;

2. we have a library for input loading containing the same code that we
currently use for the input loader;

3. the `InputLoader` is gone and instead we use the factory (or its
`*engine.experimentBuilder` wrapper) for input loading;

4. we modify the `ExperimentArgs` passed to the `ExperimentMeasurer` to
contain an additional field that is the `ExperimentTarget` we want to
measure;

5. each experiment that needs richer input type-casts from the
`ExperimentTarget` interface to the concrete type that the experiment
richer input should have and accesses any option.

Part of #1612.

This implementation strategy emerged while discussing this matter with
@ainghazal, thank you so much for that!

---------

Co-authored-by: DecFox <33030671+DecFox@users.noreply.github.com>
bassosimone added a commit that referenced this pull request Jun 6, 2024
This commit moves the engine.InputLoader type to a new package called
inputloading and adapts the naming to avoid stuttering.

The reason for moving InputLoader is that the engine package depends on
registry, and, per the plan described by the first richer input PR,
#1615, we want to move input loading
directly inside the registry. To this end, we need to move the input
loading feature outside of engine to avoid creating import loops.

We keep an integration test inside the engine package because it seems
such an integration test was checking both engine and the InputLoader
together. We may further refactor this test in the future.

Part of #1612
bassosimone added a commit that referenced this pull request Jun 6, 2024
This commit moves the engine.InputLoader type to a new package called
inputloading and adapts the naming to avoid stuttering. We therefore
have engine.InputLoaderSession => targetloading.Session and other
similar renames.

The reason for moving InputLoader is that the engine package depends on
registry, and, per the plan described by the first richer input PR,
#1615, we want to move target
loading directly inside the registry. To this end, we need to move the
target loading feature outside of engine to avoid creating import loops,
which prevent the code from compiling because Go does not support them.

While there, name the package targetloading rather than inputloading
since richer input is all about targets, where a target is defined by
the (input, options) tuple. Also, try to consistently rename types to
mention targets.

We keep an integration test inside the engine package because it seems
such an integration test was checking both engine and the Loader
together. We may further refactor this test in the future.

Part of #1612

---------

Co-authored-by: DecFox <33030671+DecFox@users.noreply.github.com>
@bassosimone bassosimone added the 2024-06-richer-input Tracking 2024-06 richer input work label Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024-06-richer-input Tracking 2024-06 richer input work
Projects
None yet
2 participants