Skip to content

Commit

Permalink
Merged PR 6969: Elaborate HTML documentation
Browse files Browse the repository at this point in the history
This PR introduces a more elaborate, HTML based documentation. It uses `sphinx` and the `read-the-docs` theme and is automatically created when calling `convert_code_to_documentation`.

The following changes are made by this PR:
- Add explicit `CONTRIBUTORS.md` file.
- Change links in `README.md` such that they work in the html version
- Add `concepts` folder and stub files for introducing general concepts of BayBE.
- Adjust the docstrings of attributes: These now need to be in the line below the attribute, NOT in the class docstring!
- Debug functionality for code conversion via `--debug` flag
- Simplification of conversion script
- Fix inheritance, thus implementing #18775
- Functionality to include examples when creating the documentation
- Making the script fail when errors are encountered during the building of the documentation
- General fixing of broken links, typos and so on

The following aspects still need to be done and will be part of an upcoming PR:
- Fill concept pages with actual content
- Adjust the CONTRIBUTION.md and CONTRIBUTORS.md file

Note that there are some issues that we should still discuss about:
- What do we do with the defaults? This is bugged (sphinx-toolbox/sphinx-toolbox#146) and might require some workaround.

Related work items: #18775, #19312
  • Loading branch information
AVHopp committed Nov 21, 2023
2 parents 576c269 + 95cb600 commit 488645e
Show file tree
Hide file tree
Showing 92 changed files with 1,072 additions and 1,082 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,8 @@ htmlcov

# Pictures created by backtesting
*.png

# Folders that are temporarily created when building the documentation
docs/_autosummary
docs/examples
docs/sdk
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `SequentialStrategy` and `StreamingSequentialStrategy` classes
- Telemetry env variable `BAYBE_TELEMETRY_VPN_CHECK` turning the initial connectivity check on/off
- Telemetry env variable `BAYBE_TELEMETRY_VPN_CHECK_TIMEOUT` for setting the connectivity check timeout
- Script for building HTML documentation and corresponding `tox` environment

### Changed
- Reorganized modules into subpackages
Expand All @@ -34,6 +35,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `baybe.surrogate` replaced with `baybe.surrogates`
- `baybe.targets.Objective` replaced with `baybe.objective.Objective`
- `baybe.strategies.Strategy` replaced with `baybe.strategies.TwoPhaseStrategy`
- Markdown based documentation replaced by HTML based documentation

## [0.5.1] - 2023-10-19
### Added
Expand All @@ -54,7 +56,7 @@ or continuous parameters
- Random recommendation failing for small discrete (sub-)spaces
- Deserialization issue with `TaskParameter`

# [0.5.0] - 2023-09-15
## [0.5.0] - 2023-09-15
### Added
- `TaskParameter` for multitask modelling
- Basic transfer learning capability using multitask kernels
Expand Down
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ The infrastructure used to host the current documentation as well as the design
- Function signatures need to have type hints for both inputs and the return type.
- Type hints should not be added to the docstrings.
- When referencing another class, function, or similar, use the syntax ``:func:`path.to.function` `` where `func` should be replaced by the respective keyword.
- When parts of the comment should appear as `code` in the docstring, use triple backticks ```.
- Since we use [attrs](https://www.attrs.org/en/stable/) for writing classes, the documentation of initialization functions needs to be done in the class docstring. In particular, instance attributes need to be documented there.
- When parts of the comment should appear as `code` in the docstring, use double backticks ``.
- Since we use [attrs](https://www.attrs.org/en/stable/) for writing classes, initialization functions are not documented. Instance attributes thus need to be documented using a docstring in the line below their declaration.
- Class variables are documented by adding a docstring in the line below their declaration.
- When an inherited class sets one of the instance attributes, this attribute needs to be documented in the docstring of the inherited class.
- Magic functions do not require a docstring.
Expand All @@ -39,5 +39,5 @@ For most parts, BayBE's code is organized into different subpackages. When
extending its functionality (for instance, by adding new component subclasses), make
sure that the newly written code is well integrated into the existing package and
module hierarchy. In particular, public functionality should be imported into the
appropriate high-level namespaces for easy user import. For an example, see the
[parameter namespace](./baybe/parameters/__init__.py).
appropriate high-level namespaces for easy user import. For an example, see the
[parameter namespace](baybe.parameters).
11 changes: 11 additions & 0 deletions CONTRIBUTORS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Contributors

## Authors
- Martin Fitzner (Merck KGaA, Darmstadt, Germany), [Contact](mailto:martin.fitzner@merckgroup.com), [Github](https://github.com/Scienfitz)
- Adrian Šošić (Merck Life Science KGaA, Darmstadt, Germany), [Contact](mailto:adrian.sosic@merckgroup.com), [Github](https://github.com/AdrianSosic)
- Alexander Hopp (Merck KGaA, Darmstadt, Germany) [Contact](mailto:alexander.hopp@merckgroup.com), [Github](https://github.com/AVHopp)
- Alex Lee (EMD Electronics, Tempe, Arizona, USA) [Contact](mailto:alex.lee@emdgroup.com), [Github](https://github.com/galaxee87)

## Contributors
- Emeline Sola (during an internship at Merck KGaA, Darmstadt, Germany):
Auto-documentation of the examples
27 changes: 13 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,14 @@ It provides the necessary functionality to:
- handle measurements data and feed it back into the experimental design.
- compare different DOE strategies through backtesting with synthetic and real data.

## :exclamation::construction: NOTE :construction::exclamation:
## ❗🚧 This repository is under **heavy active development**. 🚧❗

This repository is under **heavy active development**.
Please note that the provided functionality and user interfaces are not stable and may
change in newer releases.
Therefore, if you would like to use the code in its current early stage, we recommend
pinning the version during installation to prevent possible changes in the backend.
In case of questions or comments, feel free to reach out to the **BayBE Dev Team** (see
[pyproject.toml](./pyproject.toml) for contact details).
[the contributors page](docs/misc/contributors_link) for contact details).

## Installation

Expand Down Expand Up @@ -86,12 +85,12 @@ pip install baybe[chem,simulation]
```

The available groups are:
- `chem`: Cheminformatics utilities (e.g. for the `SubstanceParameter`).
- `chem`: Cheminformatics utilities (e.g. for the [`SubstanceParameter`](baybe.parameters.substance.SubstanceParameter)).
- `docs`: Required for creating the documentation.
- `examples`: Required for running the examples/streamlit.
- `lint`: Required for linting and formatting.
- `onnx`: Required for using custom surrogate models in ONNX format.
- `simulation`: Enabling the `simulation` module.
- `simulation`: Enabling the [`simulation`](baybe.simulation) module.
- `test`: Required for running the tests.
- `dev`: All of the above plus `tox` and `pip-audit`.

Expand Down Expand Up @@ -128,7 +127,7 @@ In that sense, the former carry information that **must be** provided by the use
whereas the latter are **optional** settings that can also be set automatically
by BayBE.

A key element in the design of BayBE is the `Campaign` object.
A key element in the design of BayBE is the [`Campaign`](baybe.campaign.Campaign) object.
It acts as a central container for all the necessary information and objects
associated with an experimentation process, ensuring that all independent model
components (e.g. the objective function, the search space, etc.) are properly combined.
Expand All @@ -145,7 +144,7 @@ simultaneously, as an introductory example, we consider a simple scenario where
goal is to **maximize** a single numerical target that represents the yield of a
chemical reaction.

In BayBE's language, the reaction yield can be represented as a `NumericalTarget`
In BayBE's language, the reaction yield can be represented as a [`NumericalTarget`](baybe.targets.numerical)
object:

```python
Expand All @@ -157,7 +156,7 @@ target = NumericalTarget(
)
```

We wrap the target object in an optimization `Objective`, to inform BayBE
We wrap the target object in an optimization [`Objective`](baybe.objective.Objective), to inform BayBE
that this is the only target we would like to consider:

```python
Expand All @@ -167,8 +166,8 @@ objective = Objective(mode="SINGLE", targets=[target])
```

In cases where we need to consider multiple (potentially competing) targets, the
role of the `Objective` is to define how these targets should be balanced.
For more details, see [baybe/targets.py](./baybe/targets.py).
role of the [`Objective`](baybe.objective.Objective) is to define how these targets should be balanced.
For more details, see [the targets section of the user guide](docs/userguide/targets.md).

### Defining the Search Space

Expand Down Expand Up @@ -211,11 +210,11 @@ type-specific settings. In particular case above, for instance:
the substance parameter "Solvent".

For more parameter types and their details, see
[baybe/parameters.py](./baybe/parameters.py).
[parameters section of the user guide](docs/userguide/parameters).

Additionally, we can define a set of constraints to further specify allowed ranges and
relationships between our parameters.
Details can be found in [baybe/constraints.py](./baybe/constraints.py).
Details can be found in [the constraints section of the user guids](docs/userguide/constraints).
In this example, we assume no further constraints and explicitly indicate this with an
empty variable, for the sake of demonstration:

Expand All @@ -224,7 +223,7 @@ constraints = None
```

With the parameter and constraint definitions at hand, we can now create our
`SearchSpace`:
[`SearchSpace`](baybe.searchspace):

```python
from baybe.searchspace import SearchSpace
Expand All @@ -247,7 +246,7 @@ For our chemistry example, we combine two selection strategies:

For more details on the different strategies, their underlying algorithmic
details, and their configuration settings, see
[baybe/strategies](./baybe/strategies).
[the strategies section of the user guide](docs/userguide/strategy).

```python
from baybe.strategies import TwoPhaseStrategy
Expand Down
36 changes: 17 additions & 19 deletions baybe/acquisition.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from typing import Any, Callable, List, Optional, Type

import gpytorch.distributions
from attr import define
from botorch.acquisition import AcquisitionFunction
from botorch.models.gpytorch import Model
from botorch.posteriors import Posterior
Expand All @@ -18,7 +19,7 @@ def debotorchize(acqf_cls: Type[AcquisitionFunction]):
This wrapped function becomes generally usable in combination with other non-BoTorch
surrogate models. This is required since BoTorch's acquisition functions expect a
```botorch.model.Model``` to work with, hindering their general use with arbitrary
``botorch.model.Model`` to work with, hindering their general use with arbitrary
probabilistic models. The wrapper class returned by this function resolves this
issue by operating as an adapter that internally creates a helper BoTorch model,
which serves as a translation layer and is passed to the selected BoTorch
Expand Down Expand Up @@ -94,31 +95,28 @@ def posterior( # noqa: D102
return GPyTorchPosterior(mvn)


@define
class PartialAcquisitionFunction:
"""Acquisition function for evaluating points in a hybrid search space.
It can either pin the discrete or the continuous part. The pinned part is assumed
to be a tensor of dimension ```d x 1``` where d is the computational dimension of
to be a tensor of dimension ``d x 1`` where d is the computational dimension of
the search space that is to be pinned. The acquisition function is assumed to be
defined for the full hybrid space.
Args:
acqf: The acquisition function for the hybrid space.
pinned_part: The values that will be attached whenever evaluating the
acquisition function.
pin_discrete: A flag for denoting whether the pinned_part corresponds to the
discrete subspace
"""

def __init__(
self, acqf: AcquisitionFunction, pinned_part: Tensor, pin_discrete: bool
):
self.acqf = acqf
self.pinned_part = pinned_part
self.pin_discrete = pin_discrete
acqf: AcquisitionFunction
"""The acquisition function for the hybrid space."""

pinned_part: Tensor
"""The values that will be attached whenever evaluating the acquisition function."""

pin_discrete: Tensor
"""A flag for denoting whether ``pinned_part`` corresponds to the discrete
subspace."""

def _lift_partial_part(self, partial_part: Tensor) -> Tensor:
"""Lift ```partial_part``` to the original hybrid space.
"""Lift ``partial_part`` to the original hybrid space.
Depending on whether the discrete or the variable part of the search space is
pinned, this function identifies whether the partial_part is the continuous
Expand Down Expand Up @@ -167,12 +165,12 @@ def __getattr__(self, item):
def set_X_pending(self, X_pending: Optional[Tensor]): # pylint: disable=C0103
"""Inform the acquisition function about pending design points.
Enhances the original ```set_X_pending``` function from the full acquisition
Enhances the original ``set_X_pending`` function from the full acquisition
function as we need to store the full point, i.e., the point in the hybrid space
for the ```PartialAcquisitionFunction``` to work properly.
for the ``PartialAcquisitionFunction`` to work properly.
Args:
X_pending: ```n x d``` Tensor with n d-dim design points that have been
X_pending: ``n x d`` Tensor with n d-dim design points that have been
submitted for evaluation but have not yet been evaluated.
"""
if X_pending is not None: # Lift point to hybrid space and add additional dim
Expand Down
34 changes: 18 additions & 16 deletions baybe/campaign.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,38 +48,40 @@ class Campaign(SerialMixin):
series of measurements and the iterative sequence of events involved.
In particular, a campaign:
* defines the objective of an experimentation process.
* defines the search space over which the experimental parameter may vary.
* defines a strategy for traversing the search space.
* Records the measurement data collected during the process.
* Records metadata about the progress of the experimentation process.
Args:
searchspace: The search space in which the experiments are conducted.
objective: The optimization objective.
strategy: The employed strategy.
measurements_exp: The experimental representation of the conducted experiments.
numerical_measurements_must_be_within_tolerance: Flag for forcing numerical
measurements to be within tolerance.
n_batches_done: The number of already processed batches.
n_fits_done: The number of fits already done.
* Defines the objective of an experimentation process.
* Defines the search space over which the experimental parameter may vary.
* Defines a strategy for traversing the search space.
* Records the measurement data collected during the process.
* Records metadata about the progress of the experimentation process.
"""

# DOE specifications
searchspace: SearchSpace = field()
"""The search space in which the experiments are conducted."""

objective: Objective = field()
"""The optimization objective."""

strategy: Strategy = field(factory=TwoPhaseStrategy)
"""The employed strategy"""

# Data
measurements_exp: pd.DataFrame = field(factory=pd.DataFrame, eq=eq_dataframe)
"""The experimental representation of the conducted experiments."""

numerical_measurements_must_be_within_tolerance: bool = field(default=True)
"""Flag for forcing numerical measurements to be within tolerance."""

# Metadata
n_batches_done: int = field(default=0)
"""The number of already processed batches."""

n_fits_done: int = field(default=0)
"""The number of fits already done."""

# Private
_cached_recommendation: pd.DataFrame = field(factory=pd.DataFrame, eq=eq_dataframe)
"""The cached recommendations."""

@property
def parameters(self) -> List[Parameter]:
Expand Down Expand Up @@ -239,7 +241,7 @@ def recommend(self, batch_quantity: int = 5) -> pd.DataFrame:
Dataframe containing the recommendations in experimental representation.
Raises:
ValueError: If ```batch_quantity``` is smaller than 1.
ValueError: If ``batch_quantity`` is smaller than 1.
"""
if batch_quantity < 1:
raise ValueError(
Expand Down
23 changes: 13 additions & 10 deletions baybe/constraints/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,20 +26,19 @@ class Constraint(ABC, SerialMixin):
Constraints use conditions and chain them together to filter unwanted entries from
the search space.
Args:
parameters: The list of parameters used for the constraint.
"""

# class variables
# TODO: it might turn out these are not needed at a later development stage
eval_during_creation: ClassVar[bool]
"""Class variable encoding whether the condition is evaluated during creation."""

eval_during_modeling: ClassVar[bool]
"""Class variable encoding whether the condition is evaluated during modeling."""

# Object variables
parameters: List[str] = field(validator=min_len(1))
"""The list of parameters used for the constraint."""

@parameters.validator
def _validate_params( # noqa: DOC101, DOC103
Expand All @@ -48,7 +47,7 @@ def _validate_params( # noqa: DOC101, DOC103
"""Validate the parameter list.
Raises:
ValueError: If ```params``` contains duplicate values.
ValueError: If ``params`` contains duplicate values.
"""
if len(params) != len(set(params)):
raise ValueError(
Expand Down Expand Up @@ -77,7 +76,10 @@ class DiscreteConstraint(Constraint, ABC):

# class variables
eval_during_creation: ClassVar[bool] = True
# See base class.

eval_during_modeling: ClassVar[bool] = False
# See base class.

@abstractmethod
def get_invalid(self, data: pd.DataFrame) -> pd.Index:
Expand All @@ -98,20 +100,21 @@ class ContinuousConstraint(Constraint, ABC):
Continuous constraints use parameter lists and coefficients to define in-/equality
constraints over a continuous parameter space.
Args:
parameters: See base class.
coefficients: In-/equality coefficient for each entry in ```parameters```.
rhs: Right-hand side value of the in-/equality.
"""

# class variables
eval_during_creation: ClassVar[bool] = False
# See base class.

eval_during_modeling: ClassVar[bool] = True
# See base class.

# object variables
coefficients: List[float] = field()
"""In-/equality coefficient for each entry in ``parameters``."""

rhs: float = field(default=0.0)
"""Right-hand side value of the in-/equality."""

@coefficients.validator
def _validate_coefficients( # noqa: DOC101, DOC103
Expand Down Expand Up @@ -139,7 +142,7 @@ def to_botorch(
) -> Tuple[Tensor, Tensor, float]:
"""Cast the constraint in a format required by botorch.
Used in calling ```optimize_acqf_*``` functions, for details see
Used in calling ``optimize_acqf_*`` functions, for details see
https://botorch.org/api/optim.html#botorch.optim.optimize.optimize_acqf
Args:
Expand Down

0 comments on commit 488645e

Please sign in to comment.