diff --git a/website/docs/actuators/working-with-actuators.md b/website/docs/actuators/working-with-actuators.md index 96659e703..2cdf7f535 100644 --- a/website/docs/actuators/working-with-actuators.md +++ b/website/docs/actuators/working-with-actuators.md @@ -13,11 +13,11 @@ You can also add [your own custom experiments](creating-custom-experiments.md) using the special actuator [_custom_experiments_](creating-custom-experiments.md#using-your-custom-experiment). -!!! info end - - Most actuators are plugins: pieces of code that can be installed - independently from `ado` and that `ado` can dynamically discover. Custom - experiments are also plugins. +> [!NOTE] Actuators and Plugins +> +> Most actuators are plugins: pieces of code that can be installed +> independently from `ado` and that `ado` can dynamically discover. Custom +> experiments are also plugins. ## Listing available Actuators @@ -28,13 +28,83 @@ To see a list of available actuators execute ado get actuators ``` -to see the experiments each provides +You can also use `ado get actuators --details` which in addition +outputs the description of the actuators, the number of +experiments they provide and their version. Below is an example +of the output: + + + +```commandline +┌────────────────────┬─────────────┬─────────────────────────────────────────────────────┬───────────────────────────┐ +│ ACTUATOR ID │ EXPERIMENTS │ DESCRIPTION │ VERSION │ +├────────────────────┼─────────────┼─────────────────────────────────────────────────────┼───────────────────────────┤ +│ SFTTrainer │ 5 │ An actuator for benchmarking fine-tuning of │ 1.5.1.dev13+ga1833142b │ +│ │ │ foundation models │ │ +│ custom_experiments │ 6 │ Actuator for applying user supplied custom │ 1.5.1.dev8+531c6444.dirty │ +│ │ │ experiments │ │ +│ mock │ 2 │ A actuator class for testing │ 1.5.1.dev8+531c6444.dirty │ +│ replay │ 0 │ Special actuator for handling externally defined │ 1.5.1.dev8+531c6444.dirty │ +│ │ │ experiments (experiments we don't have code for) │ │ +│ robotic_lab │ 1 │ A template for creating an actuator │ 1.5.1.dev13+ga1833142b │ +└────────────────────┴─────────────┴─────────────────────────────────────────────────────┴───────────────────────────┘ +``` + + + +## Listing available Experiments + +To see the experiments each actuator provides ```commandline ado get experiments ``` +You can also get see the description of each experiment (if provided) +with `ado get experiments --details`. +The output will be similar to: + + +```terminaloutput +┌────────────────────┬─────────────────────────────────────┬─────────────────────────────────────────────────────────┐ +│ ACTUATOR ID │ EXPERIMENT ID │ DESCRIPTION │ +├────────────────────┼─────────────────────────────────────┼─────────────────────────────────────────────────────────┤ +│ SFTTrainer │ finetune_full_benchmark-v1.0.0 │ Measures the performance of full-finetuning a model for │ +│ │ │ a given (GPU model, number GPUS, batch_size, │ +│ │ │ model_max_length, number nodes) combination. │ +│ SFTTrainer │ finetune_full_stability-v1.0.0 │ Performs 5 full finetune runs of 5 steps each on a │ +│ │ │ model and reports the fraction of those that resulted │ +│ │ │ in GPU OOM, Other error, or No Error for a given (GPU │ +│ │ │ model, number GPUS, batch_size, model_max_length) │ +│ │ │ combination. │ +│ SFTTrainer │ finetune_gptq-lora_benchmark-v1.0.0 │ Measures the performance of GPTQ-LORA tuning a model │ +│ │ │ for a given (GPU model, number GPUS, batch_size, │ +│ │ │ model_max_length, number nodes) combination. │ +│ SFTTrainer │ finetune_lora_benchmark-v1.0.0 │ Measures the performance of LORA tuning a model for a │ +│ │ │ given (GPU model, number GPUS, batch_size, │ +│ │ │ model_max_length, number nodes) combination. │ +│ SFTTrainer │ finetune_pt_benchmark-v1.0.0 │ Measures the performance of prompt-tuning a model for a │ +│ │ │ given (GPU model, number GPUS, batch_size, │ +│ │ │ model_max_length, number nodes) combination. │ +│ custom_experiments │ acid_test │ │ +│ custom_experiments │ avoid_oom_recommender │ An AutoConf recommender that suggests the minimum │ +│ │ │ number of gpus per worker and number of workers │ +│ │ │ necessary to execute a Tuning job whilekeeping the per │ +│ │ │ GPU batch size constant │ +│ custom_experiments │ calculate_density │ │ +│ custom_experiments │ min_gpu_recommender │ An AutoConf plugin that suggests the minimum number of │ +│ │ │ gpus per worker and number of workers necessary to │ +│ │ │ execute a Tuning job │ +│ custom_experiments │ ml-multicloud-cost-v1.0 │ │ +│ custom_experiments │ nevergrad_opt_3d_test_func │ │ +│ mock │ test-experiment │ │ +│ mock │ test-experiment-two │ │ +│ robotic_lab │ peptide_mineralization │ Measures adsorption of peptide lanthanide combinations │ +└────────────────────┴─────────────────────────────────────┴─────────────────────────────────────────────────────────┘ +``` + + ## Special actuators: replay and custom_experiments `ado` has two special builtin actuators: `custom_experiments` and `replay`. @@ -90,7 +160,7 @@ Some additional notes about this process when you are developing an actuator: ## What's next - +
@@ -111,4 +181,4 @@ Some additional notes about this process when you are developing an actuator: [Creating new Operators :octicons-arrow-right-24:](../operators/working-with-operators.md)
- \ No newline at end of file + \ No newline at end of file diff --git a/website/docs/core-concepts/actuators.md b/website/docs/core-concepts/actuators.md index c6c8debca..1328deb50 100644 --- a/website/docs/core-concepts/actuators.md +++ b/website/docs/core-concepts/actuators.md @@ -1,295 +1,30 @@ ## Experiments -To find the values of certain properties of Entities we need to perform -measurements on them. We use the term "experiment" to describe a particular type -of measurement. This is also referred to as an "experiment protocol". +An **Experiment** +measures the values of a set of output properties given a set of input +properties. Each time an Experiment is applied to an +[Entity](entity-spaces.md) it produces a measurement result. -An experiment will define its inputs - the set of constitutive and observed -properties it requires entities to have. It will also define the properties it -measures. +### Inputs and Outputs -You can list them with `ado get experiments --details`. The output will be -similar to: +Experiments define two things: - -```terminaloutput -┌────────────────────┬─────────────────────────────────────┬─────────────────────────────────────────────────────────┐ -│ ACTUATOR ID │ EXPERIMENT ID │ DESCRIPTION │ -├────────────────────┼─────────────────────────────────────┼─────────────────────────────────────────────────────────┤ -│ SFTTrainer │ finetune_full_benchmark-v1.0.0 │ Measures the performance of full-finetuning a model for │ -│ │ │ a given (GPU model, number GPUS, batch_size, │ -│ │ │ model_max_length, number nodes) combination. │ -│ SFTTrainer │ finetune_full_stability-v1.0.0 │ Performs 5 full finetune runs of 5 steps each on a │ -│ │ │ model and reports the fraction of those that resulted │ -│ │ │ in GPU OOM, Other error, or No Error for a given (GPU │ -│ │ │ model, number GPUS, batch_size, model_max_length) │ -│ │ │ combination. │ -│ SFTTrainer │ finetune_gptq-lora_benchmark-v1.0.0 │ Measures the performance of GPTQ-LORA tuning a model │ -│ │ │ for a given (GPU model, number GPUS, batch_size, │ -│ │ │ model_max_length, number nodes) combination. │ -│ SFTTrainer │ finetune_lora_benchmark-v1.0.0 │ Measures the performance of LORA tuning a model for a │ -│ │ │ given (GPU model, number GPUS, batch_size, │ -│ │ │ model_max_length, number nodes) combination. │ -│ SFTTrainer │ finetune_pt_benchmark-v1.0.0 │ Measures the performance of prompt-tuning a model for a │ -│ │ │ given (GPU model, number GPUS, batch_size, │ -│ │ │ model_max_length, number nodes) combination. │ -│ custom_experiments │ acid_test │ │ -│ custom_experiments │ avoid_oom_recommender │ An AutoConf recommender that suggests the minimum │ -│ │ │ number of gpus per worker and number of workers │ -│ │ │ necessary to execute a Tuning job whilekeeping the per │ -│ │ │ GPU batch size constant │ -│ custom_experiments │ calculate_density │ │ -│ custom_experiments │ min_gpu_recommender │ An AutoConf plugin that suggests the minimum number of │ -│ │ │ gpus per worker and number of workers necessary to │ -│ │ │ execute a Tuning job │ -│ custom_experiments │ ml-multicloud-cost-v1.0 │ │ -│ custom_experiments │ nevergrad_opt_3d_test_func │ │ -│ mock │ test-experiment │ │ -│ mock │ test-experiment-two │ │ -│ robotic_lab │ peptide_mineralization │ Measures adsorption of peptide lanthanide combinations │ -└────────────────────┴─────────────────────────────────────┴─────────────────────────────────────────────────────────┘ -``` - - -## Actuators - -Experiments are provided by Actuators. An Actuator usually provides sets of -experiments that work on the same types of entities i.e. have the same or -similar input requirements. As such Actuators usually are related to a -particular domain e.g., computational chemistry, foundation model inference, -robotic biology lab. - -`ado get actuators --details` lists the available actuators, the number of -experiments they provide, a description and their version. Below is an example -of the output: - - - -```commandline -┌────────────────────┬─────────────┬─────────────────────────────────────────────────────┬───────────────────────────┐ -│ ACTUATOR ID │ EXPERIMENTS │ DESCRIPTION │ VERSION │ -├────────────────────┼─────────────┼─────────────────────────────────────────────────────┼───────────────────────────┤ -│ SFTTrainer │ 5 │ An actuator for benchmarking fine-tuning of │ 1.5.1.dev13+ga1833142b │ -│ │ │ foundation models │ │ -│ custom_experiments │ 6 │ Actuator for applying user supplied custom │ 1.5.1.dev8+531c6444.dirty │ -│ │ │ experiments │ │ -│ mock │ 2 │ A actuator class for testing │ 1.5.1.dev8+531c6444.dirty │ -│ replay │ 0 │ Special actuator for handling externally defined │ 1.5.1.dev8+531c6444.dirty │ -│ │ │ experiments (experiments we don't have code for) │ │ -│ robotic_lab │ 1 │ A template for creating an actuator │ 1.5.1.dev13+ga1833142b │ -└────────────────────┴─────────────┴─────────────────────────────────────────────────────┴───────────────────────────┘ -``` - - - -A primary way to extend `ado` is by developing new Actuators providing the -ability to do experiments on entities in a new domain. +- **Inputs** — the values an Experiment needs in order to run. Each input + restricts the values it accepts through a **Property Domain** (for example, + a list of allowed model names, or any integer within a range). See + [Properties and Domains](properties-and-domains.md) for the full list of + domain types. +- **Outputs** — the properties the Experiment measures and records. Because + many Experiments may target the same concept (e.g. `tokens_per_second`), + each output is namespaced to the Experiment that produced it — see + [Target and Observed Properties](#target-and-observed-properties). -### Example: Experiment from the SFTTrainer actuator +### Example -Here is an example (truncated) description of an experiment from the SFTTrainer -actuator. - - - -```commandline -Identifier: SFTTrainer.finetune_pt_benchmark-v1.0.0 -Description: Measures the performance of prompt-tuning a model for a given (GPU model, number GPUS, batch_size, -model_max_length, number nodes) combination. - - -Required Inputs: - - Constitutive Properties: - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - Identifier: model_name - Description: The huggingface name or path to the model - - Domain: - - Type: CATEGORICAL_VARIABLE_TYPE - Values: [ - 'allam-1-13b', - 'granite-13b-v2', - 'granite-20b-v2', - 'granite-3-8b', - 'granite-3.0-1b-a400m-base', - 'granite-3.1-2b', - 'granite-3.1-3b-a800m-instruct', - 'granite-3.1-8b-instruct', - 'granite-3.3-8b', - 'granite-34b-code-base', - 'granite-3b-1.5', - 'granite-3b-code-base-128k', - 'granite-4.0-1b', - 'granite-4.0-350m', - 'granite-4.0-h-1b', - 'granite-4.0-h-micro', - 'granite-4.0-h-small', - 'granite-4.0-h-tiny', - 'granite-4.0-micro', - 'granite-7b-base', - 'granite-8b-code-base', - 'granite-8b-code-base-128k', - 'granite-8b-code-instruct', - 'granite-8b-japanese', - 'granite-vision-3.2-2b', - 'hf-tiny-model-private/tiny-random-BloomForCausalLM', - 'llama-13b', - 'llama-7b', - 'llama2-70b', - 'llama3-70b', - 'llama3-8b', - 'llama3.1-405b', - 'llama3.1-70b', - 'llama3.1-8b', - 'llama3.2-1b', - 'llama3.2-3b', - 'llava-v1.6-mistral-7b', - 'mistral-123b-v2', - 'mistral-7b-v0.1', - 'mixtral-8x7b-instruct-v0.1', - 'smollm2-135m' - ] - - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - Identifier: model_max_length - Description: The maximum context size. Dataset entries with more tokens they are truncated. Entries with - fewer are padded - - Domain: - - Type: DISCRETE_VARIABLE_TYPE - Interval: 1 - Range: [1, 131073] - - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - Identifier: batch_size - Description: The total batch size to use - - Domain: - - Type: DISCRETE_VARIABLE_TYPE - Interval: 1 - Range: [1, 4097] - - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - Identifier: number_gpus - Description: The total number of GPUs to use - - Domain: - - Type: DISCRETE_VARIABLE_TYPE - Interval: 1 - Range: [0, 33] - - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - -Optional Inputs and Default Values: - - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - Identifier: max_steps - Description: The number of optimization steps to perform. Set to -1 to respect num_train_epochs instead - - Domain: - - Type: DISCRETE_VARIABLE_TYPE - Interval: 1 - Range: [-1, 10001] - - Default value: -1 - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - -Outputs: - ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - finetune_pt_benchmark-v1.0.0-is_valid - finetune_pt_benchmark-v1.0.0-dataset_tokens_per_second_per_gpu - finetune_pt_benchmark-v1.0.0-train_runtime - finetune_pt_benchmark-v1.0.0-dataset_tokens_per_second - finetune_pt_benchmark-v1.0.0-train_samples_per_second - finetune_pt_benchmark-v1.0.0-train_steps_per_second - finetune_pt_benchmark-v1.0.0-train_tokens_per_second - finetune_pt_benchmark-v1.0.0-train_tokens_per_gpu_per_second - finetune_pt_benchmark-v1.0.0-cpu_compute_utilization - finetune_pt_benchmark-v1.0.0-cpu_memory_utilization - finetune_pt_benchmark-v1.0.0-gpu_compute_utilization_min - finetune_pt_benchmark-v1.0.0-gpu_compute_utilization_avg - finetune_pt_benchmark-v1.0.0-gpu_compute_utilization_max - finetune_pt_benchmark-v1.0.0-gpu_memory_utilization_min - finetune_pt_benchmark-v1.0.0-gpu_memory_utilization_avg - finetune_pt_benchmark-v1.0.0-gpu_memory_utilization_max - finetune_pt_benchmark-v1.0.0-gpu_memory_utilization_peak - finetune_pt_benchmark-v1.0.0-gpu_power_watts_min - finetune_pt_benchmark-v1.0.0-gpu_power_watts_avg - finetune_pt_benchmark-v1.0.0-gpu_power_watts_max - finetune_pt_benchmark-v1.0.0-gpu_power_percent_min - finetune_pt_benchmark-v1.0.0-gpu_power_percent_avg - finetune_pt_benchmark-v1.0.0-gpu_power_percent_max - ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── -``` - - - -The SFTTrainer actuator provides experiments which measure the performance of -different fine-tuning techniques on a foundation model fine-tuning deployment -configuration. Therefore, the entities it takes as input represent fine-tuning -deployment configuration. - -## Experiment Inputs - -Experiments define their inputs they require along with valid values for those -inputs. - -### Required Inputs - -Experiments can define required inputs. There are properties an Entity must have -values for, for it to be a valid input to the Experiment. - -For example for `SFTTrainer.finetune_pt_benchmark-v1.0.0` shown above we can see -it requires an Entity to have 4 constitutive properties defined: `model_name`, -`model_max_length`, `batch_size` and `number_gpus`. Each one has a domain which -defines the allowed values for that property - if an Entity has a value for a -property that is not in the defined domain the experiment cannot run on it. - -For example, the `number_gpu` property can only have the values from 0 to 32 -(range is exclusive of upper bound) - - -```commandline - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── - Identifier: number_gpus - Description: The total number of GPUs to use - - Domain: - - Type: DISCRETE_VARIABLE_TYPE - Interval: 1 - Range: [0, 33] - - ────────────────────────────────────────────────────────────────────────────────────────────────────────────── -``` - - -All the required inputs in the examples above are -[constitutive properties](entity-spaces.md#entities). However, they can also be -observed properties (see next section) i.e. properties measured by other -experiments. If an Experiment, `B` has a required input that is an observed -property it means the experiment measuring that property has to be run on an -Entity before Experiment `B` can be run on it. - -### Optional Properties - -Experiments can also define optional properties. These are properties an Entity -can have but if they don't the Experiment will give it a default value. In -addition, the default values of optional properties can be overridden to create -**parameterized experiments**. This is described further in the -[`discoveryspace` resource documentation](../resources/discovery-spaces.md). - -An example experiment with optional properties is +Below is the description of `robotic_lab.peptide_mineralization`, an Experiment +that measures the adsorption of peptide and lanthanide combinations in a +robotic biology lab: ```terminaloutput @@ -376,26 +111,80 @@ Outputs: ``` -Here you can see three optional properties, `temperature`, `replicas` and -`robot_identifier` that are given default values. +The example shows: + +- **Required inputs** — `peptide_identifier`, `peptide_concentration`, and + `lanthanide_concentration` must always be provided. Each declares a domain + that restricts the valid values. +- **Optional inputs** — `temperature`, `replicas`, and `robot_identifier` each + have a default value and can be overridden. +- **Outputs** — two properties are measured and recorded: + `adsorption_timeseries` and `adsorption_plateau_value`. + +### Required Inputs + +Values must be provided for all required inputs before the Experiment can run. +Providing a value outside the declared domain is an error. + +Most required inputs are **constitutive properties** — values that describe the +Entity being measured, such as a model name or a concentration. However, an +input can also be an **observed property** produced by another Experiment: if +Experiment `B` requires a value that Experiment `A` produces, Experiment `A` +must have been run on the Entity first. + +See [Properties and Domains](properties-and-domains.md) for a full description +of constitutive and observed properties and all domain types. + +### Optional Inputs + +Experiments can also declare optional inputs that have default values. +The defaults can be overridden to create **parameterized experiments** — useful +when you want to fix certain settings while exploring others. This is described +further in the +[`discoveryspace` resource documentation](../resources/discovery-spaces.md). -## Target and Observed Properties +### Target and Observed Properties -Experiments define properties the properties they measure. However, there may be -many experiments that measure the same property in different ways so we need a -way to differentiate them. +Experiments declare the properties they intend to measure — these are called +**target properties**. However, many different Experiments might target the +same property (e.g. `tokens_per_second`) measured in different ways. To +distinguish them, the actual value recorded by Experiment `A` for target +property `X` is called an **observed property**, named `A-X`. -The properties the experiment targets measuring are called `target properties`, -and the properties it actually measures `observed properties`. If experiment `A` -has target property `X`, then the observed property is `A-X` i.e. the value of -target property `X` measured by experiment `A`. +In the example above: + +- `adsorption_plateau_value` is the **target property** — the concept being + measured. +- `peptide_mineralization-adsorption_plateau_value` is the **observed property** + — that value as recorded by this specific Experiment. + +For a full description of property types see +[Properties and Domains](properties-and-domains.md). ## Measurement Space -A measurement space is simply a set of [experiments](actuators.md#experiments). +A Measurement Space is a collection of [Experiments](#experiments). +As a result a Measurement Space also defines a set of observed properties and target +properties as follows + +Property Type | Measurement Space Definition +--- | --- +Observed | Union of the observed property sets of it Experiments +Target | Union of the target property sets of it Experiments + +When combined with an Entity Space, a Measurement Space forms a +[Discovery Space](discovery-spaces.md). + +## Actuators + +Experiments are grouped and provided by **Actuators**. An Actuator typically +covers a particular domain - for example, foundation model fine-tuning, +computational chemistry, or robotic biology - and provides a collection of +related Experiments for that domain. -Since each experiment has a set of observed properties, a measurement space also -defines a set of observed properties. +A primary way to extend `ado` is by developing new Actuators to support +Experiments in a new domain. -Since each observed property is an observation of a target property, a -measurement space also defines a set of target properties. +The [Actuator documentation](../actuators/working-with-actuators.md) +has more detail including how to see the Actuators and Experiments +available in your deployment. diff --git a/website/docs/core-concepts/concepts.md b/website/docs/core-concepts/concepts.md index 59bebdd43..8540a71e3 100644 --- a/website/docs/core-concepts/concepts.md +++ b/website/docs/core-concepts/concepts.md @@ -1,40 +1,35 @@ ## Discovery Space -The core concept in `ado` is called a _Discovery Space_. In `ado` you are often -creating and performing operations on Discovery Spaces. - -For users familiar with `pandas` and `dataframes`, a Discovery Space combines: - -- the schema of a `dataframe` i.e. the columns and what they mean -- instructions on how to fill the `dataframe` rows -- the current data in the `dataframe` (and what's missing!) - -A Discovery Space expresses the hidden metadata and contextual -information necessary to understand and extend a dataframe. See -[Discovery Space](discovery-spaces.md) for more details. - -A Discovery Space is built from: - -- [Entities and Entity Spaces](entity-spaces.md): The set of things in a - Discovery Space -- [Measurement Spaces](actuators.md#measurement-space): The set of experiments - in a Discovery Space -- [Experiments and Actuators](actuators.md): The available experiments and the - tools that execute them +`ado` is a tool for systematically exploring, measuring, and analysing a space of +entities - for example, configurations, systems and substances. +The core concept enabling this is a +**Discovery Space**. It answers three questions: + +- **How are measurements performed?** A Discovery Space defines + a set of [Experiments](actuators.md). Each Experiment + takes defined inputs and produces measured outputs. The collection of Experiments + is called a [Measurement Space](actuators.md#measurement-space). +- **What do you want to measure?** A Discovery Space defines an + [Entity Space](entity-spaces.md) — the + specific set of things, called _Entities_, you want to measure. +- **What have you measured so far?** A Discovery Space uses + a **Sample Store**, a shared database, to read and store measurement + results. + +For users familiar with `pandas`, a Discovery Space is like a DataFrame that +knows its own schema, knows how to fill in missing values, and shares data +transparently with other DataFrames. See [Discovery Spaces](discovery-spaces.md) +for more. ## Sample Store -In `ado`, data on sampled entities, and the results of experiments on them, are -kept in a **sample store**. - -A single sample store can be used by multiple Discovery Spaces, allowing them to -share data. This means, for example, if an experiment has already been run, -`ado` can reuse the existing results instead of running the experiment again, -saving time and computational resources. +In `ado`, Entities and the results of Experiments on them are kept in a +**Sample Store** — a shared database that multiple Discovery Spaces can use. -This ability to transparently share and reuse data is a core feature of `ado`. -See [Shared Sample Stores](data-sharing.md) for more details. +If an Experiment has already been run on an Entity, `ado` can reuse the result +rather than running it again. This transparent data sharing is a core feature of +`ado`. See [Shared Sample Stores](data-sharing.md) for more details. ## What's next @@ -46,17 +41,19 @@ See [Shared Sample Stores](data-sharing.md) for more details. --- - Next go to [resources](../resources/resources.md) to learn more about working with these core-concepts in `ado`. + Go to [resources](../resources/resources.md) to learn more about working + with these core concepts in `ado`. [ado resources :octicons-arrow-right-24:](../resources/resources.md) - :octicons-workflow-24:{ .lg .middle } **Try our examples** - --- + --- - Try some of our [examples](../examples/examples.md) if you want to dive straight in. + Try some of our [examples](../examples/examples.md) if you want to dive + straight in. - [Our examples :octicons-arrow-right-24:](../examples/examples.md) + [Our examples :octicons-arrow-right-24:](../examples/examples.md) \ No newline at end of file diff --git a/website/docs/core-concepts/data-sharing.md b/website/docs/core-concepts/data-sharing.md index b2c93d4a5..8b1075845 100644 --- a/website/docs/core-concepts/data-sharing.md +++ b/website/docs/core-concepts/data-sharing.md @@ -1,102 +1,80 @@ # Shared Sample Stores -In `ado` Entities and measurement results are stored in a database called a -**Sample Store**. This document describes how Sample Stores enable sharing of -data. For more general information about these databases see +In `ado`, Entities and measurement results are stored in a database called a +**Sample Store**. For more on how Sample Stores are configured and managed see [their dedicated page](../resources/sample-stores.md). -There are two key points that underpin data reuse in `ado`: +Two principles underpin data reuse in `ado`: -- You can **share** a Sample Store between multiple Discovery Spaces - - This allows a Discovery Space to (re)use relevant Entities and Measurements - stored in the Sample Store by operations on other Discovery Spaces -- **Entities are always shared**. There is only one entry in a Sample Store for - an Entity +- **A Sample Store can be shared across multiple Discovery Spaces.** This allows + any Discovery Space to access Entities and measurements recorded by operations + on other Discovery Spaces that use the same store. +- **Each Entity has exactly one record in a Sample Store.** If two Discovery + Spaces both include the same Entity, they reference the same record — there is + no duplication. > [!NOTE] > -> To maximize the chance of data-reuse, similar Discovery Spaces should use the -> same Sample Store. However, Discovery Spaces do not have to be similar to use -> the same Sample Store. +> To maximise the chance of data reuse, similar Discovery Spaces should use the +> same Sample Store. However, any Discovery Spaces can share a store regardless +> of how similar they are. -## When data can be shared in `ado` - -There are two situations where data can be shared between Discovery Spaces in -`ado`: - -- **Data Retrieval**: retrieving data about entities and measurements from the - Discovery Space e.g. `ado show entities space` -- **Data Generation**: When performing an explore operation on a Discovery - Space - this type of data reuse is called `memoization` - -## How `ado` determines what data can be shared - -As a quick recap, a Discovery Space is composed of: - -- an [Entity Space](entity-spaces.md) which describes a set of Entities (points) - to be measured -- a [Measurement Space](actuators.md#measurement-space) which describes a set of - Experiments to apply to the points +## How `ado` matches shared data ### Entities -Each Entity in the Entity Space has a unique identifier, usually determined by -its set of constitutive property values. For example, if an Entity has two -constitutive properties `X` an `Y` with values 4 and 10, its id will be -'X:4-Y:10'. Since the identifiers of all the Entities in the Entity Space are -known, the Sample Store can be searched to see if it contains a record for any -of the Entities. +Each Entity has a unique identifier derived from its +[constitutive property](properties-and-domains.md#property-types) values. +For example, an Entity with properties `X=4` and `Y=10` gets the id +`X.4-Y.10`. `ado` uses these identifiers to look up Entities in the Sample +Store, regardless of which Discovery Space originally recorded them. ### Measurements -Each experiment in a Measurement Space has a unique identifier, determined from -its base name plus any optional properties that have been explicitly set. When -an Entity is retrieved from the Sample Store, it contains results of all the -experiments that have been applied to it. If the identifier of a result matches -the identifier of an Experiment in the Measurement Space, `ado` determines it -can be reused. +Each Experiment also has a unique identifier (its name plus any explicitly set +optional inputs). When an Entity is retrieved from the Sample Store, it carries +the results of all Experiments that have been applied to it. `ado` checks +whether any of those result identifiers match an Experiment in the current +Measurement Space — if so, the result can be reused. + +## Data retrieval modes -## Data sharing and data retrieval +When retrieving data from a Discovery Space (e.g. via `ado show entities`), +there are two modes that control whether shared data is included: -When retrieving data from a Discovery Space, e.g. via `ado show entities`, you -are actually retrieving data from the Sample Store that matches the Discovery -Space. When determining what data to retrieve there are two situations to -consider: + +| Mode | What is returned | +| --- | --- | +| **measured** | Only Entities and measurements recorded by operations run directly on *this* Discovery Space. Compatible data from other spaces is excluded. | +| **matching** | All Entities and measurements in the Sample Store that are compatible with this Discovery Space, regardless of which space produced them. | + -- **measured**: retrieve only Entities and measurements that were sampled via an - operation on the given Discovery Space - - this can be considered the "no sharing" mode. If an Entity or measurement - exists in the Sample Store that's compatible with the Discovery Space, but - no operation on the Discovery Space ever visited it, the "measured" mode - will not show it -- **matching**: retrieve all Entities and measurements that match the Discovery - Space - - this can be considered the "sharing" mode. +Use **measured** when you want to see only the results your operations have +produced. Use **matching** when you want the full picture including any +compatible data from other spaces. -## Data sharing and memoization +## Memoization > [!IMPORTANT] > > Each explore operator should provide a way to turn memoization on and off. > Check the operator documentation. -This section explains how data sharing and reuse works during an explore -operation - a feature called _memoization_. It's recommended you check the -documentation on [operations](../resources/operation.md) and +*Memoization* is the name for data reuse that happens automatically during an +explore operation. It's recommended you also check the documentation on +[operations](../resources/operation.md) and [explore operators](../operators/explore_operators.md). -Briefly, an explore operation samples a point in the Entity Space of a Discovery -Space and applies the experiments in the Measurement Space to it. In detail, the -sampling process is as follows: +When an operation samples an Entity it proceeds as follows: -- An Entity is sampled from the Entity Space +- The Entity is sampled from the Entity Space - The Entity's record is retrieved from the Sample Store if present (via its unique identifier) - If **memoization is on** - - for each experiment in the MeasurementSpace, `ado` checks - if a result for it already exists (via the experiment's unique identifier) - - if it does, the result is reused. If there is more than one result, they - are all reused -- if **memoization is off** - - Existing results are ignored. Each experiment in the Measurement Space is - applied again to the Entity. The new results are added to any existing. + - for each Experiment in the Measurement Space, `ado` checks + if a result for it already exists (via the Experiment's unique identifier) + - if it does, the result is reused. If there is more than one result, + they are all reused +- If **memoization is off** + - existing results are ignored. Each Experiment in the Measurement Space is + applied again to the Entity. The new results are added to any existing. diff --git a/website/docs/core-concepts/discovery-spaces.md b/website/docs/core-concepts/discovery-spaces.md index ad1580d2f..544ca3563 100644 --- a/website/docs/core-concepts/discovery-spaces.md +++ b/website/docs/core-concepts/discovery-spaces.md @@ -1,19 +1,22 @@ -A Discovery Space is made up of an [`Entity Space`](entity-spaces.md) and a -[`Measurement Space`](actuators.md#measurement-space). The `Entity Space` -defines the things you want to measure and the `Measurement Space` how you want -to measure them. +A Discovery Space combines an [Entity Space](entity-spaces.md) and a +[Measurement Space](actuators.md#measurement-space). The Entity Space defines +the Entities you want to measure; the Measurement Space defines how they are +measured. Results are stored in a [Sample Store](data-sharing.md). -A Discovery Space is also associated with a [Sample Store](data-sharing.md) -where measurement results and entities are recorded. +A Discovery Space is a **view** rather than a container — data is fetched from +the Sample Store on demand. This means multiple Discovery Spaces can share +measurement results transparently, and any measurement made by anyone using the +same Sample Store becomes immediately available. -## Example: Fine-Tuning Deployment Configuration Discovery Space +## Example: Fine-Tuning Deployment Configuration -We can combine the Entity Space example for fine-tuning deployment configuration -[here](entity-spaces.md#example-fine-tuning-deployment-configuration) with one -of the experiments from the `SFTTrainer` actuator to create the following -Discovery Space: +We can combine the +[Entity Space example](entity-spaces.md#example-fine-tuning-deployment-configuration) +with one of the Experiments from the [`SFTTrainer` Actuator](../actuators/sft-trainer.md) +to create the +following Discovery Space: @@ -66,98 +69,78 @@ Sample Store identifier: '2351e8' ``` -Here we can see: +The output shows the unique Discovery Space identifier, the Entity Space (80 +Entities across 7 dimensions), and the Measurement Space (one Experiment with +17 target properties). Together these define exactly what can be measured and +what the resulting data will look like. -- A unique id for the discovery space -- The entity space -- For each experiment in the measurement space (in this case just one) the - target properties it measures. +## Measurement Space and Entity Space Compatibility -## Sampling and Measurement - -A Discovery Space created with an empty Sample Store has no data associated with -it i.e. no sampled and measured entities. Adding data requires applying an -operation, like a Random Walk, to the Discovery Space. This operation samples -entities from the Entity Space, measures them according to the Measurement Space -experiments, and places the results into the Sample Store. - -Therefore, at any given point in time a Discovery Space will have some number of - -- sampled and measured entities -- sampled and unmeasured entities (because the measurements failed) -- unsampled entities +Since an [Experiment](actuators.md#experiments) declares the inputs it needs, +an Entity can only be measured by that Experiment if its +[constitutive property](properties-and-domains.md#property-types) values +satisfy those input requirements. -The first two will have corresponding data in the Sample Store. +Since a [Measurement Space](actuators.md#measurement-space) is a set of +Experiments, it defines a set of required constitutive properties. An Entity +Space must therefore contain all those properties, and each Entity Space +[Property Domain](properties-and-domains.md#property-domain-types) must be a +**subdomain** of the corresponding Experiment's input domain. -## Comparison: Discovery Space and a DataFrame +In practice this means the Experiment's declared input domains define the +**maximum possible extent** of any Entity Space used with that Measurement +Space. Your Entity Space is always a focused subset within those bounds. For +example, if an Experiment accepts `batch_size` values from 1 to 4096, your +Entity Space can restrict that to `[1, 2, 4, 8, 16]` — but it cannot extend +beyond `[1, 4096]`. -Comparing a Discovery Space with a DataFrame can help clarify the concept and -also illustrate the benefits +| | Full Experiment extent | Focused Entity Space subset | +| --- | --- | --- | +| `batch_size` | `[1, 4097]` interval 1 | `[1, 2, 4, 8, 16, 32, 64, 128]` | +| `model_name` | 40 model names | `[granite-3-8b, llama3-8b]` | +| `number_gpus` | `[0, 33]` interval 1 | `[2, 4]` | -### A Discovery Space defines a DataFrame schema +You can inspect the full extent of an Experiment's inputs with +`ado get experiments --details`. -When you create a Discovery Space you can imagine you have created a DataFrame -schema where: - -1. There are Columns for each entity space dimension -2. There are Columns for each measurement space property -3. Each row is an entity - -If we were to look at the example fine-tuning deployment configuration Discovery -Space this would look like (the rows and columns are truncated) - - -| model_id | gpu_type | batch_size | model_max_length | number_gpus | ... | finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0.dataset_tokens_per_second | finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0.gpu_memory_utilization_peak | ... | -| -------- | --------------------- | ---------- | ---------------- | ----------- | --- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------- | --- | -| lama3-8b | NVIDIA-A100-80GB-PCIe | 2 | 512 | 2 | ... | UNK | UNK | ... | -| lama3-8b | NVIDIA-A100-80GB-PCIe | 4 | 512 | 2 | ... | UNK | UNK | ... | -| lama3-8b | NVIDIA-A100-80GB-PCIe | 8 | 512 | 2 | ... | UNK | UNK | ... | -| ... | ... | ... | ... | ... | ... | ... | ... | ... | - - -This DataFrame has 80 rows, one for each entity, and (4+3+17) columns, one for -each of the 7 constitutive properties and the 17 target properties of -`finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0.` - -We can fill all the entity space columns for all the rows as we know the full -space. No measurements have taken place so all the measurement values are -unknown - -### A Discovery Space defines how to fill all the data in the DataFrame - -In the above example the columns associated with the measurement space have no -data. However, the Discovery Space specifies exactly how to obtain this data, as -it defines the actual experiments, supplied by actuators, that you can execute -to get it. +## Sampling and Measurement -Using the Discovery Space at any point we can choose a row (entity) with no -measurement and get the measurements +Data is added to a Discovery Space by running an **operation** on it, for +example a Random Walk or a Bayesian optimisation. The operation selects +Entities from the Entity Space, applies the Experiments in the Measurement +Space to them and stores the results in the Sample Store. Operations are +described in the [resources documentation](../resources/operation.md). -### A Discovery Space populates the schema from a shared external source +An Entity and its measurements only become **associated with a Discovery Space** +when an operation on that space has sampled them. Even if the underlying Sample +Store already contains compatible measurements from another Discovery Space, +those results are not automatically attributed to this one — attribution requires +an explicit operation. This prevents uncontrolled inheritance of data from other +spaces. -A Discovery Space is a view rather than a container. +At any point in time a Discovery Space therefore has: -This means when you generate a DataFrame from a Discovery Space the data in the -rows is fetched from a shared-source. If someone else measured an entity that -corresponds to one of the rows in your DataFrame it will be automatically -populated. +- Entities that have been sampled and successfully measured +- Entities that have been sampled but whose measurements failed +- Entities that have not yet been sampled -As operations are run on a Discovery Space the rows in the table become filled -in. You can choose to look at: +> [!NOTE] +> You can still query compatible data across spaces when needed +> — see [Shared Sample Stores](data-sharing.md). -1. Rows filled in by operations on this space (Entities sampled and measured via - this Discovery Space) -2. Rows filled in by operations on other spaces (Entities sampled and - measured via any Discovery Space using same Sample Store) -3. Rows not filled in at all (Unmeasured entities) +## Discovery Space vs DataFrame -### Summary +For users familiar with `pandas`, the table below summarises how a Discovery +Space relates to a DataFrame. The key difference is that a Discovery Space +*knows* its schema and how to fill it, and shares data from a common source +rather than holding a private copy. -| Method | Column Definition | Defines how to acquire missing data? | Data Sharing | -| --------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------ | ------------------------------------------------------------- | -| DataFrame | Ad-Hoc. The data-frame creator defines the columns when it is created. The meaning of the columns must be communicated separately, | Not defined. The DataFrame just holds data | Not possible. A DataFrame is a static object | -| Discovery Space | Defined by the discovery space. A set of Entity Space columns and Measurement Space columns. | Yes ,defined by the MeasurementSpace | Yes, values are loaded from a distributed shared db on demand | +| | DataFrame | Discovery Space | +| --- | --- | --- | +| Column definition | Ad-hoc — defined when created; meaning communicated separately | Defined by the Discovery Space: Entity Space dimensions + Measurement Space target properties | +| How to fill missing data | Not defined — a DataFrame just holds data | Defined by the Measurement Space: run the Experiments | +| Data sharing | Not possible — a DataFrame is a static, private object | Yes — values are fetched from a shared Sample Store on demand | - \ No newline at end of file + diff --git a/website/docs/core-concepts/entity-spaces.md b/website/docs/core-concepts/entity-spaces.md index d43eb6be6..ca439dd88 100644 --- a/website/docs/core-concepts/entity-spaces.md +++ b/website/docs/core-concepts/entity-spaces.md @@ -1,27 +1,27 @@ ## Entities -Entities represent things that can be measured. Examples are molecules or points -in an application configuration space. +Entities represent the things you want to measure — for example, a molecule, +a fine-tuning deployment configuration, or a robotic experiment setup. -Entities all have a set of constitutive properties which define them. A -molecule's constitutive properties might be a SMILES or INCHI string. The -constitutive properties of a fine-tuning deployment configuration might be GPU -model, number of GPUs and batch size. +Every Entity is described by a set of +[**constitutive properties**](properties-and-domains.md#property-types), and +corresponding values, that uniquely identify it. For a fine-tuning deployment +configuration these +might be GPU model, number of GPUs, and batch size. For a molecule they might +be a SMILES string. -An entity will also have observed properties. These are properties measured by -an experiment (or experiment protocol). For example, a molecule might have an -an observed property for its `band-gap` while a fine-tuning deployment -configuration might have an an observed property related to `tokens throughput`. +Once an Experiment has been run on an Entity, it also gains +[**observed properties**](actuators.md#target-and-observed-properties) — the +measured outputs produced by that Experiment. -### Example: FM Fine-tuning Deployment Configuration +### Example -Here is an example of an entity that represents a FM fine-tuning deployment +Here is an Entity representing a fine-tuning deployment configuration: ```terminaloutput Identifier: dataset_id.news-tokens-16384plus-entries-4096-model_name.llama3-8b-number_gpus.4.0-model_max_length.2048.0-torch_dtype.bfloat16-batch_size.16.0-gpu_model.NVIDIA-A100-80GB-PCIe -Generator: explicit_grid_sample_generator Constitutive properties: name value @@ -32,63 +32,39 @@ Constitutive properties: 4 torch_dtype bfloat16 5 batch_size 16.0 6 gpu_model NVIDIA-A100-80GB-PCIe - -Observed properties: - name experiment target-property values - 0 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... gpu_compute_utilization_min [98.14772727272727] - 1 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... gpu_compute_utilization_avg [98.26988636363636] - 2 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... gpu_compute_utilization_max [98.38636363636364] - 3 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... gpu_memory_utilization_min [33.709723284090906] - 4 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... gpu_memory_utilization_avg [33.709723284090906] - 5 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... gpu_memory_utilization_max [33.709723284090906] - 6 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... gpu_memory_utilization_peak [34.065475] - 7 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... cpu_compute_utilization [98.94999999999999] - 8 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... cpu_memory_utilization [6.3182326931818205] - 9 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... train_runtime [887.5672] - 10 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... train_samples_per_second [4.615] - 11 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... train_steps_per_second [0.072] - 12 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... train_tokens_per_second [9451.236] - 13 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... train_tokens_per_gpu_per_second [2362.809] - 14 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... model_load_time [-1.0] - 15 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... dataset_tokens_per_second [9451.237044361262] - 16 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... dataset_tokens_per_second_per_gpu [2362.8092610903154] - 17 finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0-... SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-defa... is_valid [1.0] - -Associated experiments: - - SFTTrainer.finetune-lora-fsdp-r-4-a-16-tm-default-v1.2.0 ``` -For more information about the meaning of `observed properties` see -[target & observed properties](actuators.md#target-and-observed-properties) - -## Entity Spaces +The identifier is derived from the constitutive property values — two Entities +with the same values are the same Entity. Once Experiments have been run on +this Entity, observed properties (measured values such as +`train_tokens_per_second`) will also appear. See +[Target and Observed Properties](actuators.md#target-and-observed-properties) +for more. -An Entity Space describes a set of entities. The set could be discrete or -continuous, bounded or unbounded. In `ado` you normally define Entity Spaces and -then sample Entities from them. +>[!IMPORTANT] Measuring Entities with Experiments +> +> In order for an [Experiment](actuators.md#experiments) to measure an Entity, +> the Entity's constitutive property values must fall within the input domains +> declared by the Experiment. -### Example: Molecules +## Entity Spaces -This space has a single dimension with type identifier. This is a property whose -values are a potentially very large set of unique-ids generated in some fashion. +An individual Entity is a single point. An **Entity Space** defines the full +set of Entities you want to explore — all the points you could potentially +measure. -```commandline - Space with non-discrete dimensions. Cannot count entities - Identifier properties: - name - 0 smiles -``` +An Entity Space is a set of constitutive properties, each with a **Property +Domain** that constrains the values it can take. Each property is a dimension +of the space, and every combination of values across all dimensions is an +Entity in the space. That is, +the Entity Space is the cartesian product of the dimensions. ### Example: Fine-tuning Deployment Configuration -This space has 7 dimensions, 4 categorical and 3 discrete. Each of the 4 -categorical dimensions has only a single value. The discrete dimensions each -have a range of values they can take. - + ```commandline - Number entities: 80 +Number entities: 80 Categorical properties: name values 0 dataset_id [news-tokens-16384plus-entries-4096] @@ -102,19 +78,17 @@ have a range of values they can take. 1 model_max_length [512, 8193] None [512, 1024, 2048, 4096, 8192] 2 batch_size [1, 129] None [1, 2, 4, 8, 16, 32, 64, 128] ``` + -### Property Domains - -Each property in an entity space can be associated with a domain. The domain is -the range of values the property can take and also the probability of those -values. In the `Fine-tuning Deployment Configuration` example we can see the -domains for each property. The categorical properties have a set of values and -the discrete properties a range and also a set of values. +This space has 7 dimensions: 4 categorical (each fixed to a single value) and +3 discrete. The total number of Entities is the product of the number of values +in each dimension: -In the `Molecules` example we see there is no domain, which means any value of -`smiles` is allowed. When there is no domain it also means the Entity Space -alone does not contain sufficient information by itself on how to sample the -entities. +```text +1 × 1 × 1 × 1 × 2 × 5 × 8 = 80 Entities +``` -By default, the probability is uniform, every value is equally likely, but it -could also be more complex. +Each Property Domain constrains one dimension. The categorical properties list +their allowed values explicitly; the discrete properties specify a range and a +set of values within it. For the full list of domain types see +[Properties and Domains](properties-and-domains.md). diff --git a/website/docs/core-concepts/properties-and-domains.md b/website/docs/core-concepts/properties-and-domains.md new file mode 100644 index 000000000..24aa2f33d --- /dev/null +++ b/website/docs/core-concepts/properties-and-domains.md @@ -0,0 +1,250 @@ +# Properties and Domains + +Properties and Property Domains are what `ado` uses to describe the +inputs and outputs of [Experiments](actuators.md), +and the dimensions of [Entity Spaces](entity-spaces.md). + +## Property + +A **Property** is a named concept — a string identifier such as: + +* _gpu-model_ +* _batch-size_ +* _node-selection-method_ +* _solve-time_ + +A Property may optionally carry metadata (a description) that explains what the +identifier represents. + +Some Properties are also associated with a **Property Domain** that specifies +the set of values the Property is allowed to take: + +* gpu-model → one of {A100, H100, MI300} +* batch-size → any integer between 1 and 1024 +* node-selection-method → one of {round-robin, random, greedy} +* solve-time → a positive floating‑point number + +## Property Types + +In `ado` there are three roles a Property can play: + +* **Constitutive properties** — the inputs to Experiments, and the dimensions + of an Entity Space. They describe inherent or assumed characteristics of the + Entity — the "givens". Constitutive properties usually have a Property Domain. +* **Target properties** — the properties an Experiment _intends_ to measure, + e.g. `train_tokens_per_second`. +* **Observed properties** — the values actually recorded by a specific + Experiment. Because many Experiments may target the same property, each + observed property is namespaced to the Experiment that produced it + (e.g. `finetune_lora-train_tokens_per_second`). See + [Target and Observed Properties](actuators.md#target-and-observed-properties). + +> [!NOTE] +> +> In ado, usually only constitutive properties have Property Domains. + +## Property Domain Types + +`ado` supports the following Property Domain types. Each is written under a +`domain:` key in ado YAML. + +> [!NOTE] +> +> The different domain types are distinguished by a **Variable Type** field +> (`variableType`). In many cases this can be omitted and `ado` will infer it +> automatically — see [Auto-inference](#auto-inference-of-property-domain-types). + +### Categorical + +A finite, named set of values. Typically strings, though numeric values are +also allowed. + +Used when the property can take one of a fixed list of labels. + +```yaml +domain: + values: [granite-3-8b, llama3-8b, mistral-7b-v0.1] +``` + +### Discrete + +A finite set of numeric values, specified either as an explicit list or as a +range with a step interval. Both forms are equivalent. + +Used when the property takes a countable set of numbers. + +**Explicit list:** + +```yaml +domain: + values: [1, 2, 4, 8, 16, 32, 64, 128] +``` + +**Range with interval** (lower inclusive, upper exclusive): + +```yaml +domain: + domainRange: [1, 129] + interval: 1 +``` + +**Interval only** (unbounded discrete — any multiple of the interval): + +```yaml +domain: + interval: 1 +``` + +### Continuous + +A continuous numeric domain. Use for real-valued properties. + +**Bounded range** — any real value within the bounds is valid: + +```yaml +domain: + domainRange: [0, 100] +``` + +**Unbounded** — any real number: + +```yaml +domain: + variableType: CONTINUOUS_VARIABLE_TYPE +``` + +### Binary + +Exactly two values: `true` and `false`. + +```yaml +domain: + variableType: BINARY_VARIABLE_TYPE +``` + +### Open Categorical + +Categorical values where the complete set of categories is not known in advance. +`variableType` must be set explicitly. An optional `values` field can seed a +known subset of categories. + +Used for properties where new categories can appear at runtime, for example a +molecule identifier or an AI model name. + +```yaml +domain: + variableType: OPEN_CATEGORICAL_VARIABLE_TYPE +``` + +## Auto-inference of Property Domain Types + +When `variableType` is omitted, `ado` infers it from the other fields: + +| Fields present | Inferred type | +| --- | --- | +| `values` with all numeric entries | `DISCRETE_VARIABLE_TYPE` | +| `values` with any non-numeric entry | `CATEGORICAL_VARIABLE_TYPE` | +| `domainRange` only (no `interval`) | `CONTINUOUS_VARIABLE_TYPE` | +| `domainRange` + `interval` | `DISCRETE_VARIABLE_TYPE` | +| `interval` only (no `domainRange`) | `DISCRETE_VARIABLE_TYPE` | + +`BINARY_VARIABLE_TYPE` and `OPEN_CATEGORICAL_VARIABLE_TYPE` cannot be inferred +and must always be declared explicitly. + +## Probability Functions + +Each domain can optionally specify a probability function that controls how +values are sampled. The default is **uniform** — every value in the domain is +equally likely. + +```yaml +domain: + values: [1, 2, 4, 8, 16] + probabilityFunction: + identifier: uniform +``` + +A **normal** distribution is also available for continuous and discrete domains: + +```yaml +domain: + domainRange: [0.0, 1.0] + probabilityFunction: + identifier: normal + parameters: + mean: 0.5 + std: 0.1 +``` + +When no `probabilityFunction` is specified, uniform sampling is used. + +## Property Subdomains + +Domain A is a **subdomain** of domain B if every value in A is also a valid +value in B. A subdomain represents a narrowed or more specific version of a +parent domain. + +The most common place this matters in `ado` is when defining an +[Entity Space](entity-spaces.md): the domain you assign to each entity space +dimension must be a subdomain of the corresponding experiment input domain. +This ensures that all entities in the space are valid inputs to the experiment. + +### Compatible Subdomain Types + +Not every combination of domain types is valid — the subdomain type must be +compatible with the parent type: + + +| Parent domain | Compatible sub-domain types | Notes | +| --- | --- | --- | +| `CONTINUOUS` | `CONTINUOUS`, `DISCRETE` (finite), `BINARY` | Sub-range must lie within the parent range; `BINARY` requires 0 and 1 to be within the range | +| `DISCRETE` | `DISCRETE`, `BINARY` | Sub-values must be a subset of the parent values; `BINARY` only valid if both 0 and 1 appear in the parent | +| `CATEGORICAL` | `CATEGORICAL`, `DISCRETE` (finite), `BINARY` | Sub-values must be a subset of the parent values | +| `BINARY` | `BINARY`, `DISCRETE` (≤2 values) | Values must be a subset of `{0, 1}` / `{false, true}` | +| `OPEN_CATEGORICAL` | `OPEN_CATEGORICAL`, `CATEGORICAL`, `DISCRETE` (finite), `BINARY` | The most permissive categorical parent | + + +### Example + +Suppose an experiment declares the following required input domains: + +```yaml +# Experiment input domains (the maximum possible extent) +model_name: + values: [granite-3-8b, llama3-8b, mistral-7b-v0.1, granite-34b-code-base] + +batch_size: + domainRange: [1, 4097] + interval: 1 + +temperature: + domainRange: [0.0, 100.0] +``` + +A valid entity space could narrow each of these to a focused subdomain: + +```yaml +# Entity space domains (subdomains of the experiment inputs above) +model_name: + values: [granite-3-8b, llama3-8b] # CATEGORICAL ⊆ CATEGORICAL ✓ + +batch_size: + values: [1, 2, 4, 8, 16] # DISCRETE ⊆ DISCRETE ✓ + +temperature: + domainRange: [20.0, 40.0] # CONTINUOUS ⊆ CONTINUOUS ✓ +``` + +The following entity space domains would be **invalid** because they are not +subdomains of the corresponding experiment inputs: + +```yaml +batch_size: + # Values above 4096 are not in the Experiment input domain for batch_size + domainRange: [4096, 8124] + interval: 1028 + +model_name: + # granite-4-3b is not one of the allowed values + domainRange: [granite-4-3b] +``` diff --git a/website/docs/resources/discovery-spaces.md b/website/docs/resources/discovery-spaces.md index cb5d0b4cc..43de3d7c7 100644 --- a/website/docs/resources/discovery-spaces.md +++ b/website/docs/resources/discovery-spaces.md @@ -802,7 +802,7 @@ explains how to use optional properties. ## Parameterizing Experiments If an experiment has -[optional properties](../core-concepts/actuators.md#optional-properties) you can +[optional input properties](../core-concepts/actuators.md#optional-inputs) you can define equivalent properties in the entity space. If you don't, the default value for the property will be used. diff --git a/website/mkdocs.yml b/website/mkdocs.yml index c868823c8..bcd2157b0 100644 --- a/website/mkdocs.yml +++ b/website/mkdocs.yml @@ -157,8 +157,9 @@ nav: - Efficiently Exploring Parameter Spaces with TRIM: examples/trim.md - Core Concepts: - core-concepts/concepts.md + - Properties and Domains: core-concepts/properties-and-domains.md + - Experiments & Actuators: core-concepts/actuators.md - Entities and Entity Spaces: core-concepts/entity-spaces.md - - Actuators, Experiments & Measurement Spaces: core-concepts/actuators.md - Discovery Spaces: core-concepts/discovery-spaces.md - Shared Sample Stores: core-concepts/data-sharing.md - Resources: