Feature: Configuration - support wildcards and regex for array definitions

# Description, motivation and use case

The current configuration requires explicitly listing all element IDs in array or device definitions. This becomes fragile and hard to maintain when new elements are added or naming conventions evolve.

Two complementary improvements are proposed:

1. Pattern-based selection inside `elements` lists (wildcards, regular expressions and exclusions).
2. Python-based configuration macros, allowing scripted generation of configuration blocks directly from YAML.

This issue focuses on extending the configuration layer to support:

* Wildcards and regular expressions inside `elements`.
* A simple negation syntax for exclusions.
* A scripting mechanism (`elements_code`) allowing dynamic generation of configuration entries using embedded Python.

The goal is to improve scalability, maintainability and expressiveness of the configuration layer while preserving full backward compatibility.

---

# Part 1 — Pattern-based element selection

## Proposed solution

Keep the existing `elements` key unchanged and make the element string itself expressive.

Supported syntax:

| Syntax        | Meaning                                 |
| ------------- | --------------------------------------- |
| `pattern`     | inclusion using exact match or wildcard |
| `~pattern`    | exclusion using exact match or wildcard |
| `re:<regex>`  | inclusion using regular expression      |
| `~re:<regex>` | exclusion using regular expression      |

Examples:

```yaml
elements:
  - BPM_C*-*
  - ~BPM_C04-*
  - re:^BPM_C1[0-9]-[0-9]{2}$
  - ~re:^BPM_C10-.*
```

---

## Interpretation

Given the following example:

```yaml
elements:
  - BPM_C*-*
  - ~BPM_C04-*
  - re:^BPM_C1[0-9]-[0-9]{2}$
  - ~re:^BPM_C10-.*
```

The evaluation proceeds as follows:

1. Include all `BPM_C*-*`
2. Remove those from cell `C04`
3. Add elements matching the regex
4. Remove elements matching the exclusion regex
5. Sort the resulting list according to the reference ordering defined in the accelerator `devices` section

This ensures that array definitions remain stable and consistent with the canonical accelerator device ordering.

---

## Resolution rules

* `elements` remains a list of strings.
* Each entry is resolved independently.

Rule grammar:

```
rule := ["~"] ("re:" <regex> | <glob>)
```

Resolution semantics:

* Rules are evaluated **sequentially**.
* Positive rules **add** matching elements.
* Negative rules **remove** matching elements.

Matching operations always use the **reference ordering defined in the `devices` section of the accelerator configuration**.

The resulting list:

* preserves the reference ordering of matched elements
* removes duplicates deterministically

Example:

```yaml
elements:
  - BPM_C*
  - ~BPM_C04*
```

This allows selective reinsertion of elements while keeping the accelerator reference ordering.

---

## Matching rules

Pattern interpretation:

* `re:` prefix → regular expression
* string containing `*` or `?` → wildcard (glob)
* otherwise → exact element ID

Resolution must:

* Use the **device order defined in the accelerator `devices` section** as the canonical ordering.
* Be deterministic.
* Clearly define behavior for 0 matches (error or empty contribution — must be specified and consistent).
* Remove duplicates deterministically.

Full backward compatibility shall be preserved.

---

# Part 2 — Python-based configuration macros

## Motivation

While wildcards and regex improve selection, some configurations require:

* Nested loops
* Multiple parameterized ranges
* Systematic naming patterns
* Complex device definitions
* Computed attribute values

For such cases, pattern matching is insufficient. We therefore propose a controlled macro mechanism using embedded Python code.

This was discussed as an additional proposal.

---

## Proposed solution: `elements_code`

Introduce a new optional key:

```yaml
elements_code: |
    <python code>
```

The embedded Python code:

* Must return either:

  * A `dict` (single configuration entry), or
  * A `list[dict]` (multiple entries).
* If a list is returned, it is expanded into the surrounding list.
* The macro block is replaced by the returned structure before normal parsing continues.

This mechanism follows the same philosophy as existing file-expansion logic: detection → execution → replacement → continue parsing.

---

## Example — BPM generation

```yaml
devices:
  - elements_code: |
      out: list[dict] = []
      ranges = [(4,33), (1,4)]
      for r in ranges:
          for cell in range(r[0], r[1]):
              for elem in range(1, 11):
                  bpm = {
                      "type": "pyaml.bpm.bpm",
                      "name": f"BPM_C{cell:02d}-{elem:02d}",
                      "model": {
                          "type": "pyaml.bpm.bpm_simple_model",
                          "x_pos": {
                              "type": "tango.pyaml.attribute_read_only",
                              "attribute": f"srdiag/bpm/c{cell:02d}-{elem:02d}/SA_HPosition",
                              "unit": "m",
                          },
                          "y_pos": {
                              "type": "tango.pyaml.attribute_read_only",
                              "attribute": f"srdiag/bpm/c{cell:02d}-{elem:02d}/SA_VPosition",
                              "unit": "m",
                          },
                      },
                  }
                  out.append(bpm)
      return out
```

The resulting list is injected into the configuration tree and parsed normally.

---

# Execution semantics

The macro mechanism must:

* Execute before object construction.
* Run in a controlled execution environment.
* Provide:

  * A minimal safe namespace.
  * Optional helper utilities (e.g. `range`, `math`, etc.).
* Enforce that the return type is either `dict` or `list[dict]`.
* Raise a clear configuration error otherwise.

The parsing process becomes:

1. Load YAML.
2. Detect `elements_code`.
3. Execute code.
4. Replace macro block with returned structure.
5. Continue parsing recursively.

---

# Determinism and reproducibility

To preserve reproducibility:

* Execution must be deterministic.
* No implicit access to external state unless explicitly allowed.
* No hidden side effects.
* Ordering of generated elements must be preserved as returned.

---

# Backward compatibility

* Existing YAML files remain valid.
* `elements_code` is optional.
* No change to current schema validation for standard entries.
* Pattern-based `elements` resolution remains independent.

---

# Considered alternatives

### Separate external Python generation script

Instead of embedding Python inside YAML, users could generate YAML files using standalone Python scripts.

**Pros:**

* Clear separation of code and configuration.

**Cons:**

* Breaks self-contained configuration principle.
* Harder to track provenance.
* More complex CI workflows.

The embedded macro approach keeps configuration declarative while allowing structured generation when necessary.

This approach will always remain possible, independently of the macro mechanism proposed here. Nothing in this proposal prevents users from generating YAML files externally using Python (or any other language) before loading them into pyAML.

---

### Dedicated DSL instead of Python

A domain-specific language could replace Python.

**Pros:**

* Safer, constrained syntax.

**Cons:**

* Reinvents control structures.
* Higher implementation complexity.
* Lower expressiveness.

Given Python is already the core language of the project, leveraging it is consistent with project philosophy.

---

# Checklist

* [x] I've assigned this issue to a project
* [x] I've @-mentioned relevant people


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Configuration - support wildcards and regex for array definitions #202

Description, motivation and use case

Part 1 — Pattern-based element selection

Proposed solution

Interpretation

Resolution rules

Matching rules

Part 2 — Python-based configuration macros

Motivation

Proposed solution: `elements_code`

Example — BPM generation

Execution semantics

Determinism and reproducibility

Backward compatibility

Considered alternatives

Separate external Python generation script

Dedicated DSL instead of Python

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Syntax	Meaning
`pattern`	inclusion using exact match or wildcard
`~pattern`	exclusion using exact match or wildcard
`re:<regex>`	inclusion using regular expression
`~re:<regex>`	exclusion using regular expression

Feature: Configuration - support wildcards and regex for array definitions #202

Description

Description, motivation and use case

Part 1 — Pattern-based element selection

Proposed solution

Interpretation

Resolution rules

Matching rules

Part 2 — Python-based configuration macros

Motivation

Proposed solution: elements_code

Example — BPM generation

Execution semantics

Determinism and reproducibility

Backward compatibility

Considered alternatives

Separate external Python generation script

Dedicated DSL instead of Python

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Proposed solution: `elements_code`