[RFC]: Cleanup workflow templates. 

### Motivation

Some workflow templates include fields that either fail validation or have no runtime effect. Because templates are our main examples, these fields make unsupported behavior look official. This RFC proposes grouping those fields by treatment: removing no-op fields, document metadata-only fields, and deciding whether any should become a supported API.

### Proposed change

Group legacy fields by treatment instead of handling each field independently.

#### Remove no-op scheduler controls from templates

These fields are accepted by schema but not consumed by the server scheduler. They should be removed from templates unless we explicitly decide to implement them.

| Field group | Active template uses | Proposed treatment | Reason |
|---|---:|---|---|
| `resources.replicas` | 44 | **Remove** | Worker matching uses `resources.hardware`; no runtime path fans out, scales, or duplicates tasks from `replicas`. |
| `spec.sloSeconds` | 3 | **Remove** | Accepted on inference/LoRA specs, but dispatcher and worker selection do not read it. |
| `spec.parallel` and `spec.parallel.max_shards` | 4 / 2 | **Remove** | No parser/runtime path expands this into shards; actual dataset sharding support is through `spec.shard`. |

#### Support legacy executor config keys with explicit mappings

These fields live inside executor-owned config dictionaries, so each one should either map to a real backend/FlowMesh behavior or be removed from templates. In the uv-managed environment, TRL 0.23.0 `PPOConfig` directly supports `report_to` and `project`, which can cover legacy logging keys; it does not directly support `target_kl`, `early_stopping`, `optimize_cuda_cache`, `padding_side`, or `generation.do_sample`.

| Field group | Active template uses | Proposed treatment | Reason |
|---|---:|---|---|
| `spec.agent.timeout` | 5 | **Support** | `AgentSpec` already accepts this field and templates already set it; `AgentExecutor` can replace hardcoded per-task execution timeouts with a validated value from `spec.agent.timeout`. |
| PPO `training.target_kl` and `training.early_stopping` | 3 each | **Support** | TRL support: false, but the behavior is important for real PPO stability. Implement as FlowMesh-owned early stopping based on observed KL; do not alias `target_kl` to `kl_coef`. |
| PPO `generation.do_sample` | 3 | **Remove** | TRL `PPOConfig` support: false. TRL `PPOTrainer.train()` hardcodes `do_sample=True`, so there is no direct config mapping. |
| PPO `training.padding_side` | 3 | **Support** | TRL support: false, but FlowMesh owns tokenizer setup. |
| PPO `training.optimize_cuda_cache` | 3 | **Remove** | TRL exact-name support: false. `torch_empty_cache_steps` exists in `PPOConfig`, but PPOTrainer's custom loop already calls `empty_cache()` directly and the boolean template field has no clear step-based contract. |
| PPO `training.log_with` and `training.tracker_project_name` | 3 each | **Support** | TRL exact-name support: false, but direct replacements exist: map `log_with` to `report_to` and `tracker_project_name` to `project`. |
| vLLM `model.source.revision` | 6 | **Support** | vLLM templates set the common source revision field, but `VLLMExecutor` only forwards `revision` and `tokenizer_revision` from `model.vllm`. Other executors do consume `model.source.revision`, so this is a vLLM-specific mismatch. |

#### Fix schema/runtime mismatches

These cases are not just no-ops; they expose inconsistent contracts between template examples, schema, and executor code.

| Field group | Active template uses | Proposed treatment | Reason |
|---|---:|---|---|
| `spec.stages[].spec.model.adapters[].url` | 1 | **Support** | `templates/lora_then_inference.yaml` fails validation because adapter schema allows `type`, `path`, `name`, and `kwargs`, while `vllm_lora_executor` appears to support URL/task-based LoRA adapters. |
| `metadata.project` | 1 | **Remove** | `templates/agent_paper_collector.yaml` fails strict workflow metadata validation. |

#### Document metadata-only fields

These fields are acceptable to keep, but docs should state clearly that they do not control scheduling or execution.

| Field group | Active template uses | Proposed treatment | Reason |
|---|---:|---|---|
| `metadata.owner` | 29 | **Keep** | Runtime owner comes from submit/auth context, not template metadata. |
| `metadata.annotations.description` | 34 | **Keep** | Useful for humans; no runtime behavior. |
| `apiVersion` and `kind` | 49 each | **Keep** | Required by envelope/schema shape, but not used for task dispatch semantics. |
| `model.source.type` | 35 | **Keep** | Reserved discriminator for future model-source backends such as Hugging Face, local paths, object storage, internal registries, or task artifacts; current executors do not route on it yet. |

#### Add guardrails

Add a template validation CI test that runs every `templates/*.yaml` file through `parse_workflow` using the uv-managed environment. 

#### Implementation plan

We plan to do the cleanup via four PRs:
1. Add CI test to verify all templates can be parsed correctly, and fix the existing two unparsable workflows. #29
2. Move templates under `examples`. #32
3. Cleanup dead configs that are easy to be removed/supported. Also, unify the use of double quotes in all templates. #34
4. Support `training.target_kl` and `training.early_stopping` in the PPO executor. They change the training dynamic and are worth a standalone PR. #37

### Alternatives considered

_No response_

### Migration / compatibility

_No response_

### Feedback period

_No response_

### CC list

@kaiitunnz 

### Before submitting

- [x] I have searched existing issues and confirmed this is not a duplicate.

Field group	Active template uses	Proposed treatment	Reason
`resources.replicas`	44	Remove	Worker matching uses `resources.hardware`; no runtime path fans out, scales, or duplicates tasks from `replicas`.
`spec.sloSeconds`	3	Remove	Accepted on inference/LoRA specs, but dispatcher and worker selection do not read it.
`spec.parallel` and `spec.parallel.max_shards`	4 / 2	Remove	No parser/runtime path expands this into shards; actual dataset sharding support is through `spec.shard`.

Field group	Active template uses	Proposed treatment	Reason
`spec.agent.timeout`	5	Support	`AgentSpec` already accepts this field and templates already set it; `AgentExecutor` can replace hardcoded per-task execution timeouts with a validated value from `spec.agent.timeout`.
PPO `training.target_kl` and `training.early_stopping`	3 each	Support	TRL support: false, but the behavior is important for real PPO stability. Implement as FlowMesh-owned early stopping based on observed KL; do not alias `target_kl` to `kl_coef`.
PPO `generation.do_sample`	3	Remove	TRL `PPOConfig` support: false. TRL `PPOTrainer.train()` hardcodes `do_sample=True`, so there is no direct config mapping.
PPO `training.padding_side`	3	Support	TRL support: false, but FlowMesh owns tokenizer setup.
PPO `training.optimize_cuda_cache`	3	Remove	TRL exact-name support: false. `torch_empty_cache_steps` exists in `PPOConfig`, but PPOTrainer's custom loop already calls `empty_cache()` directly and the boolean template field has no clear step-based contract.
PPO `training.log_with` and `training.tracker_project_name`	3 each	Support	TRL exact-name support: false, but direct replacements exist: map `log_with` to `report_to` and `tracker_project_name` to `project`.
vLLM `model.source.revision`	6	Support	vLLM templates set the common source revision field, but `VLLMExecutor` only forwards `revision` and `tokenizer_revision` from `model.vllm`. Other executors do consume `model.source.revision`, so this is a vLLM-specific mismatch.

Field group	Active template uses	Proposed treatment	Reason
`spec.stages[].spec.model.adapters[].url`	1	Support	`templates/lora_then_inference.yaml` fails validation because adapter schema allows `type`, `path`, `name`, and `kwargs`, while `vllm_lora_executor` appears to support URL/task-based LoRA adapters.
`metadata.project`	1	Remove	`templates/agent_paper_collector.yaml` fails strict workflow metadata validation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC]: Cleanup workflow templates. #28

Motivation

Proposed change

Remove no-op scheduler controls from templates

Support legacy executor config keys with explicit mappings

Fix schema/runtime mismatches

Document metadata-only fields

Add guardrails

Implementation plan

Alternatives considered

Migration / compatibility

Feedback period

CC list

Before submitting

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Field group	Active template uses	Proposed treatment	Reason
`metadata.owner`	29	Keep	Runtime owner comes from submit/auth context, not template metadata.
`metadata.annotations.description`	34	Keep	Useful for humans; no runtime behavior.
`apiVersion` and `kind`	49 each	Keep	Required by envelope/schema shape, but not used for task dispatch semantics.
`model.source.type`	35	Keep	Reserved discriminator for future model-source backends such as Hugging Face, local paths, object storage, internal registries, or task artifacts; current executors do not route on it yet.

[RFC]: Cleanup workflow templates. #28

Description

Motivation

Proposed change

Remove no-op scheduler controls from templates

Support legacy executor config keys with explicit mappings

Fix schema/runtime mismatches

Document metadata-only fields

Add guardrails

Implementation plan

Alternatives considered

Migration / compatibility

Feedback period

CC list

Before submitting

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions