validators.py: make the mechanical floor language-pluggable (today: hard-coded ruff + pytest)

## Why

`tilth/validators.py` currently hard-codes the mechanical floor as **ruff + pytest** — Python-specific tools running against a Python-specific workspace. The harness itself isn't language-coupled (Brain/Hands/Session is generic; tools are generic), but the *floor* is.

This shows up in two places:

1. The v1 worker–evaluator dialogue sketch ([`proposals/v1-worker-evaluator-dialogue.md`](../proposals/v1-worker-evaluator-dialogue.md)) leans on *"the mechanical floor stays the anchor — the evaluator's prose-judgment sits on top of it, never replaces it."* That principle is sound, but as written it implicitly bakes in Python. Non-Python workspaces have nowhere to plug their own static + test commands.
2. The v2 *custom mechanical checks* direction (Tilth's analog of OpenAI's custom-lints-with-remediation pattern) wants a clean place to add new validators. Today there's no abstraction to extend — every new check requires editing `validators.py` and `run_all()` directly.

## What pluggable should mean

A per-workspace validator config — discoverable from the workspace itself (likely a section in `AGENTS.md`, or a sibling like `.tilth/validators.yaml`) — that names a list of `(name, command, success_predicate)` entries. `validators.py` becomes a runner over that list; the Python demo's config happens to be `{ruff: \"ruff check .\", pytest: \"pytest -q\"}`.

Open design questions worth working through in this issue:

- Where does the config live — `AGENTS.md` section, sibling file, or `pyproject.toml`-style equivalent?
- How does per-task test filtering (`test_t<NNN>_*.py` glob today) generalize across languages?
- Does the seeder need to know about the validator config to author tests that the floor will actually run?
- Does the prep-feature interview need a step that asks the user about their workspace's validators, or do we detect from the project shape?

## Scope

This issue covers the **design + implementation** of the pluggable interface. It's a **v1 prerequisite** if Tilth is to support non-Python workspaces — and even for Python-only use, it cleans up a load-bearing coupling.

## Related

- [`proposals/v1-worker-evaluator-dialogue.md`](../proposals/v1-worker-evaluator-dialogue.md) — the v1 sketch that depends on this
- [`tilth/validators.py`](../tilth/validators.py) — current hard-coded surface
- v2 custom-mechanical-lints direction (OpenAI harness piece)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

validators.py: make the mechanical floor language-pluggable (today: hard-coded ruff + pytest) #20

Why

What pluggable should mean

Scope

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

validators.py: make the mechanical floor language-pluggable (today: hard-coded ruff + pytest) #20

Description

Why

What pluggable should mean

Scope

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions