feat(rules): agent rules layer with CLI-first directive by deanq · Pull Request #341 · runpod/flash

deanq · 2026-05-25T03:21:34Z

Summary

Minimal agent-rules layer: flash init writes AGENTS.md at project root and creates CLAUDE.md as a symlink to it. That's it.

Why

Agentic AI coding tools (Claude Code, Cursor, Copilot, Codex, Aider, etc.) using Flash routinely bypass the flash CLI and generate raw Runpod REST or GraphQL calls to build, deploy, list, or scale endpoints — bypassing auth, config hashing, drift detection, manifest generation, and image selection.

This PR ships a sidecar AGENTS.md whose first section is a "Use the Flash CLI — Do Not Call Runpod REST or GraphQL Directly" directive with an intent → command table. The file is written by flash init for new projects; existing projects run a one-liner documented in the README.

No agentic tool scans site-packages or subdirectories for instruction files — they read named files at project root. Prior art (Prisma, Husky, shadcn, ESLint) never writes into user-owned root files; they prompt or use namespaced directories. This PR follows that pattern: AGENTS.md is Flash-owned (skipped if user already has one), CLAUDE.md is created as a symlink so Claude Code picks up the same content without duplication.

Behavior

flash init writes ./AGENTS.md if absent; symlinks ./CLAUDE.md → AGENTS.md if absent
Existing AGENTS.md / CLAUDE.md are never touched
Broken CLAUDE.md symlink is detected and a warning is logged
Symlink failure (e.g. Windows without dev mode) is logged and skipped; AGENTS.md still installed
Missing AGENTS.md in the wheel raises a clear FileNotFoundError with an actionable message
Opt-out: delete AGENTS.md. Flash will not re-create it
No flash rules command, no --no-rules flag, no marker injection, no dynamic context, no .flash/context.md

AGENTS.md content

Four endpoint patterns documented:

A: Queue-based function endpoint
B: Load-balanced routes
C: Class-based stateful worker
D: Pre-built container image (BYOI) — e.g. Endpoint(name="vllm", image="runpod/worker-v1-vllm:v2.18.1", ...) — for workloads that already serve HTTP and don't need a Python handler

Plus a "Rules That Break If Violated" list, a "Common Agent Mistakes" table, and the CLI-first directive at the top.

Existing-project install

python -c "from runpod_flash.rules import install_agent_files; from pathlib import Path; install_agent_files(Path.cwd())"

Diff scope

10 files net change vs main:

Added: src/runpod_flash/rules/{__init__.py,AGENTS.md}, tests/unit/rules/test_install.py
Modified: src/runpod_flash/cli/commands/init.py, README.md, pyproject.toml, .gitignore, skeleton .gitignore
Test mods: tests/unit/cli/commands/test_init.py

Test plan

make quality-check passes (2702 tests, 85.4%+ coverage)
Unit tests cover: write-when-absent, no-overwrite, symlink creation, no-replace-existing CLAUDE.md, symlink failure path, broken-symlink warning, missing-asset error, idempotency
Manual: flash init demo in an empty dir creates AGENTS.md + CLAUDE.md symlink
Manual: flash init demo in a dir with pre-existing AGENTS.md leaves it untouched
Manual: Claude Code session in the demo project picks up the rules via CLAUDE.md → AGENTS.md

Introduces the rules package with flash-rules.md — a static markdown file teaching AI coding agents Flash-specific conventions (import placement, decorator usage, dependency declaration, CLI commands, and common mistakes). - 457 words (~594 tokens), well under the 3,000-token budget - Covers the three core patterns: queue-based, load-balanced, class-based - Includes CLI cheatsheet and common agent mistake table

- Add rules/**/* to package-data so flash-rules.md is included in wheel - Implement read_static_rules() via importlib.resources for portable access - Implement inject_rules_section() with full marker lifecycle: replace, append, orphan-start, multiple-pairs, FLASH:DISABLE skip - Implement generate_agent_files() for CLAUDE.md, .cursorrules, AGENTS.md, .github/copilot-instructions.md, and .flash/context.md placeholder - 18 new tests across TestRulesPackaging, TestReadStaticRules, TestInjectRulesSection, TestGenerateAgentFiles — all pass - engine.py at 100% coverage; total suite 85.53% (2396 passed)

…, fix tests

Commit 5d85bee reverted ~135 lines of main's run.py while adding the rules engine hook: RuntimeScanner renamed back to RemoteDecoratorScanner, tuple return dropped, _discover_resources reverted to the removed core.discovery.ResourceDiscovery path, and hot-reload sys.modules purge removed. This restores run.py to origin/main and re-adds only the update_dynamic_context() call. Also switches two init tests to CliRunner.invoke() so Typer resolves defaults instead of leaking ArgumentInfo sentinels.

…EADME Adds a prominent "Use the Flash CLI — Do Not Call Runpod REST or GraphQL Directly" section to flash-rules.md with an intent → command table covering init/run/build/deploy/preview/undeploy/app/env/rules. Guardrails raw runpod.Endpoint(...) use to invoking already-deployed endpoints only. Rewrites the README "Coding agent integration" section to surface the files that flash init now writes (CLAUDE.md, AGENTS.md, .cursorrules, .github/copilot-instructions.md, .flash/context.md), the FLASH:START/END markers, `flash rules` regeneration, FLASH:DISABLE per-file opt-out, and the --no-rules repo-wide opt-out. Closes AE-3155.

promptless · 2026-05-25T03:26:46Z

Promptless prepared a documentation update related to this change.

Triggered by runpod/flash#341

Added documentation for the new AI coding agent integration feature. Created a new flash rules CLI reference page, updated the flash init docs with the --no-rules flag and generated agent files, added the command to the CLI overview, and updated the quickstart with a tip about built-in agent integration.

Review: Flash AI coding agent integration

Renames the static rules file to AGENTS.md (the emerging cross-tool convention). Moves the CLI-first directive to the top of the file so agents read it before any other content. Cuts the GPU/CPU reference tables (out-of-date risk) and the dynamic-context placeholder section.

Single public function writes AGENTS.md if absent and creates a relative CLAUDE.md symlink. Idempotent. Symlink failure on platforms without support is logged and skipped (AGENTS.md still installed).

…wheel asset Code review feedback. Broken symlink case now emits a warning instead of silently skipping. Missing AGENTS.md in package data now raises a FileNotFoundError with an actionable message instead of a bare traceback.

Deletes engine.py, context.py, and the flash rules CLI command. Drops update_dynamic_context hooks from flash run and flash build. Rewires flash init to call install_agent_files, removes the --no-rules flag and the now-unused _get_version helper. Updates test_init.py to match the new behavior. Tasks 4-7 of the minimal agent-rules plan committed as one unit to keep quality-check green throughout: the engine importers in the 4 CLI files are all cleaned up before the commit lands.

Reflects the new behavior: flash init writes AGENTS.md + CLAUDE.md symlink, never touches existing files, no flash rules command, no opt-out flag (delete the file).

The dynamic-context engine was removed in this branch. The .flash/ line already covers everything under that directory; the specific entry plus its 'regenerated on each command' comment are now misleading.

The CLI command is 'flash dev' — 'flash run' does not exist. Was a copy-paste from earlier docs that referenced the older name.

Documents Endpoint(name=..., image=...) for workloads that already serve HTTP — vLLM, TGI, ComfyUI, etc. — and the id= attach mode for connecting to already-deployed endpoints without re-provisioning.

The Runpod-published vLLM worker is the canonical image for serving LLMs on Runpod Serverless. Adjusts the QB call shape to match the worker's input schema (input.prompt, not top-level prompt).

…nore .runpod/ was the pre-rename state directory; nothing in current Flash writes there. The migration in resource_manager.py handles old projects that still have .runpod/resources.pkl. dist/ was duplicated — it's already in the Python section at the top of the file.

Was unintentionally dropped during the Task 8 section rewrite. The skill bundle is complementary to AGENTS.md — static rules cover the CLI-first directive; the skill bundle adds richer Claude Code behavior that goes beyond a markdown file. Both belong in the README.

Copilot

Pull request overview

Adds a minimal “agent rules” layer to Flash: flash init now installs a packaged AGENTS.md into the new project root and (best-effort) creates CLAUDE.md as a symlink to AGENTS.md, with unit tests and packaging updates to support distributing the asset in the wheel.

Changes:

Introduces runpod_flash.rules.install_agent_files() and ships AGENTS.md as package data.
Wires flash init to call install_agent_files() and documents the behavior in README.md.
Adds unit tests covering install behavior and edge cases; updates gitignore defaults and packaging config.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`src/runpod_flash/rules/__init__.py`	New installer for `AGENTS.md` + best-effort `CLAUDE.md` symlink.
`src/runpod_flash/rules/AGENTS.md`	Packaged rules content intended for AI coding tools.
`src/runpod_flash/cli/commands/init.py`	Calls `install_agent_files()` during `flash init` and updates displayed project tree.
`tests/unit/rules/test_install.py`	New unit tests for agent file installation behavior.
`tests/unit/cli/commands/test_init.py`	Ensures `flash init` invokes `install_agent_files()`.
`src/runpod_flash/cli/utils/skeleton_template/.gitignore`	Updates ignore defaults in generated projects.
`README.md`	Documents coding agent integration and installation snippet for existing projects.
`pyproject.toml`	Ensures `rules/*/` is included in wheel package data.
`.gitignore`	Adds `.runpod/` to ignored Flash state paths at repo root.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Three findings from Copilot review on #341: - install_agent_files: AGENTS.md broken-symlink case now logs a warning instead of write_text() following the symlink and overwriting its target. Symmetric to the existing CLAUDE.md guard. - test_broken_claude_md_symlink_logs_warning: skipif on Windows (mirrors test_creates_claude_md_symlink_when_absent). - flash init success panel: only list AGENTS.md / CLAUDE.md when install_agent_files actually created them. Previously listed unconditionally even when symlink creation failed silently. Added test_broken_agents_md_symlink_logs_warning_and_skips_write for the new AGENTS.md guard.

Two findings from Copilot review on #341: - AGENTS.md: 'Three Patterns' heading was stale after Pattern D (BYOI/container image) landed. Renamed to 'Endpoint Patterns' — no count, survives future additions. - README opt-out section: previous 'Flash will not re-create it' was too absolute since a second 'flash init' would write the file again. Now scopes the guarantee to flash subcommands other than init / explicit install_agent_files calls.

runpod-Henrik

Henrik's AI-Powered Bug Finder — PR #341 Review

Verdict: NEEDS WORK

The mechanism (install AGENTS.md + CLAUDE.md symlink, no-overwrite, broken-symlink warning, idempotent) is correct and well-tested at the unit level. CI green on 3.10/3.11/3.12.

The blocker is in the content of AGENTS.md: it ships as packaged data, gets dropped into every new project via flash init, and becomes the canonical rule source for every AI agent touching a Flash project. Two of the rules contradict the codebase.

1. Bug: `AGENTS.md` claim about `workers=N` is wrong

User scenario: an agent reads AGENTS.md (line 109 in the bundled file), sees "workers=N for fixed count, workers=(min, max) for auto-scaling range," and writes Endpoint(name="x", workers=3) for a user who wanted exactly 3 warm workers. They get an endpoint that scales (0, 3) — cold-starting on every burst.

Ground truth from the code, endpoint.py:165-176 (_normalize_workers):

- int: shorthand for (DEFAULT_WORKERS_MIN, n)   # = (0, n)
- (min, max): explicit tuple
- None: defaults to (DEFAULT_WORKERS_MIN, DEFAULT_WORKERS_MAX)

And the SDK reference (docs/Flash_SDK_Reference.md:46) confirms: "workers: N = (0, N). (min, max) = explicit range."

So workers=N is not "fixed count" — it's "auto-scale from 0 to N." The actual fixed-count idiom is workers=(N, N).

Fix: replace the bullet in AGENTS.md with something like:

- workers=N is shorthand for (0, N) — auto-scale 0 → N
- workers=(min, max) for explicit range; use (N, N) to pin a fixed count
- workers=(1, N) keeps at least one warm worker (avoids cold starts)

This is the highest-stakes content error — every Flash project initialized after this lands gets this misleading rule, and agents are explicitly told to trust it.

2. Question: Pattern D's "Flash deploys the image" vs. `is_client` semantics

AGENTS.md Pattern D says "Flash deploys the image and gives you HTTP + queue access to it" for Endpoint(name="vllm", image="runpod/worker-v1-vllm:v2.18.1", ...). But the SDK reference (docs/Flash_SDK_Reference.md:200,254) describes image= as a client mode, and is_client returns True when image= is set (endpoint.py:440).

If Endpoint(image=...) actually does provision a fresh endpoint when invoked (as PR #339's description implies for live provisioning), then AGENTS.md is right and the SDK reference is misleading. If image= is purely a client attaching to a pre-deployed endpoint, AGENTS.md is wrong and would have agents writing code that silently no-ops.

Either way, the two source-of-truth documents tell different stories about the same parameter. Worth nailing down which is correct and aligning both before this rule file ships to every new project.

3. Issue: a user with a real `CLAUDE.md` (not symlink) becomes orphaned from rules

User scenario: user has an existing CLAUDE.md (custom Claude Code instructions) and runs flash init (or the existing-project one-liner). The code's not claude.exists() check correctly skips replacement — but the user is now silently in a state where:

AGENTS.md exists and gets used by Cursor, Codex, Aider, Amp.
CLAUDE.md is the user's custom file, NOT linked to AGENTS.md.
Claude Code reads the user's CLAUDE.md and never sees the Flash rules.

The user is unlikely to realise this until they file a "why does Claude break my Flash project but Cursor doesn't" report.

Either:

Log an info line when CLAUDE.md exists as a real file (not symlink) suggesting the user cat AGENTS.md >> CLAUDE.md or symlink manually.
Or document this case explicitly in the README's "Coding agent integration" section.

The current README only addresses "AGENTS.md already exists" and "broken symlink" cases.

4. Issue: no opt-out at `flash init` time

PR description explicitly: "No flash rules command, no --no-rules flag, no marker injection." Users in restrictive environments (corporate policy on AI tooling, regulated industries, projects that ban LLM-generated context) have no way to suppress this at init time. Their only recourse is to rm AGENTS.md CLAUDE.md after init.

That's a fine default for the 95% case but worth confirming this is a deliberate product decision rather than a "we'll add it if asked" decision. The earlier commits in this branch (feat(rules): add --no-rules flag to flash init) suggest the flag was considered and removed — would be good to record the reasoning somewhere durable (CHANGELOG, design note in the README) so future contributors don't re-litigate.

5. Issue: test gap — `flash init` integration not verified end-to-end

test_init_calls_install_agent_files patches both install_agent_files and create_project_skeleton, so it only verifies that the function gets called. It doesn't verify the resulting filesystem state. Combined with the manual smoke checks all being unchecked in the PR's own test plan, the actual user-visible behavior of flash init demo → "AGENTS.md and CLAUDE.md appear in demo/" is unverified.

A test that runs flash init end-to-end (without mocks) against a tmp_path and asserts both files exist would close this. The components are individually tested, but the wiring isn't.

Nits

src/runpod_flash/rules/__init__.py:51 — os.symlink("AGENTS.md", claude) creates a relative symlink. Robust for in-place use, but breaks if the user later copies the project somewhere with non-symlink-preserving tools (e.g., scp without -p, some zip extractions). Worth a one-line comment that the relative path is intentional.
.gitignore diff at repo root adds .runpod/ — .runpod/ is documented in core/credentials.py as ~/.runpod/config.toml (user HOME, not project). Why does the repo need to gitignore .runpod/ at project root? If nothing creates it there, the entry is dead code; if something does, that's worth knowing.
Pattern A in AGENTS.md uses GpuType.NVIDIA_GEFORCE_RTX_4090. Verified that enum value exists (gpu.py:179). Consistent.
Pattern B uses cpu="cpu3c-1-2". Verified (cpu.py:34). Consistent.
The skeleton template's gitignore drops .runpod/ and dist/ — verified .runpod/ isn't a project-local artifact, but dist/ is the standard Python build output. New projects that run python -m build or use flash build will have dist/ show up in git status. Worth keeping it ignored.
README uses python -c "..." snippet for existing-project install. A bit heavy for a one-shot operation; consider adding a flash rules install subcommand later (acknowledged as out-of-scope here).

🤖 Reviewed by Henrik's AI-Powered Bug Finder

- AGENTS.md: workers=N is shorthand for (0, N), not fixed count. Clarify fixed/warm idioms (N, N) and (1, N) to prevent agents writing endpoints that cold-start on every burst. - Flash_SDK_Reference: clarify image= provisions then acts as client; id= is pure client. Disambiguates is_client semantics. - rules installer: log info when CLAUDE.md exists as a real (non-symlink) file so users learn Claude Code is silently orphaned from Flash rules. - Add intentionality comment on relative AGENTS.md symlink target. - .gitignore: drop .runpod/ from repo root (migrated to .flash/; no new project creates .runpod/). - README: record rationale for not shipping --no-rules / flash rules. - Tests: cover new real-CLAUDE.md warning path and add end-to-end flash init test asserting AGENTS.md + CLAUDE.md land on disk.

promptless · 2026-05-27T20:02:56Z

Promptless updated the documentation for this change.

Triggered by runpod/flash#341

Corrected the documentation to match the final merged behavior. Removed references to the non-existent flash rules command and --no-rules flag. Updated flash init docs to accurately describe the new agent file generation.

Review: Document Flash AI coding agent integration

deanq added 12 commits April 29, 2026 00:21

feat(rules): add flash rules CLI command

b4f6ecf

feat(rules): integrate agent file generation into flash init

271f931

feat(rules): add dynamic context renderer from manifest data

24fd85b

feat(rules): wire dynamic context generation into flash rules command

f4d9a85

feat(rules): add --no-rules flag to flash init and .gitignore entry

7711260

feat(rules): regenerate dynamic context on flash run and flash build

5d85bee

fix(rules): wrap dynamic context in try/except, remove --disable flag…

a4d3bd3

…, fix tests

chore: ignore entire .flash directory

5131fe7

deanq changed the title ~~feat(rules): AE-3155 agent rules layer with CLI-first directive~~ feat(rules): agent rules layer with CLI-first directive May 25, 2026

deanq added 11 commits May 24, 2026 23:07

feat(rules): add minimal install_agent_files

bc39e08

Single public function writes AGENTS.md if absent and creates a relative CLAUDE.md symlink. Idempotent. Symlink failure on platforms without support is logged and skipped (AGENTS.md still installed).

docs(rules): rewrite agent integration section for minimal design

0c3ec22

Reflects the new behavior: flash init writes AGENTS.md + CLAUDE.md symlink, never touches existing files, no flash rules command, no opt-out flag (delete the file).

chore(rules): drop stale .flash/context.md skeleton gitignore entry

eec27bd

The dynamic-context engine was removed in this branch. The .flash/ line already covers everything under that directory; the specific entry plus its 'regenerated on each command' comment are now misleading.

fix(rules): use 'flash dev' not 'flash run' in AGENTS.md

e67edfc

The CLI command is 'flash dev' — 'flash run' does not exist. Was a copy-paste from earlier docs that referenced the older name.

feat(rules): add Pattern D for pre-built container images (BYOI)

b457380

Documents Endpoint(name=..., image=...) for workloads that already serve HTTP — vLLM, TGI, ComfyUI, etc. — and the id= attach mode for connecting to already-deployed endpoints without re-provisioning.

docs(rules): use runpod/worker-v1-vllm:v2.18.1 in Pattern D

f69eb4e

The Runpod-published vLLM worker is the canonical image for serving LLMs on Runpod Serverless. Adjusts the QB call shape to match the worker's input schema (input.prompt, not top-level prompt).

deanq requested a review from Copilot May 25, 2026 08:40

Copilot started reviewing on behalf of deanq May 25, 2026 08:40 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

deanq requested a review from KAJdev May 25, 2026 08:49

deanq requested review from jhcipar and runpod-Henrik May 25, 2026 08:49

runpod-Henrik reviewed May 26, 2026

View reviewed changes

KAJdev approved these changes May 27, 2026

View reviewed changes

deanq and others added 2 commits May 27, 2026 12:19

Merge branch 'main' into deanq/ae-3155-flash-agent-rules-layer

eec6a21

deanq merged commit 4b65121 into main May 27, 2026
4 checks passed

deanq deleted the deanq/ae-3155-flash-agent-rules-layer branch May 27, 2026 19:56

runpod-release-please-bot Bot mentioned this pull request May 27, 2026

chore: release 1.17.0 #343

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rules): agent rules layer with CLI-first directive#341

feat(rules): agent rules layer with CLI-first directive#341
deanq merged 27 commits into
mainfrom
deanq/ae-3155-flash-agent-rules-layer

deanq commented May 25, 2026 •

edited

Loading

Uh oh!

promptless Bot commented May 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

runpod-Henrik left a comment

Uh oh!

Uh oh!

promptless Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

deanq commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Behavior

AGENTS.md content

Existing-project install

Diff scope

Test plan

Uh oh!

promptless Bot commented May 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

runpod-Henrik left a comment

Choose a reason for hiding this comment

Henrik's AI-Powered Bug Finder — PR #341 Review

1. Bug: AGENTS.md claim about workers=N is wrong

2. Question: Pattern D's "Flash deploys the image" vs. is_client semantics

3. Issue: a user with a real CLAUDE.md (not symlink) becomes orphaned from rules

4. Issue: no opt-out at flash init time

5. Issue: test gap — flash init integration not verified end-to-end

Nits

Uh oh!

Uh oh!

promptless Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

deanq commented May 25, 2026 •

edited

Loading

1. Bug: `AGENTS.md` claim about `workers=N` is wrong

2. Question: Pattern D's "Flash deploys the image" vs. `is_client` semantics

3. Issue: a user with a real `CLAUDE.md` (not symlink) becomes orphaned from rules

4. Issue: no opt-out at `flash init` time

5. Issue: test gap — `flash init` integration not verified end-to-end