Skip to content

feat(rules): agent rules layer with CLI-first directive#341

Merged
deanq merged 27 commits into
mainfrom
deanq/ae-3155-flash-agent-rules-layer
May 27, 2026
Merged

feat(rules): agent rules layer with CLI-first directive#341
deanq merged 27 commits into
mainfrom
deanq/ae-3155-flash-agent-rules-layer

Conversation

@deanq
Copy link
Copy Markdown
Member

@deanq deanq commented May 25, 2026

Summary

Minimal agent-rules layer: flash init writes AGENTS.md at project root and creates CLAUDE.md as a symlink to it. That's it.

Why

Agentic AI coding tools (Claude Code, Cursor, Copilot, Codex, Aider, etc.) using Flash routinely bypass the flash CLI and generate raw Runpod REST or GraphQL calls to build, deploy, list, or scale endpoints — bypassing auth, config hashing, drift detection, manifest generation, and image selection.

This PR ships a sidecar AGENTS.md whose first section is a "Use the Flash CLI — Do Not Call Runpod REST or GraphQL Directly" directive with an intent → command table. The file is written by flash init for new projects; existing projects run a one-liner documented in the README.

No agentic tool scans site-packages or subdirectories for instruction files — they read named files at project root. Prior art (Prisma, Husky, shadcn, ESLint) never writes into user-owned root files; they prompt or use namespaced directories. This PR follows that pattern: AGENTS.md is Flash-owned (skipped if user already has one), CLAUDE.md is created as a symlink so Claude Code picks up the same content without duplication.

Behavior

  • flash init writes ./AGENTS.md if absent; symlinks ./CLAUDE.mdAGENTS.md if absent
  • Existing AGENTS.md / CLAUDE.md are never touched
  • Broken CLAUDE.md symlink is detected and a warning is logged
  • Symlink failure (e.g. Windows without dev mode) is logged and skipped; AGENTS.md still installed
  • Missing AGENTS.md in the wheel raises a clear FileNotFoundError with an actionable message
  • Opt-out: delete AGENTS.md. Flash will not re-create it
  • No flash rules command, no --no-rules flag, no marker injection, no dynamic context, no .flash/context.md

AGENTS.md content

Four endpoint patterns documented:

  • A: Queue-based function endpoint
  • B: Load-balanced routes
  • C: Class-based stateful worker
  • D: Pre-built container image (BYOI) — e.g. Endpoint(name="vllm", image="runpod/worker-v1-vllm:v2.18.1", ...) — for workloads that already serve HTTP and don't need a Python handler

Plus a "Rules That Break If Violated" list, a "Common Agent Mistakes" table, and the CLI-first directive at the top.

Existing-project install

python -c "from runpod_flash.rules import install_agent_files; from pathlib import Path; install_agent_files(Path.cwd())"

Diff scope

10 files net change vs main:

  • Added: src/runpod_flash/rules/{__init__.py,AGENTS.md}, tests/unit/rules/test_install.py
  • Modified: src/runpod_flash/cli/commands/init.py, README.md, pyproject.toml, .gitignore, skeleton .gitignore
  • Test mods: tests/unit/cli/commands/test_init.py

Test plan

  • make quality-check passes (2702 tests, 85.4%+ coverage)
  • Unit tests cover: write-when-absent, no-overwrite, symlink creation, no-replace-existing CLAUDE.md, symlink failure path, broken-symlink warning, missing-asset error, idempotency
  • Manual: flash init demo in an empty dir creates AGENTS.md + CLAUDE.md symlink
  • Manual: flash init demo in a dir with pre-existing AGENTS.md leaves it untouched
  • Manual: Claude Code session in the demo project picks up the rules via CLAUDE.md → AGENTS.md

deanq added 12 commits April 29, 2026 00:21
Introduces the rules package with flash-rules.md — a static markdown
file teaching AI coding agents Flash-specific conventions (import
placement, decorator usage, dependency declaration, CLI commands,
and common mistakes).

- 457 words (~594 tokens), well under the 3,000-token budget
- Covers the three core patterns: queue-based, load-balanced, class-based
- Includes CLI cheatsheet and common agent mistake table
- Add rules/**/* to package-data so flash-rules.md is included in wheel
- Implement read_static_rules() via importlib.resources for portable access
- Implement inject_rules_section() with full marker lifecycle:
  replace, append, orphan-start, multiple-pairs, FLASH:DISABLE skip
- Implement generate_agent_files() for CLAUDE.md, .cursorrules, AGENTS.md,
  .github/copilot-instructions.md, and .flash/context.md placeholder
- 18 new tests across TestRulesPackaging, TestReadStaticRules,
  TestInjectRulesSection, TestGenerateAgentFiles — all pass
- engine.py at 100% coverage; total suite 85.53% (2396 passed)
Commit 5d85bee reverted ~135 lines of main's run.py while adding the
rules engine hook: RuntimeScanner renamed back to RemoteDecoratorScanner,
tuple return dropped, _discover_resources reverted to the removed
core.discovery.ResourceDiscovery path, and hot-reload sys.modules purge
removed. This restores run.py to origin/main and re-adds only the
update_dynamic_context() call.

Also switches two init tests to CliRunner.invoke() so Typer resolves
defaults instead of leaking ArgumentInfo sentinels.
…EADME

Adds a prominent "Use the Flash CLI — Do Not Call Runpod REST or GraphQL
Directly" section to flash-rules.md with an intent → command table covering
init/run/build/deploy/preview/undeploy/app/env/rules. Guardrails raw
runpod.Endpoint(...) use to invoking already-deployed endpoints only.

Rewrites the README "Coding agent integration" section to surface the
files that flash init now writes (CLAUDE.md, AGENTS.md, .cursorrules,
.github/copilot-instructions.md, .flash/context.md), the FLASH:START/END
markers, `flash rules` regeneration, FLASH:DISABLE per-file opt-out,
and the --no-rules repo-wide opt-out.

Closes AE-3155.
@promptless
Copy link
Copy Markdown

promptless Bot commented May 25, 2026

Promptless prepared a documentation update related to this change.

Triggered by runpod/flash#341

Added documentation for the new AI coding agent integration feature. Created a new flash rules CLI reference page, updated the flash init docs with the --no-rules flag and generated agent files, added the command to the CLI overview, and updated the quickstart with a tip about built-in agent integration.

Review: Flash AI coding agent integration

@deanq deanq changed the title feat(rules): AE-3155 agent rules layer with CLI-first directive feat(rules): agent rules layer with CLI-first directive May 25, 2026
deanq added 11 commits May 24, 2026 23:07
Renames the static rules file to AGENTS.md (the emerging cross-tool
convention). Moves the CLI-first directive to the top of the file so
agents read it before any other content. Cuts the GPU/CPU reference
tables (out-of-date risk) and the dynamic-context placeholder section.
Single public function writes AGENTS.md if absent and creates a relative
CLAUDE.md symlink. Idempotent. Symlink failure on platforms without
support is logged and skipped (AGENTS.md still installed).
…wheel asset

Code review feedback. Broken symlink case now emits a warning instead of
silently skipping. Missing AGENTS.md in package data now raises a
FileNotFoundError with an actionable message instead of a bare traceback.
Deletes engine.py, context.py, and the flash rules CLI command. Drops
update_dynamic_context hooks from flash run and flash build. Rewires
flash init to call install_agent_files, removes the --no-rules flag and
the now-unused _get_version helper. Updates test_init.py to match the
new behavior.

Tasks 4-7 of the minimal agent-rules plan committed as one unit to keep
quality-check green throughout: the engine importers in the 4 CLI files
are all cleaned up before the commit lands.
Reflects the new behavior: flash init writes AGENTS.md + CLAUDE.md
symlink, never touches existing files, no flash rules command, no
opt-out flag (delete the file).
The dynamic-context engine was removed in this branch. The .flash/ line
already covers everything under that directory; the specific entry plus
its 'regenerated on each command' comment are now misleading.
The CLI command is 'flash dev' — 'flash run' does not exist. Was a
copy-paste from earlier docs that referenced the older name.
Documents Endpoint(name=..., image=...) for workloads that already serve
HTTP — vLLM, TGI, ComfyUI, etc. — and the id= attach mode for connecting
to already-deployed endpoints without re-provisioning.
The Runpod-published vLLM worker is the canonical image for serving
LLMs on Runpod Serverless. Adjusts the QB call shape to match the
worker's input schema (input.prompt, not top-level prompt).
…nore

.runpod/ was the pre-rename state directory; nothing in current Flash
writes there. The migration in resource_manager.py handles old projects
that still have .runpod/resources.pkl. dist/ was duplicated — it's
already in the Python section at the top of the file.
Was unintentionally dropped during the Task 8 section rewrite. The
skill bundle is complementary to AGENTS.md — static rules cover the
CLI-first directive; the skill bundle adds richer Claude Code behavior
that goes beyond a markdown file. Both belong in the README.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a minimal “agent rules” layer to Flash: flash init now installs a packaged AGENTS.md into the new project root and (best-effort) creates CLAUDE.md as a symlink to AGENTS.md, with unit tests and packaging updates to support distributing the asset in the wheel.

Changes:

  • Introduces runpod_flash.rules.install_agent_files() and ships AGENTS.md as package data.
  • Wires flash init to call install_agent_files() and documents the behavior in README.md.
  • Adds unit tests covering install behavior and edge cases; updates gitignore defaults and packaging config.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/runpod_flash/rules/__init__.py New installer for AGENTS.md + best-effort CLAUDE.md symlink.
src/runpod_flash/rules/AGENTS.md Packaged rules content intended for AI coding tools.
src/runpod_flash/cli/commands/init.py Calls install_agent_files() during flash init and updates displayed project tree.
tests/unit/rules/test_install.py New unit tests for agent file installation behavior.
tests/unit/cli/commands/test_init.py Ensures flash init invokes install_agent_files().
src/runpod_flash/cli/utils/skeleton_template/.gitignore Updates ignore defaults in generated projects.
README.md Documents coding agent integration and installation snippet for existing projects.
pyproject.toml Ensures rules/**/* is included in wheel package data.
.gitignore Adds .runpod/ to ignored Flash state paths at repo root.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/runpod_flash/rules/__init__.py
Comment thread tests/unit/rules/test_install.py
Comment thread src/runpod_flash/cli/utils/skeleton_template/.gitignore
Comment thread src/runpod_flash/rules/AGENTS.md Outdated
Comment thread README.md Outdated
Comment thread src/runpod_flash/cli/commands/init.py Outdated
Three findings from Copilot review on #341:

- install_agent_files: AGENTS.md broken-symlink case now logs a warning
  instead of write_text() following the symlink and overwriting its
  target. Symmetric to the existing CLAUDE.md guard.
- test_broken_claude_md_symlink_logs_warning: skipif on Windows
  (mirrors test_creates_claude_md_symlink_when_absent).
- flash init success panel: only list AGENTS.md / CLAUDE.md when
  install_agent_files actually created them. Previously listed
  unconditionally even when symlink creation failed silently.

Added test_broken_agents_md_symlink_logs_warning_and_skips_write
for the new AGENTS.md guard.
@deanq deanq requested a review from KAJdev May 25, 2026 08:49
@deanq deanq requested review from jhcipar and runpod-Henrik May 25, 2026 08:49
Two findings from Copilot review on #341:

- AGENTS.md: 'Three Patterns' heading was stale after Pattern D
  (BYOI/container image) landed. Renamed to 'Endpoint Patterns' —
  no count, survives future additions.
- README opt-out section: previous 'Flash will not re-create it'
  was too absolute since a second 'flash init' would write the file
  again. Now scopes the guarantee to flash subcommands other than
  init / explicit install_agent_files calls.
Copy link
Copy Markdown
Contributor

@runpod-Henrik runpod-Henrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Henrik's AI-Powered Bug Finder — PR #341 Review

Verdict: NEEDS WORK

The mechanism (install AGENTS.md + CLAUDE.md symlink, no-overwrite, broken-symlink warning, idempotent) is correct and well-tested at the unit level. CI green on 3.10/3.11/3.12.

The blocker is in the content of AGENTS.md: it ships as packaged data, gets dropped into every new project via flash init, and becomes the canonical rule source for every AI agent touching a Flash project. Two of the rules contradict the codebase.


1. Bug: AGENTS.md claim about workers=N is wrong

User scenario: an agent reads AGENTS.md (line 109 in the bundled file), sees "workers=N for fixed count, workers=(min, max) for auto-scaling range," and writes Endpoint(name="x", workers=3) for a user who wanted exactly 3 warm workers. They get an endpoint that scales (0, 3) — cold-starting on every burst.

Ground truth from the code, endpoint.py:165-176 (_normalize_workers):

- int: shorthand for (DEFAULT_WORKERS_MIN, n)   # = (0, n)
- (min, max): explicit tuple
- None: defaults to (DEFAULT_WORKERS_MIN, DEFAULT_WORKERS_MAX)

And the SDK reference (docs/Flash_SDK_Reference.md:46) confirms: "workers: N = (0, N). (min, max) = explicit range."

So workers=N is not "fixed count" — it's "auto-scale from 0 to N." The actual fixed-count idiom is workers=(N, N).

Fix: replace the bullet in AGENTS.md with something like:

- workers=N is shorthand for (0, N) — auto-scale 0 → N
- workers=(min, max) for explicit range; use (N, N) to pin a fixed count
- workers=(1, N) keeps at least one warm worker (avoids cold starts)

This is the highest-stakes content error — every Flash project initialized after this lands gets this misleading rule, and agents are explicitly told to trust it.


2. Question: Pattern D's "Flash deploys the image" vs. is_client semantics

AGENTS.md Pattern D says "Flash deploys the image and gives you HTTP + queue access to it" for Endpoint(name="vllm", image="runpod/worker-v1-vllm:v2.18.1", ...). But the SDK reference (docs/Flash_SDK_Reference.md:200,254) describes image= as a client mode, and is_client returns True when image= is set (endpoint.py:440).

If Endpoint(image=...) actually does provision a fresh endpoint when invoked (as PR #339's description implies for live provisioning), then AGENTS.md is right and the SDK reference is misleading. If image= is purely a client attaching to a pre-deployed endpoint, AGENTS.md is wrong and would have agents writing code that silently no-ops.

Either way, the two source-of-truth documents tell different stories about the same parameter. Worth nailing down which is correct and aligning both before this rule file ships to every new project.


3. Issue: a user with a real CLAUDE.md (not symlink) becomes orphaned from rules

User scenario: user has an existing CLAUDE.md (custom Claude Code instructions) and runs flash init (or the existing-project one-liner). The code's not claude.exists() check correctly skips replacement — but the user is now silently in a state where:

  • AGENTS.md exists and gets used by Cursor, Codex, Aider, Amp.
  • CLAUDE.md is the user's custom file, NOT linked to AGENTS.md.
  • Claude Code reads the user's CLAUDE.md and never sees the Flash rules.

The user is unlikely to realise this until they file a "why does Claude break my Flash project but Cursor doesn't" report.

Either:

  • Log an info line when CLAUDE.md exists as a real file (not symlink) suggesting the user cat AGENTS.md >> CLAUDE.md or symlink manually.
  • Or document this case explicitly in the README's "Coding agent integration" section.

The current README only addresses "AGENTS.md already exists" and "broken symlink" cases.


4. Issue: no opt-out at flash init time

PR description explicitly: "No flash rules command, no --no-rules flag, no marker injection." Users in restrictive environments (corporate policy on AI tooling, regulated industries, projects that ban LLM-generated context) have no way to suppress this at init time. Their only recourse is to rm AGENTS.md CLAUDE.md after init.

That's a fine default for the 95% case but worth confirming this is a deliberate product decision rather than a "we'll add it if asked" decision. The earlier commits in this branch (feat(rules): add --no-rules flag to flash init) suggest the flag was considered and removed — would be good to record the reasoning somewhere durable (CHANGELOG, design note in the README) so future contributors don't re-litigate.


5. Issue: test gap — flash init integration not verified end-to-end

test_init_calls_install_agent_files patches both install_agent_files and create_project_skeleton, so it only verifies that the function gets called. It doesn't verify the resulting filesystem state. Combined with the manual smoke checks all being unchecked in the PR's own test plan, the actual user-visible behavior of flash init demo → "AGENTS.md and CLAUDE.md appear in demo/" is unverified.

A test that runs flash init end-to-end (without mocks) against a tmp_path and asserts both files exist would close this. The components are individually tested, but the wiring isn't.


Nits

  • src/runpod_flash/rules/__init__.py:51os.symlink("AGENTS.md", claude) creates a relative symlink. Robust for in-place use, but breaks if the user later copies the project somewhere with non-symlink-preserving tools (e.g., scp without -p, some zip extractions). Worth a one-line comment that the relative path is intentional.
  • .gitignore diff at repo root adds .runpod/.runpod/ is documented in core/credentials.py as ~/.runpod/config.toml (user HOME, not project). Why does the repo need to gitignore .runpod/ at project root? If nothing creates it there, the entry is dead code; if something does, that's worth knowing.
  • Pattern A in AGENTS.md uses GpuType.NVIDIA_GEFORCE_RTX_4090. Verified that enum value exists (gpu.py:179). Consistent.
  • Pattern B uses cpu="cpu3c-1-2". Verified (cpu.py:34). Consistent.
  • The skeleton template's gitignore drops .runpod/ and dist/ — verified .runpod/ isn't a project-local artifact, but dist/ is the standard Python build output. New projects that run python -m build or use flash build will have dist/ show up in git status. Worth keeping it ignored.
  • README uses python -c "..." snippet for existing-project install. A bit heavy for a one-shot operation; consider adding a flash rules install subcommand later (acknowledged as out-of-scope here).

🤖 Reviewed by Henrik's AI-Powered Bug Finder

deanq and others added 2 commits May 27, 2026 12:19
- AGENTS.md: workers=N is shorthand for (0, N), not fixed count. Clarify
  fixed/warm idioms (N, N) and (1, N) to prevent agents writing endpoints
  that cold-start on every burst.
- Flash_SDK_Reference: clarify image= provisions then acts as client;
  id= is pure client. Disambiguates is_client semantics.
- rules installer: log info when CLAUDE.md exists as a real (non-symlink)
  file so users learn Claude Code is silently orphaned from Flash rules.
- Add intentionality comment on relative AGENTS.md symlink target.
- .gitignore: drop .runpod/ from repo root (migrated to .flash/; no new
  project creates .runpod/).
- README: record rationale for not shipping --no-rules / flash rules.
- Tests: cover new real-CLAUDE.md warning path and add end-to-end
  flash init test asserting AGENTS.md + CLAUDE.md land on disk.
@deanq deanq merged commit 4b65121 into main May 27, 2026
4 checks passed
@deanq deanq deleted the deanq/ae-3155-flash-agent-rules-layer branch May 27, 2026 19:56
@promptless
Copy link
Copy Markdown

promptless Bot commented May 27, 2026

Promptless updated the documentation for this change.

Triggered by runpod/flash#341

Corrected the documentation to match the final merged behavior. Removed references to the non-existent flash rules command and --no-rules flag. Updated flash init docs to accurately describe the new agent file generation.

Review: Document Flash AI coding agent integration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants