feat(rules): agent rules layer with CLI-first directive#341
Conversation
Introduces the rules package with flash-rules.md — a static markdown file teaching AI coding agents Flash-specific conventions (import placement, decorator usage, dependency declaration, CLI commands, and common mistakes). - 457 words (~594 tokens), well under the 3,000-token budget - Covers the three core patterns: queue-based, load-balanced, class-based - Includes CLI cheatsheet and common agent mistake table
- Add rules/**/* to package-data so flash-rules.md is included in wheel - Implement read_static_rules() via importlib.resources for portable access - Implement inject_rules_section() with full marker lifecycle: replace, append, orphan-start, multiple-pairs, FLASH:DISABLE skip - Implement generate_agent_files() for CLAUDE.md, .cursorrules, AGENTS.md, .github/copilot-instructions.md, and .flash/context.md placeholder - 18 new tests across TestRulesPackaging, TestReadStaticRules, TestInjectRulesSection, TestGenerateAgentFiles — all pass - engine.py at 100% coverage; total suite 85.53% (2396 passed)
Commit 5d85bee reverted ~135 lines of main's run.py while adding the rules engine hook: RuntimeScanner renamed back to RemoteDecoratorScanner, tuple return dropped, _discover_resources reverted to the removed core.discovery.ResourceDiscovery path, and hot-reload sys.modules purge removed. This restores run.py to origin/main and re-adds only the update_dynamic_context() call. Also switches two init tests to CliRunner.invoke() so Typer resolves defaults instead of leaking ArgumentInfo sentinels.
…EADME Adds a prominent "Use the Flash CLI — Do Not Call Runpod REST or GraphQL Directly" section to flash-rules.md with an intent → command table covering init/run/build/deploy/preview/undeploy/app/env/rules. Guardrails raw runpod.Endpoint(...) use to invoking already-deployed endpoints only. Rewrites the README "Coding agent integration" section to surface the files that flash init now writes (CLAUDE.md, AGENTS.md, .cursorrules, .github/copilot-instructions.md, .flash/context.md), the FLASH:START/END markers, `flash rules` regeneration, FLASH:DISABLE per-file opt-out, and the --no-rules repo-wide opt-out. Closes AE-3155.
|
Promptless prepared a documentation update related to this change. Triggered by runpod/flash#341 Added documentation for the new AI coding agent integration feature. Created a new |
Renames the static rules file to AGENTS.md (the emerging cross-tool convention). Moves the CLI-first directive to the top of the file so agents read it before any other content. Cuts the GPU/CPU reference tables (out-of-date risk) and the dynamic-context placeholder section.
Single public function writes AGENTS.md if absent and creates a relative CLAUDE.md symlink. Idempotent. Symlink failure on platforms without support is logged and skipped (AGENTS.md still installed).
…wheel asset Code review feedback. Broken symlink case now emits a warning instead of silently skipping. Missing AGENTS.md in package data now raises a FileNotFoundError with an actionable message instead of a bare traceback.
Deletes engine.py, context.py, and the flash rules CLI command. Drops update_dynamic_context hooks from flash run and flash build. Rewires flash init to call install_agent_files, removes the --no-rules flag and the now-unused _get_version helper. Updates test_init.py to match the new behavior. Tasks 4-7 of the minimal agent-rules plan committed as one unit to keep quality-check green throughout: the engine importers in the 4 CLI files are all cleaned up before the commit lands.
Reflects the new behavior: flash init writes AGENTS.md + CLAUDE.md symlink, never touches existing files, no flash rules command, no opt-out flag (delete the file).
The dynamic-context engine was removed in this branch. The .flash/ line already covers everything under that directory; the specific entry plus its 'regenerated on each command' comment are now misleading.
The CLI command is 'flash dev' — 'flash run' does not exist. Was a copy-paste from earlier docs that referenced the older name.
Documents Endpoint(name=..., image=...) for workloads that already serve HTTP — vLLM, TGI, ComfyUI, etc. — and the id= attach mode for connecting to already-deployed endpoints without re-provisioning.
The Runpod-published vLLM worker is the canonical image for serving LLMs on Runpod Serverless. Adjusts the QB call shape to match the worker's input schema (input.prompt, not top-level prompt).
…nore .runpod/ was the pre-rename state directory; nothing in current Flash writes there. The migration in resource_manager.py handles old projects that still have .runpod/resources.pkl. dist/ was duplicated — it's already in the Python section at the top of the file.
Was unintentionally dropped during the Task 8 section rewrite. The skill bundle is complementary to AGENTS.md — static rules cover the CLI-first directive; the skill bundle adds richer Claude Code behavior that goes beyond a markdown file. Both belong in the README.
There was a problem hiding this comment.
Pull request overview
Adds a minimal “agent rules” layer to Flash: flash init now installs a packaged AGENTS.md into the new project root and (best-effort) creates CLAUDE.md as a symlink to AGENTS.md, with unit tests and packaging updates to support distributing the asset in the wheel.
Changes:
- Introduces
runpod_flash.rules.install_agent_files()and shipsAGENTS.mdas package data. - Wires
flash initto callinstall_agent_files()and documents the behavior inREADME.md. - Adds unit tests covering install behavior and edge cases; updates gitignore defaults and packaging config.
Reviewed changes
Copilot reviewed 8 out of 10 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
src/runpod_flash/rules/__init__.py |
New installer for AGENTS.md + best-effort CLAUDE.md symlink. |
src/runpod_flash/rules/AGENTS.md |
Packaged rules content intended for AI coding tools. |
src/runpod_flash/cli/commands/init.py |
Calls install_agent_files() during flash init and updates displayed project tree. |
tests/unit/rules/test_install.py |
New unit tests for agent file installation behavior. |
tests/unit/cli/commands/test_init.py |
Ensures flash init invokes install_agent_files(). |
src/runpod_flash/cli/utils/skeleton_template/.gitignore |
Updates ignore defaults in generated projects. |
README.md |
Documents coding agent integration and installation snippet for existing projects. |
pyproject.toml |
Ensures rules/**/* is included in wheel package data. |
.gitignore |
Adds .runpod/ to ignored Flash state paths at repo root. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three findings from Copilot review on #341: - install_agent_files: AGENTS.md broken-symlink case now logs a warning instead of write_text() following the symlink and overwriting its target. Symmetric to the existing CLAUDE.md guard. - test_broken_claude_md_symlink_logs_warning: skipif on Windows (mirrors test_creates_claude_md_symlink_when_absent). - flash init success panel: only list AGENTS.md / CLAUDE.md when install_agent_files actually created them. Previously listed unconditionally even when symlink creation failed silently. Added test_broken_agents_md_symlink_logs_warning_and_skips_write for the new AGENTS.md guard.
Two findings from Copilot review on #341: - AGENTS.md: 'Three Patterns' heading was stale after Pattern D (BYOI/container image) landed. Renamed to 'Endpoint Patterns' — no count, survives future additions. - README opt-out section: previous 'Flash will not re-create it' was too absolute since a second 'flash init' would write the file again. Now scopes the guarantee to flash subcommands other than init / explicit install_agent_files calls.
runpod-Henrik
left a comment
There was a problem hiding this comment.
Henrik's AI-Powered Bug Finder — PR #341 Review
Verdict: NEEDS WORK
The mechanism (install AGENTS.md + CLAUDE.md symlink, no-overwrite, broken-symlink warning, idempotent) is correct and well-tested at the unit level. CI green on 3.10/3.11/3.12.
The blocker is in the content of AGENTS.md: it ships as packaged data, gets dropped into every new project via flash init, and becomes the canonical rule source for every AI agent touching a Flash project. Two of the rules contradict the codebase.
1. Bug: AGENTS.md claim about workers=N is wrong
User scenario: an agent reads AGENTS.md (line 109 in the bundled file), sees "workers=N for fixed count, workers=(min, max) for auto-scaling range," and writes Endpoint(name="x", workers=3) for a user who wanted exactly 3 warm workers. They get an endpoint that scales (0, 3) — cold-starting on every burst.
Ground truth from the code, endpoint.py:165-176 (_normalize_workers):
- int: shorthand for (DEFAULT_WORKERS_MIN, n) # = (0, n)
- (min, max): explicit tuple
- None: defaults to (DEFAULT_WORKERS_MIN, DEFAULT_WORKERS_MAX)
And the SDK reference (docs/Flash_SDK_Reference.md:46) confirms: "workers: N = (0, N). (min, max) = explicit range."
So workers=N is not "fixed count" — it's "auto-scale from 0 to N." The actual fixed-count idiom is workers=(N, N).
Fix: replace the bullet in AGENTS.md with something like:
- workers=N is shorthand for (0, N) — auto-scale 0 → N
- workers=(min, max) for explicit range; use (N, N) to pin a fixed count
- workers=(1, N) keeps at least one warm worker (avoids cold starts)
This is the highest-stakes content error — every Flash project initialized after this lands gets this misleading rule, and agents are explicitly told to trust it.
2. Question: Pattern D's "Flash deploys the image" vs. is_client semantics
AGENTS.md Pattern D says "Flash deploys the image and gives you HTTP + queue access to it" for Endpoint(name="vllm", image="runpod/worker-v1-vllm:v2.18.1", ...). But the SDK reference (docs/Flash_SDK_Reference.md:200,254) describes image= as a client mode, and is_client returns True when image= is set (endpoint.py:440).
If Endpoint(image=...) actually does provision a fresh endpoint when invoked (as PR #339's description implies for live provisioning), then AGENTS.md is right and the SDK reference is misleading. If image= is purely a client attaching to a pre-deployed endpoint, AGENTS.md is wrong and would have agents writing code that silently no-ops.
Either way, the two source-of-truth documents tell different stories about the same parameter. Worth nailing down which is correct and aligning both before this rule file ships to every new project.
3. Issue: a user with a real CLAUDE.md (not symlink) becomes orphaned from rules
User scenario: user has an existing CLAUDE.md (custom Claude Code instructions) and runs flash init (or the existing-project one-liner). The code's not claude.exists() check correctly skips replacement — but the user is now silently in a state where:
AGENTS.mdexists and gets used by Cursor, Codex, Aider, Amp.CLAUDE.mdis the user's custom file, NOT linked to AGENTS.md.- Claude Code reads the user's CLAUDE.md and never sees the Flash rules.
The user is unlikely to realise this until they file a "why does Claude break my Flash project but Cursor doesn't" report.
Either:
- Log an info line when CLAUDE.md exists as a real file (not symlink) suggesting the user
cat AGENTS.md >> CLAUDE.mdor symlink manually. - Or document this case explicitly in the README's "Coding agent integration" section.
The current README only addresses "AGENTS.md already exists" and "broken symlink" cases.
4. Issue: no opt-out at flash init time
PR description explicitly: "No flash rules command, no --no-rules flag, no marker injection." Users in restrictive environments (corporate policy on AI tooling, regulated industries, projects that ban LLM-generated context) have no way to suppress this at init time. Their only recourse is to rm AGENTS.md CLAUDE.md after init.
That's a fine default for the 95% case but worth confirming this is a deliberate product decision rather than a "we'll add it if asked" decision. The earlier commits in this branch (feat(rules): add --no-rules flag to flash init) suggest the flag was considered and removed — would be good to record the reasoning somewhere durable (CHANGELOG, design note in the README) so future contributors don't re-litigate.
5. Issue: test gap — flash init integration not verified end-to-end
test_init_calls_install_agent_files patches both install_agent_files and create_project_skeleton, so it only verifies that the function gets called. It doesn't verify the resulting filesystem state. Combined with the manual smoke checks all being unchecked in the PR's own test plan, the actual user-visible behavior of flash init demo → "AGENTS.md and CLAUDE.md appear in demo/" is unverified.
A test that runs flash init end-to-end (without mocks) against a tmp_path and asserts both files exist would close this. The components are individually tested, but the wiring isn't.
Nits
src/runpod_flash/rules/__init__.py:51—os.symlink("AGENTS.md", claude)creates a relative symlink. Robust for in-place use, but breaks if the user later copies the project somewhere with non-symlink-preserving tools (e.g.,scpwithout-p, some zip extractions). Worth a one-line comment that the relative path is intentional..gitignorediff at repo root adds.runpod/—.runpod/is documented incore/credentials.pyas~/.runpod/config.toml(user HOME, not project). Why does the repo need to gitignore.runpod/at project root? If nothing creates it there, the entry is dead code; if something does, that's worth knowing.- Pattern A in AGENTS.md uses
GpuType.NVIDIA_GEFORCE_RTX_4090. Verified that enum value exists (gpu.py:179). Consistent. - Pattern B uses
cpu="cpu3c-1-2". Verified (cpu.py:34). Consistent. - The skeleton template's gitignore drops
.runpod/anddist/— verified.runpod/isn't a project-local artifact, butdist/is the standard Python build output. New projects that runpython -m buildor useflash buildwill havedist/show up ingit status. Worth keeping it ignored. - README uses
python -c "..."snippet for existing-project install. A bit heavy for a one-shot operation; consider adding aflash rules installsubcommand later (acknowledged as out-of-scope here).
🤖 Reviewed by Henrik's AI-Powered Bug Finder
- AGENTS.md: workers=N is shorthand for (0, N), not fixed count. Clarify fixed/warm idioms (N, N) and (1, N) to prevent agents writing endpoints that cold-start on every burst. - Flash_SDK_Reference: clarify image= provisions then acts as client; id= is pure client. Disambiguates is_client semantics. - rules installer: log info when CLAUDE.md exists as a real (non-symlink) file so users learn Claude Code is silently orphaned from Flash rules. - Add intentionality comment on relative AGENTS.md symlink target. - .gitignore: drop .runpod/ from repo root (migrated to .flash/; no new project creates .runpod/). - README: record rationale for not shipping --no-rules / flash rules. - Tests: cover new real-CLAUDE.md warning path and add end-to-end flash init test asserting AGENTS.md + CLAUDE.md land on disk.
|
Promptless updated the documentation for this change. Triggered by runpod/flash#341 Corrected the documentation to match the final merged behavior. Removed references to the non-existent |
Summary
Minimal agent-rules layer:
flash initwritesAGENTS.mdat project root and createsCLAUDE.mdas a symlink to it. That's it.Why
Agentic AI coding tools (Claude Code, Cursor, Copilot, Codex, Aider, etc.) using Flash routinely bypass the
flashCLI and generate raw Runpod REST or GraphQL calls to build, deploy, list, or scale endpoints — bypassing auth, config hashing, drift detection, manifest generation, and image selection.This PR ships a sidecar
AGENTS.mdwhose first section is a "Use the Flash CLI — Do Not Call Runpod REST or GraphQL Directly" directive with an intent → command table. The file is written byflash initfor new projects; existing projects run a one-liner documented in the README.No agentic tool scans
site-packagesor subdirectories for instruction files — they read named files at project root. Prior art (Prisma, Husky, shadcn, ESLint) never writes into user-owned root files; they prompt or use namespaced directories. This PR follows that pattern: AGENTS.md is Flash-owned (skipped if user already has one), CLAUDE.md is created as a symlink so Claude Code picks up the same content without duplication.Behavior
flash initwrites./AGENTS.mdif absent; symlinks./CLAUDE.md→AGENTS.mdif absentAGENTS.md/CLAUDE.mdare never touchedCLAUDE.mdsymlink is detected and a warning is loggedAGENTS.mdstill installedFileNotFoundErrorwith an actionable messageAGENTS.md. Flash will not re-create itflash rulescommand, no--no-rulesflag, no marker injection, no dynamic context, no.flash/context.mdAGENTS.md content
Four endpoint patterns documented:
Endpoint(name="vllm", image="runpod/worker-v1-vllm:v2.18.1", ...)— for workloads that already serve HTTP and don't need a Python handlerPlus a "Rules That Break If Violated" list, a "Common Agent Mistakes" table, and the CLI-first directive at the top.
Existing-project install
python -c "from runpod_flash.rules import install_agent_files; from pathlib import Path; install_agent_files(Path.cwd())"Diff scope
10 files net change vs
main:src/runpod_flash/rules/{__init__.py,AGENTS.md},tests/unit/rules/test_install.pysrc/runpod_flash/cli/commands/init.py,README.md,pyproject.toml,.gitignore, skeleton.gitignoretests/unit/cli/commands/test_init.pyTest plan
make quality-checkpasses (2702 tests, 85.4%+ coverage)flash init demoin an empty dir creates AGENTS.md + CLAUDE.md symlinkflash init demoin a dir with pre-existing AGENTS.md leaves it untouched