docs(workflow): refine codex plan-execute prompt#256
Conversation
Reviewer's GuideAdds new MCP tools and helpers for safely listing and reading text artifacts under the outputs/ directory, and refines the Codex workflow documentation into a clearer two-phase plan/execute model aligned with a new Cursor+Codex cheatsheet. Sequence diagram for quantlab_artifact_read MCP toolsequenceDiagram
participant Client
participant MCPServer
participant OutputsHelper
participant FileSystem
Client->>MCPServer: invoke quantlab_artifact_read(relative_path)
MCPServer->>OutputsHelper: readOutputsArtifact(relative_path)
OutputsHelper->>OutputsHelper: resolveOutputsPath(relative_path)
OutputsHelper->>FileSystem: stat(resolvedPath)
FileSystem-->>OutputsHelper: FileStats
alt Path_is_directory
OutputsHelper-->>MCPServer: throw Error("Expected a file under outputs")
MCPServer-->>Client: isError response with failure message
else Path_is_file
OutputsHelper->>OutputsHelper: check BINARY_ARTIFACT_EXTENSIONS
alt Is_binary_extension
OutputsHelper-->>MCPServer: throw Error("Binary artifact reading is not supported")
MCPServer-->>Client: isError response with failure message
else Is_text_file
OutputsHelper->>FileSystem: readFile(resolvedPath, utf8)
FileSystem-->>OutputsHelper: content
OutputsHelper->>OutputsHelper: truncateText(content)
OutputsHelper-->>MCPServer: payload { root, requested_path, absolute_path, bytes, modified_at, truncated, content }
MCPServer->>MCPServer: formatBytes(payload.bytes)
MCPServer-->>Client: text content with metadata and body
end
end
Sequence diagram for quantlab_outputs_list MCP toolsequenceDiagram
participant Client
participant MCPServer
participant OutputsHelper
participant FileSystem
Client->>MCPServer: invoke quantlab_outputs_list(relative_path?)
MCPServer->>OutputsHelper: listOutputs(relative_path)
OutputsHelper->>OutputsHelper: resolveOutputsPath(relative_path)
OutputsHelper->>FileSystem: stat(targetPath)
FileSystem-->>OutputsHelper: DirStats
alt Not_a_directory
OutputsHelper-->>MCPServer: throw Error("Not a directory under outputs")
MCPServer-->>Client: isError response with failure message
else Is_directory
OutputsHelper->>FileSystem: readdir(targetPath, withFileTypes)
FileSystem-->>OutputsHelper: DirEntries
loop For_each_entry
OutputsHelper->>FileSystem: stat(entryPath)
FileSystem-->>OutputsHelper: EntryStats
OutputsHelper->>OutputsHelper: formatBytes(size)
OutputsHelper->>OutputsHelper: toIsoString(mtime)
OutputsHelper->>OutputsHelper: push entry metadata into entries list
end
OutputsHelper->>OutputsHelper: sort entries (directories first, then name)
OutputsHelper-->>MCPServer: payload { root, requested_path, absolute_path, entry_count, entries }
MCPServer-->>Client: JSON text listing outputs entries
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- The binary-artifact guard in
readOutputsArtifactrelies solely on file extensions, which is easy to bypass; consider either inspecting file content (e.g., checking for non-text bytes) or enforcing a size/heuristic limit to avoid trying to render large/binary artifacts as UTF‑8. - The MCP tools
quantlab_outputs_listandquantlab_artifact_readcurrently stringify their payloads into a single text block; if you expect other tools/agents to consume these programmatically, consider returning a more structured, machine-friendly format (e.g., JSON content type or a consistent schema) instead of preformatted text.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The binary-artifact guard in `readOutputsArtifact` relies solely on file extensions, which is easy to bypass; consider either inspecting file content (e.g., checking for non-text bytes) or enforcing a size/heuristic limit to avoid trying to render large/binary artifacts as UTF‑8.
- The MCP tools `quantlab_outputs_list` and `quantlab_artifact_read` currently stringify their payloads into a single text block; if you expect other tools/agents to consume these programmatically, consider returning a more structured, machine-friendly format (e.g., JSON content type or a consistent schema) instead of preformatted text.
## Individual Comments
### Comment 1
<location path="desktop/mcp-server.mjs" line_range="145-151" />
<code_context>
+ return {
+ root: "outputs",
+ requested_path: relativePath || ".",
+ absolute_path: targetPath,
+ entry_count: detailed.length,
+ entries: detailed,
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Avoid exposing absolute filesystem paths in the outputs API payloads.
The `absolute_path` field exposes internal filesystem layout, which can be sensitive if clients are untrusted or logs are shared. Consider removing this field or returning only a path relative to `outputs/` (since `relative_path`/`requested_path` already cover client needs) and keep absolute paths internal to the server.
```suggestion
return {
root: "outputs",
requested_path: relativePath || ".",
entry_count: detailed.length,
entries: detailed,
};
```
</issue_to_address>
### Comment 2
<location path="desktop/mcp-server.mjs" line_range="16-24" />
<code_context>
+const OUTPUTS_ROOT = path.resolve(PROJECT_ROOT, "outputs");
const PYTHON_EXECUTABLE = process.env.QUANTLAB_PYTHON || "python";
const MAX_OUTPUT_CHARS = 12000;
+const BINARY_ARTIFACT_EXTENSIONS = new Set([
+ ".png",
+ ".jpg",
+ ".jpeg",
+ ".gif",
+ ".webp",
+ ".bmp",
+ ".ico",
+ ".pdf",
+]);
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider an allowlist of text types instead of a blocklist of binary extensions.
This blocklist only covers a few binary types; others (e.g. `.zip`, `.tar`, `.bin`, custom extensions) could still be treated as UTF‑8, causing noisy or failing reads. Since this tool is for text artifacts, consider switching to an allowlist of known text formats (e.g. `.txt`, `.md`, `.log`, `.json`) and treating everything else as binary by default.
Suggested implementation:
```javascript
const OUTPUTS_ROOT = path.resolve(PROJECT_ROOT, "outputs");
const PYTHON_EXECUTABLE = process.env.QUANTLAB_PYTHON || "python";
const MAX_OUTPUT_CHARS = 12000;
// Only these extensions will be treated as text artifacts.
// Everything else should be assumed to be binary by default.
const TEXT_ARTIFACT_EXTENSIONS = new Set([
".txt",
".md",
".markdown",
".log",
".json",
".jsonl",
".csv",
".tsv",
".yaml",
".yml",
".xml",
".html",
".htm",
]);
```
Anywhere `BINARY_ARTIFACT_EXTENSIONS` is used, you'll need to:
1. Rename the reference to `TEXT_ARTIFACT_EXTENSIONS`.
2. Invert the logic so that text is explicitly allowed and everything else is treated as binary. For example:
- If you currently do something like:
```js
const ext = path.extname(filePath).toLowerCase();
const isBinary = BINARY_ARTIFACT_EXTENSIONS.has(ext);
if (!isBinary) {
// treat as text
}
```
you should change it to:
```js
const ext = path.extname(filePath).toLowerCase();
const isText = TEXT_ARTIFACT_EXTENSIONS.has(ext);
if (isText) {
// treat as text
} else {
// treat as binary
}
```
3. Remove any assumptions elsewhere that "unknown extension => text"; instead, default unknown/absent extensions to binary.
</issue_to_address>
### Comment 3
<location path="desktop/mcp-server.mjs" line_range="123-136" />
<code_context>
+ throw new Error(`Not a directory under outputs/: ${relativePath || "."}`);
+ }
+
+ const entries = await fs.readdir(targetPath, { withFileTypes: true });
+ const detailed = [];
+ for (const entry of entries) {
+ const entryPath = path.join(targetPath, entry.name);
+ const entryStat = await fs.stat(entryPath);
+ detailed.push({
+ name: entry.name,
</code_context>
<issue_to_address>
**suggestion (performance):** Parallelize per-entry `fs.stat` calls in `listOutputs` to reduce latency on large directories.
`fs.stat` is awaited sequentially in the loop, which can be slow for large directories. Consider collecting the `fs.stat` calls into a promise array and using `await Promise.all(...)`, then rebuilding `detailed` from the resolved results to maintain the current ordering.
```suggestion
const entries = await fs.readdir(targetPath, { withFileTypes: true });
// Parallelize fs.stat calls for all entries to reduce latency on large directories.
const entryStats = await Promise.all(
entries.map((entry) => {
const entryPath = path.join(targetPath, entry.name);
return fs.stat(entryPath);
}),
);
const detailed = entries.map((entry, index) => {
const entryPath = path.join(targetPath, entry.name);
const entryStat = entryStats[index];
return {
name: entry.name,
kind: entry.isDirectory() ? "directory" : "file",
relative_path: path
.relative(PROJECT_ROOT, entryPath)
.replaceAll("\\", "/"),
size_bytes: entry.isDirectory() ? null : entryStat.size,
size_human: entry.isDirectory() ? null : formatBytes(entryStat.size),
modified_at: toIsoString(entryStat.mtime),
};
});
```
</issue_to_address>
### Comment 4
<location path="desktop/mcp-server.mjs" line_range="32-39" />
<code_context>
return `${text.slice(0, MAX_OUTPUT_CHARS)}\n...[truncated]`;
}
+function resolveOutputsPath(relativePath) {
+ const requested = relativePath || "";
+ const resolvedPath = path.resolve(OUTPUTS_ROOT, requested);
+ const relative = path.relative(OUTPUTS_ROOT, resolvedPath);
+ if (relative.startsWith("..") || path.isAbsolute(relative)) {
+ throw new Error(`Refusing to access outside outputs/: ${relativePath}`);
+ }
+ return resolvedPath;
+}
+
</code_context>
<issue_to_address>
**🚨 question (security):** Decide whether symlinks under `outputs/` should be allowed to escape the tree.
The current checks block `..` and absolute paths, but they don’t prevent a symlink under `outputs/` from pointing elsewhere in the filesystem and being followed by `fs.stat` / `fs.readFile`. If access is meant to be strictly confined to `outputs/`, consider rejecting symlinks or validating `fs.realpath(resolvedPath)` to ensure it still lies under `OUTPUTS_ROOT`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| return { | ||
| root: "outputs", | ||
| requested_path: relativePath || ".", | ||
| absolute_path: targetPath, | ||
| entry_count: detailed.length, | ||
| entries: detailed, | ||
| }; |
There was a problem hiding this comment.
🚨 suggestion (security): Avoid exposing absolute filesystem paths in the outputs API payloads.
The absolute_path field exposes internal filesystem layout, which can be sensitive if clients are untrusted or logs are shared. Consider removing this field or returning only a path relative to outputs/ (since relative_path/requested_path already cover client needs) and keep absolute paths internal to the server.
| return { | |
| root: "outputs", | |
| requested_path: relativePath || ".", | |
| absolute_path: targetPath, | |
| entry_count: detailed.length, | |
| entries: detailed, | |
| }; | |
| return { | |
| root: "outputs", | |
| requested_path: relativePath || ".", | |
| entry_count: detailed.length, | |
| entries: detailed, | |
| }; |
| const BINARY_ARTIFACT_EXTENSIONS = new Set([ | ||
| ".png", | ||
| ".jpg", | ||
| ".jpeg", | ||
| ".gif", | ||
| ".webp", | ||
| ".bmp", | ||
| ".ico", | ||
| ".pdf", |
There was a problem hiding this comment.
suggestion (bug_risk): Consider an allowlist of text types instead of a blocklist of binary extensions.
This blocklist only covers a few binary types; others (e.g. .zip, .tar, .bin, custom extensions) could still be treated as UTF‑8, causing noisy or failing reads. Since this tool is for text artifacts, consider switching to an allowlist of known text formats (e.g. .txt, .md, .log, .json) and treating everything else as binary by default.
Suggested implementation:
const OUTPUTS_ROOT = path.resolve(PROJECT_ROOT, "outputs");
const PYTHON_EXECUTABLE = process.env.QUANTLAB_PYTHON || "python";
const MAX_OUTPUT_CHARS = 12000;
// Only these extensions will be treated as text artifacts.
// Everything else should be assumed to be binary by default.
const TEXT_ARTIFACT_EXTENSIONS = new Set([
".txt",
".md",
".markdown",
".log",
".json",
".jsonl",
".csv",
".tsv",
".yaml",
".yml",
".xml",
".html",
".htm",
]);Anywhere BINARY_ARTIFACT_EXTENSIONS is used, you'll need to:
- Rename the reference to
TEXT_ARTIFACT_EXTENSIONS. - Invert the logic so that text is explicitly allowed and everything else is treated as binary. For example:
- If you currently do something like:
you should change it to:
const ext = path.extname(filePath).toLowerCase(); const isBinary = BINARY_ARTIFACT_EXTENSIONS.has(ext); if (!isBinary) { // treat as text }
const ext = path.extname(filePath).toLowerCase(); const isText = TEXT_ARTIFACT_EXTENSIONS.has(ext); if (isText) { // treat as text } else { // treat as binary }
- If you currently do something like:
- Remove any assumptions elsewhere that "unknown extension => text"; instead, default unknown/absent extensions to binary.
| const entries = await fs.readdir(targetPath, { withFileTypes: true }); | ||
| const detailed = []; | ||
| for (const entry of entries) { | ||
| const entryPath = path.join(targetPath, entry.name); | ||
| const entryStat = await fs.stat(entryPath); | ||
| detailed.push({ | ||
| name: entry.name, | ||
| kind: entry.isDirectory() ? "directory" : "file", | ||
| relative_path: path.relative(PROJECT_ROOT, entryPath).replaceAll("\\", "/"), | ||
| size_bytes: entry.isDirectory() ? null : entryStat.size, | ||
| size_human: entry.isDirectory() ? null : formatBytes(entryStat.size), | ||
| modified_at: toIsoString(entryStat.mtime), | ||
| }); | ||
| } |
There was a problem hiding this comment.
suggestion (performance): Parallelize per-entry fs.stat calls in listOutputs to reduce latency on large directories.
fs.stat is awaited sequentially in the loop, which can be slow for large directories. Consider collecting the fs.stat calls into a promise array and using await Promise.all(...), then rebuilding detailed from the resolved results to maintain the current ordering.
| const entries = await fs.readdir(targetPath, { withFileTypes: true }); | |
| const detailed = []; | |
| for (const entry of entries) { | |
| const entryPath = path.join(targetPath, entry.name); | |
| const entryStat = await fs.stat(entryPath); | |
| detailed.push({ | |
| name: entry.name, | |
| kind: entry.isDirectory() ? "directory" : "file", | |
| relative_path: path.relative(PROJECT_ROOT, entryPath).replaceAll("\\", "/"), | |
| size_bytes: entry.isDirectory() ? null : entryStat.size, | |
| size_human: entry.isDirectory() ? null : formatBytes(entryStat.size), | |
| modified_at: toIsoString(entryStat.mtime), | |
| }); | |
| } | |
| const entries = await fs.readdir(targetPath, { withFileTypes: true }); | |
| // Parallelize fs.stat calls for all entries to reduce latency on large directories. | |
| const entryStats = await Promise.all( | |
| entries.map((entry) => { | |
| const entryPath = path.join(targetPath, entry.name); | |
| return fs.stat(entryPath); | |
| }), | |
| ); | |
| const detailed = entries.map((entry, index) => { | |
| const entryPath = path.join(targetPath, entry.name); | |
| const entryStat = entryStats[index]; | |
| return { | |
| name: entry.name, | |
| kind: entry.isDirectory() ? "directory" : "file", | |
| relative_path: path | |
| .relative(PROJECT_ROOT, entryPath) | |
| .replaceAll("\\", "/"), | |
| size_bytes: entry.isDirectory() ? null : entryStat.size, | |
| size_human: entry.isDirectory() ? null : formatBytes(entryStat.size), | |
| modified_at: toIsoString(entryStat.mtime), | |
| }; | |
| }); |
| function resolveOutputsPath(relativePath) { | ||
| const requested = relativePath || ""; | ||
| const resolvedPath = path.resolve(OUTPUTS_ROOT, requested); | ||
| const relative = path.relative(OUTPUTS_ROOT, resolvedPath); | ||
| if (relative.startsWith("..") || path.isAbsolute(relative)) { | ||
| throw new Error(`Refusing to access outside outputs/: ${relativePath}`); | ||
| } | ||
| return resolvedPath; |
There was a problem hiding this comment.
🚨 question (security): Decide whether symlinks under outputs/ should be allowed to escape the tree.
The current checks block .. and absolute paths, but they don’t prevent a symlink under outputs/ from pointing elsewhere in the filesystem and being followed by fs.stat / fs.readFile. If access is meant to be strictly confined to outputs/, consider rejecting symlinks or validating fs.realpath(resolvedPath) to ensure it still lies under OUTPUTS_ROOT.
Summary
Refine the Codex workflow prompt into a clearer two-phase plan/execute template and align the quick-reference cheatsheet with the canonical prompt.
Scope
Documentation only.
Files:
.agents/prompts/codex-master-prompt.md.agents/cursor-codex-cheatsheet.mdAGENTS.mdValidation
git diff --checkNotes
AGENTS.mdnow links to the canonical workflow prompt and cheatsheet.