chore(mcp): hint agents to write human-readable formatted SQL#61105
Conversation
👀 Auto-assigned reviewersThese soft owners were skipped because they only have minor changes here. Nothing blocks merge, so self-assign if you'd like a look:
Soft owners come from |
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
products/data_warehouse/backend/api/saved_query.py:206
The example at the end of the `help_text` shows a single-line query (`"SELECT * FROM events LIMIT 100"`), which directly contradicts the formatting guidance just given. An agent reading this help text will see the inline example and may default to that pattern for simple queries. A multi-line example would reinforce the guidance rather than undermine it, and this string is the single source of truth that propagates into all the generated files and the MCP tool schemas.
```suggestion
help_text='HogQL query definition as a JSON object with a "query" key containing the SQL string and a "kind" key (always "HogQLQuery"). Format the SQL string multi-line with indentation and inline `--` comments for non-obvious logic — the SQL editor renders it verbatim, so avoid minified single-line SQL. Example: {"kind": "HogQLQuery", "query": "SELECT\\n event,\\n count() AS cnt\\nFROM events\\nLIMIT 100"}',
```
Reviews (1): Last reviewed commit: "chore(mcp): hint agents to write human-r..." | Re-trigger Greptile |
| columns = serializers.SerializerMethodField(read_only=True) | ||
| query = QueryDefinitionField( | ||
| help_text='HogQL query definition as a JSON object with a "query" key containing the SQL string and a "kind" key (always "HogQLQuery"). Example: {"kind": "HogQLQuery", "query": "SELECT * FROM events LIMIT 100"}', | ||
| help_text='HogQL query definition as a JSON object with a "query" key containing the SQL string and a "kind" key (always "HogQLQuery"). Format the SQL string multi-line with indentation and inline `--` comments for non-obvious logic — the SQL editor renders it verbatim, so avoid minified single-line SQL. Example: {"kind": "HogQLQuery", "query": "SELECT * FROM events LIMIT 100"}', |
There was a problem hiding this comment.
The example at the end of the
help_text shows a single-line query ("SELECT * FROM events LIMIT 100"), which directly contradicts the formatting guidance just given. An agent reading this help text will see the inline example and may default to that pattern for simple queries. A multi-line example would reinforce the guidance rather than undermine it, and this string is the single source of truth that propagates into all the generated files and the MCP tool schemas.
| help_text='HogQL query definition as a JSON object with a "query" key containing the SQL string and a "kind" key (always "HogQLQuery"). Format the SQL string multi-line with indentation and inline `--` comments for non-obvious logic — the SQL editor renders it verbatim, so avoid minified single-line SQL. Example: {"kind": "HogQLQuery", "query": "SELECT * FROM events LIMIT 100"}', | |
| help_text='HogQL query definition as a JSON object with a "query" key containing the SQL string and a "kind" key (always "HogQLQuery"). Format the SQL string multi-line with indentation and inline `--` comments for non-obvious logic — the SQL editor renders it verbatim, so avoid minified single-line SQL. Example: {"kind": "HogQLQuery", "query": "SELECT\\n event,\\n count() AS cnt\\nFROM events\\nLIMIT 100"}', |
Rule Used: Default templates for code should be clear and use... (source)
Learned From
PostHog/posthog#32714
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/data_warehouse/backend/api/saved_query.py
Line: 206
Comment:
The example at the end of the `help_text` shows a single-line query (`"SELECT * FROM events LIMIT 100"`), which directly contradicts the formatting guidance just given. An agent reading this help text will see the inline example and may default to that pattern for simple queries. A multi-line example would reinforce the guidance rather than undermine it, and this string is the single source of truth that propagates into all the generated files and the MCP tool schemas.
```suggestion
help_text='HogQL query definition as a JSON object with a "query" key containing the SQL string and a "kind" key (always "HogQLQuery"). Format the SQL string multi-line with indentation and inline `--` comments for non-obvious logic — the SQL editor renders it verbatim, so avoid minified single-line SQL. Example: {"kind": "HogQLQuery", "query": "SELECT\\n event,\\n count() AS cnt\\nFROM events\\nLIMIT 100"}',
```
**Rule Used:** Default templates for code should be clear and use... ([source](https://app.greptile.com/posthog-org-19734/-/custom-context?memory=2846d58a-18a4-4009-8619-eee3c5100113))
**Learned From**
[PostHog/posthog#32714](https://github.com/PostHog/posthog/pull/32714)
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Good catch — fixed in e154fd9. The example now shows a multi-line query (SELECT\n event,\n count() AS cnt\nFROM events\nGROUP BY event\nLIMIT 100, JSON-escaped) so it reinforces the guidance instead of contradicting it, and regenerated the downstream generated files + MCP schema snapshots.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6d5d18b9a7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Query snapshots: Backend query snapshots updatedChanges: 5 snapshots (5 modified, 0 added, 0 deleted) What this means:
Next steps:
|
|
Size Change: 0 B Total Size: 81 MB ℹ️ View Unchanged
|
Problem
When an agent writes HogQL via the MCP tools and the result is saved as a data warehouse view (or echoed back to a user), the SQL is often a single minified line. The SQL editor renders the stored query string verbatim — there is no auto-formatter — so anyone who opens the view later sees an unreadable wall of text. We want to steer agents toward formatted, human-readable SQL at the point where they write it.
Changes
Adds a short formatting hint to the two MCP surfaces where agents write SQL:
execute-sql— a brief "Format SQL for readability" note in the hand-authored tool prompt template (services/mcp/src/templates/execute-sql-prompt.md).view-create/view-update/view-materialize/view-run— extended thequeryfieldhelp_texton the data warehouse saved-query serializer (products/data_warehouse/backend/api/saved_query.py). This is the shared input field for all the view-* tools, so the guidance lands exactly where the SQL is written. The change propagates throughhogli build:openapiinto the generated frontend types and MCP tool schemas (regenerated files included).The guidance: write SQL multi-line with indentation, one column/CTE per line, with inline
--comments for non-obvious logic, and avoid minified single-line SQL.No logic changes — description/help-text only.
How did you test this code?
I'm an agent (Claude Code). I did not perform manual UI testing. Automated/local verification:
hogli build:openapisuccessfully; confirmed the regenerated diff contains only thequeryhelp_text string update across the generated files (no unrelated drift).lint-staged: ruff format/lint,ty check, JS/markdown formatters) passed on the staged files.Automatic notifications
🤖 Agent context
Authored with Claude Code (Opus 4.8) at the request of @andrewm4894.
Context: while extending a data warehouse "signals reports" view layer, the user noticed saved-view SQL renders as an ugly single line in the PostHog SQL editor. Root cause is that the editor stores/renders the query string verbatim with no formatter, and tools (and agents) tend to emit minified SQL.
Decisions:
queryfieldhelp_textrather than each view tool's top-level description — it is the single shared input for create/update/materialize/run, so one edit covers all of them and sits exactly where the SQL is written. More surgical and avoids redundant per-tool prose.hogli build:openapi; they were not hand-edited.