Skip to content

docs: agent friendly sql reference#685

Merged
ULookup merged 4 commits into
matrixorigin:mainfrom
ULookup:docs/agent-friendly-sql-reference
May 12, 2026
Merged

docs: agent friendly sql reference#685
ULookup merged 4 commits into
matrixorigin:mainfrom
ULookup:docs/agent-friendly-sql-reference

Conversation

@ULookup
Copy link
Copy Markdown
Collaborator

@ULookup ULookup commented May 8, 2026

What type of PR is this?

  • Enhancement
  • Displaying
  • Typo
  • Doc Request

Which issue(s) this PR fixes:

issue #

What this PR does / why we need it:

Extends the Agent-friendly documentation rollout (originally landed in PR #676) so every page under docs/MatrixOne/Reference/SQL-Reference/** (129 pages) and docs/MatrixOne/Reference/Functions-and-Operators/** (163 pages) now carries the FULL Agent-friendly frontmatter shape — the same template used by the v3.0.11 follow-up pages such as Datetime/addtime.md and mo-tools/mo_service.md:

---
title: "..."                                                                                                                                                                                    
doc_type: reference                                                                                                                                                                             
mysql_compat: full | partial | mo_only | unknown                                                                                                                                                
differs_from_mysql: []                                                                                                                                                                          
mo_only: []                                                                                                                                                                                     
since: unknown                                                                                                                                                                                  
last_updated: 2026-05-08                                                                                                                                                                        
llms_summary: "one-line summary surfaced to llms.txt / llms-full.txt"                                                                                                                           
---                                                                                                                                                                                             

Each upgraded page also picks up a > summary blockquote under its H1, so the same one-liner is visible in both the rendered HTML and the raw .md mirror served to agents.

Every mysql_compat value was hand-verified against MySQL 8.0

All 292 pages were read in full and compared against MySQL 8.0 behavior — the automated classifier's output was used only as a starting point and corrected wherever it disagreed with the
actual docs. Notable reclassifications:

SQL-Reference

  • INTERSECT, FULL JOINmo_only (MySQL 8.0 supports neither)
  • OUTER JOIN overview → partial (it describes FULL OUTER JOIN)
  • LAST_INSERT_IDpartial (multi-row INSERT returns last value, not first)
  • SHOW TABLESpartial (result column is name, not Tables_in_<db>)
  • EXPLAIN PREPAREDmo_only (uses EXPLAIN FORCE EXECUTE …)
  • SET ROLE, CREATE ROLE, DROP ROLE, DROP USERpartial
  • Data-Manipulation-Language/information-functions/current_rolepartial
  • SQL-Type.mdmo_only (it's an index page, not a MySQL-equivalent concept)
  • CREATE FULLTEXT INDEXpartial (TAE-specific implementation)

Functions-and-Operators

  • SINHmo_only (MySQL 8.0 has no hyperbolic functions)
  • LOAD_FILEmo_only (parameter is DATALINK, not a file path)
  • Datetime to-days / to-seconds / date / date-add / date-sub / date-format / extract / year / dayofyear / from-unixtime / unix-timestamp / curdatepartial (each
    body declares its own MySQL divergence)
  • JSON_SETfull (MySQL 8.0 does support it — previous mo_only was wrong)
  • system-ops/current_user, current_rolepartial (MySQL has both names with a different output shape)
  • AES_ENCRYPT / AES_DECRYPT → kept partial, but differs_from_mysql now lists the concrete block-mode and KDF gaps

Expanded differs_from_mysql / mo_only entries on CREATE USER, GRANT, REVOKE, SELECT, CREATE VIEW, CREATE TABLE, and others — covering previously-missing items like
'user'@'host' handling, ON ACCOUNT / ON DATABASE grant targets, NULLS FIRST | LAST, ALGORITHM = {UNDEFINED | MERGE | TEMPTABLE}, auto_increment step, and partition-pruning scope.

Artefacts

  • docs/MatrixOne/Reference/mysql-compatibility-matrix.md regenerated from the new frontmatter — 129 rows, full=30 partial=43 mo_only=56.
  • llms.txt / llms-full.txt / per-page .md mirrors refresh automatically at build time via the existing mkdocs hook.
  • New helper scripts/fo-compat-classification.js captures the per-file classification table used to seed Functions-and-Operators (source of truth:
    docs/MatrixOne/Overview/feature/mysql-compatibility.md cross-checked against MySQL 8.0).
  • Existing upgrader scripts/upgrade-sql-reference-frontmatter.js generalised to accept --target={sql-reference,functions-operators} and --regen-summary; its markdown-stripping logic
    fixed so identifiers like JSON_SET / STR_TO_DATE survive round-tripping.

ULookup added 4 commits May 8, 2026 02:34
…tter

Brings every page under `docs/MatrixOne/Reference/SQL-Reference/**` (129
pages) to the Agent-friendly FULL frontmatter shape established by the
v3.0.11 follow-up pages (e.g. `addtime.md`, `mo_service.md`).

Per-page changes:
- Add `doc_type: reference`, `since: unknown`, `last_updated: 2026-05-08`
- Add `llms_summary: "..."` derived from the page body (prefers the first
  paragraph under `## Description` / Overview / Introduction, falls back
  to the first non-heading / non-blockquote paragraph; stripped of inline
  markdown and trimmed to the first sentence or 280 chars)
- Normalize `differs_from_mysql` / `mo_only` to YAML lists (`[]` when
  absent), keeping existing hand-authored entries intact
- Quote `title` and `llms_summary` consistently for alignment with the
  existing FULL pages
- Inject a `> <summary>` blockquote under the H1 so the summary is
  visible in both the rendered page and the raw markdown mirror

Also:
- scripts/upgrade-sql-reference-frontmatter.js: idempotent upgrader with
  `--dry` (report only) and `--restyle` (re-emit existing FULL frontmatter
  without regenerating the summary) modes
- docs/MatrixOne/Reference/mysql-compatibility-matrix.md: regenerated
  from the new frontmatter via `scripts/generate-compat-matrix.js`

Verified:
- `node scripts/check-compat-frontmatter.js` passes on all 129 pages
  (distribution unchanged: full=38, partial=35, mo_only=56, unknown=0)
- `mkdocs build` succeeds; `site/llms.txt` and `site/llms-full.txt`
  regenerate cleanly with the new summaries
…ompat

Extends the Agent-friendly FULL frontmatter rollout to the
Functions-and-Operators corpus (152 pages) and corrects mysql_compat
classifications across both corpora based on hand-verification against
MySQL 8.0 behavior rather than trusting the automated classifier.

Functions-and-Operators changes:
- All 152 non-FULL pages upgraded to FULL frontmatter (title, doc_type,
  mysql_compat, differs_from_mysql, mo_only, since, last_updated,
  llms_summary) following the addtime.md / mo_service.md template.
- Summary blockquote injected under each H1, consistent with the
  SQL-Reference rollout.
- mysql_compat seeded from a per-page classification table
  (scripts/fo-compat-classification.js), cross-checked against
  docs/MatrixOne/Overview/feature/mysql-compatibility.md.

Hand-verified compat corrections (every page read in full and compared
against MySQL 8.0 documentation):

SQL-Reference
- INTERSECT, FULL JOIN: full -> mo_only (MySQL 8.0 has neither)
- OUTER JOIN overview: full -> partial (lists FULL OUTER JOIN)
- LAST_INSERT_ID: full -> partial (multi-row INSERT returns last, not first)
- SHOW TABLES: full -> partial (column named 'name', not 'Tables_in_<db>')
- EXPLAIN PREPARED: partial -> mo_only (EXPLAIN FORCE EXECUTE is MO syntax)
- SET ROLE, CREATE ROLE, DROP ROLE, DROP USER: mo_only/full -> partial
  (MySQL 8.0 has equivalents with different argument shapes)
- current_role (DML info-func page): mo_only -> partial
- SQL-Type.md: full -> mo_only (index page, not a MySQL concept)
- CREATE FULLTEXT INDEX: full -> partial (TAE-specific implementation)
- Expanded differs_from_mysql and mo_only entries on CREATE USER, GRANT,
  REVOKE, SELECT, CREATE VIEW, CREATE TABLE with previously-missing
  items (user@host, ON ACCOUNT/DATABASE, NULLS FIRST/LAST, ALGORITHM=,
  auto_increment step, partition pruning scope).

Functions-and-Operators
- SINH: full -> mo_only (MySQL 8.0 has no hyperbolic functions)
- Datetime/to-days, to-seconds, date, date-add, date-sub, date-format,
  extract, year, dayofyear, from-unixtime, unix-timestamp, curdate:
  full -> partial (date-literal / two-digit-year / curdate+int differences
  called out in their own bodies)
- Json/json_set: mo_only -> full (MySQL 8.0 does support it)
- Other/load_file: full -> mo_only (parameter is a DATALINK, not a path)
- system-ops/current_user, current_role: mo_only -> partial (MySQL has
  CURRENT_USER/CURRENT_ROLE, output format differs)
- String/aes_encrypt, aes_decrypt: partial with newly-populated differs
  (only aes-128-ecb / aes-256-cbc supported; no KDF arguments)

Also:
- scripts/fo-compat-classification.js: source of truth for the F&O
  seeding pass, with per-file override table and per-directory defaults.
- scripts/upgrade-sql-reference-frontmatter.js: generalized to accept
  --target={sql-reference,functions-operators} and --regen-summary,
  fixed title/summary escape round-tripping, stopped stripping
  underscores inside identifiers, and preferred the pre-H2 intro
  paragraph over Description section when both exist.
- docs/MatrixOne/Reference/mysql-compatibility-matrix.md regenerated
  from the new frontmatter (129 rows; full=30, partial=43, mo_only=56).

Verified:
- node scripts/check-compat-frontmatter.js passes on all 129
  SQL-Reference pages; 0 unknown.
- All 163 Functions-and-Operators pages carry llms_summary; 0 unknown.
The GRANT / REVOKE frontmatter referred to 'ON ACCOUNT *' and
'ON DATABASE *'. When the compatibility-matrix generator joined those
bullets onto a single table cell (<br/>-separated), the two trailing
'*'s looked like unbalanced emphasis markers to avtodev/markdown-lint
(MD037 "no-space-in-emphasis").

Wrap each 'GRANT ... ON ACCOUNT *' / 'ON DATABASE *' bullet in backticks
so the asterisks become code-span content instead of emphasis markup,
and regenerate mysql-compatibility-matrix.md.

Verified: docker run avtodev/markdown-lint:v1 ./docs/MatrixOne is clean.
The parameter table linked to cast() as
../../../Reference/Operators/operators/cast-functions-and-operators/cast/
which has two problems:

1. One leading ../ too many (the file is already under Reference/,
   so the path resolved to docs/MatrixOne/Reference/Reference/...)
2. Trailing slash without `.md`, which markdown-link-check resolves
   to a 400 response

Fix to the correct relative path:
../../Operators/operators/cast-functions-and-operators/cast.md
matching the already-working link in Other/load_file.md.

This is the only "Status: 400" internal dead link the CI workflow
treats as blocking; the remaining artwork 404s (cosine_distance.png,
normalize_l2.png, l2_distance.png) are external warnings and are
left alone in this PR since those images never existed in the
matrixorigin/artwork repo and fixing them is out of scope.
@ULookup ULookup merged commit 3cd81ea into matrixorigin:main May 12, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant