fix(migration): quote identifiers without breaking expression indexes#11
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6530ec964f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .chars() | ||
| .all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '_'); | ||
| if needs_quoting { | ||
| format!("\"{}\"", name.replace('"', "\"\"")) |
There was a problem hiding this comment.
Use MySQL-safe identifier quoting
quote_ident always emits double-quoted identifiers, but this helper is also used when generating MySQL statements (for example in DROP INDEX ... ON ... and MODIFY COLUMN). In default MySQL/MariaDB SQL mode, "identifier" is not an identifier quote (unless ANSI_QUOTES is enabled), so migrations for names that trigger quoting (e.g., CamelCase or spaces) now produce invalid SQL where previously unquoted forms worked. Please make quoting dialect-aware (backticks for MySQL) or avoid forcing ANSI-style quotes for that dialect.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a61eb474e1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let upper = indexdef.to_uppercase(); | ||
| let on_pos = match upper.find(" ON ") { | ||
| Some(p) => p, | ||
| None => return Vec::new(), | ||
| }; | ||
|
|
||
| let after_on = &indexdef[on_pos..]; |
There was a problem hiding this comment.
Avoid slicing with indices from uppercased SQL text
parse_index_columns computes on_pos from indexdef.to_uppercase() and then reuses that byte offset to slice the original indexdef. For quoted identifiers containing non-ASCII characters whose uppercase mapping changes byte length (for example ı, ſ, or similar), this offset can point to the wrong byte boundary and make &indexdef[on_pos..] panic or parse the wrong substring, causing schema loading to fail on otherwise valid PostgreSQL index definitions.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0cac54da7d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| match dialect { | ||
| SqlDialect::MySql => format!("`{}`", name.replace('`', "``")), | ||
| _ => format!("\"{}\"", name.replace('"', "\"\"")), |
There was a problem hiding this comment.
Handle SqlFile dialect when quoting identifiers
In quote_ident, the fallback branch (_) applies PostgreSQL/SQLite-style double-quote escaping to every non-lowercase identifier, but generate_migration is also called with SqlDialect::SqlFile when both inputs are .sql files. That means MySQL-oriented SQL-file comparisons now produce statements like "UserAccounts" and "idx_UserAccounts_email", which are invalid in default MySQL/MariaDB mode unless ANSI_QUOTES is enabled. This is a regression from the previous unquoted output path for SQL-file workflows.
Useful? React with 👍 / 👎.
| let after_on = &indexdef[on_pos..]; | ||
| let open = match after_on.find('(') { | ||
| Some(p) => on_pos + p, |
There was a problem hiding this comment.
Locate index column list after table token, not first '('
parse_index_columns takes the first ( after ON as the start of the index column list. If a quoted table name itself contains ( (legal identifier text in PostgreSQL), this points into the table name, so the depth scan never balances and the function returns an empty column list. That causes indexes to load with no columns and can later generate invalid SQL like CREATE INDEX ... ON ...(); or spurious index diffs.
Useful? React with 👍 / 👎.
…ismatch on non-ASCII identifiers Replace to_uppercase().find() with as_bytes().windows().eq_ignore_ascii_case() in parse_index_columns and parse_default. The old approach computed byte offsets from the uppercased copy and reused them to slice the original string, which panics or parses incorrectly when non-ASCII characters change byte length under uppercasing (e.g. ı → I). Also adds MySQL backtick quoting integration test.
0cac54d to
aa5e0ea
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aa5e0ea87c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Index columns are raw clauses from pg_indexes.indexdef and may contain | ||
| // expressions (lower(email)), sort orders (created_at DESC), or | ||
| // already-quoted identifiers. They must NOT be wrapped with quote_ident. | ||
| let cols = idx.columns.join(", "); |
There was a problem hiding this comment.
Quote index column identifiers for MySQL output
create_index_sql now emits idx.columns verbatim for every dialect, but MySQL index columns loaded from information_schema.statistics are plain identifier strings (not raw SQL clauses). When a MySQL column name needs escaping (for example reserved words like select or names containing spaces), generated SQL becomes CREATE INDEX ... (select) and fails to execute; before this change each column was passed through quote_ident. Restrict raw passthrough to expression-style PostgreSQL clauses, or keep identifier quoting for MySQL/SQLite paths.
Useful? React with 👍 / 👎.
| match bytes[pos] { | ||
| b'(' => depth += 1, | ||
| b')' => depth -= 1, | ||
| _ => {} |
There was a problem hiding this comment.
Ignore quoted text when matching index-list parentheses
The parenthesis-depth scan treats every ( and ) byte as structural, including those inside quoted identifiers or string literals, so valid PostgreSQL definitions like ... USING btree ("a)", b) or expressions containing literal parentheses can terminate early and return truncated columns. This yields incorrect index metadata and malformed follow-up migration SQL. The parser needs quote-aware scanning while finding the closing ) for the column list.
Useful? React with 👍 / 👎.
…ote-aware parser - Restore conditional quoting in quote_ident: only quote identifiers that need it (uppercase, spaces, special chars), fixing SqlFile dialect regression - Quote index columns for MySQL/SQLite (plain identifiers from information_schema) while keeping PostgreSQL columns raw (SQL expressions from pg_get_indexdef) - Make parse_index_columns quote-aware: skip double-quoted content when finding opening paren and matching close paren, so table names or column names containing parentheses don't break the parser
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: be72c44c5b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let needs_quoting = name.is_empty() | ||
| || !name | ||
| .chars() | ||
| .all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '_'); |
There was a problem hiding this comment.
Quote reserved and digit-leading identifiers
The new needs_quoting heuristic treats any [a-z0-9_] name as safe, so identifiers like user (reserved keyword) or 123table (invalid unquoted start in PostgreSQL) are emitted unquoted and produce invalid migration SQL. This is a regression from the previous behavior, which always quoted identifiers and therefore handled these valid-but-quoted catalog names correctly when generating CREATE/ALTER/DROP statements.
Useful? React with 👍 / 👎.
| let on_pos = match indexdef | ||
| .as_bytes() | ||
| .windows(4) | ||
| .position(|w| w.eq_ignore_ascii_case(b" ON ")) | ||
| { |
There was a problem hiding this comment.
Ignore quoted index names when finding ON clause
parse_index_columns takes the first ASCII " ON " anywhere in indexdef without quote awareness, so a valid quoted index name containing that token (for example CREATE INDEX "idx ON weird" ON ...) makes on_pos point inside the name. The subsequent scan then fails to find the real column-list opener and returns an empty column list, which can cascade into incorrect index diffs and malformed generated SQL.
Useful? React with 👍 / 👎.
Summary
quote_ident(name, dialect)that wraps identifiers in double quotes (PostgreSQL/SQLite) or backticks (MySQL) only when needed — simple lowercase names pass through unquoted for readabilitypg_indexes.indexdef(e.g.lower(email),created_at DESC,"MixedCase") and must be emitted verbatimparse_index_columnsbug:rfind('(')matched inner parens in expressions likelower(email), now walks fromON table(...)with proper nestingSubsumes
Closes #9 — our implementation covers the same quoting surface but avoids the critical bug of quoting index column clauses, which breaks expression and sorted indexes.
Test plan
cargo test— 108 tests pass (83 unit + 22 CLI + 3 config)cargo clippy -- -D warnings— cleancargo fmt --check— cleancargo run -- tests/fixtures/schema_a.sql --schema tests/fixtures/schema_expr_indexes.sqlproduces valid SQL