Skip to content

Fix split_arguments to handle single-quoted strings and UTF-8#156

Merged
nettrash merged 11 commits intonettrash:v1.0.12from
iBarBuba:154-fix-default-string-comma-parsing
Apr 8, 2026
Merged

Fix split_arguments to handle single-quoted strings and UTF-8#156
nettrash merged 11 commits intonettrash:v1.0.12from
iBarBuba:154-fix-default-string-comma-parsing

Conversation

@iBarBuba
Copy link
Copy Markdown

@iBarBuba iBarBuba commented Apr 7, 2026

This pull request improves the parsing logic for routine argument defaults in order to correctly handle cases where a comma appears inside a single-quoted string, such as in default values like DEFAULT ','. It ensures that such commas are not incorrectly treated as argument separators, addressing Issue #154. The change is thoroughly tested and includes regression tests in the codebase and test schemas.

Parsing improvements:

  • Updated the split_arguments function in routine.rs to correctly handle single-quoted string literals, including escaped quotes, so that commas inside quoted strings are not treated as delimiters. This prevents incorrect splitting of argument lists when default values contain commas in strings.

Testing and regression coverage:

  • Added multiple unit tests for split_arguments to verify correct handling of single-quoted commas, escaped quotes, and parenthesized expressions, ensuring robust parsing and preventing regressions.
  • Added a new test procedure format_csv_line with a comma-in-string default value to both test schemas (schema_a.sql and schema_b.sql) as a regression test for Issue pgc incorrectly parses default string values in procedure parameters (comma delimiter case) #154. [1] [2]
  • Updated test documentation (README.md) to describe the new regression test and clarify the expected behavior when comparing schemas with procedures containing comma-in-string defaults. [1] [2]

The split_arguments method did not account for single-quoted string
literals, causing commas inside quoted default values (e.g. ','::character
varying) to be treated as delimiters. This produced broken argument
fragments when building CREATE OR REPLACE scripts for routines with such
defaults.

Extend the character-by-character state machine with in_quote tracking.
When inside a single-quoted string, commas are ignored as delimiters.
Escaped quotes ('') inside a string literal are consumed as a pair
without toggling the in-quote state. Dollar-quoted strings are also
tracked defensively to handle any future occurrences in default
expressions.

Add five unit tests covering: single quoted-comma default, multiple
defaults with a quoted comma, escaped single-quotes, parenthesised type
expressions (regression), and a full arguments_with_defaults round-trip.

Add format_csv_line procedure with DEFAULT ',' to both integration
fixture schemas (schema_a.sql, schema_b.sql) to validate no false diff
is emitted for this case. Update data/test/README.md accordingly.

Closes nettrash#154
Switch char-vector index to char_indices() so that s[byte_i..] slices
are always valid byte offsets, preventing panics on multi-byte UTF-8
characters before a dollar-quote tag. Also remove the unnecessary
.clone() on in_dollar_quote and add a unit test for dollar-quoted
string parsing.
Dollar-quote handling was added defensively but is not needed for the
issue nettrash#154 fix. Remove in_dollar_quote state, the dollar-tag detection
block, and the related unit test. Keep only single-quote handling and
the char_indices approach required for escaped-quote lookahead.
@iBarBuba iBarBuba requested a review from nettrash as a code owner April 7, 2026 20:31
Copilot AI review requested due to automatic review settings April 7, 2026 20:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes routine argument-default parsing so commas inside single-quoted string literals (e.g., DEFAULT ',') are not treated as argument separators, preventing incorrect routine signatures/diffs (Issue #154).

Changes:

  • Enhanced Routine::split_arguments to respect single-quoted string literals (including doubled-quote escaping) in addition to parenthesized groups.
  • Added unit tests covering quoted commas, escaped quotes, and a full arguments_with_defaults round-trip for the DEFAULT ',' case.
  • Added an identical regression-test procedure (format_csv_line) to both test schemas and documented the expected no-diff behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
app/src/dump/routine.rs Improves argument/default splitting logic and adds targeted unit/regression tests.
data/test/schema_a.sql Adds format_csv_line procedure with DEFAULT ',' to reproduce/guard Issue #154.
data/test/schema_b.sql Mirrors format_csv_line procedure so schema comparisons should emit no diff.
data/test/README.md Documents the new regression test and expected comparer behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… split_arguments

Use a zero-allocation peekable iterator instead of pre-collecting into a
Vec for the single lookahead needed for escaped-quote detection.
Replace clone+clear with std::mem::take when pushing completed segments.
Copilot AI review requested due to automatic review settings April 7, 2026 20:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Author

@iBarBuba iBarBuba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: [#discussion_r3047810206] — Good observation. In practice pg_get_expr(proargdefaults, 0) always deparses string literals via quote_literal(), which emits the standard ''-doubling form and never produces E'...' / \' escapes — so this splitter only needs to handle what that function can actually output. E-string handling would be necessary if parsing arbitrary SQL expressions from other sources, but that's out of scope here.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 7, 2026 21:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings April 7, 2026 21:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ealekseevfrompaysend and others added 2 commits April 7, 2026 23:31
Copilot AI review requested due to automatic review settings April 7, 2026 21:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 7, 2026 21:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nettrash nettrash added this to the v1.0.12 milestone Apr 8, 2026
@nettrash nettrash merged commit 4bbb141 into nettrash:v1.0.12 Apr 8, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants