Skip to content

fix: preserve array type casts in policy expressions (#345)#348

Merged
tianzhou merged 2 commits intomainfrom
fix/issue-345-array-cast
Mar 8, 2026
Merged

fix: preserve array type casts in policy expressions (#345)#348
tianzhou merged 2 commits intomainfrom
fix/issue-345-array-cast

Conversation

@tianzhou
Copy link
Contributor

@tianzhou tianzhou commented Mar 8, 2026

Summary

  • pgschema dump was stripping ::text from explicit array literal casts like '{nested,key}'::text[], producing invalid SQL '{nested,key}'[]
  • The normalizeExpressionParentheses function's regex '([^']+)'::text matched ::text even when followed by [] (array notation)
  • Fixed by changing the regex to '([^']+)'::text([^[\w]|$) which requires a non-[ character or end-of-string after ::text, preserving the trailing character in the replacement

Fixes #345

Test plan

  • Added dump test TestDumpCommand_Issue345ArrayCast with policy using #>> operator and ::text[] cast
  • All existing dump tests pass (no regressions)
  • All diff tests pass (no regressions)
  • All policy integration tests pass

🤖 Generated with Claude Code

The normalizeExpressionParentheses function was stripping ::text from
array literal casts like '{nested,key}'::text[], leaving invalid SQL
'{nested,key}'[]. Fix the regex to not match ::text when followed by [].

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 8, 2026 13:39
@greptile-apps
Copy link

greptile-apps bot commented Mar 8, 2026

Greptile Summary

This PR fixes a bug where pgschema dump was stripping the type name from array literal casts (e.g., '{nested,key}'::text[] became '{nested,key}'[]) in policy expressions, producing invalid SQL. The root cause was a regex in normalizeExpressionParentheses that greedily matched ::text regardless of whether it was followed by [].

Key changes:

  • ir/normalize.go: Updated regex from '([^']+)'::text to '([^']+)'::text([^[\w]|$) with a corresponding replacement of '$1'$2, so array casts (::text[]) are never consumed while standalone ::text casts are still removed as redundant.
  • New dump test fixture testdata/dump/issue_345_array_cast/ with manifest.json, raw.sql, pgdump.sql, and pgschema.sql verifying the round-trip through the normalizer.
  • cmd/dump/dump_integration_test.go: Added TestDumpCommand_Issue345ArrayCast wiring up the new fixture.

Issues found:

  • The comment describing the new regex only mentions excluding [, but the character class [^[\w] also excludes all word characters ([0-9A-Za-z_]) — which is actually correct behaviour (prevents ::textual from being mis-matched) but is not documented.
  • normalizeDomainDefault (line 811) uses \b (word boundary) for the same purpose, but a word boundary exists between t (in text) and [, meaning '([^']+)'::text\b will still incorrectly match 'val'::text[] in domain defaults. This is the same class of bug as the one being fixed here and was not addressed in this PR.

Confidence Score: 3/5

  • The primary fix is correct and well-tested, but a parallel code path in normalizeDomainDefault has the same array-cast bug and was not addressed.
  • The regex change in normalizeExpressionParentheses correctly solves the reported issue and the new test fixture gives good regression coverage. However, normalizeDomainDefault on line 811 uses \b which does not guard against ::text[] the same way the new fix does — this is an unfixed sibling of the same bug. The fix is also only tested via integration tests against a real database, so there are no unit-level tests that verify the regex in isolation (e.g. with edge inputs like ::text at end-of-string, ::textual, ::text[][]). These factors reduce confidence to 3.
  • ir/normalize.go — both the regex comment accuracy and the normalizeDomainDefault \b pattern warrant attention before merging.

Important Files Changed

Filename Overview
ir/normalize.go Core fix applied correctly in normalizeExpressionParentheses; however, normalizeDomainDefault (line 811) retains the word-boundary (\b) approach that still incorrectly strips ::text[] from array literals in domain defaults. The regex comment on the fix is also misleading about exactly which characters are excluded.
cmd/dump/dump_integration_test.go New test TestDumpCommand_Issue345ArrayCast follows the established short-mode guard + runExactMatchTest pattern; no issues found.
testdata/dump/issue_345_array_cast/pgschema.sql Expected pgschema output correctly preserves ::text[] on the policy expression and strips the redundant 'x'::text cast to just 'x'.
testdata/dump/issue_345_array_cast/pgdump.sql pg_dump fixture accurately represents what PostgreSQL would emit for the policy with both ::text[] and 'x'::text casts; serves as the input side of the round-trip test.
testdata/dump/issue_345_array_cast/raw.sql Clean raw SQL reproducer for the issue; no problems.
testdata/dump/issue_345_array_cast/manifest.json Manifest follows the standard structure with an accurate description and source URL; no issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Policy expression string\n(from pg_dump)"] --> B["normalizeExpressionParentheses()"]
    B --> B1["Step 1: Wrap in outer parens\nif missing"]
    B1 --> B2["Step 2: Remove redundant parens\naround function calls\n(loop until stable)"]
    B2 --> B3["Step 3: Strip redundant ::text casts\nRegex: '([^']+)'::text([^[\\w]|$)\n→ preserves ::text[]"]
    B3 --> C{"Was ::text\nfollowed by []?"}
    C -- "Yes → [^[\\w] won't match [" --> D["Cast preserved\n'{nested,key}'::text[]"]
    C -- "No → char or end-of-string captured as \\$2" --> E["Cast stripped\n'x'::text) → 'x')"]
    D --> F["Normalized expression\nreturned"]
    E --> F

    style D fill:#c8f7c5
    style E fill:#c8f7c5
Loading

Comments Outside Diff (1)

  1. ir/normalize.go, line 811 (link)

    normalizeDomainDefault still strips ::text[] via \b

    The \b word-boundary approach used here has the same root vulnerability as the original normalizeExpressionParentheses bug. After the t in text, a word boundary does exist before [ (word char → non-word char), so '([^']+)'::text\b will match '{a,b}'::text[], consuming only ::text and leaving [] orphaned.

    Unlike normalizeDefaultValue (which intentionally strips array casts because the column type provides context), normalizeDomainDefault is used for domain-level defaults where the array type annotation may carry semantic meaning.

    Consider switching this to the same guard used in normalizeExpressionParentheses:

Last reviewed commit: de8995e

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes pgschema dump output normalization so explicit ::text[] casts in policy expressions aren’t corrupted (e.g., preserving '{nested,key}'::text[] instead of producing invalid '{nested,key}'[]).

Changes:

  • Tightened the normalizeExpressionParentheses redundant text-cast regex to avoid matching ::text when it’s part of an array cast (::text[]).
  • Added a new dump integration fixture for issue #345 covering #>> with an explicit ::text[] cast in an RLS policy.
  • Added an integration test to run the new fixture.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
ir/normalize.go Updates expression normalization regex to preserve ::text[] casts in policy expressions.
cmd/dump/dump_integration_test.go Adds a new integration test to validate the issue #345 fixture.
testdata/dump/issue_345_array_cast/raw.sql Provides a minimal repro schema for the issue.
testdata/dump/issue_345_array_cast/pgdump.sql Captures pg_dump output used as the input for the integration test.
testdata/dump/issue_345_array_cast/pgschema.sql Stores the expected pgschema dump output, ensuring the cast is preserved.
testdata/dump/issue_345_array_cast/manifest.json Adds fixture metadata for the new test case.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Fix regex comment to accurately describe [^[\w] excluding both '[' and
  word characters, not just '['
- Use 'source' field instead of 'source_url' and 'notes' as array to
  match other dump fixture manifests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tianzhou tianzhou merged commit 3619dc0 into main Mar 8, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pgschema dump produces invalid SQL for literal array casts

2 participants