Skip to content

deparser bug: Incorrect string escaping #124

@rwe

Description

@rwe

Currently, deparse represents most string literals escaped only by single-quote doubling.
If special characters like backslashes occur, they're output literally, possibly resulting in invalid data. (How and whether those backslashes are interpreted is configuration-dependent).

Comment statements (COMMENT ON TABLE "foo" IS …) are, for some reason, the only string literals that are rendered with an attempt to use C-style "escape" strings (E'foo'), which consistently interpret backslash-escaping.

However, the rendering logic for those is completely wrong. It doesn't do the requisite quote doubling (or, alternatively, quote-escaping), and instead of doubling backslashes, it replaces them with the same number of them.

Effectively comments are rendered worse than unescaped, since they both still open for single-quote mangling, and additionally opened up for backslash interpolation.

That means that a validly written comment like:

COMMENT ON COLUMN "foo"."whatever" IS $$
Something blah, this data may have chars like '\n' and '\r' in it.
$$;

Gets rewritten as:

COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like '\n' and '\r' in it.
';

which is completely invalid syntax since the single quotes close on the single quote before the [\, n] chars.

If you change that text to '\n' to "\n" (e.g. to pretend that the single quotes had been correctly escaped), then it outputs as:

COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like "\n" and "\r" in it.
';

This parses, but not as the original comment. When re-parsed, becomes:

COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like "
" and "
" in it.
';

Instead, Deparser.escape should be written something like:

class Deparser {
  // ...
  escape(str: string): string {
    // If the string contains backslashes, render an escape string using the E'…' syntax.
    // Single quotes and backslashes are escaped with a backslash.
    if (str.includes('\\')) {
      return `E'${str.replaceAll(/[\\']/g, '\\$&')}'`;
    }
    // No backslashes: double any internal single quotes.
    return `'${str.replaceAll(/'/g, '$&$&')}'`;
  }

  escapeNull(str: string | null | undefined): string {
    return str == null ? 'NULL' : this.escape(str);
  }
}

and the relevant lines in CommentStmt (

const escapeComment = (str) => {
return str.replace(/\\/g, '\\');
};
if (node.comment) {
if (/[^a-zA-Z0-9]/.test(node.comment)) {
// special chars we care about...
output.push(`E'${escapeComment(node.comment)}'`);
} else {
// find a double \\n or \\ something...
output.push(`'${node.comment}'`);
}
} else {
output.push('NULL');
}
) should be:

-const escapeComment = (str) => {
-  return str.replace(/\\/g, '\\');
-};
-
-if (node.comment) {
-  if (/[^a-zA-Z0-9]/.test(node.comment)) {
-    // special chars we care about...
-    output.push(`E'${escapeComment(node.comment)}'`);
-  } else {
-    // find a double \\n or \\ something...
-    output.push(`'${node.comment}'`);
-  }
-} else {
-  output.push('NULL');
-}
+output.push(this.escapeNull(node.comment));

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions