Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upper case option affect table fields names #156

Closed
royKarseboom opened this issue Jul 4, 2021 · 10 comments
Closed

Upper case option affect table fields names #156

royKarseboom opened this issue Jul 4, 2021 · 10 comments
Labels

Comments

@royKarseboom
Copy link

Hi, in the case were the field name is a also a keyword in the language chosen, the field is being capitilized.
for example:
running formatter on the following with -u -l postgresql options:
CREATE TABLE users(id INT PRIMARY KEY, name TEXT NOT NULL);
yields:
CREATE TABLE users(id INT PRIMARY KEY, NAME TEXT NOT NULL);
as opposed to just leaving it the same way.
tnx!

@nene nene added the bug label May 3, 2022
@nene
Copy link
Collaborator

nene commented Jun 13, 2022

The issue here has multiple levels:

  1. NAME is not a reserved keyword (just a plain keyword) and can therefore be used as an identifier in most contexts. However SQL Formatter currently treats all keywords the same (whether reserved or not).
  2. PostgreSQL is particularly complicated:

    Even reserved key words are not completely reserved in PostgreSQL, but can be used as column labels (for example, SELECT 55 AS CHECK, even though CHECK is a reserved key word).

@karlhorky
Copy link
Contributor

karlhorky commented Nov 26, 2023

Just ran into this now too, actors.name was lowercase before 😲

CREATE TABLE
  actors (
    id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    NAME VARCHAR(80) NOT NULL
  );

@nene what are your thoughts on extending keywordCase to accept an object?

  1. accepting an object for the option keywordCase, for granular configuration:
const options = {
  language: 'postgresql',
  keywordCase: {
    upper: true, // true: default case
    lower: ['name', 'varchar', 'integer'],
    preserve: ['in'],
  },
}
  1. an option keywordCaseIgnore (array), which would allow for more flexible customization (would not allow for same configurability, I prefer option 1)
const options = {
  language: 'postgresql',
  keywordCase: 'upper',
  keywordCaseIgnore: [
    // Avoid changing case of `name` fields in tables
    'name',
    // Avoid changing case of data types
    'varchar',
    'integer',
    // ...
  ],
}

Edit: Just realized I submitted a similar feature request last month and forgot about it, would also probably resolve this one:

@karlhorky
Copy link
Contributor

karlhorky commented Nov 26, 2023

I also just saw the following PR by @chrjorgensen for a new option identifierCase: 'preserve' | 'upper' | 'lower', which was also released in sql-formatter@14.0.0

I haven't yet upgraded to 14.0.0, but maybe this also resolves the issue with field names being capitalized in CREATE TABLE?

Edit: Just upgraded to 14.0.0 and no it doesn't resolve the issue with fields being uppercased to NAME, because it runs before the other keywordCase transformation

@karlhorky
Copy link
Contributor

karlhorky commented Nov 27, 2023

Workaround

Until it's possible to configure the keywords in a more granular way, one workaround is to patch sql-formatter.

Use patch-package to patch node_modules/sql-formatter/lib/formatter/ExpressionFormatter.js (and node_modules/sql-formatter/dist/sql-formatter.min.cjs if you're using CommonJS) as follows:

  showNonTabularKw(node) {
+    if (['NAME'].includes(node.text)) {
+      return node.text.toLowerCase();
+    }
+
    switch (this.cfg.keywordCase) {
      case 'preserve':
        return equalizeWhitespace(node.raw);

      case 'upper':
        return node.text;

      case 'lower':
        return node.text.toLowerCase();
    }
  }

@nene
Copy link
Collaborator

nene commented Nov 27, 2023

@karlhorky I'd rather not add a separate way of tweaking which keywords get capitalized. There already exists a way to create a custom dialect. One can simply extend an existing dialect and change the list of keywords. See docs for the "dialect" option.

@karlhorky
Copy link
Contributor

There already exists a way to create a custom dialect. One can simply extend an existing dialect and change the list of keywords. See docs for the "dialect" option.

Interesting! Maybe our organization can create a custom PostgreSQL dialect that will work for most cases... 🤔

I guess I would still like to be able to have lowercase data types, but maybe the combination dataTypeCase - #653 and dialects would cover all cases we're currently interested in.

@karlhorky
Copy link
Contributor

Opened a draft PR for experimental datatypeCase option, still need to add the main functionality and tests:

I'm basing it off the identifierCase PR:

@karlhorky
Copy link
Contributor

karlhorky commented Dec 2, 2023

Custom Dialect solution

Maybe our organization can create a custom PostgreSQL dialect that will work for most cases... 🤔

Using the dialect option with a custom dialect (experimental) I made a modified PostgreSQL dialect with the 'LABEL', 'LOCATION', 'NAME' and 'RELEASE' keywords excluded:

import { formatDialect, postgresql } from 'sql-formatter';

// Avoid identifiers being uppercased
// https://github.com/sql-formatter-org/sql-formatter/issues/156#issuecomment-1826846259
const postgresWithoutNameKeyword = {
  ...postgresql,
  tokenizerOptions: {
    ...postgresql.tokenizerOptions,
    reservedKeywords: postgresql.tokenizerOptions.reservedKeywords.filter(
      (reservedKeyword) =>
        ![
          // Uncommon usage with SECURITY LABEL
          // https://www.postgresql.org/docs/current/sql-security-label.html
          'LABEL',
          // Uncommon usage with CREATE TABLESPACE
          // CREATE TABLESPACE dbspace LOCATION '/data/dbs';
          // https://www.postgresql.org/docs/current/sql-createtablespace.html
          'LOCATION',
          // Data type for use in internal system catalogs
          // https://www.postgresql.org/docs/current/datatype-character.html#DATATYPE-CHARACTER-SPECIAL-TABLE
          'NAME',
          // Uncommon usage with RELEASE SAVEPOINT
          // https://www.postgresql.org/docs/current/sql-release-savepoint.html
          'RELEASE',
        ].includes(reservedKeyword),
    ),
  },
};

formatDialect('cReAtE tAbLe actors ( name VARCHAR(80) NOT NULL )', {
  dialect: postgresWithoutNameKeyword,
  keywordCase: 'upper'
});

/*
  Output:

  CREATE TABLE
    actors (
      name VARCHAR(80) NOT NULL
    )
*/

The name identifier stays lowercase 🎉

@nene
Copy link
Collaborator

nene commented Dec 7, 2023

I did a sort-of basic fix for this, removing the following from list of keywords:

  • label
  • location
  • name
  • names
  • release
  • type
  • types

There are quite a bit of other keywords in this list that we probably should drop, but at least for start the most problematic ones should be gone now.

@karlhorky
Copy link
Contributor

Oh nice, that's great!

I'll remove my custom dialect once d34557b is published in a release 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants