Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: EXPOSED-150 Auto-quoted column names change case across databases #1841

Merged
merged 10 commits into from
Sep 22, 2023

Conversation

bog-walk
Copy link
Member

@bog-walk bog-walk commented Aug 24, 2023

Currently, if a reserved keyword is used as a table or column name, Exposed first changes the case based on the proper database case, then quotes the name.

This means that a table object will create different-cased quoted identifiers in different databases, which is problematic when using migration scripts to enable use of a table object across databases:

object Tester : Table("Public") {
    val col = integer("integer")
}

// PostgreSQL -> CREATE TABLE IF NOT EXISTS "public" ("integer" INT NOT NULL)
// H2         -> CREATE TABLE IF NOT EXISTS "PUBLIC" ("INTEGER" INT NOT NULL)
// MySQL      -> CREATE TABLE IF NOT EXISTS `Public` (`integer` INT NOT NULL)

This fix ensures that a reserved keyword used as an identifier is only ever quoted, so the exact case provided by the user will be retained.

Index Names:
Because of this fix, any auto-generated index names would retain column case, causing strings like TABLE_name_IDX.
To ensure that these index names do not have breaking changes, now inProperCase() is only applied to the final buildString, rather than each separate string component.
This is what is already done for auto-generated foreign-key constraint names.

BREAKING CHANGES:

  • H2 & Oracle: This fix primarily affects these databases as they both support folding identifiers to upper case.
    • e.g. If a column was called "name", it would have previously been sent to the DB as "NAME", but will now remain as "name".
  • PostgreSQL: Supports folding identifiers to lower case, so this fix only affects keyword identifiers that are provided in upper-/mixed-case.
    • e.g. If a column was called "NAME", it would have previously been sent to the DB as "name", but will now remain as "NAME".

OPT-IN FLAG:
As per user input (in issue 683, issue 1658, and related EXPOSED 157), a temporary flag has been introduced, preserveKeywordCasing. It is available in the DatabaseConfig block and defaults to false. Users can @OptIn and set this value to true to implement the fix.

A warning will also be logged for every identifier that is found to be a keyword when a table is created, if the user has not opt-in. The warning indicates that this fix may become the default with the implication that changes should be made to table identifiers if the new behavior clashes with any migration schema.

Internal functions that use the check are annotated as @Deprecated to prevent further internal use.


BREAKING CHANGES in test suites (only with Oracle):

  • Test table objects that both use reserved keywords as table names (e.g. "public") and have an auto-increment column (any IdTable subclass) were failing with error ORA-04043: Object "PUBLIC" does not exist.
    • This is a documented Oracle JDBC driver bug that only happens when such a table is used with an insert statement and connection.prepareStatement(String, Array<String>) and auto-generated keys are fetched.
    • Solution A: Update the Oracle JDBC driver version from the current 12.2.0.1 to at minimum 21.1.0.0. This should ideally be investigated in the future, but cannot currently be implemented. If updated, all tests that use batchInsert() or statement.addBatch() would fail with the error Operation not allowed: DML Returning cannot be batched.
    • Solution B (implemented): Change all IdTable test tables to no longer use reserved keywords as table names.

@bog-walk bog-walk force-pushed the bog-walk/fix-identifier-issues branch from 63304cc to e294751 Compare August 24, 2023 20:25
@bog-walk bog-walk marked this pull request as ready for review August 24, 2023 21:26
@bog-walk bog-walk requested review from e5l and joc-a August 24, 2023 21:26
@bog-walk bog-walk marked this pull request as draft September 1, 2023 13:28
@bog-walk bog-walk force-pushed the bog-walk/fix-identifier-issues branch from e294751 to d10524e Compare September 15, 2023 10:06
@bog-walk bog-walk marked this pull request as ready for review September 15, 2023 16:03
@bog-walk bog-walk requested review from e5l and joc-a September 15, 2023 16:21
Column and table names that are reserved keywords are automatically quoted before
being used in SQL statements. Databases that support upper case folding (H2, Oracle)
quote and upper case the identifiers, so attempting to use the tables across different
databases fails.

This fix ensures any reserved keywords used as identifiers are only quoted, so they
now retain whatever case the user provides them in, but it will be equivalent
across databases.

This broke some tests that checked for index name, as names like TABLE_column_IDX,
were being created in those databases. This was avoided by pulling inProperCase() out
of the buildString until the end of the name creation.

BREAKING CHANGE: [H2, Oracle] Reserved words will be treated as quoted identifiers
and no longer have their case automatically changed to upper case.
Change metadata test to include proper keyword column name in H2.

Change table name of tests failing in Oracle due to using a quoted identifier.
These tests fail because of a JDBC driver bug specific to quoted table identifiers
beign used with an insert statement and preparedStatement(). Upgrading the driver
version to at least 21.1.0.0 resolves the issue but causes other failures (namely
batch insert / DML Returning errors). This will be investigated for the future,
but for now the tables are no longer named using reserved keywords.
Add custom name to missed table that uses reserved keyword.
Add IdentifierManager cache for identities that have been checked against the
keywords list.
Add global flag, preserveKeywordCasing, to DatabaseConfig to allow temporary
opt-out from new behavior.

Log warnings if table is created using identifiers that are in keywords list.
Change table with keyword name.

Restrict opt-out test to H2 for proper Database connection and configuration.
Fix failing tests that don't register non-default flag when run out of isolation.
Switch preserveKeywordCasing flag to have @OptInt and be false by default.

Adjust internal and private functions accordingly.

Adjust unit tests to test for when flag is opted-in.

Logged warnings only happen if keyword is identified and flag is not opted-in.
@bog-walk bog-walk force-pushed the bog-walk/fix-identifier-issues branch from a810b09 to 60eb722 Compare September 21, 2023 19:58
Revert changed unit tests to use old behavior since flag set to false as default.
Comment on lines +139 to +145
@RequiresOptIn(
message = "This API is experimental and the behavior defined by setting this value to 'true' may become the default " +
"in future releases. Its usage must be marked with '@OptIn(org.jetbrains.exposed.sql.ExperimentalKeywordApi::class)' " +
"or '@org.jetbrains.exposed.sql.ExperimentalKeywordApi'."
)
@Target(AnnotationTarget.PROPERTY)
annotation class ExperimentalKeywordApi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preserveKeywordCasing is now set to false by default and requires opt-in to set to true.

I tried to follow naming and message conventions from Kotlin and Compose Multiplatform repos.

Comment on lines +1221 to +1227
private fun String.warnIfUnflaggedKeyword() {
val warn = TransactionManager.currentOrNull()?.db?.identifierManager?.isUnflaggedKeyword(this) == true
if (warn) {
exposedLogger.warn(
"Keyword identifier used: '$this'. Case sensitivity may not be kept when quoted by default: '${inProperCase()}'. " +
"To keep case sensitivity, opt-in and set 'preserveKeywordCasing' to true in DatabaseConfig block."
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce possibly overloading users' logs, a warning is now only logged during table creation for a keyword identifier iff the user has not opted-in for the flag.

@bog-walk bog-walk requested a review from e5l September 22, 2023 00:46
@bog-walk bog-walk merged commit 17a9dc8 into main Sep 22, 2023
5 checks passed
@bog-walk bog-walk deleted the bog-walk/fix-identifier-issues branch September 22, 2023 10:13
saral pushed a commit to saral/Exposed that referenced this pull request Oct 3, 2023
…es (JetBrains#1841)

* fix!: EXPOSED-150 Auto-quoted column names change case across databases

Column and table names that are reserved keywords are automatically quoted before
being used in SQL statements. Databases that support upper case folding (H2, Oracle)
quote and upper case the identifiers, so attempting to use the tables across different
databases fails.

This fix ensures any reserved keywords used as identifiers are only quoted, so they
now retain whatever case the user provides them in, but it will be equivalent
across databases.

This broke some tests that checked for index name, as names like TABLE_column_IDX,
were being created in those databases. This was avoided by pulling inProperCase() out
of the buildString until the end of the name creation.

BREAKING CHANGE: [H2, Oracle] Reserved words will be treated as quoted identifiers
and no longer have their case automatically changed to upper case.

* fix!: EXPOSED-150 Auto-quoted column names change case across databases

Change table name of tests failing in Oracle due to using a quoted identifier.
These tests fail because of a JDBC driver bug specific to quoted table identifiers
beign used with an insert statement and preparedStatement(). Upgrading the driver
version to at least 21.1.0.0 resolves the issue but causes other failures (namely
batch insert / DML Returning errors). This will be investigated for the future,
but for now the tables are no longer named using reserved keywords.

* fix!: EXPOSED-150 Auto-quoted column names change case across databases

Add IdentifierManager cache for identities that have been checked against the
keywords list.

* fix!: EXPOSED-150 Auto-quoted column names change case across databases

Add global flag, preserveKeywordCasing, to DatabaseConfig to allow 
opt-out from new behavior.

Log warnings if table is created using identifiers that are in keywords list.

Restrict opt-in test to H2 for proper Database connection and configuration.

* fix!: EXPOSED-150 Auto-quoted column names change case across databases

Switch preserveKeywordCasing flag to have @OptIn and be false by default.

Adjust internal and private functions accordingly.

Adjust unit tests to test for when flag is opted-in.

Logged warnings only happen if keyword is identified and flag is not opted-in.

* fix!: EXPOSED-150 Auto-quoted column names change case across databases

Revert changed unit tests to use old behavior since flag set to false as default.
@FidasL
Copy link
Contributor

FidasL commented Feb 19, 2024

Hello @bog-walk @e5l do you plan to get rid of preserveKeywordCasing flag in the future releases?

@bog-walk
Copy link
Member Author

Hi @FidasL Eventually deprecating preserveKeywordCasing is the possible plan.

If you have a use case for quoted table identifiers not being case-sensitive by default, or a reason for the flag to remain a permanent (non-experimental) fixture in DatabaseConfig, please don't hesitate to open a YouTrack ticket with the relevant details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent column names for different databases (ie. H2 & PostgreSQL)
4 participants