DatabaseChangeLogLock race condition exists if two nodes both try to create the table on ORACLE #1036

davidcyp · 2020-03-16T20:05:35Z

Description

See the description for CORE-2596(https://liquibase.jira.com/browse/CORE-2596). The issue mentioned in CORE-2596 still arises when working with Oracle database.

While the fix for CORE-2596 checks for the message to contain the keyword exists, Oracle responds with the following error message:

_ora-00955 name is already used by an existing object _. Given that the keyword exists is not present in Oracle's description the code to resolve the deadlock is not executed, resulting in the described problem.

The solution is to adapt the if clause in the code implemented to fix the referenced issue.

} catch (DatabaseException e) {
                if ((e.getMessage() != null) && e.getMessage().contains("exists")) {	                if ((e.getMessage() != null) && e.getMessage().contains("exists")) {
                    //hit a race condition where the table got created by another node.	                    //hit a race condition where the table got created by another

To Reproduce
See the linked ticket. The application still ends up in a race condition.

Expected behavior
See the linked ticket.

Screenshots
N/A
Additional context
N/A

The text was updated successfully, but these errors were encountered:

SteveDonie · 2020-03-17T15:17:49Z

Thanks for reporting this! We will review and prioritize. If you are able to submit a PR, that will help expedite a fix.

davidcyp · 2020-03-17T16:06:07Z

We implemented a small fix yesterday(actually by checking for the contents of the ORA error message, though I'm not sure if that is the way to go). Unfortunately we ran into another problem thereafter.

Context:

We have multiple spring boot modules which we start in parallel.
All these processes try to execute the same liquibase scripts.

Now some of the processes try to execute following SQL statements, but the processes do this with an offset of only a few milliseconds, resulting in the following error below.

Therefore we changed our approach and now wait for the 1st process to be completely started before launching the other processes in parallel.

Sidenote: We know there are alternative and better approaches to execute liquibase without these issues, but the situation at the customer is as described.

om.zaxxer.hikari.pool.HikariProxyConnection.setRemarksReporting(boolean)

21:15:17.935 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - SELECT COUNT(*) FROM E1S1_ADM.DATABASECHANGELOGLOCK

21:15:17.955 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - CREATE TABLE E1S1_ADM.DATABASECHANGELOGLOCK (ID INTEGER NOT NULL, LOCKED NUMBER(1) NOT NULL, LOCKGRANTED TIMESTAMP, LOCKEDBY VARCHAR2(255), CONSTRAINT PK_DATABASECHANGELOGLOCK PRIMARY KEY (ID))

21:15:18.155 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - SELECT COUNT(*) FROM E1S1_ADM.DATABASECHANGELOGLOCK

21:15:18.190 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - DELETE FROM E1S1_ADM.DATABASECHANGELOGLOCK

21:15:18.208 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - INSERT INTO E1S1_ADM.DATABASECHANGELOGLOCK (ID, LOCKED) VALUES (1, 0)

21:15:18.236 [] [main] WARN  igServletWebServerApplicationContext - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'liquibase' defined in class path resource [org/springframework/boot/autoconfigure/liquibase/LiquibaseAutoConfiguration$LiquibaseConfiguration.class]: Invocation of init method failed; nested exception is liquibase.exception.LockException: liquibase.exception.DatabaseException: ORA-00001: unique constraint (E1S1_ADM.PK_DATABASECHANGELOGLOCK) violated

[Failed SQL: INSERT INTO E1S1_ADM.DATABASECHANGELOGLOCK (ID, LOCKED) VALUES (1, 0)]

21:15:18.244 [] [main] INFO  org.eclipse.jetty.server.session     - node0 Stopped scavenging

nealeu · 2021-01-17T18:29:00Z

@davidcyp I'm assuming that the different modules have different Liquibase changelog sets, otherwise you might not need Liquibase to run for all of them.
Certainly you should only need to insert a small delay rather than having those other modules wait for the whole of the first microservice to start.

Another thought on a solution would be to have a Liquibase option to indicate that Liquibase on the other processes should wait for the lock table to exist rather than try to create it.

davidcyp · 2021-01-17T20:39:06Z

@nealeu unfortunately not. These processes all operate on the same tables. We fixed it by writing a small orchestration tool for our processes.

ps: I know that designing our codebase differently would also be a fix, but see my initial sidenote ;)

nealeu · 2021-01-18T15:10:25Z

@davidcyp. Surely in that case, you only need to run Liquibase on the one that you are running first - the others are just spending CPU cycles parsing Liquibase changelogs to find that the original process beat them to it.

DiogoParrinha · 2021-11-16T12:20:04Z

This also affects PostgreSQL @molivasdat (3.10.2)

oey · 2023-09-05T07:10:37Z

Discovered this race condition when upgrading to Liquibase 4.23.1.
We've a custom Docker image with some DB tools and Liquibase for managing database upgrades in Kubernetes.

Recently when upgrading to 4.23.1 from 4.21.1, one of our automated tests for concurrent use failed with ORA-00955: name is already used by an existing object when creating the DATABASECHANGELOG table as another container beat this container in creating the table.

As k8s may start several containers we need to handle concurrent use. We use Liquibase-sessionlock to handle concurrent use in the migration scripts.

A quick fix from our side will be to create the DATABASECHANGELOG and the current set of columns before we call Liquibase.

davidcyp added the Status:Discovery label Mar 16, 2020

datical-jenkins added Status:Discovery and removed Status:Discovery labels Mar 25, 2020

molivasdat added DBAll DBH2 DBOracle ImpactHigh RiskMedium Changes that require more testing or that affect many different code paths. Severity4 TypeEnhancement and removed RiskMedium Changes that require more testing or that affect many different code paths. labels May 18, 2020

molivasdat added the hacktoberfest a month-long celebration of open-source software and Developers contribute by completing PRs label Sep 26, 2020

zholub mentioned this issue Dec 11, 2020

Race condition and possible lock bypass if two nodes run on clean schema #1584

Closed

kataggart removed the StatusDiscovery label May 23, 2022

kataggart added DATABASECHANGELOGLOCK scaling IntegrationSpringboot labels Jul 18, 2022

kataggart added this to To Do in Conditioning++ via automation Jul 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DatabaseChangeLogLock race condition exists if two nodes both try to create the table on ORACLE #1036

DatabaseChangeLogLock race condition exists if two nodes both try to create the table on ORACLE #1036

davidcyp commented Mar 16, 2020 •

edited by sync-by-unito bot

SteveDonie commented Mar 17, 2020

davidcyp commented Mar 17, 2020

nealeu commented Jan 17, 2021

davidcyp commented Jan 17, 2021

nealeu commented Jan 18, 2021

DiogoParrinha commented Nov 16, 2021 •

edited

oey commented Sep 5, 2023

DatabaseChangeLogLock race condition exists if two nodes both try to create the table on ORACLE #1036

DatabaseChangeLogLock race condition exists if two nodes both try to create the table on ORACLE #1036

Comments

davidcyp commented Mar 16, 2020 • edited by sync-by-unito bot

SteveDonie commented Mar 17, 2020

davidcyp commented Mar 17, 2020

nealeu commented Jan 17, 2021

davidcyp commented Jan 17, 2021

nealeu commented Jan 18, 2021

DiogoParrinha commented Nov 16, 2021 • edited

oey commented Sep 5, 2023

davidcyp commented Mar 16, 2020 •

edited by sync-by-unito bot

DiogoParrinha commented Nov 16, 2021 •

edited