Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DatabaseChangeLogLock race condition exists if two nodes both try to create the table on ORACLE #1036

Open
davidcyp opened this issue Mar 16, 2020 · 7 comments

Comments

@davidcyp
Copy link

davidcyp commented Mar 16, 2020

Description

See the description for CORE-2596(https://liquibase.jira.com/browse/CORE-2596). The issue mentioned in CORE-2596 still arises when working with Oracle database.

While the fix for CORE-2596 checks for the message to contain the keyword exists, Oracle responds with the following error message:

_ora-00955 name is already used by an existing object _. Given that the keyword exists is not present in Oracle's description the code to resolve the deadlock is not executed, resulting in the described problem.

The solution is to adapt the if clause in the code implemented to fix the referenced issue.

} catch (DatabaseException e) {
                if ((e.getMessage() != null) && e.getMessage().contains("exists")) {	                if ((e.getMessage() != null) && e.getMessage().contains("exists")) {
                    //hit a race condition where the table got created by another node.	                    //hit a race condition where the table got created by another

To Reproduce
See the linked ticket. The application still ends up in a race condition.

Expected behavior
See the linked ticket.

Screenshots
N/A
Additional context
N/A

@SteveDonie
Copy link
Contributor

Thanks for reporting this! We will review and prioritize. If you are able to submit a PR, that will help expedite a fix.

@davidcyp
Copy link
Author

We implemented a small fix yesterday(actually by checking for the contents of the ORA error message, though I'm not sure if that is the way to go). Unfortunately we ran into another problem thereafter.

Context:

  • We have multiple spring boot modules which we start in parallel.
  • All these processes try to execute the same liquibase scripts.

Now some of the processes try to execute following SQL statements, but the processes do this with an offset of only a few milliseconds, resulting in the following error below.

Therefore we changed our approach and now wait for the 1st process to be completely started before launching the other processes in parallel.

Sidenote: We know there are alternative and better approaches to execute liquibase without these issues, but the situation at the customer is as described.

om.zaxxer.hikari.pool.HikariProxyConnection.setRemarksReporting(boolean)

21:15:17.935 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - SELECT COUNT(*) FROM E1S1_ADM.DATABASECHANGELOGLOCK

21:15:17.955 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - CREATE TABLE E1S1_ADM.DATABASECHANGELOGLOCK (ID INTEGER NOT NULL, LOCKED NUMBER(1) NOT NULL, LOCKGRANTED TIMESTAMP, LOCKEDBY VARCHAR2(255), CONSTRAINT PK_DATABASECHANGELOGLOCK PRIMARY KEY (ID))

21:15:18.155 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - SELECT COUNT(*) FROM E1S1_ADM.DATABASECHANGELOGLOCK

21:15:18.190 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - DELETE FROM E1S1_ADM.DATABASECHANGELOGLOCK

21:15:18.208 [] [main] INFO  liquibase.executor.jvm.JdbcExecutor  - INSERT INTO E1S1_ADM.DATABASECHANGELOGLOCK (ID, LOCKED) VALUES (1, 0)

21:15:18.236 [] [main] WARN  igServletWebServerApplicationContext - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'liquibase' defined in class path resource [org/springframework/boot/autoconfigure/liquibase/LiquibaseAutoConfiguration$LiquibaseConfiguration.class]: Invocation of init method failed; nested exception is liquibase.exception.LockException: liquibase.exception.DatabaseException: ORA-00001: unique constraint (E1S1_ADM.PK_DATABASECHANGELOGLOCK) violated

[Failed SQL: INSERT INTO E1S1_ADM.DATABASECHANGELOGLOCK (ID, LOCKED) VALUES (1, 0)]

21:15:18.244 [] [main] INFO  org.eclipse.jetty.server.session     - node0 Stopped scavenging

@molivasdat molivasdat added DBAll DBH2 DBOracle ImpactHigh RiskMedium Changes that require more testing or that affect many different code paths. Severity4 TypeEnhancement and removed RiskMedium Changes that require more testing or that affect many different code paths. labels May 18, 2020
@molivasdat molivasdat added the hacktoberfest a month-long celebration of open-source software and Developers contribute by completing PRs label Sep 26, 2020
@nealeu
Copy link

nealeu commented Jan 17, 2021

@davidcyp I'm assuming that the different modules have different Liquibase changelog sets, otherwise you might not need Liquibase to run for all of them.
Certainly you should only need to insert a small delay rather than having those other modules wait for the whole of the first microservice to start.

Another thought on a solution would be to have a Liquibase option to indicate that Liquibase on the other processes should wait for the lock table to exist rather than try to create it.

@davidcyp
Copy link
Author

@nealeu unfortunately not. These processes all operate on the same tables. We fixed it by writing a small orchestration tool for our processes.

ps: I know that designing our codebase differently would also be a fix, but see my initial sidenote ;)

@nealeu
Copy link

nealeu commented Jan 18, 2021

@davidcyp. Surely in that case, you only need to run Liquibase on the one that you are running first - the others are just spending CPU cycles parsing Liquibase changelogs to find that the original process beat them to it.

@DiogoParrinha
Copy link

DiogoParrinha commented Nov 16, 2021

This also affects PostgreSQL @molivasdat (3.10.2)

@oey
Copy link

oey commented Sep 5, 2023

Discovered this race condition when upgrading to Liquibase 4.23.1.
We've a custom Docker image with some DB tools and Liquibase for managing database upgrades in Kubernetes.

Recently when upgrading to 4.23.1 from 4.21.1, one of our automated tests for concurrent use failed with ORA-00955: name is already used by an existing object when creating the DATABASECHANGELOG table as another container beat this container in creating the table.

As k8s may start several containers we need to handle concurrent use. We use Liquibase-sessionlock to handle concurrent use in the migration scripts.

A quick fix from our side will be to create the DATABASECHANGELOG and the current set of columns before we call Liquibase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

8 participants