Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDBSchema not designed for case-sensitive databases #19

Closed
IvanSeregin opened this issue Sep 8, 2023 · 13 comments
Closed

IDBSchema not designed for case-sensitive databases #19

IvanSeregin opened this issue Sep 8, 2023 · 13 comments
Assignees
Labels
bug Something isn't working cdo-core CDO Model Repository (Core)
Milestone

Comments

@IvanSeregin
Copy link

Hi Eike,
Based on the discussion https://www.eclipse.org/forums/index.php/t/1113597/ , it would be nice to have org.eclipse.net4j.internal.db.ddl.DBNamedElement.class.name(String name) configurable. If possible, it might return name as is, converted to lower case, converted to upper case.

I don't know if other DBMS are case sensitive or not, but Postgres is case sensitive. I have all objects in lower case, like schemas and table names, so name(String name) converting names to upper case brings some problems.

@estepper estepper changed the title org.eclipse.net4j.internal.db.ddl.DBNamedElement.class.name(String name) returns names in upper case IDBSchema not designed for case-sensitive databases Sep 27, 2023
@estepper estepper self-assigned this Sep 27, 2023
@estepper estepper added bug Something isn't working cdo-core CDO Model Repository (Core) labels Sep 27, 2023
@estepper estepper added this to the 4.25 milestone Sep 27, 2023
@estepper
Copy link
Contributor

This was really tricky! But now it should work with the help of the new IDBAdapter.isCaseSensitive() method ;-)

@estepper
Copy link
Contributor

Hi Ivan, do you have a chance to test my changes and give some feedback how they work for you?

@SebastienRevol
Copy link

Hello Eike, I just tried the related fix you pushed on master and got some errors using our own postgres13 dbAdapter or the default H2 db. In both case, the issue occurs when we restart the cdo server after a first init.
More particularly, org.eclipse.emf.cdo.server.internal.db.DBStore.doActivate() is intializing the schemaName with the name of the repository. This name is later used in DBAdapter.readSchema as the "dbName" to call org.eclipse.net4j.db.DBUtil.forEachTable, and iterate over the tables of the given schema.

But in practice, tables are created in the default "public" schema in PostGres, and "PUBLIC" with H2.
Hence, IDBSchema is not intialized correctly and the server tries to initialize again the DB with the default tables and I get the follwing exception (with H2):

[ERROR] Table "CDO_BRANCHES" already exists; SQL statement:
CREATE TABLE cdo_branches (id INTEGER NOT NULL, name VARCHAR(255) NULL, base_id INTEGER NULL, base_time BIGINT NULL) [42101-168] --> CREATE TABLE cdo_branches (id INTEGER NOT NULL, name VARCHAR(255) NULL, base_id INTEGER NULL, base_time BIGINT NULL)
org.eclipse.net4j.db.DBException: Table "CDO_BRANCHES" already exists; SQL statement:
CREATE TABLE cdo_branches (id INTEGER NOT NULL, name VARCHAR(255) NULL, base_id INTEGER NULL, base_time BIGINT NULL) [42101-168] --> CREATE TABLE cdo_branches (id INTEGER NOT NULL, name VARCHAR(255) NULL, base_id INTEGER NULL, base_time BIGINT NULL)
at org.eclipse.net4j.db.DBUtil.execute(DBUtil.java:830)
at org.eclipse.net4j.spi.db.DBAdapter.createTable(DBAdapter.java:492)
at org.eclipse.net4j.spi.db.DBAdapter$1.visit(DBAdapter.java:386)
at org.eclipse.net4j.internal.db.ddl.delta.DBTableDelta.doAccept(DBTableDelta.java:197)
at org.eclipse.net4j.internal.db.ddl.delta.DBDelta.accept(DBDelta.java:112)
at org.eclipse.net4j.internal.db.ddl.delta.DBDelta.accept(DBDelta.java:121)
at org.eclipse.net4j.spi.db.DBAdapter.updateSchema(DBAdapter.java:475)
at org.eclipse.net4j.internal.db.DBSchemaTransaction.run(DBSchemaTransaction.java:141)
at org.eclipse.net4j.internal.db.DBSchemaTransaction.run(DBSchemaTransaction.java:1)
at org.eclipse.net4j.db.DBUtil.execute(DBUtil.java:796)
at org.eclipse.net4j.internal.db.DBSchemaTransaction.commit(DBSchemaTransaction.java:110)
at org.eclipse.net4j.internal.db.DBSchemaTransaction.commit(DBSchemaTransaction.java:1)
at org.eclipse.emf.cdo.server.internal.db.DBStore.doActivate(DBStore.java:780)
at org.eclipse.net4j.util.lifecycle.Lifecycle.internalActivate(Lifecycle.java:76)
at org.eclipse.net4j.util.lifecycle.Lifecycle.activate(Lifecycle.java:178)
at org.eclipse.net4j.util.lifecycle.LifecycleUtil.activate(LifecycleUtil.java:146)
at org.eclipse.net4j.util.lifecycle.LifecycleUtil.activate(LifecycleUtil.java:136)
at org.eclipse.emf.cdo.internal.server.Repository.doActivate(Repository.java:2793)
at org.eclipse.net4j.util.lifecycle.Lifecycle.internalActivate(Lifecycle.java:76)
at org.eclipse.net4j.util.lifecycle.ShareableLifecycle.internalActivate(ShareableLifecycle.java:43)
at org.eclipse.net4j.util.lifecycle.Lifecycle.activate(Lifecycle.java:178)
at org.eclipse.net4j.util.lifecycle.LifecycleUtil.activate(LifecycleUtil.java:146)
at org.eclipse.net4j.util.lifecycle.LifecycleUtil.activate(LifecycleUtil.java:136)
at org.eclipse.emf.cdo.server.CDOServerUtil.addRepository(CDOServerUtil.java:319)
at org.eclipse.emf.cdo.spi.server.RepositoryConfigurator.configure(RepositoryConfigurator.java:170)
at org.eclipse.emf.cdo.spi.server.RepositoryConfigurator.configure(RepositoryConfigurator.java:139)
at org.eclipse.emf.cdo.internal.server.bundle.CDOServerApplication.startRepositories(CDOServerApplication.java:165)
at org.eclipse.emf.cdo.internal.server.bundle.CDOServerApplication.doStart(CDOServerApplication.java:149)
at org.eclipse.net4j.util.om.OSGiApplication.start(OSGiApplication.java:64)
at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:203)
at org.eclipse.equinox.internal.app.AnyThreadAppLauncher.run(AnyThreadAppLauncher.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.h2.jdbc.JdbcSQLException: Table "CDO_BRANCHES" already exists; SQL statement:
CREATE TABLE cdo_branches (id INTEGER NOT NULL, name VARCHAR(255) NULL, base_id INTEGER NULL, base_time BIGINT NULL) [42101-168]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:329)
at org.h2.message.DbException.get(DbException.java:169)
at org.h2.message.DbException.get(DbException.java:146)
at org.h2.command.ddl.CreateTable.update(CreateTable.java:108)
at org.h2.command.CommandContainer.update(CommandContainer.java:75)
at org.h2.command.Command.executeUpdate(Command.java:230)
at org.h2.jdbc.JdbcStatement.executeInternal(JdbcStatement.java:177)
at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:152)
at org.eclipse.net4j.db.DBUtil.execute(DBUtil.java:826)
... 31 more

@estepper
Copy link
Contributor

estepper commented Oct 7, 2023

Hi Séb,

Sorry for the disruption! Usually I try to fix bugs without breaking existing usages. In this case it was very complex because of the different DB behaviors and with restarts of existing databases in mind. At some point I decided to ignore existing Postgres databases because I concluded that they really can't exist with the former (buggy) PostgresAdapter. Of course I didn't consider custom Postgres adapters ;-(

On the other hand I'm still not sure I fully understand the nature of your problem, so here come a few questions:

  1. Is your problem (only) with an existing database, i.e., one with a schema created with an older CDO version?
  2. Have you tried the new CDO version with a fresh empty database, i.e., first start and restart with the new code?
  3. Somehow I understand that the actual schema name and the schema name that CDO uses are different. I wonder which of the two should be fixed? I.e., Would you say that the tables are created in the wrong schema? Or do we need a way to tell the DBStore explicitely the name of the schema to use?

@estepper estepper reopened this Oct 7, 2023
@openMonArch
Copy link

Hi guys, I am highly interested in a working Postgres adapter. Changing the schema that CDO uses is a very good idea. Therefore, we could easily combine CDO with our own database optimizations and functionality. How can I help you?

@SebastienRevol
Copy link

SebastienRevol commented Oct 9, 2023

Hello Eike, thanks for your answser.
To answser your questions :
1:No the problem occurs with fresh new databases.
2: Yes I have tested : with the new code, start a new database, hard cdo server shutdown shutdown, restart the cdo server.
This issue also occurs with the default H2 connector for me! Not only Postgres. But maybe I have something wriong?
3: What I see in my case is that tables are defined in the default "public" schema. I don't see in the table creation queries statements specifying a specific target schema.

To me the new behaviour causing this issue is in the org.eclipse.net4j.db.DBUtil.forEachTable where you are specifically checking the schemaName in the metaData.getTables result. Previously the schema was not checked, and tables names were collected whatever the schema.

But i agree it sounds like a good idea to clearly identify the target schema in the db to allow the integration of CDO into an existing DB.

@estepper
Copy link
Contributor

In my next commit I'll add two ways to explicitly specify the schema name to use for (the DBStore of) a repository:

  1. A new property IDBStore.Props.SCHEMA_NAME = "schemaName" that can also be specified via cdo-server.xml.
  2. A new method IDBAdapter.getDefaultSchemaName(Connection), which returns null by default and "public" for H2.

The effective schema name of a repository is determined in this order:

  1. IDBStore.Props.SCHEMA_NAME
  2. IDBAdapter.getDefaultSchemaName()
  3. repository.getName()

I hope that this resolves the remaining issues and causes no disruption for existing users / databases.

@estepper
Copy link
Contributor

This should be available in build I20231013-0501 in a few minutes.

@SebastienRevol
Copy link

Hello Eike, I managed to make experiments with our postgres adapter. I added both isCaseSensitive and getDefaultSchemaName in it. I managed to get something working, but with the following limitations:

  • isCaseSensitive should still return false. When it returns true, there is an issue when the CDO server restarts. This is due to the fact that by default, Postgres (13) stores all table and column identifiers in lower case. According to this, identifiers should be enclosed in double-quotes to respect the case.

For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but "Foo" and "FOO" are different from these three and each other.

  • getDefaultSchemaName should return "public". Setting another value with IDBStore.Props.SCHEMA_NAME fails. As far as I understand, the schemaName is only used for read access queries in CDO. There is no "CREATE SCHEMA" statements in CDO that could allow to store tables in a schema which is not the DB (system specific) default schema. Right?

@openMonArch : for your info, I have shared our Postgres adapter with Eike. We only tested it with Postgres 13, I don't know if it works with other versions, but we would be happy to contribute it to CDO if it makes sense! :)

@estepper
Copy link
Contributor

During the past days I've implemented a few more things related to dealing with DB element naming and schema handling. Most significantly the ability to use quoted element names, which completely prevents all possible problems that were caused by DB vendor-specific mangling of element names. Here's a summary of the new functionalities:

  1. By default now, all schema, table, and field names are properly quoted. If that creates problems with existing databases you can turn off quoting completely with -Dorg.eclipse.net4j.db.DISABLE_QUOTED_NAMES=true.
  2. The names of all CDO-specific tables and fields can be individually overridden, for example -Dorg.eclipse.emf.cdo.server.internal.db.mapping.horizontal.MappingNames.cdo_revised=xyz_REVISED and -Dorg.eclipse.emf.cdo.server.internal.db.DBStoreTables.BranchesTable.name=XYZ_branchName. This may also help overcome problems with quoting in existing databases.
  3. Schema names can not only be specified for use in metadata queries (see IDBSchema not designed for case-sensitive databases #19 (comment)), but now also in DDL statements such as table creations. This behavior can be turned on per repository programmatically via IDBStore.Props.PREPEND_SCHEMA_NAME, or in cdo-server.xml via <property name="prependSchemaName" value="true"/>.
  4. On-demand creation of schemas can be turned on per repository programmatically via IDBStore.Props.CREATE_SCHEMA_IF_NEEDED, or in cdo-server.xml via <property name="createSchemaIfNeeded" value="true"/>.
  5. Now you can more easily log executed SQL statements with -Dorg.eclipse.net4j.db.DEBUG=true

@estepper estepper reopened this Oct 28, 2023
@estepper
Copy link
Contributor

Important Note:

Opening existing databases can be problematic with the new code, especially the automatic quoting of all names!

The new code attempts to detect these situations and outputs something like the following to the console:

[INFO] The CDO system tables are not found in schema "PUBLIC"! Other schemas may contain a CDO repository: PUBLIC
[INFO] This can indicate an attempt to open an existing database with a newer CDO server.
It may help to specify 'schemaName=...' and 'prependSchemaName=true' in cdo-server.xml,
and/or the system property '-Dorg.eclipse.net4j.db.DISABLE_QUOTED_NAMES=true'

I apologize for this potential disruption after migrating to the new code, but I'm convinced that using quoted names is the only way to address the different name conversion behaviors of all the different DB vendors.

@estepper
Copy link
Contributor

estepper commented Oct 28, 2023

All the mentioned changes are now pushed .

Please test this new version and report problems here!

@estepper
Copy link
Contributor

I-build I20231028-0750 is in progress...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cdo-core CDO Model Repository (Core)
Projects
None yet
Development

No branches or pull requests

4 participants