-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON based object mapping in JDBC connectors #7841
JSON based object mapping in JDBC connectors #7841
Conversation
Are two initial commits relevant for PR? |
...ase-jdbc/src/main/java/io/trino/plugin/jdbc/mapping/CaseInsensitiveNameMatchingCacheTtl.java
Outdated
Show resolved
Hide resolved
.../trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/mapping/CacheBasedIdentifierMapping.java
Outdated
Show resolved
Hide resolved
@Inject | ||
public CacheBasedIdentifierMapping( | ||
@CaseInsensitiveNameMatchingCacheTtl Duration caseInsensitiveNameMatchingCacheTtl, | ||
DefaultIdentifierMapping defaultIdentifierMapping, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want to be able to provide different implementation here? Seems like you should use annotated interface here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it resolved but not answered addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that. I changed the code a lot so I thought that comment no longer applies. See the above comment (copy pasted below):
No. DefaultIdentifierMapping has specific implementation that I wanted to consciously use here. Hence type is not generic IdentifierMapping.
DefaultIdentifierMapping
is part of implementation of CacheBasedIdentifierMapping
, that way there are no logical changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skimmed last commit. Seems fine. A couple questions.
Kind of. First commit is a prerequisite, otherwise guice in Phoenix could look very ugly. Second one is just a litter that I found. |
It is WIP. I will post an update soon. |
Sure :) A hint - it is nice if WIP PRs are marked as |
Sure. I was thinking that if I not ask for a review anyone that will be enough. Sorry about that. |
1ccb0d2
to
bf7e00d
Compare
ec325a7
to
8deaaa3
Compare
Extracted #7851, so review can be easier. |
that one is good. Please merge and rebase. |
|
||
connectorFactory.create("test", TestingH2JdbcModule.createProperties(), new TestingConnectorContext()); | ||
private static ConnectorFactory getConnectorFactory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be used in next commit, when new test is added here.
...ino-sqlserver/src/test/java/io/trino/plugin/sqlserver/BaseSqlCaseInsensitiveMappingTest.java
Outdated
Show resolved
Hide resolved
{ | ||
private final Cache<JdbcIdentity, Map<String, String>> remoteSchemaNames; | ||
private final Cache<RemoteTableNameCacheKey, Map<String, String>> remoteTableNames; | ||
private final DefaultIdentifierMapping defaultIdentifierMapping; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name it delegate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. DefaultIdentifierMapping
has specific implementation that I wanted to consciously use here. Hence type is not generic IdentifierMapping
.
private final Provider<BaseJdbcClient> baseJdbcClient; | ||
|
||
@Inject | ||
public CacheBasedIdentifierMapping( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think CachingIdentifierMapping
would be a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CachingIdentifierMapping
sounds like a layer on top of generic IdentifierMapping
that just do the caching of results of mapping. Here this implementation is using caching to provide the mapping, so identifier mapping is based on caching. Notice that next commit provides mapping based on rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CachingIdentifierMapping sounds like a layer on top of generic IdentifierMapping that just do the caching of results of map
Well. It actually just does that. It just does not cache strictly for the keys it is asked for. But uses extra listers to make caching more efficient.
Also it seems like it should work fine with any implementation of IdentifierMapping
(not only DefaultIdentifierMapping
). Probably I am missing something, please add a code comment explaining the situation for less smart people like me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or probably I am too stubborn. I think it make sense what you are saying.
import static java.util.Objects.requireNonNull; | ||
import static java.util.concurrent.TimeUnit.MILLISECONDS; | ||
|
||
public final class CacheBasedIdentifierMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are here any logical changes vs code which was originally in JdbcClient? Something to focus review on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No logical changes. No thing is the fall back to DefaultIdentifierMapping
but it was there as well but expressed differently.
I changed this class later, so logical changes are in separate commit.
|
||
String toRemoteTableName(JdbcIdentity identity, Connection connection, String remoteSchema, String tableName); | ||
|
||
String toRemoteColumnName(Connection connection, String columnName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks asymetric. Why this one does not get JdbcIdentity
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be not used. Adding this would make code the symetric, but it would be more difficult to call.
Unless you think that a case where one has a table with columns which names differ by casing and we should implement support for them now (or later as a follow up PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever. I guess leave it until we really need that.
@@ -817,7 +790,7 @@ public PreparedStatement getPreparedStatement(Connection connection, String sql) | |||
return connection.prepareStatement(sql); | |||
} | |||
|
|||
protected ResultSet getTables(Connection connection, Optional<String> schemaName, Optional<String> tableName) | |||
public ResultSet getTables(Connection connection, Optional<String> schemaName, Optional<String> tableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename schemaName
and tableName
to reamoteSchemaName
and remoteTableName
.
Add comment that this method is called by by IdentifierMapping
and it can not call back to it, as this would result in a loop.
Btw. Can we get rid of this cyclic dependency somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about extract 3rd component which could be use by jdbc client and identifier mapping. However since this method is overridden by several vendor specific jdbc client it is not that simple. Identifier mapping introduction is quite invasive. So I would leave extracting such component for later...
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/mapping/IdentifierMappingModule.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
CI hit: #7216 |
156efba
to
f1db060
Compare
CI hit: #7872 |
f1db060
to
afd5223
Compare
@losipiuk AC |
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/mapping/IdentifierMappingModule.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/mapping/IdentifierMappingModule.java
Outdated
Show resolved
Hide resolved
...n/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/mapping/RuleBasedIdentifierMapping.java
Outdated
Show resolved
Hide resolved
// With case-insensitive-name-matching enabled colliding schema/table names are considered as errors. | ||
// Some tests here create colliding names which can cause any other concurrent test to fail. | ||
@Test(singleThreaded = true) | ||
public class TestSqlServerCaseInsensitiveRuleBaseMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should the test be SQLServer specific? Can we share test code as we share feature implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be covered with #7864 we are hitting same with other tests. Extracting base class is not trivial. There is many vendor specific things. Hence I would like to handle it as separate effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Some editorials. Feel free to ignore.
afd5223
to
3771832
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great.
Thanks.
Some editorials. Feel free to ignore.
Applied them all % one was commented.
// With case-insensitive-name-matching enabled colliding schema/table names are considered as errors. | ||
// Some tests here create colliding names which can cause any other concurrent test to fail. | ||
@Test(singleThreaded = true) | ||
public class TestSqlServerCaseInsensitiveRuleBaseMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be covered with #7864 we are hitting same with other tests. Extracting base class is not trivial. There is many vendor specific things. Hence I would like to handle it as separate effort.
3771832
to
26df5ec
Compare
Sometimes TestingSqlServer is fails to run: ALTER DATABASE database_xxx SET READ_COMMITTED_SNAPSHOT ON saying: Transaction (Process ID 51) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
This way test is using higher level API that should be a default way to use the base JdbcPlugin.
It extracts the mapping responsibility out of the BaseJdbcClient. Thanks to that it is possible to make it customizable.
Only fail the query when an ambiguous object is accessed.
The idea is to allow JDBC connectors too, to configure ``` case-insensitive-name-matching.config-file=mapping.json case-insensitive-name-matching.config-file.refresh-period=30s ``` Where mapping.json would look like: ``` { "schemas": [ { "remote": "CaseSensitiveName", "mapping": "case_sensitive_name_mapped_to_case_inseitive_1" }, { "remote": "cASEsENSITIVEnAME", "mapping": "case_sensitive_name_mapped_to_case_inseitive_2" }], "tables": [ { "remote-schema": "CaseSensitiveName", "remote-table": "tablex", "mapping": "table_1" }, { "remote-schema": "CaseSensitiveName", "remote-table": "TABLEX", "mapping": "table_2" }] } ``` Having the above, a remote schema `CaseSensitiveName` would be mapped to `case_sensitive_name_mapped_to_case_inseitive_1` schema in Trino. Remote table in `CaseSensitiveName` schema and named `tablex` would be named named to `case_sensitive_name_mapped_to_case_inseitive_1` schema and `table_1` table name in Trino. Notice that this mapping file is refreshable so no restart is needed when it changes.
26df5ec
to
8323863
Compare
The idea is to allow JDBC connectors too, to configure
Where mapping.json would look like:
Having the above, a remote schema
CaseSensitiveName
would be mapped tocase_sensitive_name_mapped_to_case_inseitive_1
schema in Trino. Remote table inCaseSensitiveName
schema and namedtablex
would be named named tocase_sensitive_name_mapped_to_case_inseitive_1
schema andtable_1
table name in Trino.Notice that this mapping file is refreshable so no restart is needed when it changes.