Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to set the schema pattern for tables metadata retrieval #109

Merged
merged 9 commits into from
Sep 7, 2016

Conversation

pqueixalos
Copy link

This adds the ability to configure jdbc schemaPattern from connector configuration.
Reported here #32

@ConfluentJenkins
Copy link
Contributor

Can one of the admins verify this patch?

@ghost
Copy link

ghost commented Aug 11, 2016

It looks like @pqueixalos hasn't signed our Contributor License Agreement, yet.

Appreciation of efforts,

clabot

@pqueixalos
Copy link
Author

CLA just signed.

@@ -144,6 +143,9 @@
public static final long TIMESTAMP_DELAY_INTERVAL_MS_DEFAULT = 0;
private static final String TIMESTAMP_DELAY_INTERVAL_MS_DISPLAY = "Delay Interval (ms)";

public static final String SCHEMA_PATTERN_CONFIG = "schemaPattern";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schema.pattern to be consistent with the config naming conventions

@shikhar
Copy link
Contributor

shikhar commented Aug 19, 2016

@pqueixalos is it possible to just include the schema in the connection URL?

@pqueixalos
Copy link
Author

@shikhar, short answer : no.
Settting the schema in URL configures the JDBC Driver and not all JDBC implementations supports this.
Also it would not change anything (I tried :)), the main issue is that JdbcUtils is aksing for database metadata without providing any schema pattern so every tables of every 'usable' schemas are retrieved.

Regarding the changes you asked (whitespace and parameter naming, sure, I will make those changes next week).

@shikhar
Copy link
Contributor

shikhar commented Aug 19, 2016

@pqueixalos would it work to update the code to use connection.getSchema() as the schema pattern? (in conjunction with the user specifying it as part of the connection URL)

@@ -16,6 +16,12 @@

package io.confluent.connect.jdbc;

import io.confluent.connect.jdbc.source.JdbcSourceConnectorConfig;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's avoid the import rearrangement

@pqueixalos
Copy link
Author

Good point @shikhar, getting the schema from the connection (when set in jdbc connection params) might work fine in some situations !
It won't work in my context for sure, as (silly) Redshift connector does not support it. Also, I don't know how much other jdbc connector implementations honor this, for instance MySQL would probably return null as it does not implement schema concept, connection.getCatalog() would need to be used instead.

@ghost
Copy link

ghost commented Aug 22, 2016

@confluentinc It looks like @pqueixalos just signed our Contributor License Agreement. 👍

Always at your service,

clabot

@pqueixalos pqueixalos changed the title Added schemaPattern connector task configuration, used for fetched ta… Ability to set the schema pattern for tables metadata retrieval Aug 22, 2016
@@ -104,8 +105,8 @@ public void start(Map<String, String> properties) throws ConnectException {
// query.
whitelistSet = Collections.emptySet();
}
tableMonitorThread = new TableMonitorThread(db, context, tablePollMs, whitelistSet,
blacklistSet, tableTypesSet);
tableMonitorThread = new TableMonitorThread(db, schemaPattern, context, tablePollMs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pqueixalos sorry didn't notice this minor thing previously, but in terms of arg ordering schemaPattern makes sense after the context

@shikhar
Copy link
Contributor

shikhar commented Aug 22, 2016

@pqueixalos ah too bad Redshift JDBC does not support that. So we need this patch to make the JDBC source connector usable against Redshift then.

Left a couple more comments, otherwise LGTM

cc @ewencp in case you have any thoughts on this

@shikhar
Copy link
Contributor

shikhar commented Aug 30, 2016

ok to test

@@ -190,6 +192,7 @@ public static ConfigDef baseConfigDef() {
.define(BATCH_MAX_ROWS_CONFIG, Type.INT, BATCH_MAX_ROWS_DEFAULT, Importance.LOW, BATCH_MAX_ROWS_DOC, CONNECTOR_GROUP, 2, Width.SHORT, BATCH_MAX_ROWS_DISPLAY)
.define(TABLE_POLL_INTERVAL_MS_CONFIG, Type.LONG, TABLE_POLL_INTERVAL_MS_DEFAULT, Importance.LOW, TABLE_POLL_INTERVAL_MS_DOC, CONNECTOR_GROUP, 3, Width.SHORT, TABLE_POLL_INTERVAL_MS_DISPLAY)
.define(TOPIC_PREFIX_CONFIG, Type.STRING, Importance.HIGH, TOPIC_PREFIX_DOC, CONNECTOR_GROUP, 4, Width.MEDIUM, TOPIC_PREFIX_DISPLAY)
.define(SCHEMA_PATTERN_CONFIG, Type.STRING, null, Importance.MEDIUM, SCHEMA_PATTERN_DOC, DATABASE_GROUP, 5, Width.SHORT, QUERY_DISPLAY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in the right location? If it is part of the DATABASE_GROUP should we relocate it with that group? And I think the order might not be quite right since it looks like there are only 3 items in the DATABASE_GROUP so far. (It'd also be nice to include a DISPLAY setting as well for a more human-friendly form.)

@ewencp
Copy link
Contributor

ewencp commented Aug 31, 2016

@pqueixalos @shikhar Only took a quick scan through, but the changes seem fine. Major missing piece seems to be tests -- there were some updates to tests but it looks like they just use null for the schema pattern so we haven't actually tested the case where the user sets the schema pattern.

@pqueixalos
Copy link
Author

@ewencp I added a test in JdbcUtilsTest, but taht is probably not enough. Making sure that the configuration is well handled at JdbcSourceConnector level would be better. I will take care of this on a later commit.
I also improved the configuration documentation as you suggested, hoping I did well.

"Schema pattern to fetch tables metadata from:\n"
+ " * \"\" retrieves those without a schema,"
+ " * null (default) means that the schema name should not be used to narrow the search, all tables "
+ "metadata would be fetched, regardless there schema.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: s/there/their

@shikhar
Copy link
Contributor

shikhar commented Sep 6, 2016

@pqueixalos a few minor comments about the doc, otherwise LGTM

@shikhar shikhar merged commit 7f0e59d into confluentinc:master Sep 7, 2016
@shikhar
Copy link
Contributor

shikhar commented Sep 7, 2016

Thanks for the contrib @pqueixalos!

Copy link

@jmfn jmfn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants