Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBZ-302 Add support for database.whitelist mongodb connector config… #253

Closed
wants to merge 8 commits into from

Conversation

ekreiser
Copy link
Contributor

DBZ-302 Add support for database.whitelist mongodb connector configuration option

Copy link
Member

@rhauch rhauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! This is really nice, but I did note a few small changes. Otherwise, looks great!

@@ -250,7 +252,7 @@ protected boolean performInitialSync() {
// We need to copy each collection, so put the collection IDs into a queue ...
final List<CollectionId> collections = new ArrayList<>();
primaryClient.collections().forEach(id -> {
if (collectionFilter.test(id)) collections.add(id);
if (databaseFilter.test(id.dbName()) && collectionFilter.test(id))collections.add(id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: watch formatting and the space after the if condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space added

@@ -430,6 +432,7 @@ protected boolean handleOplogEvent(ServerAddress primaryAddress, Document event)
logger.error("Get current primary executeBlocking", e);
}
ServerAddress serverAddress = address.get();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're making changes, it'd be great to avoid unrelated changes like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra line removed

Copy link
Member

@gunnarmorling gunnarmorling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ekreiser, thanks a lot! Looks good to me overall, I've added some comments inline.

@@ -162,6 +162,18 @@
.withDependents(COLLECTION_LIST_NAME)
.withDescription("The databases for which changes are to be captured");

/**
* A comma-separated list of regular expressions that match the fully-qualified namespaces of collections to be monitored.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems like a Copy & Paste glitch, I don't think it should mention collections here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected the comment

* Fully-qualified namespaces for collections are of the form {@code '<databaseName>.<collectionName>'}.
* May not be used with {@link #COLLECTION_BLACKLIST}.
*/
public static final Field DATABASE_WHITELIST = Field.create("database.whitelist")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a database.blacklist, too, so to resemble the collection-level options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added database.blacklist support

if (dbWhitelist != null && !dbWhitelist.trim().isEmpty()){
databaseFilter = Predicates.includes(dbWhitelist);
} else {
databaseFilter = Predicates.includes(".*");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass a simple predicate which always returns true instead of passing it through the regex engine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to use simple predicate when a pattern isn't set

…uration option as a companion to `database.whitelist`

/**
* Create an instance of the filters.
*
* @param config the configuration; may not be null
*/
public Filters(Configuration config) {
String dbWhitelist = config.getString(MongoDbConnectorConfig.DATABASE_WHITELIST);
String dbBlacklist = config.getString(MongoDbConnectorConfig.DATABASE_BLACKLIST);
if (dbWhitelist != null && !dbWhitelist.trim().isEmpty()){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry but it seems that space is missing between rightmost parenthesis and the left curl brace following

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected

@@ -252,4 +279,14 @@ private static int validateCollectionBlacklist(Configuration config, Field field
}
return 0;
}

private static int validateDatabaseBlacklist(Configuration config, Field field, ValidationOutput problems) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we have an interesting inconsistency here - MySQL connector supports adding both blacklist and whitelist but PostgreSQL requires setting of only one of those options. I'd vote going MySQL way and retrofit the functionality into Postgres connector too. Let Gunnar decide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point but was trying to be consistent within the mongo connector. The collection.whitelist and collection.blacklist are currently coded to be mutually exclusive (not sure why). If the current collection setting were not coded this way, I would be able to combine collection.whitelist and collection.blacklist patterns that accomplish what I need without ever needing the settings specifically at the database level (possible since the collection comparisons are performed while including the database name as a prefix).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it turns out, combining includes and exludes aren't supported in the MySQL connector atm. either. Apparently it was meant to be supported at some point (there's some code in place for merging them), but then this validation was added (which you added to MongoDB, too) which prevents giving both options. So I'd suggest we do this for now (i.e. your PR is good as is), and then, as a follow-up task, we allow merging includes + excludes across all the connectors. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's https://issues.jboss.org/browse/DBZ-307 for the follow-up task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good plan to separate this issue out and look at it holistically cross all connectors.

…uration option as a companion to `database.whitelist - fix formatting issue

protected static Field.Set EXPOSED_FIELDS = ALL_FIELDS;

protected static ConfigDef configDef() {
ConfigDef config = new ConfigDef();
Field.group(config, "MongoDB", HOSTS, USER, PASSWORD, LOGICAL_NAME, CONNECT_BACKOFF_INITIAL_DELAY_MS,
CONNECT_BACKOFF_MAX_DELAY_MS, MAX_FAILED_CONNECTIONS, AUTO_DISCOVER_MEMBERS);
Field.group(config, "Events", DATABASE_LIST, COLLECTION_WHITELIST, COLLECTION_BLACKLIST);
Field.group(config, "Events", DATABASE_LIST, DATABASE_WHITELIST, DATABASE_BLACKLIST, COLLECTION_WHITELIST, COLLECTION_BLACKLIST);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, so that's a bit weird, there already was DATABASE_LIST before, but just its definition, I can't seem to find code that would read that property, also it's not part of the docs. Seems like an uncompleted attempt of adding an option for specifying the DBs. So I'd suggest to remove this with this PR as well, as it's superseded by the new options now.

@gunnarmorling
Copy link
Member

Hey @ekreiser, so I think the PR s good, apart from my last comment about removing that left-over DATABASE_LIST option. Could you also send in a PR for updating the docs to cover the new options, too?

Thanks!

…s now superseded by `database.whitelist` and `database.blacklist`
@ekreiser
Copy link
Contributor Author

I added DBZ-309 to track the task of updating the documentation to include the new configuration settings included by this PR.

@gunnarmorling
Copy link
Member

Rebased and applied. Thanks a lot, @ekreiser! I took the liberty to squash some of the commits and remove some trailing whitespace inserted accidentally. I'll send a PR for updating the docs, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants