-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pulsar-io] pass the pulsar service url to debezium source for history database #11251
[pulsar-io] pass the pulsar service url to debezium source for history database #11251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM and it will fix the tests/allow connector to run.
This returns us to original problem of having to add extra parameters to configure the pulsar client, this can be address later IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM is this unblocks Debezium tests
IIUC we wanted to use the client in order to access auth and TLS.
With this change it is not possible to do it.
What is your plan?
@@ -50,7 +50,7 @@ public DebeziumMongoDbSourceTester(PulsarCluster cluster) { | |||
sourceConfig.put("mongodb.password", "dbz"); | |||
sourceConfig.put("mongodb.task.id","1"); | |||
sourceConfig.put("database.whitelist", "inventory"); | |||
sourceConfig.put("database.history.pulsar.service.url", pulsarServiceUrl); | |||
sourceConfig.put("history.database.pulsar.service.url", pulsarServiceUrl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it should start with DatabaseHistory.CONFIGURATION_FIELD_PREFIX_STRING
which is CONFIGURATION_FIELD_PREFIX_STRING = "database.history."
debezium does
Configuration dbHistoryConfig = config.subset(DatabaseHistory.CONFIGURATION_FIELD_PREFIX_STRING, false)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just updated them.
Based on Debezium's current paradigm, user can only pass string parameters in the configuration. For our case, it'll be better if they allow passing serialized objects. This will be the most secure way but needs help from Debezium community. A second way would be we can access the PulsarClient via some static method in One last way is we add history database specific pulsar auth parameters in |
adding all parameters is tedious (the way config definition works there needs definition of each parameter with bunch of metadata around).
to pass that to the PulsarDatabaseHistory that can use ClientBuilder.loadConf(..) |
@dlg99 This sounds an interesting idea! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nlu90 what is the plan for supporting authentication for the Pulsar cluster?
@nlu90 I felt this is the most secure solution because the connector users don't need to provide the auth information through function config. The current PR still requires users to submit auth information via function config. Although I admit the code will look a bit hacky, it is still manageable. |
if this patch is unblocking @dlg99 's work to add back support for running Integration Tests of Pulsar IO CI is in a very bad state currently. |
@eolivelli How is that related to CI in a very bad state since the integration tests for debezium is not enabled? I would have a concern about this patch as it doesn't support accessing an authenticated Pulsar cluster yet. Even we merge this pull request for now, the connector can't be used in an authenticated cluster. I think we need to spend more time figuring out how to fix this properly. All the changes related are done in the master which is the development branch for 2.9 (which is targeted to release in September based on a time-based release schedule). I don't see why we need to rush on merging this. We need to fix this properly in order to make the debezium connector work in an authenticated cluster. |
@nlu90 an alternative approach is to expose the client builder via the context. So we can just serialize the client builder to a base64 encoded string and pass it to debezium config. Hence the pulsar database history can get the client builder and create a Pulsar client from that. I created a PR (#11293) to demonstrate the idea. Let me know what you think. |
The point is that currently Debezium related tests are not running on CI, and I meant that if CI is not running the tests this is kind of a "bad state" :-) We cannot enable them (#11154) because currently they will fail due to the problem that this PR is trying to address. I would feel more confident with this change and any other proposal if we could revert #11056, commit #11154 and then be back in resolving the Client Auth problem. So my proposal is:
Does it sound like a good plan ? |
@sijie I second @eolivelli here.
We are dealing with a case of broken functionality in the build. I think the first priority should be fixing the regression quickly and new features should be second. |
an alternative approach for #11251
…y database (apache#11251) ### Motivation The Debezium requires pulsar a service URL for history database usage. In apache#11056 , the `service.url` field from `PulsarKafkaWorkerConfig` is no longer available. And the value is also deleted from multiple yaml config files in this [commit](apache@3ce24c9). This causes the integration test for Debezium connector to fail. Based on the Debezium [paradigm](https://debezium.io/documentation/reference/1.5/connectors/mysql.html#debezium-mysql-connector-database-history-configuration-properties), all configurations should be passed as strings. There's no easy way to inject a PulsarClient via configuration. We need to ask user to provide the pulsar url explicitly and probably auth info also. ### Modifications 1. Make the `database.history.pulsar.service.url` field required 2. Add the config value back to example yaml files 3. Update the integration test config ### Verifying this change - [ ] Make sure that the change passes the CI checks.
…y database (apache#11251) ### Motivation The Debezium requires pulsar a service URL for history database usage. In apache#11056 , the `service.url` field from `PulsarKafkaWorkerConfig` is no longer available. And the value is also deleted from multiple yaml config files in this [commit](apache@3ce24c9). This causes the integration test for Debezium connector to fail. Based on the Debezium [paradigm](https://debezium.io/documentation/reference/1.5/connectors/mysql.html#debezium-mysql-connector-database-history-configuration-properties), all configurations should be passed as strings. There's no easy way to inject a PulsarClient via configuration. We need to ask user to provide the pulsar url explicitly and probably auth info also. ### Modifications 1. Make the `database.history.pulsar.service.url` field required 2. Add the config value back to example yaml files 3. Update the integration test config ### Verifying this change - [ ] Make sure that the change passes the CI checks.
…y database (apache#11251) ### Motivation The Debezium requires pulsar a service URL for history database usage. In apache#11056 , the `service.url` field from `PulsarKafkaWorkerConfig` is no longer available. And the value is also deleted from multiple yaml config files in this [commit](apache@3ce24c9). This causes the integration test for Debezium connector to fail. Based on the Debezium [paradigm](https://debezium.io/documentation/reference/1.5/connectors/mysql.html#debezium-mysql-connector-database-history-configuration-properties), all configurations should be passed as strings. There's no easy way to inject a PulsarClient via configuration. We need to ask user to provide the pulsar url explicitly and probably auth info also. ### Modifications 1. Make the `database.history.pulsar.service.url` field required 2. Add the config value back to example yaml files 3. Update the integration test config ### Verifying this change - [ ] Make sure that the change passes the CI checks.
an alternative approach for apache#11251
Motivation
The Debezium requires pulsar a service URL for history database usage.
In #11056 , the
service.url
field fromPulsarKafkaWorkerConfig
is no longer available. And the value is also deleted from multiple yaml config files in this commit. This causes the integration test for Debezium connector to fail.Based on the Debezium paradigm, all configurations should be passed as strings. There's no easy way to inject a PulsarClient via configuration.
We need to ask user to provide the pulsar url explicitly and probably auth info also.
Modifications
database.history.pulsar.service.url
field requiredVerifying this change