Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wip db id gen #75

Merged
merged 11 commits into from
Oct 25, 2020
Merged

Conversation

dartartem
Copy link
Contributor

No description provided.

@@ -21,6 +21,7 @@ services:
EVENTUATELOCAL_CDC_DB_USER_NAME: root
EVENTUATELOCAL_CDC_DB_PASSWORD: rootpassword
EVENTUATELOCAL_CDC_READER_NAME: MySqlReader
EVENTUATELOCAL_CDC_READER_ID: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -71,6 +71,21 @@ workflows:
- build-and-test/build-and-test:
name: build-and-test-postgres-wal-json
script: ./scripts/build-and-test-postgres-wal-json.sh
- build-and-test/build-and-test:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an extremely expensive way to test a low-level feature. Every additional job adds minutes to the build:

  1. Wait for an available VM
  2. Create VM to run the job
  3. Restore cache
  4. compile lots of code
  5. Run one test, e.g. io.eventuate.local.polling.PollingBinlogEntryReaderMessageTableTest
  6. shut everything down

A more efficient approach would be a single job that perhaps runs first and starts all of the databases and runs the needed tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


if (!headers.containsKey("ID")) {
headers = new HashMap<>(headers);
headers.put("ID", idGenerator.genId(binlogEntry.getLongColumn("xid")).asString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"xid" should not be a magic value. There should be a constant defined in eventuate-common

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but usually we don't use constants for column names, and having only one can look a little inconsistent, no?
Or should we think about constants for all columns?

@@ -0,0 +1,20 @@
USE eventuate;

CREATE TABLE new_message (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this in common?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the same approach is used for testing json schema. Probably is better to move such sqls to java code or gradle plugin, I did not think about it yet. What do you think?

Copy link
Contributor

@cer cer Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. My point is: why is the CDC building images that have already been built in eventuate-common?

export DATABASE=postgres
export MODE=wal
export TEST_MODULE=eventuate-local-java-cdc-connector-postgres-wal
export TEST_CLASS=io.eventuate.local.postgres.wal.PostgresWalBinlogEntryReaderMessageTableTest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these profiles postgres,PostgresWal need to be activated when running PostgresWalBinlogEntryReaderMessageTableTest which is specific to postgres,PostgresWal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chris, you probably ask why I added "postgres" to:

@ActiveProfiles("${SPRING_PROFILES_ACTIVE:PostgresWal}")

we need to use profiles like: "postgres, mssql" to activate properties that used to connect to those databases.

But PostgresWal is used only for postgres, and adding postgres as activated profile can't break something, but helps run that script from Idea. (I don't need to add it as env variable)

However postgres can be used with polling, so it does not make sense to have application-postgres-wal.properties, because we anyway need to have also application-postgres.properties with the same content.

Copy link
Contributor

@cer cer Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. My point is that @activeprofiles on PostgresWalBinlogEntryReaderMessageTableTest can activate the needed profiles `@ActiveProfiles("postgres,PostgresWal").

There is no need to specify SPRING_PROFILES_ACTIVE when executing PostgresWalBinlogEntryReaderMessageTableTest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh.. I see now, yes, I will remove it

@@ -25,6 +26,6 @@ public CdcPipelineFactory cdcPipelineFactory(DataProducerFactory dataProducerFac
publishingFilter,
new MessageWithDestinationPublishingStrategy(),
meterRegistry),
new BinlogEntryToMessageConverter());
readerId -> new BinlogEntryToMessageConverter(new DatabaseIdGenerator(readerId)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by this.
readerId can be null, right?
DatabaseIdGenerator's constructor should/does verify that readerId is not null, right - since it needs one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chris,

tests uses both id generators, application and database.
Application id generator is used when reader id is not specified, that is why here null is allowed:
https://github.com/eventuate-foundation/eventuate-cdc/pull/75/files#diff-0dfbeeb39750b2d0117def81e57cbe28R20
Also it is allowed for database binlog clients.
But the cdc service requires reader id: https://github.com/eventuate-foundation/eventuate-cdc/pull/75/files#diff-fafb9b7680f553fb1b9f65d6bf4202c8R30

Regarding that line of code (29), reader id is Long (not long) because

BinlogEntryToEventConverterFactory (specified as lambda on line 29) is Function<Long, BinlogEntryToEventConverter>

and as we know Generics cannot be primitives.
Also I don't think that is critical to change BinlogEntryToEventConverterFactory to use explicit method definition.

Copy link
Contributor

@cer cer Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are miss understanding by comment.

readerId -> new BinlogEntryToMessageConverter(new DatabaseIdGenerator(readerId)) creates a BinlogEntryToMessageConverter

But what if readerId is null - i.e. a reader for a schema using old id generation. DatabaseIdGenerator expects a non-null readerId.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic was that when we migrate the cdc we should anyway specify a reader id.
But in case if messages were generated with application id, cdc does nothing, because ID header is present:

if (!headers.containsKey("ID")) {
headers = new HashMap<>(headers);
headers.put("ID", idGenerator.genId(binlogEntry.getLongColumn("xid")).asString());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you have added a new mandatory property for a reader: id which is in reality outboxId.
Not sure that makes sense if someone is using application generated IDs and they haven't thought about migrating to db generated IDs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes migration more straightforward.
As I see a migration scenario:

  1. User updates all libraries, including cdc.
  2. Migrates schema.
  3. Cdc forces him to specify unique id for reader (user should not think about it much, he should only know that id is unique and has 48 unsigned bits).
  4. Everything works as before (application generated id)
  5. At some point that user desides to use database generated id. He only uses outbox id for applications as he used for reader. Because other work already done during an update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants