New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent date and time handling #681

Closed
geoff-addepar opened this Issue Jun 22, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@geoff-addepar
Contributor

geoff-addepar commented Jun 22, 2017

I'm having trouble wrapping my head around how dates and times are handled in Maxwell, and I can't seem to get things to work correctly for me. I think I have at least 3 separate bugs here.

To summarize the differences:

  • TIMESTAMP fields show up in the local time zone when --binlog_connector=false, and in UTC when --binlog_connector=true. I don't know which is correct, but I think they should at least be the same. Or document otherwise. (I prefer local, I think, but either is fine.)
  • When --binlog_connector=false, DATETIME fields in the daylight savings gap get adjusted to a valid local time, but this means the string in Kafka doesn't match the string in the database. If DATETIME fields are being used to store UTC times (as they are in my case) then your data is basically being corrupted before being placed in the queue.
  • DATETIMEs prior to 1970 that have a fractional seconds part cause a crash.

Reproduction steps (I'm in California, which is UTC-8 in this example):

create table test (dt datetime, ts timestamp null default null, fdt datetime(6));
insert into test (dt, ts) values('2017-01-01 01:00:00', '2017-01-01 01:00:00');
insert into test (dt) values('2017-03-12 02:30:00'); -- note that this is in the daylight savings gap in my local time zone
insert into test (fdt) values('1950-01-01 01:00:00.999999');

With --binlog_connector=false:

{"database":"test","table":"test","type":"insert","ts":1498089571,"xid":248502,"commit":true,"data":{"dt":"2017-01-01 01:00:00","ts":"2017-01-01 01:00:00","fdt":null,"fts":null}}
{"database":"test","table":"test","type":"insert","ts":1498089571,"xid":248507,"commit":true,"data":{"dt":"2017-03-12 03:30:00","ts":null,"fdt":null,"fts":null}}
{"database":"test","table":"test","type":"insert","ts":1498089572,"xid":248511,"commit":true,"data":{"dt":null,"ts":null,"fdt":"1950-01-01 01:00:00.999999","fts":null}}

With --binlog_connector=true:

{"database":"test","table":"test","type":"insert","ts":1498089667,"xid":248612,"commit":true,"data":{"dt":"2017-01-01 01:00:00","ts":"2017-01-01 09:00:00","fdt":null,"fts":null}}
{"database":"test","table":"test","type":"insert","ts":1498089667,"xid":248617,"commit":true,"data":{"dt":"2017-03-12 02:30:00","ts":null,"fdt":null,"fts":null}}
... and then ...
java.lang.IllegalArgumentException: nanos > 999999999 or < 0
	at java.sql.Timestamp.setNanos(Timestamp.java:389)
	at com.zendesk.maxwell.schema.columndef.DateFormatter.extractTimestamp(DateFormatter.java:27)
	at com.zendesk.maxwell.schema.columndef.DateTimeColumnDef.formatValue(DateTimeColumnDef.java:24)
	at com.zendesk.maxwell.schema.columndef.ColumnDefWithLength.asJSON(ColumnDefWithLength.java:40)
	at com.zendesk.maxwell.replication.BinlogConnectorEvent.writeData(BinlogConnectorEvent.java:88)
	at com.zendesk.maxwell.replication.BinlogConnectorEvent.buildRowMap(BinlogConnectorEvent.java:136)
	at com.zendesk.maxwell.replication.BinlogConnectorEvent.jsonMaps(BinlogConnectorEvent.java:148)
	at com.zendesk.maxwell.replication.BinlogConnectorReplicator.getTransactionRows(BinlogConnectorReplicator.java:160)
	at com.zendesk.maxwell.replication.BinlogConnectorReplicator.getRow(BinlogConnectorReplicator.java:271)
	at com.zendesk.maxwell.replication.AbstractReplicator.work(AbstractReplicator.java:157)
	at com.zendesk.maxwell.util.RunLoopProcess.runLoop(RunLoopProcess.java:27)
	at com.zendesk.maxwell.Maxwell.startInner(Maxwell.java:181)
	at com.zendesk.maxwell.Maxwell.start(Maxwell.java:129)
	at com.zendesk.maxwell.Maxwell.main(Maxwell.java:202)

I haven't looked into the bootstrap mechanism at all, but it also needs to match what's going on here.

@timbertson

This comment has been minimized.

Show comment
Hide comment
@timbertson

timbertson Jun 26, 2017

Contributor

Thanks for the report(s) Geoff. In terms of what should happen, UTC vs local time was discussed a bit in #647. I'm inclined to agree with Ben that we should handle things as UTC unless we have a good reason not to.

Also --binlog_connector=true is going to be the only option sometime soon, which I guess will incidentally make some of this consistent but it should definitely be documented, particularly where it differs from old behaviour.

Contributor

timbertson commented Jun 26, 2017

Thanks for the report(s) Geoff. In terms of what should happen, UTC vs local time was discussed a bit in #647. I'm inclined to agree with Ben that we should handle things as UTC unless we have a good reason not to.

Also --binlog_connector=true is going to be the only option sometime soon, which I guess will incidentally make some of this consistent but it should definitely be documented, particularly where it differs from old behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment