MDEV-11065 - Compressed binary log #247

Closed
wants to merge 4 commits into
from

Conversation

Projects
None yet
5 participants
@GCSAdmin

We add new event types to support compress the binlog as follow:
QUERY_COMPRESSED_EVENT,
WRITE_ROWS_COMPRESSED_EVENT_V1,
UPDATE_ROWS_COMPRESSED_EVENT_V1,
DELETE_POWS_COMPRESSED_EVENT_V1,
WRITE_ROWS_COMPRESSED_EVENT,
UPDATE_ROWS_COMPRESSED_EVENT,
DELETE_POWS_COMPRESSED_EVENT

Of course, in mariadb 10.2, it's always the V1 version of compressed binlog row event.

We introduce two option for this feature: "log_bin_compress " and "log_bin_compress_min_len", the
former is a switch of whether the binlog should be compressed and the latter is the minimum length of
sql statement(in statement mode) or record(in row mode) that can be compressed.
In master, it can be described by the code:
if binlog_format == statement {
if log_bin_compress == true and query_len >= log_bin_compress_min_len
create a Query_compressed_log_event;
else
create a Query_log_event;
}
if binlog_format == row {
if log_bin_compress == true and record_len >= log_bin_compress_min_len
create a Write_rows_compressed_log_event(when INSERT)
else
create a Write_log_event(when INSERT);
}

And in slave, the compressed binlog events would be converted to the uncompressed form in IO thread, such as QUERY_COMPRESSED_EVENT convert to QUERY_EVENT.
So the SQL and worker threads can be stay unchanged. And the sql thread will be executed as fast as executing uncompressed binlog.

Now, we use zlib as compressed algrithm.
And the compressed record is
Compressed Record(3 parts)
Record Header: 1 Byte
0 Bit: Always 1, mean compressed;
1-3 Bit: Reversed, compressed algorithm。0 means zlib, It's always 0 by now.
4-7 Bit: Bytes of "Record Original Length"
Record Original Length: 1-4 Bytes
Compressed Buf: the real compressed part.

The feture can reduce 42% ~ 70% binlog capacity in our production environment.

vinchen added some commits Oct 8, 2016

Binlog compressed
Add some event types for the compressed event, there are:
     QUERY_COMPRESSED_EVENT,
     WRITE_ROWS_COMPRESSED_EVENT_V1,
     UPDATE_ROWS_COMPRESSED_EVENT_V1,
     DELETE_POWS_COMPRESSED_EVENT_V1,
     WRITE_ROWS_COMPRESSED_EVENT,
     UPDATE_ROWS_COMPRESSED_EVENT,
     DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write
function have been overridden for uncompressing and compressing. Anything but this is totally the same.

On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.

Now we use zlib as compress algorithm. It maybe support other algorithm in the future.

@svoj svoj changed the title from Compressed binary log to MDEV-11065 - Compressed binary log Oct 15, 2016

@svoj

This comment has been minimized.

Show comment
Hide comment
@svoj

svoj Oct 15, 2016

Contributor

Hi GCSAdmin,

Thanks for your contribution. JIRA task has been created to track this pull request: https://jira.mariadb.org/browse/MDEV-11064

This task was added to 10.2.4 backlog, which is planned to be handled between 2016-11-10 and 2016-11-17.

Similar to other open source projects, the MariaDB Foundation needs to have shared ownership of all code that is included in the MariaDB distribution. The easiest way to achieve this is by submitting your code under the BSD-new license.

The other alternative is to sign the code contribution agreement which can be found here: https://mariadb.com/kb/en/mariadb/mca/

Please indicate in a comment below that you are contributing your new code of the whole pull request, including one or several files that are either new files or modified ones, under the BSD-new license or that you have filled out the contribution agreement and sent it.

Thanks,
Sergey

Contributor

svoj commented Oct 15, 2016

Hi GCSAdmin,

Thanks for your contribution. JIRA task has been created to track this pull request: https://jira.mariadb.org/browse/MDEV-11064

This task was added to 10.2.4 backlog, which is planned to be handled between 2016-11-10 and 2016-11-17.

Similar to other open source projects, the MariaDB Foundation needs to have shared ownership of all code that is included in the MariaDB distribution. The easiest way to achieve this is by submitting your code under the BSD-new license.

The other alternative is to sign the code contribution agreement which can be found here: https://mariadb.com/kb/en/mariadb/mca/

Please indicate in a comment below that you are contributing your new code of the whole pull request, including one or several files that are either new files or modified ones, under the BSD-new license or that you have filled out the contribution agreement and sent it.

Thanks,
Sergey

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 20, 2016

Member

Seems my reply did not get added here for some reason? But it's also on the maria-developers@ mainlig list:

https://lists.launchpad.net/maria-developers/msg10010.html

Member

knielsen commented Oct 20, 2016

Seems my reply did not get added here for some reason? But it's also on the maria-developers@ mainlig list:

https://lists.launchpad.net/maria-developers/msg10010.html

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

GCSAdmin notifications@github.com writes:

We add new event types to support compress the binlog as follow:
QUERY_COMPRESSED_EVENT,
WRITE_ROWS_COMPRESSED_EVENT_V1,
UPDATE_ROWS_COMPRESSED_EVENT_V1,
DELETE_POWS_COMPRESSED_EVENT_V1,
WRITE_ROWS_COMPRESSED_EVENT,
UPDATE_ROWS_COMPRESSED_EVENT,
DELETE_POWS_COMPRESSED_EVENT

Thanks for the patch!

Overall, I'm much impressed with the quality, I see a lot of attention to
detail. Below, I have a number of comments and suggestions, but they are all
minor for a patch of this complexity.

A number of the comments are style fixes or simple changes, and I found it
easier to just do the changes to the source for you to review. I hope that
is ok. I rebased the series without intermediate merge to simplify review to
a single patch. The rebase with my suggested changes on top is here:

https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

Please check the changes I made and let me know if you disagree with
anything or I misunderstood something. Changes are mostly:

  • A bunch of code style (indentation, lines <= 80 chars, clarify comments,
    and so on).
  • Change LOG_EVENT_IS_QUERY() etc. from macros to static inline functions.
  • check_event_type() updated (new code merged into 10.2 recently).
  • Minor .result file update, also from code merged into 10.2 recently.

I also have the following questions/suggestions:

  1. I think the code should sanity-check the compressed event header, to
    check that it does not access outside the event buffer in corrupt
    events. Eg. if lenlen is 7 and there is less than 7 bytes available in the
    event. I know that replication code in general is not robust to corrupt
    events, but it seems best that new code avoids overflowing buffers in case
    of corrupt event data.
  2. There are some places where the compressed event types are not added to
    switch() or if() statements, mainly related to minimal binlog images, I
    think: Rows_log_event::read_write_bitmaps_cmp(),
    Rows_log_event::get_data_size(), Rows_log_event::do_apply_event(),
    Rows_log_event::write_data_body(). I think the reason for that is that the
    SQL thread never sees the compressed events (they are uncompressed by the IO
    thread), but I would like your confirmation that my understanding is
    correct.
  3. Did you think about testing that BINLOG statements with compressed events
    can execute correctly? This happens with mysqlbinlog | mysql, when there are
    (compressed) row-based events in the binlog. I'm wondering if this could
    expose compressed events to code that normally runs in the SQL thread and
    expects to have events uncompressed for it by the IO thread?

A few detailed comments on the patch below (most comments are done as
changes in the above-linked git branch).

Otherwise, after answer to above 3 questions, I think the patch looks good
to go into 10.2.

  • Kristian.

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always 0, means zlib
    
  •               4-7 Bit: Bytes of "Record Original Length"
    
  • Record Original Length: 1-4 Bytes
  • Compressed Buf:

I did not understand this. Why "reversed"? It seems to be bits 0-2 that have
the bytes of "Record Original Length" and bit 7 that has the '1' bit meaning
compression enabled? Maybe this can be clarified.

+int query_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,

  •                    const char _src, char_ buf, ulong buf_size, bool\* is_malloc,
    
  •           char **dst, ulong *newlen)
    
    +{
  • ulong len = uint4korr(src + EVENT_LEN_OFFSET);
  • const char *tmp = src;
    +
  • DBUG_ASSERT((uchar)src[EVENT_TYPE_OFFSET] == QUERY_COMPRESSED_EVENT);
  • uint8 common_header_len= description_event->common_header_len;
  • uint8 post_header_len= description_event->post_header_len[QUERY_COMPRESSED_EVENT-1];
    +
  • tmp += common_header_len;
    +
  • uint db_len = (uint)tmp[Q_DB_LEN_OFFSET];
  • uint16 status_vars_len= uint2korr(tmp + Q_STATUS_VARS_LEN_OFFSET);
    +
  • tmp += post_header_len + status_vars_len + db_len + 1;
    +
  • uint32 un_len = binlog_get_uncompress_len(tmp);
  • *newlen = (tmp - src) + un_len;

This is one place I would like to see a check on the data in the
event. Maybe just that 'len' is larger than 1+(tmp[0]&7). Or maybe simply
that len is at least 10, the minimal size for compressed events.

+int row_log_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,

  •                         const char _src, char_ buf, ulong buf_size, bool\* is_malloc,
    
  •                         char **dst, ulong *newlen)
    

    +{

  • uint32 un_len = binlog_get_uncompress_len(tmp);

  • *newlen = (tmp - src) + un_len;

Another place where I'd like a check against looking outside the event
buffer.

+int binlog_buf_uncompress(const char *src, char *dst, uint32 len, uint32 *newlen)
+{

  • if((src[0] & 0x80) == 0)
  • {
  • return 1;
    
  • }
    +
  • uint32 lenlen = src[0] & 0x07;
  • uLongf buflen = *newlen;
  • if(uncompress((Bytef )dst, &buflen, (const Bytef)src + 1 + lenlen, len) != Z_OK)

Again check on length.

Member

knielsen commented Oct 25, 2016

GCSAdmin notifications@github.com writes:

We add new event types to support compress the binlog as follow:
QUERY_COMPRESSED_EVENT,
WRITE_ROWS_COMPRESSED_EVENT_V1,
UPDATE_ROWS_COMPRESSED_EVENT_V1,
DELETE_POWS_COMPRESSED_EVENT_V1,
WRITE_ROWS_COMPRESSED_EVENT,
UPDATE_ROWS_COMPRESSED_EVENT,
DELETE_POWS_COMPRESSED_EVENT

Thanks for the patch!

Overall, I'm much impressed with the quality, I see a lot of attention to
detail. Below, I have a number of comments and suggestions, but they are all
minor for a patch of this complexity.

A number of the comments are style fixes or simple changes, and I found it
easier to just do the changes to the source for you to review. I hope that
is ok. I rebased the series without intermediate merge to simplify review to
a single patch. The rebase with my suggested changes on top is here:

https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

Please check the changes I made and let me know if you disagree with
anything or I misunderstood something. Changes are mostly:

  • A bunch of code style (indentation, lines <= 80 chars, clarify comments,
    and so on).
  • Change LOG_EVENT_IS_QUERY() etc. from macros to static inline functions.
  • check_event_type() updated (new code merged into 10.2 recently).
  • Minor .result file update, also from code merged into 10.2 recently.

I also have the following questions/suggestions:

  1. I think the code should sanity-check the compressed event header, to
    check that it does not access outside the event buffer in corrupt
    events. Eg. if lenlen is 7 and there is less than 7 bytes available in the
    event. I know that replication code in general is not robust to corrupt
    events, but it seems best that new code avoids overflowing buffers in case
    of corrupt event data.
  2. There are some places where the compressed event types are not added to
    switch() or if() statements, mainly related to minimal binlog images, I
    think: Rows_log_event::read_write_bitmaps_cmp(),
    Rows_log_event::get_data_size(), Rows_log_event::do_apply_event(),
    Rows_log_event::write_data_body(). I think the reason for that is that the
    SQL thread never sees the compressed events (they are uncompressed by the IO
    thread), but I would like your confirmation that my understanding is
    correct.
  3. Did you think about testing that BINLOG statements with compressed events
    can execute correctly? This happens with mysqlbinlog | mysql, when there are
    (compressed) row-based events in the binlog. I'm wondering if this could
    expose compressed events to code that normally runs in the SQL thread and
    expects to have events uncompressed for it by the IO thread?

A few detailed comments on the patch below (most comments are done as
changes in the above-linked git branch).

Otherwise, after answer to above 3 questions, I think the patch looks good
to go into 10.2.

  • Kristian.

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always 0, means zlib
    
  •               4-7 Bit: Bytes of "Record Original Length"
    
  • Record Original Length: 1-4 Bytes
  • Compressed Buf:

I did not understand this. Why "reversed"? It seems to be bits 0-2 that have
the bytes of "Record Original Length" and bit 7 that has the '1' bit meaning
compression enabled? Maybe this can be clarified.

+int query_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,

  •                    const char _src, char_ buf, ulong buf_size, bool\* is_malloc,
    
  •           char **dst, ulong *newlen)
    
    +{
  • ulong len = uint4korr(src + EVENT_LEN_OFFSET);
  • const char *tmp = src;
    +
  • DBUG_ASSERT((uchar)src[EVENT_TYPE_OFFSET] == QUERY_COMPRESSED_EVENT);
  • uint8 common_header_len= description_event->common_header_len;
  • uint8 post_header_len= description_event->post_header_len[QUERY_COMPRESSED_EVENT-1];
    +
  • tmp += common_header_len;
    +
  • uint db_len = (uint)tmp[Q_DB_LEN_OFFSET];
  • uint16 status_vars_len= uint2korr(tmp + Q_STATUS_VARS_LEN_OFFSET);
    +
  • tmp += post_header_len + status_vars_len + db_len + 1;
    +
  • uint32 un_len = binlog_get_uncompress_len(tmp);
  • *newlen = (tmp - src) + un_len;

This is one place I would like to see a check on the data in the
event. Maybe just that 'len' is larger than 1+(tmp[0]&7). Or maybe simply
that len is at least 10, the minimal size for compressed events.

+int row_log_event_uncompress(const Format_description_log_event *description_event, bool contain_checksum,

  •                         const char _src, char_ buf, ulong buf_size, bool\* is_malloc,
    
  •                         char **dst, ulong *newlen)
    

    +{

  • uint32 un_len = binlog_get_uncompress_len(tmp);

  • *newlen = (tmp - src) + un_len;

Another place where I'd like a check against looking outside the event
buffer.

+int binlog_buf_uncompress(const char *src, char *dst, uint32 len, uint32 *newlen)
+{

  • if((src[0] & 0x80) == 0)
  • {
  • return 1;
    
  • }
    +
  • uint32 lenlen = src[0] & 0x07;
  • uLongf buflen = *newlen;
  • if(uncompress((Bytef )dst, &buflen, (const Bytef)src + 1 + lenlen, len) != Z_OK)

Again check on length.

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

Simon Mudd simon.mudd@booking.com writes:

I have not looked at the code in detail but a couple of questions
related to the implementation come to mind:

The author is probably the best to answer, but I can tell what I know from
reviewing the patch:

(1) Why not use a single compressed event type and encapsulate the whole original event inside ?

This would result in higher overhead on each event. There is a fixed header
to each event, which cannot be compressed (without significantly more
changes to the code, at least). Thus, a single event encapsulating another
full compressed event would require two headers, one uncompressed and one
compresed.

There might be other pros and cons to each method, this was just what
occured to me.

Is it not possible to have a "generic compressed event”? Mark mentions
allowing selection of the compression type and that seems useful and I
would expect the the compressed event to have this information and the
compressed content.

The current patch has reserved three bits for the compression method. Thus,
it would be possible in the future to add more compression types (up to a
maximum of 8, at least).

Actually, it would be good if the code would check the compression type and
fail if it is unknown. This will help get proper errors on old slaves if new
compression types are added in the future.

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

(3) Receivers on receiving a compressed event would decompress it and
then process the extracted event in the same way it’s processed
now. Most of the code should haven no idea the event had been
compressed.

That's basically how the patch is written.

Perhaps I am missing something but the need to compress events
individually probably requires quite a bit more logic and will not be
as extensible.

Perhaps. In any case, the patch at hand is based on individually compressed
events. Even with individual event types, the patch is nice and clean and
leaves much of the existing logic unchanged.

My guess is that this will primarily be useful for row-based binlogging,
where reasonable compression rates can be expected in many cases where
row-based binlogging has particularly high overhead compared to
statement-based. Per-event compression is in any case of limited use for
events that are small in size.

  • Kristian.
Member

knielsen commented Oct 25, 2016

Simon Mudd simon.mudd@booking.com writes:

I have not looked at the code in detail but a couple of questions
related to the implementation come to mind:

The author is probably the best to answer, but I can tell what I know from
reviewing the patch:

(1) Why not use a single compressed event type and encapsulate the whole original event inside ?

This would result in higher overhead on each event. There is a fixed header
to each event, which cannot be compressed (without significantly more
changes to the code, at least). Thus, a single event encapsulating another
full compressed event would require two headers, one uncompressed and one
compresed.

There might be other pros and cons to each method, this was just what
occured to me.

Is it not possible to have a "generic compressed event”? Mark mentions
allowing selection of the compression type and that seems useful and I
would expect the the compressed event to have this information and the
compressed content.

The current patch has reserved three bits for the compression method. Thus,
it would be possible in the future to add more compression types (up to a
maximum of 8, at least).

Actually, it would be good if the code would check the compression type and
fail if it is unknown. This will help get proper errors on old slaves if new
compression types are added in the future.

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

(3) Receivers on receiving a compressed event would decompress it and
then process the extracted event in the same way it’s processed
now. Most of the code should haven no idea the event had been
compressed.

That's basically how the patch is written.

Perhaps I am missing something but the need to compress events
individually probably requires quite a bit more logic and will not be
as extensible.

Perhaps. In any case, the patch at hand is based on individually compressed
events. Even with individual event types, the patch is nice and clean and
leaves much of the existing logic unchanged.

My guess is that this will primarily be useful for row-based binlogging,
where reasonable compression rates can be expected in many cases where
row-based binlogging has particularly high overhead compared to
statement-based. Per-event compression is in any case of limited use for
events that are small in size.

  • Kristian.
@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

On 21 Oct 2016, at 17:34, Kristian Nielsen knielsen@knielsen-hq.org wrote:

Simon Mudd simon.mudd@booking.com writes:

This would result in higher overhead on each event. There is a fixed header

Ok. I’ve been assuming the headers were small (from some casual browsing of things
related to the binlog router some time ago), but that may be wrong.

Yes, they are quite small, 10-20 bytes per event or something like that.

Indeed, one of the things about the current binlog format is that there’s little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently implicit is
explicit and also if the specs are outside of the code. That’s something I

Tell me about it ... it is very hard to change most anything in
replication without breaking some odd corner somewhere.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most helpful.

Right. This patch compresses query events (ie. statement-based updates) and
row-events, so both of these are covered. LOAD DATA INFILE in statement mode
is not (but in row-based mode it should be, I think).

I use both LOAD DATA INFILE which is great on the box you load the data into but awful on a downstream
slave that "streams down” the data, only to output it to a temporary file which is loaded back in again…
[ The design is logical but I’d love to see the LOAD DATA INFILE turned directly into a RBR binlog stream,
certainly not be default, but as an option, as that should reduce load on the downstream slaves which would
not have to reprocess the input stream as they do now. ]

I also have “big inserts” of say 16 MB SBR events often of the type: INSERT INTO … VALUES … [ON DUPLICATE KEY UPDATE …].
For usage such as this the “text” is big, and so the individual event does compress pretty well, so the win would be big.

So compression is good and there are several use cases, thus making it as generic as possible would benefit
more people. That may not be appropriate for this suggested patch, and it’s good to see people offering
solutions to their own issues but at least it could perhaps be considered as future functionality.

A side effect of a more generic mechanism would hopefully be that this same mechanism could be implemented upstream
and would work even if the internal events that go through the "compression pipeline” are different. That avoids
feature drift or dual incompatible implementations which would not be very good and has happened already (GTID).

Anyway perhaps I’ve drifted off-topic for your comments on the patch but this certainly “woke me up” … :-)

Simon

Member

knielsen commented Oct 25, 2016

On 21 Oct 2016, at 17:34, Kristian Nielsen knielsen@knielsen-hq.org wrote:

Simon Mudd simon.mudd@booking.com writes:

This would result in higher overhead on each event. There is a fixed header

Ok. I’ve been assuming the headers were small (from some casual browsing of things
related to the binlog router some time ago), but that may be wrong.

Yes, they are quite small, 10-20 bytes per event or something like that.

Indeed, one of the things about the current binlog format is that there’s little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently implicit is
explicit and also if the specs are outside of the code. That’s something I

Tell me about it ... it is very hard to change most anything in
replication without breaking some odd corner somewhere.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most helpful.

Right. This patch compresses query events (ie. statement-based updates) and
row-events, so both of these are covered. LOAD DATA INFILE in statement mode
is not (but in row-based mode it should be, I think).

I use both LOAD DATA INFILE which is great on the box you load the data into but awful on a downstream
slave that "streams down” the data, only to output it to a temporary file which is loaded back in again…
[ The design is logical but I’d love to see the LOAD DATA INFILE turned directly into a RBR binlog stream,
certainly not be default, but as an option, as that should reduce load on the downstream slaves which would
not have to reprocess the input stream as they do now. ]

I also have “big inserts” of say 16 MB SBR events often of the type: INSERT INTO … VALUES … [ON DUPLICATE KEY UPDATE …].
For usage such as this the “text” is big, and so the individual event does compress pretty well, so the win would be big.

So compression is good and there are several use cases, thus making it as generic as possible would benefit
more people. That may not be appropriate for this suggested patch, and it’s good to see people offering
solutions to their own issues but at least it could perhaps be considered as future functionality.

A side effect of a more generic mechanism would hopefully be that this same mechanism could be implemented upstream
and would work even if the internal events that go through the "compression pipeline” are different. That avoids
feature drift or dual incompatible implementations which would not be very good and has happened already (GTID).

Anyway perhaps I’ve drifted off-topic for your comments on the patch but this certainly “woke me up” … :-)

Simon

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

How easy will it be for this to use a lower latency compression algorithm
like lz4, Snappy or zstandard?

On Thu, Oct 20, 2016 at 2:52 PM, Daniel Black daniel.black@au1.ibm.com
wrote:

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always
    
    0, means zlib

Good to see this progressed from the old MySQL bug I saw originally.

I'm a bit late into this however I suggest we can use a better algorithm
than zlib. There are a significant number of algorithms that may have
better compression at similar speed faster like brotli, zstd, snappy.

https://quixdb.github.io/squash-benchmark/#results-table


Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp

Mark Callaghan
mdcallag@gmail.com

Member

knielsen commented Oct 25, 2016

How easy will it be for this to use a lower latency compression algorithm
like lz4, Snappy or zstandard?

On Thu, Oct 20, 2016 at 2:52 PM, Daniel Black daniel.black@au1.ibm.com
wrote:

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always
    
    0, means zlib

Good to see this progressed from the old MySQL bug I saw originally.

I'm a bit late into this however I suggest we can use a better algorithm
than zlib. There are a significant number of algorithms that may have
better compression at similar speed faster like brotli, zstd, snappy.

https://quixdb.github.io/squash-benchmark/#results-table


Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp

Mark Callaghan
mdcallag@gmail.com

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always
    
    0, means zlib

Good to see this progressed from the old MySQL bug I saw originally.

I'm a bit late into this however I suggest we can use a better algorithm
than zlib. There are a significant number of algorithms that may have
better compression at similar speed faster like brotli, zstd, snappy.

https://quixdb.github.io/squash-benchmark/#results-table

Member

knielsen commented Oct 25, 2016

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always
    
    0, means zlib

Good to see this progressed from the old MySQL bug I saw originally.

I'm a bit late into this however I suggest we can use a better algorithm
than zlib. There are a significant number of algorithms that may have
better compression at similar speed faster like brotli, zstd, snappy.

https://quixdb.github.io/squash-benchmark/#results-table

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

Hi,

Compressed binlogs are something that have interested me for some time[1] so it is nice to see patches which provide this functionality.

I have not looked at the code in detail but a couple of questions related to the implementation come to mind:

(1) Why not use a single compressed event type and encapsulate the whole original event inside ?

Is it not possible to have a "generic compressed event”? Mark mentions allowing selection of the compression type and that seems useful and I would expect the the compressed event to have this information and the compressed content.

(2) Senders would try to compress the event as requested. If the compressed event is not any smaller then do not bother compressing it, just send the original event.
(3) Receivers on receiving a compressed event would decompress it and then process the extracted event in the same way it’s processed now. Most of the code should haven no idea the event had been compressed.
(4) This logic would only require a single new event type and also allow all types of events (including any future ones) to be compressed without requiring extra logic

Perhaps I am missing something but the need to compress events individually probably requires quite a bit more logic and will not be as extensible.

Thanks for sharing any thoughts on this.

Regards,

Simon
[1] See: https://bugs.mysql.com/bug.php?id=71696 and various other FRs related to this topic.

Member

knielsen commented Oct 25, 2016

Hi,

Compressed binlogs are something that have interested me for some time[1] so it is nice to see patches which provide this functionality.

I have not looked at the code in detail but a couple of questions related to the implementation come to mind:

(1) Why not use a single compressed event type and encapsulate the whole original event inside ?

Is it not possible to have a "generic compressed event”? Mark mentions allowing selection of the compression type and that seems useful and I would expect the the compressed event to have this information and the compressed content.

(2) Senders would try to compress the event as requested. If the compressed event is not any smaller then do not bother compressing it, just send the original event.
(3) Receivers on receiving a compressed event would decompress it and then process the extracted event in the same way it’s processed now. Most of the code should haven no idea the event had been compressed.
(4) This logic would only require a single new event type and also allow all types of events (including any future ones) to be compressed without requiring extra logic

Perhaps I am missing something but the need to compress events individually probably requires quite a bit more logic and will not be as extensible.

Thanks for sharing any thoughts on this.

Regards,

Simon
[1] See: https://bugs.mysql.com/bug.php?id=71696 and various other FRs related to this topic.

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

Hi Kristian,

On 21 Oct 2016, at 13:22, Kristian Nielsen knielsen@knielsen-hq.org wrote:

Simon Mudd simon.mudd@booking.com writes:

I have not looked at the code in detail but a couple of questions
related to the implementation come to mind:

The author is probably the best to answer, but I can tell what I know from
reviewing the patch:

(1) Why not use a single compressed event type and encapsulate the whole original event inside ?

This would result in higher overhead on each event. There is a fixed header
to each event, which cannot be compressed (without significantly more
changes to the code, at least). Thus, a single event encapsulating another
full compressed event would require two headers, one uncompressed and one
compressed.

Ok. I’ve been assuming the headers were small (from some casual browsing of things
related to the binlog router some time ago), but that may be wrong.

There might be other pros and cons to each method, this was just what
occured to me.

Is it not possible to have a "generic compressed event”? Mark mentions
allowing selection of the compression type and that seems useful and I
would expect the the compressed event to have this information and the
compressed content.

The current patch has reserved three bits for the compression method. Thus,
it would be possible in the future to add more compression types (up to a
maximum of 8, at least).

Ok. good to know.

Actually, it would be good if the code would check the compression type and
fail if it is unknown. This will help get proper errors on old slaves if new
compression types are added in the future.

Indeed, one of the things about the current binlog format is that there’s little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently implicit is
explicit and also if the specs are outside of the code. That’s something I
tried to mention to Oracle with the new X protocol and obviously out of scope of
this discussion but the intricacies of the binlog traffic are quite hard to follow
for mere mortals…

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing that
would be trivial even if not implemented now. Just a thought anyway.

Perhaps I am missing something but the need to compress events
individually probably requires quite a bit more logic and will not be
as extensible.

Perhaps. In any case, the patch at hand is based on individually compressed
events. Even with individual event types, the patch is nice and clean and
leaves much of the existing logic unchanged.

My guess is that this will primarily be useful for row-based binlogging,
where reasonable compression rates can be expected in many cases where
row-based binlogging has particularly high overhead compared to
statement-based. Per-event compression is in any case of limited use for
events that are small in size.

Sure so workload clearly affects the usefulness of a patch such as this one.
That said I’ve seen SBR traffic with very large packets as many bulk changes
are massaged to fit to max_allowed_packet, so here even in SBR you can
make a huge size saving if you have such statements.

Minimal RBR can help reduce some of the space concerns that RBR generates,
and I use it a lot (and requested this functionality) specifically for this reason.
However there are use cases where it can not be used (exporting data to
external systems that may need/want to have the whole “change log stream”)
so clearly compression here may be a good bit of extra functionality.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most helpful.

Thanks for sharing your thoughts.

Simon

Member

knielsen commented Oct 25, 2016

Hi Kristian,

On 21 Oct 2016, at 13:22, Kristian Nielsen knielsen@knielsen-hq.org wrote:

Simon Mudd simon.mudd@booking.com writes:

I have not looked at the code in detail but a couple of questions
related to the implementation come to mind:

The author is probably the best to answer, but I can tell what I know from
reviewing the patch:

(1) Why not use a single compressed event type and encapsulate the whole original event inside ?

This would result in higher overhead on each event. There is a fixed header
to each event, which cannot be compressed (without significantly more
changes to the code, at least). Thus, a single event encapsulating another
full compressed event would require two headers, one uncompressed and one
compressed.

Ok. I’ve been assuming the headers were small (from some casual browsing of things
related to the binlog router some time ago), but that may be wrong.

There might be other pros and cons to each method, this was just what
occured to me.

Is it not possible to have a "generic compressed event”? Mark mentions
allowing selection of the compression type and that seems useful and I
would expect the the compressed event to have this information and the
compressed content.

The current patch has reserved three bits for the compression method. Thus,
it would be possible in the future to add more compression types (up to a
maximum of 8, at least).

Ok. good to know.

Actually, it would be good if the code would check the compression type and
fail if it is unknown. This will help get proper errors on old slaves if new
compression types are added in the future.

Indeed, one of the things about the current binlog format is that there’s little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently implicit is
explicit and also if the specs are outside of the code. That’s something I
tried to mention to Oracle with the new X protocol and obviously out of scope of
this discussion but the intricacies of the binlog traffic are quite hard to follow
for mere mortals…

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing that
would be trivial even if not implemented now. Just a thought anyway.

Perhaps I am missing something but the need to compress events
individually probably requires quite a bit more logic and will not be
as extensible.

Perhaps. In any case, the patch at hand is based on individually compressed
events. Even with individual event types, the patch is nice and clean and
leaves much of the existing logic unchanged.

My guess is that this will primarily be useful for row-based binlogging,
where reasonable compression rates can be expected in many cases where
row-based binlogging has particularly high overhead compared to
statement-based. Per-event compression is in any case of limited use for
events that are small in size.

Sure so workload clearly affects the usefulness of a patch such as this one.
That said I’ve seen SBR traffic with very large packets as many bulk changes
are massaged to fit to max_allowed_packet, so here even in SBR you can
make a huge size saving if you have such statements.

Minimal RBR can help reduce some of the space concerns that RBR generates,
and I use it a lot (and requested this functionality) specifically for this reason.
However there are use cases where it can not be used (exporting data to
external systems that may need/want to have the whole “change log stream”)
so clearly compression here may be a good bit of extra functionality.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most helpful.

Thanks for sharing your thoughts.

Simon

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 25, 2016

Member

Simon Mudd simon.mudd@booking.com writes:

This would result in higher overhead on each event. There is a fixed header

Ok. I’ve been assuming the headers were small (from some casual browsing of things
related to the binlog router some time ago), but that may be wrong.

Yes, they are quite small, 10-20 bytes per event or something like that.

Indeed, one of the things about the current binlog format is that there’s little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently implicit is
explicit and also if the specs are outside of the code. That’s something I

Tell me about it ... it is very hard to change most anything in
replication without breaking some odd corner somewhere.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most helpful.

Right. This patch compresses query events (ie. statement-based updates) and
row-events, so both of these are covered. LOAD DATA INFILE in statement mode
is not (but in row-based mode it should be, I think).

  • Kristian.
Member

knielsen commented Oct 25, 2016

Simon Mudd simon.mudd@booking.com writes:

This would result in higher overhead on each event. There is a fixed header

Ok. I’ve been assuming the headers were small (from some casual browsing of things
related to the binlog router some time ago), but that may be wrong.

Yes, they are quite small, 10-20 bytes per event or something like that.

Indeed, one of the things about the current binlog format is that there’s little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently implicit is
explicit and also if the specs are outside of the code. That’s something I

Tell me about it ... it is very hard to change most anything in
replication without breaking some odd corner somewhere.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most helpful.

Right. This patch compresses query events (ie. statement-based updates) and
row-events, so both of these are covered. LOAD DATA INFILE in statement mode
is not (but in row-based mode it should be, I think).

  • Kristian.
@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 26, 2016

Member

2016-10-21 1:06 GMT+08:00 Kristian Nielsen knielsen@knielsen-hq.org:

GCSAdmin notifications@github.com writes:

We add new event types to support compress the binlog as follow:
QUERY_COMPRESSED_EVENT,
WRITE_ROWS_COMPRESSED_EVENT_V1,
UPDATE_ROWS_COMPRESSED_EVENT_V1,
DELETE_POWS_COMPRESSED_EVENT_V1,
WRITE_ROWS_COMPRESSED_EVENT,
UPDATE_ROWS_COMPRESSED_EVENT,
DELETE_POWS_COMPRESSED_EVENT

Thanks for the patch!

Overall, I'm much impressed with the quality, I see a lot of attention to
detail. Below, I have a number of comments and suggestions, but they are
all
minor for a patch of this complexity.

A number of the comments are style fixes or simple changes, and I found it
easier to just do the changes to the source for you to review. I hope that
is ok. I rebased the series without intermediate merge to simplify review
to
a single patch. The rebase with my suggested changes on top is here:

https://github.com/knielsen/server/commits/GCSAdmin-10.2-bin
log-compressed2-2

Please check the changes I made and let me know if you disagree with
anything or I misunderstood something. Changes are mostly:

  • A bunch of code style (indentation, lines <= 80 chars, clarify comments,
    and so on).
  • Change LOG_EVENT_IS_QUERY() etc. from macros to static inline functions.
  • check_event_type() updated (new code merged into 10.2 recently).
  • Minor .result file update, also from code merged into 10.2 recently.

Thanks for your changes, it seems it's ok. And my new code will base on it.

I also have the following questions/suggestions:

  1. I think the code should sanity-check the compressed event header, to
    check that it does not access outside the event buffer in corrupt
    events. Eg. if lenlen is 7 and there is less than 7 bytes available in the
    event. I know that replication code in general is not robust to corrupt
    events, but it seems best that new code avoids overflowing buffers in case
    of corrupt event data.

Yes, you are right. I will add the check to avoid overflowing.

  1. There are some places where the compressed event types are not added to
    switch() or if() statements, mainly related to minimal binlog images, I
    think: Rows_log_event::read_write_bitmaps_cmp(),
    Rows_log_event::get_data_size(), Rows_log_event::do_apply_event(),
    Rows_log_event::write_data_body(). I think the reason for that is that
    the
    SQL thread never sees the compressed events (they are uncompressed by the
    IO
    thread), but I would like your confirmation that my understanding is
    correct.

In SQL thread, the compressed event will uncompressed the “body” in the
constructor.
So, most of member functions of compressed events can inherit directly, no
need to derived.

Such as read_write_bitmaps_cmp(), there is same general_type_code for
WRITE_ROWS_EVENT and WRITE_ROWS_COMPRESSED_EVENT.
And the body is uncompressed in constructor for compressed events, so there
is no need to change.

bool read_write_bitmaps_cmp(TABLE *table)
{
bool res= FALSE;

switch (get_general_type_code())
{
  case DELETE_ROWS_EVENT:
    res= bitmap_cmp(get_cols(), table->read_set);
    break;
  case UPDATE_ROWS_EVENT:
    res= (bitmap_cmp(get_cols(), table->read_set) &&
          bitmap_cmp(get_cols_ai(), table->rpl_write_set));
    break;
  case WRITE_ROWS_EVENT:
    res= bitmap_cmp(get_cols(), table->rpl_write_set);
    break;
  default:
    /*
      We should just compare bitmaps for Delete, Write
      or Update rows events.
    */
    DBUG_ASSERT(0);
}
return res;

}

  1. Did you think about testing that BINLOG statements with compressed
    events
    can execute correctly? This happens with mysqlbinlog | mysql, when there
    are
    (compressed) row-based events in the binlog. I'm wondering if this could
    expose compressed events to code that normally runs in the SQL thread and
    expects to have events uncompressed for it by the IO thread?

As mentioned above, the compressed events would uncompressed in constructor
in SQL thread.
So, it would also work even if the IO thread do nothing.

Of course, it would also work when "mysqlbinlog | mysql" . I also add some
test cases for that in mysql-test like "main.
mysqlbinlog_row_compressed","main.mysqlbinlog_stmt_compressed"

A few detailed comments on the patch below (most comments are done as
changes in the above-linked git branch).

Otherwise, after answer to above 3 questions, I think the patch looks good
to go into 10.2.

  • Kristian.

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always 0,
    
    means zlib
  •               4-7 Bit: Bytes of "Record Original Length"
    
  • Record Original Length: 1-4 Bytes
  • Compressed Buf:

I did not understand this. Why "reversed"? It seems to be bits 0-2 that
have
the bytes of "Record Original Length" and bit 7 that has the '1' bit
meaning
compression enabled? Maybe this can be clarified.

sorry, maybe there are some mistake here.
There following may be right:

Compressed Record
Record Header: 1 Byte
7 Bit: Always 1, mean compressed;
4-6 Bit: Reversed, compressed algorithm
Now it is always 0, means zlib.
It maybe support other compression algorithm in the
feture.
0-3 Bit: Bytes of "Record Original Length"
Record Original Length: 1-4 Bytes
Compressed Buf:

+int query_event_uncompress(const Format_description_log_event
*description_event, bool contain_checksum,

  •                                            const char _src, char_
    
    buf, ulong buf_size, bool* is_malloc,
  •           char **dst, ulong *newlen)
    
    +{
  • ulong len = uint4korr(src + EVENT_LEN_OFFSET);
  • const char *tmp = src;
    
  • DBUG_ASSERT((uchar)src[EVENT_TYPE_OFFSET] == QUERY_COMPRESSED_EVENT);
  • uint8 common_header_len= description_event->common_header_len;
    
  • uint8 post_header_len= description_event->post_header
    
    _len[QUERY_COMPRESSED_EVENT-1];
    +
  • tmp += common_header_len;
    
  • uint db_len = (uint)tmp[Q_DB_LEN_OFFSET];
    
  • uint16 status_vars_len= uint2korr(tmp + Q_STATUS_VARS_LEN_OFFSET);
    
  • tmp += post_header_len + status_vars_len + db_len + 1;
    
  • uint32 un_len = binlog_get_uncompress_len(tmp);
    
  • *newlen = (tmp - src) + un_len;
    

This is one place I would like to see a check on the data in the
event. Maybe just that 'len' is larger than 1+(tmp[0]&7). Or maybe simply
that len is at least 10, the minimal size for compressed events.

+int row_log_event_uncompress(const Format_description_log_event
*description_event, bool contain_checksum,

  •                         const char _src, char_ buf, ulong
    

    buf_size, bool* is_malloc,

  •                         char **dst, ulong *newlen)
    

    +{

  • uint32 un_len = binlog_get_uncompress_len(tmp);
    
  • *newlen = (tmp - src) + un_len;
    

Another place where I'd like a check against looking outside the event
buffer.

+int binlog_buf_uncompress(const char *src, char *dst, uint32 len,
uint32 *newlen)
+{

  • if((src[0] & 0x80) == 0)
    
  • {
    
  •         return 1;
    
  • }
    
  • uint32 lenlen = src[0] & 0x07;
    
  • uLongf buflen = *newlen;
  • if(uncompress((Bytef _)dst, &buflen, (const Bytef_)src + 1 +
    
    lenlen, len) != Z_OK)

Again check on length.

Ok, I will add the check.

Member

knielsen commented Oct 26, 2016

2016-10-21 1:06 GMT+08:00 Kristian Nielsen knielsen@knielsen-hq.org:

GCSAdmin notifications@github.com writes:

We add new event types to support compress the binlog as follow:
QUERY_COMPRESSED_EVENT,
WRITE_ROWS_COMPRESSED_EVENT_V1,
UPDATE_ROWS_COMPRESSED_EVENT_V1,
DELETE_POWS_COMPRESSED_EVENT_V1,
WRITE_ROWS_COMPRESSED_EVENT,
UPDATE_ROWS_COMPRESSED_EVENT,
DELETE_POWS_COMPRESSED_EVENT

Thanks for the patch!

Overall, I'm much impressed with the quality, I see a lot of attention to
detail. Below, I have a number of comments and suggestions, but they are
all
minor for a patch of this complexity.

A number of the comments are style fixes or simple changes, and I found it
easier to just do the changes to the source for you to review. I hope that
is ok. I rebased the series without intermediate merge to simplify review
to
a single patch. The rebase with my suggested changes on top is here:

https://github.com/knielsen/server/commits/GCSAdmin-10.2-bin
log-compressed2-2

Please check the changes I made and let me know if you disagree with
anything or I misunderstood something. Changes are mostly:

  • A bunch of code style (indentation, lines <= 80 chars, clarify comments,
    and so on).
  • Change LOG_EVENT_IS_QUERY() etc. from macros to static inline functions.
  • check_event_type() updated (new code merged into 10.2 recently).
  • Minor .result file update, also from code merged into 10.2 recently.

Thanks for your changes, it seems it's ok. And my new code will base on it.

I also have the following questions/suggestions:

  1. I think the code should sanity-check the compressed event header, to
    check that it does not access outside the event buffer in corrupt
    events. Eg. if lenlen is 7 and there is less than 7 bytes available in the
    event. I know that replication code in general is not robust to corrupt
    events, but it seems best that new code avoids overflowing buffers in case
    of corrupt event data.

Yes, you are right. I will add the check to avoid overflowing.

  1. There are some places where the compressed event types are not added to
    switch() or if() statements, mainly related to minimal binlog images, I
    think: Rows_log_event::read_write_bitmaps_cmp(),
    Rows_log_event::get_data_size(), Rows_log_event::do_apply_event(),
    Rows_log_event::write_data_body(). I think the reason for that is that
    the
    SQL thread never sees the compressed events (they are uncompressed by the
    IO
    thread), but I would like your confirmation that my understanding is
    correct.

In SQL thread, the compressed event will uncompressed the “body” in the
constructor.
So, most of member functions of compressed events can inherit directly, no
need to derived.

Such as read_write_bitmaps_cmp(), there is same general_type_code for
WRITE_ROWS_EVENT and WRITE_ROWS_COMPRESSED_EVENT.
And the body is uncompressed in constructor for compressed events, so there
is no need to change.

bool read_write_bitmaps_cmp(TABLE *table)
{
bool res= FALSE;

switch (get_general_type_code())
{
  case DELETE_ROWS_EVENT:
    res= bitmap_cmp(get_cols(), table->read_set);
    break;
  case UPDATE_ROWS_EVENT:
    res= (bitmap_cmp(get_cols(), table->read_set) &&
          bitmap_cmp(get_cols_ai(), table->rpl_write_set));
    break;
  case WRITE_ROWS_EVENT:
    res= bitmap_cmp(get_cols(), table->rpl_write_set);
    break;
  default:
    /*
      We should just compare bitmaps for Delete, Write
      or Update rows events.
    */
    DBUG_ASSERT(0);
}
return res;

}

  1. Did you think about testing that BINLOG statements with compressed
    events
    can execute correctly? This happens with mysqlbinlog | mysql, when there
    are
    (compressed) row-based events in the binlog. I'm wondering if this could
    expose compressed events to code that normally runs in the SQL thread and
    expects to have events uncompressed for it by the IO thread?

As mentioned above, the compressed events would uncompressed in constructor
in SQL thread.
So, it would also work even if the IO thread do nothing.

Of course, it would also work when "mysqlbinlog | mysql" . I also add some
test cases for that in mysql-test like "main.
mysqlbinlog_row_compressed","main.mysqlbinlog_stmt_compressed"

A few detailed comments on the patch below (most comments are done as
changes in the above-linked git branch).

Otherwise, after answer to above 3 questions, I think the patch looks good
to go into 10.2.

  • Kristian.

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always 0,
    
    means zlib
  •               4-7 Bit: Bytes of "Record Original Length"
    
  • Record Original Length: 1-4 Bytes
  • Compressed Buf:

I did not understand this. Why "reversed"? It seems to be bits 0-2 that
have
the bytes of "Record Original Length" and bit 7 that has the '1' bit
meaning
compression enabled? Maybe this can be clarified.

sorry, maybe there are some mistake here.
There following may be right:

Compressed Record
Record Header: 1 Byte
7 Bit: Always 1, mean compressed;
4-6 Bit: Reversed, compressed algorithm
Now it is always 0, means zlib.
It maybe support other compression algorithm in the
feture.
0-3 Bit: Bytes of "Record Original Length"
Record Original Length: 1-4 Bytes
Compressed Buf:

+int query_event_uncompress(const Format_description_log_event
*description_event, bool contain_checksum,

  •                                            const char _src, char_
    
    buf, ulong buf_size, bool* is_malloc,
  •           char **dst, ulong *newlen)
    
    +{
  • ulong len = uint4korr(src + EVENT_LEN_OFFSET);
  • const char *tmp = src;
    
  • DBUG_ASSERT((uchar)src[EVENT_TYPE_OFFSET] == QUERY_COMPRESSED_EVENT);
  • uint8 common_header_len= description_event->common_header_len;
    
  • uint8 post_header_len= description_event->post_header
    
    _len[QUERY_COMPRESSED_EVENT-1];
    +
  • tmp += common_header_len;
    
  • uint db_len = (uint)tmp[Q_DB_LEN_OFFSET];
    
  • uint16 status_vars_len= uint2korr(tmp + Q_STATUS_VARS_LEN_OFFSET);
    
  • tmp += post_header_len + status_vars_len + db_len + 1;
    
  • uint32 un_len = binlog_get_uncompress_len(tmp);
    
  • *newlen = (tmp - src) + un_len;
    

This is one place I would like to see a check on the data in the
event. Maybe just that 'len' is larger than 1+(tmp[0]&7). Or maybe simply
that len is at least 10, the minimal size for compressed events.

+int row_log_event_uncompress(const Format_description_log_event
*description_event, bool contain_checksum,

  •                         const char _src, char_ buf, ulong
    

    buf_size, bool* is_malloc,

  •                         char **dst, ulong *newlen)
    

    +{

  • uint32 un_len = binlog_get_uncompress_len(tmp);
    
  • *newlen = (tmp - src) + un_len;
    

Another place where I'd like a check against looking outside the event
buffer.

+int binlog_buf_uncompress(const char *src, char *dst, uint32 len,
uint32 *newlen)
+{

  • if((src[0] & 0x80) == 0)
    
  • {
    
  •         return 1;
    
  • }
    
  • uint32 lenlen = src[0] & 0x07;
    
  • uLongf buflen = *newlen;
  • if(uncompress((Bytef _)dst, &buflen, (const Bytef_)src + 1 +
    
    lenlen, len) != Z_OK)

Again check on length.

Ok, I will add the check.

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 28, 2016

Member

Thanks for knielsen's nice replay. I would like to add some comments

2016-10-21 22:31 GMT+08:00 Simon Mudd simon.mudd@booking.com:

Hi Kristian,

On 21 Oct 2016, at 13:22, Kristian Nielsen knielsen@knielsen-hq.org
wrote:

Simon Mudd simon.mudd@booking.com writes:

I have not looked at the code in detail but a couple of questions
related to the implementation come to mind:

The author is probably the best to answer, but I can tell what I know
from
reviewing the patch:

(1) Why not use a single compressed event type and encapsulate the
whole original event inside ?

This would result in higher overhead on each event. There is a fixed
header
to each event, which cannot be compressed (without significantly more
changes to the code, at least). Thus, a single event encapsulating
another
full compressed event would require two headers, one uncompressed and one
compressed.

Ok. I’ve been assuming the headers were small (from some casual browsing
of things
related to the binlog router some time ago), but that may be wrong.

Now, the individual compressed event types inherit the original event
types, so it may not have too many extra logic.
If use a single compressed event type and encapsulate the whole original
event inside,
I don't think it easy to do it without too much code modification. Such as:

bool Query_log_event::write()
{
...
return write_header(event_length) ||
write_data(buf, QUERY_HEADER_LEN) ||
write_post_header_for_derived() ||
write_data(start_of_status, (uint) (start-start_of_status)) ||
write_data(safe_str(db), db_len + 1) ||
write_data(query, q_len) ||
write_footer();
}

And I am agree that it may bring higher overhead.

There might be other pros and cons to each method, this was just what

occured to me.

Is it not possible to have a "generic compressed event”? Mark mentions
allowing selection of the compression type and that seems useful and I
would expect the the compressed event to have this information and the
compressed content.

The current patch has reserved three bits for the compression method.
Thus,
it would be possible in the future to add more compression types (up to a
maximum of 8, at least).

Ok. good to know.

Actually, it would be good if the code would check the compression type
and
fail if it is unknown. This will help get proper errors on old slaves if
new
compression types are added in the future.

Indeed, one of the things about the current binlog format is that there’s
little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently
implicit is
explicit and also if the specs are outside of the code. That’s something I
tried to mention to Oracle with the new X protocol and obviously out of
scope of
this discussion but the intricacies of the binlog traffic are quite hard
to follow
for mere mortals…

And the new code would check the compression type.

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing
that
would be trivial even if not implemented now. Just a thought anyway.

As knielsen said, the main purpose of log_bin_compress_min_len is avoiding
the too small binlog events.

Perhaps I am missing something but the need to compress events
individually probably requires quite a bit more logic and will not be
as extensible.

Perhaps. In any case, the patch at hand is based on individually
compressed
events. Even with individual event types, the patch is nice and clean and
leaves much of the existing logic unchanged.

My guess is that this will primarily be useful for row-based binlogging,
where reasonable compression rates can be expected in many cases where
row-based binlogging has particularly high overhead compared to
statement-based. Per-event compression is in any case of limited use for
events that are small in size.

Sure so workload clearly affects the usefulness of a patch such as this
one.
That said I’ve seen SBR traffic with very large packets as many bulk
changes
are massaged to fit to max_allowed_packet, so here even in SBR you can
make a huge size saving if you have such statements.

Minimal RBR can help reduce some of the space concerns that RBR generates,
and I use it a lot (and requested this functionality) specifically for
this reason.
However there are use cases where it can not be used (exporting data to
external systems that may need/want to have the whole “change log stream”)
so clearly compression here may be a good bit of extra functionality.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most
helpful.

Thanks for sharing your thoughts.

Simon

SBR is also work in this feature.

Member

knielsen commented Oct 28, 2016

Thanks for knielsen's nice replay. I would like to add some comments

2016-10-21 22:31 GMT+08:00 Simon Mudd simon.mudd@booking.com:

Hi Kristian,

On 21 Oct 2016, at 13:22, Kristian Nielsen knielsen@knielsen-hq.org
wrote:

Simon Mudd simon.mudd@booking.com writes:

I have not looked at the code in detail but a couple of questions
related to the implementation come to mind:

The author is probably the best to answer, but I can tell what I know
from
reviewing the patch:

(1) Why not use a single compressed event type and encapsulate the
whole original event inside ?

This would result in higher overhead on each event. There is a fixed
header
to each event, which cannot be compressed (without significantly more
changes to the code, at least). Thus, a single event encapsulating
another
full compressed event would require two headers, one uncompressed and one
compressed.

Ok. I’ve been assuming the headers were small (from some casual browsing
of things
related to the binlog router some time ago), but that may be wrong.

Now, the individual compressed event types inherit the original event
types, so it may not have too many extra logic.
If use a single compressed event type and encapsulate the whole original
event inside,
I don't think it easy to do it without too much code modification. Such as:

bool Query_log_event::write()
{
...
return write_header(event_length) ||
write_data(buf, QUERY_HEADER_LEN) ||
write_post_header_for_derived() ||
write_data(start_of_status, (uint) (start-start_of_status)) ||
write_data(safe_str(db), db_len + 1) ||
write_data(query, q_len) ||
write_footer();
}

And I am agree that it may bring higher overhead.

There might be other pros and cons to each method, this was just what

occured to me.

Is it not possible to have a "generic compressed event”? Mark mentions
allowing selection of the compression type and that seems useful and I
would expect the the compressed event to have this information and the
compressed content.

The current patch has reserved three bits for the compression method.
Thus,
it would be possible in the future to add more compression types (up to a
maximum of 8, at least).

Ok. good to know.

Actually, it would be good if the code would check the compression type
and
fail if it is unknown. This will help get proper errors on old slaves if
new
compression types are added in the future.

Indeed, one of the things about the current binlog format is that there’s
little
complete documentation outside of the code. Code changes and there’s no
clear specification. It makes things much better if what’s currently
implicit is
explicit and also if the specs are outside of the code. That’s something I
tried to mention to Oracle with the new X protocol and obviously out of
scope of
this discussion but the intricacies of the binlog traffic are quite hard
to follow
for mere mortals…

And the new code would check the compression type.

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing
that
would be trivial even if not implemented now. Just a thought anyway.

As knielsen said, the main purpose of log_bin_compress_min_len is avoiding
the too small binlog events.

Perhaps I am missing something but the need to compress events
individually probably requires quite a bit more logic and will not be
as extensible.

Perhaps. In any case, the patch at hand is based on individually
compressed
events. Even with individual event types, the patch is nice and clean and
leaves much of the existing logic unchanged.

My guess is that this will primarily be useful for row-based binlogging,
where reasonable compression rates can be expected in many cases where
row-based binlogging has particularly high overhead compared to
statement-based. Per-event compression is in any case of limited use for
events that are small in size.

Sure so workload clearly affects the usefulness of a patch such as this
one.
That said I’ve seen SBR traffic with very large packets as many bulk
changes
are massaged to fit to max_allowed_packet, so here even in SBR you can
make a huge size saving if you have such statements.

Minimal RBR can help reduce some of the space concerns that RBR generates,
and I use it a lot (and requested this functionality) specifically for
this reason.
However there are use cases where it can not be used (exporting data to
external systems that may need/want to have the whole “change log stream”)
so clearly compression here may be a good bit of extra functionality.

Fixing the case for RBR is good but I feel the focus may be too narrow,
especially if the approach can be used more generically.

I certainly have some SBR machines which generate large volumes of bin logs
and to be able to compress the events they generate on disk would be most
helpful.

Thanks for sharing your thoughts.

Simon

SBR is also work in this feature.

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 28, 2016

Member

Hi mark

There are 3 bits reserved for compressed algorithm in Compressed Record.
It would be always 0 at present, means zlib.
It can be easy to support other compression algorithm in the future.

Compressed Record
Record Header: 1 Byte
7 Bit: Always 1, mean compressed;
4-6 Bit: Reversed, compressed algorithm
Now it is always 0, means zlib.
It maybe support other compression algorithm in the
future.
0-3 Bit: Bytes of "Record Original Length"
Record Original Length: 1-4 Bytes
Compressed Buf:

2016-10-21 7:32 GMT+08:00 MARK CALLAGHAN mdcallag@gmail.com:

How easy will it be for this to use a lower latency compression algorithm
like lz4, Snappy or zstandard?

On Thu, Oct 20, 2016 at 2:52 PM, Daniel Black daniel.black@au1.ibm.com
wrote:

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always
    
    0, means zlib

Good to see this progressed from the old MySQL bug I saw originally.

I'm a bit late into this however I suggest we can use a better algorithm
than zlib. There are a significant number of algorithms that may have
better compression at similar speed faster like brotli, zstd, snappy.

https://quixdb.github.io/squash-benchmark/#results-table


Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp

Mark Callaghan
mdcallag@gmail.com

Member

knielsen commented Oct 28, 2016

Hi mark

There are 3 bits reserved for compressed algorithm in Compressed Record.
It would be always 0 at present, means zlib.
It can be easy to support other compression algorithm in the future.

Compressed Record
Record Header: 1 Byte
7 Bit: Always 1, mean compressed;
4-6 Bit: Reversed, compressed algorithm
Now it is always 0, means zlib.
It maybe support other compression algorithm in the
future.
0-3 Bit: Bytes of "Record Original Length"
Record Original Length: 1-4 Bytes
Compressed Buf:

2016-10-21 7:32 GMT+08:00 MARK CALLAGHAN mdcallag@gmail.com:

How easy will it be for this to use a lower latency compression algorithm
like lz4, Snappy or zstandard?

On Thu, Oct 20, 2016 at 2:52 PM, Daniel Black daniel.black@au1.ibm.com
wrote:

+#define BINLOG_COMPRESSED_HEADER_LEN 1
+#define BINLOG_COMPRESSED_ORIGINAL_LENGTH_MAX_BYTES 4
+/**

  • Compressed Record
  • Record Header: 1 Byte
  •               0 Bit: Always 1, mean compressed;
    
  •               1-3 Bit: Reversed, compressed algorithm??Always
    
    0, means zlib

Good to see this progressed from the old MySQL bug I saw originally.

I'm a bit late into this however I suggest we can use a better algorithm
than zlib. There are a significant number of algorithms that may have
better compression at similar speed faster like brotli, zstd, snappy.

https://quixdb.github.io/squash-benchmark/#results-table


Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp

Mark Callaghan
mdcallag@gmail.com

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Oct 28, 2016

Member

Hi,

On 28 Oct 2016, at 06:18, 陈福荣 <vinchen13@gmail.commailto:vinchen13@gmail.com> wrote:

Thanks for knielsen's nice replay. I would like to add some comments

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing that
would be trivial even if not implemented now. Just a thought anyway.

As knielsen said, the main purpose of log_bin_compress_min_len is avoiding the too small binlog events.

Can’t you just find out how big the compressed event is and if it’s not smaller than the original event send the original?
That way you know you get some extra compression which if you’re enabling this feature it’s what you want anyway.
Having to configure this minimum size means you need to set a specific value. Maybe you can use this
new setting with it’s current usage and use -1 to mean only write a compressed event if it’s actually smaller ?

SBR is also work in this feature.

That’s good to hear.

Simon

Member

knielsen commented Oct 28, 2016

Hi,

On 28 Oct 2016, at 06:18, 陈福荣 <vinchen13@gmail.commailto:vinchen13@gmail.com> wrote:

Thanks for knielsen's nice replay. I would like to add some comments

(2) Senders would try to compress the event as requested. If the
compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough. It
might not be needed though, the log_bin_compress_min_len kind of serves a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing that
would be trivial even if not implemented now. Just a thought anyway.

As knielsen said, the main purpose of log_bin_compress_min_len is avoiding the too small binlog events.

Can’t you just find out how big the compressed event is and if it’s not smaller than the original event send the original?
That way you know you get some extra compression which if you’re enabling this feature it’s what you want anyway.
Having to configure this minimum size means you need to set a specific value. Maybe you can use this
new setting with it’s current usage and use -1 to mean only write a compressed event if it’s actually smaller ?

SBR is also work in this feature.

That’s good to hear.

Simon

@vinchen

This comment has been minimized.

Show comment
Hide comment
@vinchen

vinchen Oct 29, 2016

Contributor

Hi knielsen,

The new code is here:
https://github.com/vinchen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

based on
https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

And added two fixed:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.

Contributor

vinchen commented Oct 29, 2016

Hi knielsen,

The new code is here:
https://github.com/vinchen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

based on
https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

And added two fixed:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Nov 3, 2016

Member

Hi knielsen,

The new code is here:
https://github.com/vinchen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

based on
https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

And added two fixed:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.

2016-10-28 21:06 GMT+08:00 Simon Mudd simon.mudd@booking.com:

Hi,

On 28 Oct 2016, at 06:18, 陈福荣 vinchen13@gmail.com wrote:

Thanks for knielsen's nice replay. I would like to add some comments

(2) Senders would try to compress the event as requested. If the

compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough.
It
might not be needed though, the log_bin_compress_min_len kind of serves
a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing
that
would be trivial even if not implemented now. Just a thought anyway.

As knielsen said, the main purpose of log_bin_compress_min_len is avoiding
the too small binlog events.

Can’t you just find out how big the compressed event is and if it’s not
smaller than the original event send the original?
That way you know you get some extra compression which if you’re enabling
this feature it’s what you want anyway.
Having to configure this minimum size means you need to set a specific
value. Maybe you can use this
new setting with it’s current usage and use -1 to mean only write a
compressed event if it’s actually smaller ?

SBR is also work in this feature.

That’s good to hear.

Simon

Member

knielsen commented Nov 3, 2016

Hi knielsen,

The new code is here:
https://github.com/vinchen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

based on
https://github.com/knielsen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

And added two fixed:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.

2016-10-28 21:06 GMT+08:00 Simon Mudd simon.mudd@booking.com:

Hi,

On 28 Oct 2016, at 06:18, 陈福荣 vinchen13@gmail.com wrote:

Thanks for knielsen's nice replay. I would like to add some comments

(2) Senders would try to compress the event as requested. If the

compressed event is not any smaller then do not bother compressing it,
just send the original event.

This is not in the current patch, but it could be added easily enough.
It
might not be needed though, the log_bin_compress_min_len kind of serves
a
similar purpose.

It just saves the receiver doing extra work and I’d expect that allowing
that
would be trivial even if not implemented now. Just a thought anyway.

As knielsen said, the main purpose of log_bin_compress_min_len is avoiding
the too small binlog events.

Can’t you just find out how big the compressed event is and if it’s not
smaller than the original event send the original?
That way you know you get some extra compression which if you’re enabling
this feature it’s what you want anyway.
Having to configure this minimum size means you need to set a specific
value. Maybe you can use this
new setting with it’s current usage and use -1 to mean only write a
compressed event if it’s actually smaller ?

SBR is also work in this feature.

That’s good to hear.

Simon

@felixliang

This comment has been minimized.

Show comment
Hide comment
@felixliang

felixliang Nov 3, 2016

hi @knielsen
Is any thing need to be fixed regarding to vinchen's newest commit? please let us know.

hi @knielsen
Is any thing need to be fixed regarding to vinchen's newest commit? please let us know.

@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Nov 3, 2016

Member

vinchen notifications@github.com writes:

The new code is here:
https://github.com/vinchen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

And added two fixed:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.

Looks fine, thanks for fixing this.

I have now merged and pushed this to MariaDB 10.2.

Rows_log_event::write_data_body(). I think the reason for that is that
the
SQL thread never sees the compressed events (they are uncompressed by the
IO
thread), but I would like your confirmation that my understanding is
correct.

  1. Did you think about testing that BINLOG statements with compressed

As mentioned above, the compressed events would uncompressed in constructor
in SQL thread.

Right, thanks, I understand now.

So the BINLOG statements in the output of mysqlbinlog have the uncompressed
data. This makes sense, just like the SQL queries are uncompressed before
being output by mysqlbinlog.

So in fact, one change I suggested is wrong, to accept compressed event
types in BINLOG statements. So I reverted this change again, and also added
some test cases:

56a041c

BTW, I think this is recently merged code (from the delayed replication
feature) that would not have appeared in your original patch.

Thanks for the explanation to help me understand this.

I also fixed a .result file - this is a failure that would only show up when
running the test suite with --embedded:

3c0ff61

So I think everything should be fine now and this should appear in MariaDB
10.2.3, if I understand correctly.

Once again thanks for the patch. It was a pleasure to see a replication
addition that was so well done, and with so much attention to detail.

  • Kristian.
Member

knielsen commented Nov 3, 2016

vinchen notifications@github.com writes:

The new code is here:
https://github.com/vinchen/server/commits/GCSAdmin-10.2-binlog-compressed2-2

And added two fixed:
1.Avoid overflowing buffers in case of corrupt events
2.Check the compressed algorithm.

Looks fine, thanks for fixing this.

I have now merged and pushed this to MariaDB 10.2.

Rows_log_event::write_data_body(). I think the reason for that is that
the
SQL thread never sees the compressed events (they are uncompressed by the
IO
thread), but I would like your confirmation that my understanding is
correct.

  1. Did you think about testing that BINLOG statements with compressed

As mentioned above, the compressed events would uncompressed in constructor
in SQL thread.

Right, thanks, I understand now.

So the BINLOG statements in the output of mysqlbinlog have the uncompressed
data. This makes sense, just like the SQL queries are uncompressed before
being output by mysqlbinlog.

So in fact, one change I suggested is wrong, to accept compressed event
types in BINLOG statements. So I reverted this change again, and also added
some test cases:

56a041c

BTW, I think this is recently merged code (from the delayed replication
feature) that would not have appeared in your original patch.

Thanks for the explanation to help me understand this.

I also fixed a .result file - this is a failure that would only show up when
running the test suite with --embedded:

3c0ff61

So I think everything should be fine now and this should appear in MariaDB
10.2.3, if I understand correctly.

Once again thanks for the patch. It was a pleasure to see a replication
addition that was so well done, and with so much attention to detail.

  • Kristian.
@knielsen

This comment has been minimized.

Show comment
Hide comment
@knielsen

knielsen Nov 3, 2016

Member

Thanks, pushed to MariaDB 10.2.3.

Member

knielsen commented Nov 3, 2016

Thanks, pushed to MariaDB 10.2.3.

@knielsen knielsen closed this Nov 3, 2016

@svoj svoj added this to the 10.2 milestone Jul 11, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment