Support InnoDB-based binary log (transactional binlog stored within InnoDB) #25
ScottStroz
started this conversation in
Extensibility/Ecosystem
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Originally submitted by @vgrippa
Abstract:
This session proposes an InnoDB-based transactional binary log architecture for MySQL, where binlog events are stored directly within InnoDB instead of standalone files. The approach aims to simplify durability and crash recovery, reduce commit overhead by eliminating an extra fsync and two-phase commit coordination, and provide transactionally consistent binlog state for replication and CDC ecosystems, while preserving the existing replication protocol.
Short Description of the Feature:
Move the binary log out of standalone files and into an InnoDB-managed tablespace, so that data changes and their binlog events are written within the same mini-transaction and made durable by a single redo flush. This removes one fsync from every durable commit, eliminates the two-phase commit between InnoDB redo and the binary log, simplifies crash recovery to a single redo pass, and makes binlog state transactionally consistent with the rows it describes. The replication wire protocol stays unchanged. The feature is opt-in via a new binlog_storage = FILE | INNODB setting, defaulting to FILE.
Bug DB Link
https://bugs.mysql.com/bug.php?id=120430
Full Description
The binary log is currently a separate, file-based log written outside InnoDB's transactional layer. Every durable, replicated commit pays for two fsyncs (sync_binlog=1 + innodb_flush_log_at_trx_commit=1) and a two-phase commit between redo and binlog, plus a non-trivial crash-recovery reconciliation. Storing binlog events inside InnoDB would collapse this into a single durability domain.
Why it matters:
Performance: one fsync per commit on the hot path; simpler group commit.
Reliability: single recovery pass over InnoDB redo, no XA reconciliation between two logs.
Operability: binlog state is transactional — queryable, purgeable, and backed up atomically with the data.
Ecosystem: CDC consumers (Debezium, Maxwell, Readyset, ProxySQL mirroring) get a log that is transactionally consistent with the rows they read.
Proposal
mysql-community-design-proposal.json
Beta Was this translation helpful? Give feedback.
All reactions