You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"We're fairly sure LinkedIn's engineers, while writing DataBus solved this problem by patching mysql to output full schema information into its binlogs. They never seemed to release the patch, though."
Actually, we didn't make any such patch. The use case for Databus consuming from MySQL (as opposed to Oracle) is limited to Espresso - LinkedIn's NoSQL store. The Open Replicator based binlog to Databus event translator has access to the Espresso schema registry that contains all necessary information about the MySQL columnar data.
Espresso stores and retrieves Avro documents via a REST API. The URL path for a document is defined by a table schema that defines the columns that compose the primary key in MySQL. All other columns are common to all tables in the system. The actual documents are stored in avro/binary as a longblob field so there is no further specialization of the MySQL tables based on the layout of the documents stored. When processing a row event from the binlog, one only needs the table name from the corresponding TableMapEvent to fetch (and cache) the corresponding table schema from a registry to obtain all necessary information about the row.
I enjoyed the introductory blog post for Maxwell at:
https://developer.zendesk.com/blog/introducing-maxwell-a-mysql-to-kafka-binlog-processor
This looks like a great project!
I just wanted to respond to one comment:
Actually, we didn't make any such patch. The use case for Databus consuming from MySQL (as opposed to Oracle) is limited to Espresso - LinkedIn's NoSQL store. The Open Replicator based binlog to Databus event translator has access to the Espresso schema registry that contains all necessary information about the MySQL columnar data.
Espresso stores and retrieves Avro documents via a REST API. The URL path for a document is defined by a table schema that defines the columns that compose the primary key in MySQL. All other columns are common to all tables in the system. The actual documents are stored in avro/binary as a longblob field so there is no further specialization of the MySQL tables based on the layout of the documents stored. When processing a row event from the binlog, one only needs the table name from the corresponding TableMapEvent to fetch (and cache) the corresponding table schema from a registry to obtain all necessary information about the row.
Tom Quiggle,
https://www.linkedin.com/in/tquiggle
The text was updated successfully, but these errors were encountered: