Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect ddl add handlers #323

Merged
merged 3 commits into from
Feb 24, 2022
Merged

Detect ddl add handlers #323

merged 3 commits into from
Feb 24, 2022

Conversation

shivnagarajan
Copy link
Contributor

This is a slightly alternate version of

#322

@pawandubey

binlog_streamer.go Outdated Show resolved Hide resolved
binlog_streamer.go Show resolved Hide resolved
@shivnagarajan shivnagarajan changed the base branch from detect-ddl to master December 9, 2021 18:14
@shivnagarajan shivnagarajan force-pushed the detect-ddl-add-handlers-temp branch 2 times, most recently from b6b3c98 to ad0ba1e Compare December 9, 2021 18:53
@shivnagarajan
Copy link
Contributor Author

This changes the code to provide the ability to add a binlogEvent handler for individual events. The golang test has been updated to test adding such a handler for a query event and validates that it actually works.

Since the code no longer needs to parse the binlog events explicitly, we did not need to pull in the full parser for the ddl events that we had earlier.

@shivnagarajan shivnagarajan marked this pull request as ready for review December 9, 2021 19:04
// reset following the next RowsQueryEvent before the corresponding RowsEvent(s)
query = nil
} else {
query, err = s.defaultEventHandler(ev, query, &es)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the defaultEventHandler fires events to the eventListeners, but if one were to override the eventHandler, eventListeners will no longer be called? This also means that the default event listener that handles the row event (by sending it to the BinlogWriter) will no longer be called. You would have to manually reregister handleRowEvent as a eventHandler? This feels very confusing and somewhat error prone? Are there any cases where we don't want to row events to be handled as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shivnagarajan and I discussed this briefly: the event listeners would only be missed if a custom handler for ROWS_EVENT was registered (overwriting the default handler behaviour). The problem is that the callers would not have access to s.eventListeners if they did want to fire them off themselves. That's probably okay since we are providing a hook into event handling and not a hook into the bodies of built-in event handlers themselves.

There are a few ways to tackle this, that I wanted to summarize:

  • Accept this limitation and note it in the documentation (possibly as a gotcha).
    This is the current approach. If there is demand for this, we can later come back and adopt one of the alternative approaches later.

  • Always firing off the eventListeners for ROWS_EVENT, even if a custom handler had been registered.
    This would solve the specific problem
    This solves the specific problem of firing off the listeners for ROWS_EVENT without increasing the surface area of the API exposed to the handler, but it will be a surprising, snowflake-y behaviour which cannot be turned off even if you wanted to.

  • Make eventListeners available to the callers inside the handler so that they can do with them as they wish.
    This solves the whole problem at the cost of muddling the handler signature with data unrelated to the event state. Moreover, the listeners are currently only be useful for one specific event but would need to be a part of the signature for handlers of all events (this is perhaps not too bad but still seems unnecessary).

  • Another approach not considered here?

I think we should go with the first option for now and revisit later if needed, FWIW.


Separately, I think if we wanted to be "truly" flexible in handler design then we should get to a place where all our default handlers could be registered as custom handlers instead of being implemented in a private fallback function. But we don't need to take on that complexity all at once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK the first option sounds OK for the time being.

I suspect this part of the code will need to be revisited if we ever look into a larger restructure of Ghostferry as I think the complexity is increasing with this change (and also would with any other changes proposed). Perhaps the callback system would need to be revamped 🤷 , but that's a problem for another day.

@shivnagarajan shivnagarajan force-pushed the detect-ddl-add-handlers-temp branch 2 times, most recently from 8a1bb95 to a8f2db5 Compare December 17, 2021 03:52
Co-authored-by: Pawan Dubey <pawan.dubey@shopify.com>
Co-authored-by: Shiv Nagarajan <shiv.nagarajan@shopify.com>
Co-authored-by: Pawan Dubey <2499863+pawandubey@users.noreply.github.com>
@shivnagarajan
Copy link
Contributor Author

I just rebased this to latest master. Will clean up a little based on the remaining comments.

Co-authored-by: Pawan Dubey <2499863+pawandubey@users.noreply.github.com>
}

// try attaching a handler to a valid event type
err := this.binlogStreamer.AddBinlogEventHandler(replication.TABLE_MAP_EVENT, qe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice!

// reset following the next RowsQueryEvent before the corresponding RowsEvent(s)
query = nil
} else {
query, err = s.defaultEventHandler(ev, query, &es)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shivnagarajan and I discussed this briefly: the event listeners would only be missed if a custom handler for ROWS_EVENT was registered (overwriting the default handler behaviour). The problem is that the callers would not have access to s.eventListeners if they did want to fire them off themselves. That's probably okay since we are providing a hook into event handling and not a hook into the bodies of built-in event handlers themselves.

There are a few ways to tackle this, that I wanted to summarize:

  • Accept this limitation and note it in the documentation (possibly as a gotcha).
    This is the current approach. If there is demand for this, we can later come back and adopt one of the alternative approaches later.

  • Always firing off the eventListeners for ROWS_EVENT, even if a custom handler had been registered.
    This would solve the specific problem
    This solves the specific problem of firing off the listeners for ROWS_EVENT without increasing the surface area of the API exposed to the handler, but it will be a surprising, snowflake-y behaviour which cannot be turned off even if you wanted to.

  • Make eventListeners available to the callers inside the handler so that they can do with them as they wish.
    This solves the whole problem at the cost of muddling the handler signature with data unrelated to the event state. Moreover, the listeners are currently only be useful for one specific event but would need to be a part of the signature for handlers of all events (this is perhaps not too bad but still seems unnecessary).

  • Another approach not considered here?

I think we should go with the first option for now and revisit later if needed, FWIW.


Separately, I think if we wanted to be "truly" flexible in handler design then we should get to a place where all our default handlers could be registered as custom handlers instead of being implemented in a private fallback function. But we don't need to take on that complexity all at once.

Copy link
Contributor

@shuhaowu shuhaowu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I requested a few minor comments, as I feel the complexity of the code in the BinlogStreamer increasing a bit, but that's likely OK for now.

binlog_streamer.go Show resolved Hide resolved
binlog_streamer.go Show resolved Hide resolved
Co-authored-by: Pawan Dubey <2499863+pawandubey@users.noreply.github.com>

Co-authored-by: Pawan Dubey <2499863+pawandubey@users.noreply.github.com>
@shivnagarajan shivnagarajan merged commit 05fb06b into master Feb 24, 2022
@shivnagarajan shivnagarajan deleted the detect-ddl-add-handlers-temp branch February 24, 2022 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants