Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for specifying an end log_pos #357

Merged
merged 8 commits into from
Oct 18, 2021

Conversation

paulvic
Copy link
Contributor

@paulvic paulvic commented Sep 14, 2021

I have a use case where we want to read binary logs from a specific start position to a specific end position.

I was originally planning to just filter the events returned by the BinLogStreamReader but it just sits waiting in self._stream_connection.read_packet() and never returns the events (the DB has very sporadic low traffic).

I was wondering if there would be an appetite to add support for an end_log_pos parameter? I've tested this out locally and it solves my problem by exiting the loop when we've moved past the end_pos.

There may be a better way to achieve my goal here, so all feedback is appreciated. Thanks!

@dongwook-chan
Copy link
Collaborator

dongwook-chan commented Sep 15, 2021

This is an interesting feature.
I do have some questions, though.
What behavior would you expect when all the events in the current binlog file are exhausted?
Go on to the next binlog file and begin from log_pos 0? Or just stop at it?

@paulvic
Copy link
Contributor Author

paulvic commented Sep 15, 2021

This is an interesting feature.
I do have some questions, though.
What behavior would you expect when all the events in the current binlog file is exhausted?
Go on to the next binlog file and begin from log_pos 0? Or just stop at it?

Thanks for the reply - good questions! For my use case, I would always be passing a start and end position in the same file. If the current binlog file is exhausted, then I think perhaps it would be best to just stop there.

We could add an end_log_file parameter that would enable stopping in the same or another file?

@paulvic
Copy link
Contributor Author

paulvic commented Sep 16, 2021

@dongwook-chan any thoughts on how to proceed? I can add an end_log_file parameter if you think that's a good idea.

@dongwook-chan
Copy link
Collaborator

@paulvic
Great idea.

Before going any further, I need to take a look at code.
I'll leave comments if anything comes across my mind.

@dongwook-chan
Copy link
Collaborator

dongwook-chan commented Sep 24, 2021

@paulvic
Sorry for the delay.
It was Korean thanks giving holiday.
I've studied code for a bit.
It seems that the main issue will be when to stop the stream.
It might take a while to determine when.
I'll leave comments for any progress.

And please invite me as a collaborator to your repo, paulvic:patch-1.

@dongwook-chan
Copy link
Collaborator

dongwook-chan commented Oct 10, 2021

@paulvic

Seems like you made some changes in paulvic:patch-1 for your own sake.
Some of the changes you made are not eligible to be merged in to the project's master branch.
So I don't think I can continue to work on in your forked repo anymore.

Plus, I think using boolean variable to 'should_continue' isn't the correct solution.
I would like to know if it is tested.
Do you want me continue the work on my forked repo?
Hope to hear back from you.

@paulvic
Copy link
Contributor Author

paulvic commented Oct 11, 2021

@paulvic

Seems like you made some changes in paulvic:patch-1 for your own sake. Some of the changes you made are not eligible to be merged in to the project's master branch. So I don't think I can continue to work on in your forked repo anymore.

Plus, I think using boolean variable to 'should_continue' isn't the correct solution. I would like to know if it is tested. Do you want me continue the work on my forked repo? Hope to hear back from you.

Hi @dongwook-chan yes I'd still like to continue with this work. Your forked repo is fine with me.

@julien-duponchelle
Copy link
Owner

Hi @paulvic I'm no longer very active but I'm really excited to see intercom is using my code for something :D

@dongwook-chan
Copy link
Collaborator

dongwook-chan commented Oct 16, 2021

@paulvic
I removed should_continue variable and added test in my forked branch.
However, I think you should be credited for your contribution.
There seems to be a few options:

  • Remove commits in paulvic:patch-1 that are not relevant to this PR. (Or maybe move the commits to another branch of yours?)
  • I'll create a PR in intercom:patch-1 then in noplay:main, once PR in intercom:patch-1 is merged.

I'll be waiting for your reply before making any progress with the PR.

@dongwook-chan
Copy link
Collaborator

dongwook-chan commented Oct 16, 2021

@paulvic
I would like to know if you want end_log_pos to be inclusive.
I assumed that you wanted it this way and if there is a binlog event that ends at end_log_pos, it will be the last event to be read by BinLogStreamReader.

@paulvic
Copy link
Contributor Author

paulvic commented Oct 16, 2021

@paulvic I removed should_continue variable and added test in my forked branch.

Hi @dongwook-chan, thanks for your help here. I added a comment to your proposed changes dongwook-chan@67d3a71.

  • Remove commits in paulvic:patch-1 that are not relevant to this PR. (Or maybe move the commits to another branch of yours?)

Which commits are you referring to? There are just 3 commits, and I believe they're all related to this PR but maybe I'm misunderstanding 😄.

I would like to know if you want end_log_pos to be inclusive.

Yes 👍 , I would like the end_log_pos to be inclusive.

@paulvic
Copy link
Contributor Author

paulvic commented Oct 16, 2021

Hi @paulvic I'm no longer very active but I'm really excited to see intercom is using my code for something :D

😄 @noplay We're appreciative your work here!

* Binlog event at end_pos is the last event to be read (end_pos is inclusive)
* iter() ends when fetchone() returns None
@dongwook-chan
Copy link
Collaborator

dongwook-chan commented Oct 17, 2021

@paulvic

Hi @dongwook-chan, thanks for your help here. I added a comment to your proposed changes dongwook-chan/python-mysql-replication@67d3a71.

Thank you for your review. You made a good point and the issue will be dealt with ASAP.

Which commits are you referring to? There are just 3 commits, and I believe they're all related to this PR but maybe I'm misunderstanding 😄.

My bad, they're indeed related to this PR. Now I don't see any reason to remove commits.
I've copied commits from dongwook-chan/python-mysql-replication@67d3a71 to paulvic:patch-1

As paulvic pointed out in [the
comment](dongwook-chan@67d3a71),

iteration will be stalled indefinitely if there is no events to read
past end_log_pos.

`is_past_end_log_pos` attribute will be added to BinLogStreamReader to stop iteration even if it is the last event.
@dongwook-chan
Copy link
Collaborator

dongwook-chan commented Oct 17, 2021

@paulvic

I've fixed the issue.
I have done some tests and it seems to work right.
Would you please test the code for your specific case?
I'll ask noplay, the owner to review and merge this PR if no issue arises.

@paulvic
Copy link
Contributor Author

paulvic commented Oct 17, 2021

@paulvic

I've fixed the issue. I have done some tests and it seems to work right. Would you please test the code for your specific case? I'll ask noplay, the owner to merge this PR if no issue arises.

@dongwook-chan - confirmed the fix works for me 👍

@dongwook-chan
Copy link
Collaborator

@noplay
Could you review this PR if you have time?
Please take your time.
I appreciate your effort to keep the repo going.

@julien-duponchelle
Copy link
Owner

That 's awesome

@julien-duponchelle julien-duponchelle merged commit cf314b2 into julien-duponchelle:main Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants