Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/remove column schema #477

Conversation

sean-k1
Copy link
Collaborator

@sean-k1 sean-k1 commented Sep 15, 2023

Resolve #473
Changes

Remove Column Schema Information
Do not rely on current database schemas

Reason

For now, we can benefit from the column schema information, but in the future (since MYSQL 8.0.14) we will rely on optional_meta_data to get table information.

We apologize to those using MYSQL 5, but we no longer support python-mysql-replication prior to version 0.44.0.

Similar projects like ours, go-mysql and rust-mysql-common, do not SELECT information_schema, but only refer to column names in optional metadata.

We've been relying on the current database to give us values based on incorrect column information if the table at the time of the current event is different from the current database table.

The column information must be obtained from the information in the MYSQL Binlog packets so that we can build a reliable open source project.

Detail

We currently have two problems with relying on the column_schmea of the database.

  1. the table objects in our table_map do not change when DDL Event occurs.

  2. if the table information at the time of the event is different from the column_schema of the current database, we have no way to find out. (because it depends on the current database)

This is a serious problem as it indicates incorrect column information changes on UPDATE, INSERT, and DELETE.

So with this PR, we hope to no longer rely on the database.

In this PR, we don't currently know the column name information because we don't rely on the current database.
Therefore, we set the Table object's column_name_flag = False.

We have now changed all the test case events to work correctly when column_name_flag==True.

After this PR, we are going to write something to change column_name_flag=True when we get column information due to optional_meta_data.
We can also provide it based on the table column information at the time of the event.
(the table objects in our table_map will be changed)

Reference

https://dev.mysql.com/blog-archive/more-metadata-is-written-into-binary-log/

remove using column_schema

remove using column_schemas
@sean-k1 sean-k1 marked this pull request as draft September 15, 2023 15:58
fix : testcase

fix testcase

fix: testcase
table init changed
@sean-k1 sean-k1 marked this pull request as ready for review September 16, 2023 02:59
@sean-k1
Copy link
Collaborator Author

sean-k1 commented Sep 16, 2023

@julien-duponchelle

Hi Julien!
I'd like to ask you to think deeply about this issue.
By solving this, we can become a more reliable open source project.

@julien-duponchelle
Copy link
Owner

Maybe we could keep the old mode and use the new one if the database is compatible even if it's create a lot of additional complexity. I think it will be usefull for people using the library for moving from old MySQL deployment.

I agree that the new mode is better.

@dongwook-chan
Copy link
Collaborator

@julien-duponchelle @sean-k1
Considering the depth of this change and its impact on the project's core functionality, I believe this introduces a new chapter for python-mysql-replication. With that in mind, I'd recommend considering this for a major version increment. This would help signal to the users the significance of the change and set the right expectations regarding compatibility and functionality differences.

@julien-duponchelle
Copy link
Owner

Agree a major version increase is a good idea

@julien-duponchelle
Copy link
Owner

I'm not strongly attached to backward compatibility with MySQL 5 even if it will be ideal. If it's too much effort we can make it clear in the documentation that old version require an old version of the library.

@sean-k1
Copy link
Collaborator Author

sean-k1 commented Sep 17, 2023

@julien-duponchelle
I Check this one the database is compatible
By SHOW VARIABLES LIKE 'BINLOG_ROW_METADATA';

If this value is None, provide logging to the user at the warning level.
using python-mysql-replication version earlier than 1.0

if this value is Minimal
스크린샷 2023-09-18 오전 12 29 37

@julien-duponchelle
Copy link
Owner

Great

Let me know when you are ready for a merge

@sean-k1
Copy link
Collaborator Author

sean-k1 commented Sep 18, 2023

@julien-duponchelle I'm ready to merge this PR.
If this PR merged, I resolve the conflict sync Column Pr

delete print
@sean-k1
Copy link
Collaborator Author

sean-k1 commented Sep 19, 2023

@julien-duponchelle
It would be nice if this PR was also merged!
This PR should have been merged before the column sync PR.
but as the order of merges changed, the comment didn't change all the way.

@julien-duponchelle julien-duponchelle merged commit a301596 into julien-duponchelle:main Sep 20, 2023
5 checks passed
@julien-duponchelle
Copy link
Owner

And voilà

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How about remove information about self.column_schemas?
3 participants