Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix - Bug/increase history list length #431

Conversation

chungeun-choi
Copy link
Contributor

@chungeun-choi chungeun-choi commented Aug 23, 2023

Overview

issue number #430

When using the pymysqlreplication package, there was a recurring issue with the history list length value under the following circumstances:

  • MySQL version 8.x or higher.
  • Continuous execution of DML (Data Manipulation Language) statements.

This problem led to a gradual slowing down of SELECT queries.

For detailed information, you can refer to the following reference:

https://minervadb.xyz/troubleshooting-innodb-history-length-with-hung-mysql-transaction/

As a result, I investigated the problem area and made necessary modifications. After making the changes, you were able to compare the performance before and after the fix, using monitoring tools such as Prometheus and MySQL Exporter.

Fixed

  • Modified code

    def __get_table_information(self, schema, table):
            for i in range(1, 3):
                try:
                    if not self.__connected_ctl:
                        self.__connect_to_ctl()
    
                    cur = self._ctl_connection.cursor()
                    cur.execute("""
                        SELECT
                            COLUMN_NAME, COLLATION_NAME, CHARACTER_SET_NAME,
                            COLUMN_COMMENT, COLUMN_TYPE, COLUMN_KEY, ORDINAL_POSITION,
                            DATA_TYPE, CHARACTER_OCTET_LENGTH
                        FROM
                            information_schema.columns
                        WHERE
    			  table_schema = %s AND table_name = %s
                        ORDER BY ORDINAL_POSITION
                        """, (schema, table))
                    result = cur.fetchall()
                    cur.close()
    
                    return result
    								...
    								...

Performance Comparison After the Code Modifications

  • Before modification

    image

  • After modification

    image

@chungeun-choi
Copy link
Contributor Author

We also found and added parts that can be solved by adding autocommit settings

    def __connect_to_ctl(self):
        if not self._ctl_connection_settings:
            self._ctl_connection_settings = dict(self.__connection_settings)
        self._ctl_connection_settings["db"] = "information_schema"
        self._ctl_connection_settings["cursorclass"] = DictCursor
        self._ctl_connection_settings["autocommit"] = True # Changed
        self._ctl_connection = self.pymysql_wrapper(**self._ctl_connection_settings)
        self._ctl_connection._get_table_information = self.__get_table_information
        self.__connected_ctl = True

@dongwook-chan
Copy link
Collaborator

This PR makes it feasible for major corporations handling extensive traffic to utilize this library. It addresses potential query response lags and reduces the strain on database servers. One of the standout features of the replication protocol is its minimal impact on database performance, and this PR ensures the library capitalizes on that strength.

@julien-duponchelle
Copy link
Owner

Great , thanks for the investigation . Small fix but big problem indeed

@julien-duponchelle julien-duponchelle merged commit 4c2dcf2 into julien-duponchelle:main Aug 23, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants