Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] tpcc failed after transfer happened #15091

Merged
merged 1 commit into from
Mar 22, 2024

Conversation

volgariver6
Copy link
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue https://github.com/matrixorigin/MO-Cloud/issues/2652

What this PR does / why we need it:

  1. the conditionn to check current session is in txn is wrong, fix it
  2. re-use the buffer of the previous connection in proxy.

@matrix-meow matrix-meow added the size/M Denotes a PR that changes [100,499] lines label Mar 21, 2024
@mergify mergify bot requested a review from sukki37 March 21, 2024 08:52
@mergify mergify bot added the kind/bug Something isn't working label Mar 21, 2024
@matrix-meow
Copy link
Contributor

@volgariver6 Thanks for your contributions!

Here are review comments for file pkg/frontend/mysql_protocol.go:

Pull Request Review:

Title and Body:

The title and body of the pull request clearly indicate that it addresses a bug related to the failure of TPCC after a transfer occurred. The body provides a brief overview of the changes made, including fixing a condition check and reusing a buffer in the proxy. It also references the related issue on GitHub. Overall, the title and body effectively communicate the purpose of the pull request.

Changes in mysql_protocol.go:

  1. Security Issue - Bitwise Operation in extendStatus Function:

    • The extendStatus function performs a bitwise operation using the & operator on old, SERVER_QUERY_WAS_SLOW, and SERVER_SESSION_STATE_CHANGED. However, the intention seems to be to set the bits corresponding to SERVER_QUERY_WAS_SLOW and SERVER_SESSION_STATE_CHANGED in the statusFlags variable. The current implementation will not achieve this and may lead to unexpected behavior.
    • Suggestion: To set the bits for SERVER_QUERY_WAS_SLOW and SERVER_SESSION_STATE_CHANGED in statusFlags, use the | operator instead of &.
  2. Code Optimization - DRY Principle:

    • The extendStatus function is used multiple times to modify statusFlags in makeOKPayload and makeOKPayloadWithEof functions. To adhere to the DRY (Don't Repeat Yourself) principle, consider consolidating the logic for modifying statusFlags into a single function to avoid code duplication and potential inconsistencies.
  3. Code Readability - Magic Numbers:

    • Magic numbers like 128, 10, and HeaderOffset are used in the calculation of slice lengths and positions. These numbers lack context and can make the code harder to understand and maintain.
    • Suggestion: Define constants or variables with descriptive names for these magic numbers to improve code readability and maintainability.
  4. Consistency - Code Formatting:

    • In the makeEOFPayload function, the status variable is modified using extendStatus(status) before writing it to the data buffer. However, in the makeOKPayload and makeOKPayloadWithEof functions, the modification of statusFlags is done after writing to the buffer.
    • Suggestion: Ensure consistency in the order of operations within these functions for better code organization and readability.

Overall Suggestions:

  • Address the security issue by correcting the bitwise operation in the extendStatus function to set the desired bits in statusFlags.
  • Refactor the code to consolidate the logic for modifying statusFlags into a single function to avoid redundancy.
  • Improve code readability by replacing magic numbers with descriptive constants or variables.
  • Maintain consistency in the order of operations within functions for better code organization.

By addressing these issues and optimizing the code changes, the pull request can enhance the reliability, security, and maintainability of the codebase.

Here are review comments for file pkg/proxy/mysql_conn_buf.go:

Pull Request Review:

Title:

The title indicates that this pull request is addressing a bug related to TPCC failing after a transfer happened.

Body:

The body of the pull request provides information about the type of PR (BUG), the specific issue it fixes (issue #2652), and a brief description of the changes made, including fixing a condition to check the current session in a transaction and reusing the buffer of the previous connection in the proxy.

Changes in pkg/proxy/mysql_conn_buf.go:

  1. Concurrency Issue:

    • The msgBuf struct previously used a sync.Mutex for inTxn state. In the updated code, inTxn is now an atomic.Bool. However, there is a potential issue in the setTxnStatus function where the peer field is accessed without proper synchronization. This could lead to race conditions and incorrect behavior. To address this, ensure proper synchronization when accessing shared data between msgBuf instances.
  2. Error Handling:

    • In the handleEOFPacket function, there is a check for the length of the msg slice, but the check should be done before accessing msg[7:] to avoid potential out-of-bounds errors. Update the error handling to check the length of the msg slice before accessing specific indices.
  3. Code Optimization:

    • In the newMySQLConn function, the conditional check for mc.msgBuf == nil can be simplified by directly assigning newMsgBuf to mc.msgBuf without the conditional check. This can make the code cleaner and more concise.
  4. Code Comment Cleanup:

    • There are comments mentioning "FIXME" and instructions to remove lines after fixing certain issues. It's advisable to remove such comments once the issues are resolved to maintain code cleanliness.

Suggestions for Improvement:

  1. Concurrency Handling:

    • Ensure proper synchronization mechanisms are in place when accessing shared data to prevent race conditions. Consider using sync.Mutex or other synchronization primitives where necessary.
  2. Error Handling Improvement:

    • Enhance error handling by validating input data sizes before accessing specific indices to prevent potential panics due to out-of-bounds errors.
  3. Code Refactoring:

    • Simplify conditional checks and remove unnecessary comments to improve code readability and maintainability.
  4. Testing:

    • Consider adding or updating unit tests to cover the changes made in this PR to ensure the correctness and stability of the code.

By addressing the concurrency issues, improving error handling, optimizing the code, and refining the comments, the quality and reliability of the codebase can be enhanced. Additionally, incorporating thorough testing will help validate the correctness of the changes made in this pull request.

Here are review comments for file pkg/proxy/mysql_conn_buf_test.go:

Pull Request Review:

Title:

The title of the pull request indicates that it is addressing a bug related to the failure of "tpcc" after a transfer occurred. This is a clear and concise description of the purpose of the pull request.

Body:

The body of the pull request provides information about the type of PR (bug fix), the specific issue it fixes, and a brief description of the changes made. It mentions fixing a condition check related to the current session being in a transaction and reusing the buffer of the previous connection in the proxy. The body could be improved by providing more detailed information about the specific changes made and the impact of those changes.

Changes in pkg/proxy/mysql_conn_buf_test.go:

  1. Issue with newMySQLConn function call: In the changes made to the mysql_conn_buf_test.go file, there are multiple instances where the newMySQLConn function call is modified by adding an additional nil argument at the end. This change seems inconsistent and may introduce errors or unexpected behavior. It is important to ensure that the function calls are consistent throughout the codebase to maintain clarity and avoid potential issues.

    Suggestion: Review the newMySQLConn function definition and ensure that the function calls are consistent with the expected parameters. If the additional nil argument is necessary, document the reason for its inclusion to improve code readability.

  2. Code duplication: In some test cases, there is code duplication where the newMySQLConn function call is repeated with the same arguments. This duplication can make the code harder to maintain and may lead to inconsistencies if changes are required in the future.

    Suggestion: Consider refactoring the test cases to eliminate code duplication by extracting common setup logic into separate functions or using setup methods provided by testing frameworks to improve code readability and maintainability.

  3. Magic numbers: There are magic numbers like 10, 4, 13, etc., used in the test cases. Magic numbers can make the code less readable and harder to maintain. It is recommended to replace these magic numbers with meaningful constants or variables to improve code clarity.

    Suggestion: Define constants or variables with descriptive names for these magic numbers to enhance code readability and maintainability.

  4. Comments: In one instance, there is a commented-out line of code (// sc := newMySQLConn("source", src, 10, nil, nil)) in a test case. Including commented-out code in the codebase can lead to confusion and should be avoided.

    Suggestion: Remove commented-out code to keep the codebase clean and maintain clarity.

Security Concerns:

The changes in the pull request do not appear to introduce any immediate security concerns based on the provided diff data.

Overall Suggestions:

  • Ensure consistency in function calls and parameter usage to prevent errors and maintain code clarity.
  • Refactor test cases to eliminate code duplication and improve maintainability.
  • Replace magic numbers with constants or variables for better code readability.
  • Remove commented-out code to keep the codebase clean and understandable.

By addressing the issues mentioned above and implementing the suggested improvements, the codebase can be enhanced in terms of maintainability, readability, and consistency.

Here are review comments for file pkg/proxy/tunnel.go:

Pull Request Review:

Title and Body:

The title and body of the pull request clearly indicate that this PR is addressing a bug related to the failure of TPCC after a transfer occurred. The body provides a brief overview of the changes made to fix the issue, including correcting a condition check and reusing the buffer of the previous connection in the proxy. It also references the related GitHub issue.

Changes in pkg/proxy/tunnel.go:

  1. Inconsistent Synchronization in replaceServerConn and doReplaceConnection:

    • The replaceServerConn function now takes an additional parameter sync bool, but the usage of this parameter is inconsistent. Depending on the value of sync, the function either updates the pipes or sets the destination/source for pipes. This inconsistency can lead to unexpected behavior and should be clarified or made consistent throughout the codebase.
  2. Inadequate Handling of Transfer Intent in handleTransferIntent:

    • In the handleTransferIntent function, the condition if p.safeToTransfer() && p.tun != nil is used to determine if it is safe to transfer the session. However, the logic inside this condition is based on the negation of safeToTransfer, which might be confusing. It's recommended to revise the logic to ensure clarity and maintainability.
  3. Potential Security Issue in safeToTransfer:

    • The comment in the safeToTransfer function mentions that it must be a server-to-client pipe. However, the function does not explicitly check for this condition, which could lead to security vulnerabilities if used incorrectly. It's advisable to add a check to ensure that this function is only called for the intended pipe direction.
  4. Code Duplication in transfer and transferSync Functions:

    • There is code duplication in the transfer and transferSync functions where both functions call canStartTransfer with different parameters. This duplication can be avoided by refactoring the code to eliminate redundancy and improve maintainability.
  5. Missing Error Handling in getNewServerConn:

    • The getNewServerConn function does not handle errors returned by newMySQLConn properly. It's essential to add appropriate error handling to ensure robustness and prevent unexpected failures.
  6. Unused Variables in kickoff Function:

    • In the kickoff function, the variables src and dst are assigned values but not used. Removing these unused variables can improve code readability and maintainability.

Suggestions for Optimization:

  • Consistent Parameter Usage: Ensure that the usage of the sync parameter is consistent across functions like replaceServerConn and doReplaceConnection to avoid confusion and maintain code clarity.

  • Refactor Code Duplication: Refactor the transfer and transferSync functions to eliminate code duplication by extracting common logic into a separate function, improving code maintainability.

  • Enhance Error Handling: Improve error handling in functions like getNewServerConn to handle potential errors returned by newMySQLConn effectively, providing better resilience to failures.

  • Clarify Functionality: Clarify the logic and conditions in functions like safeToTransfer to ensure that they are used correctly and securely, especially considering the security implications mentioned in the comments.

By addressing these issues and optimizing the code changes, the pull request can be enhanced for better functionality, maintainability, and security.

Here are review comments for file pkg/proxy/tunnel_test.go:

Pull Request Review:

Title:

The title of the pull request indicates that there was a bug causing an issue with TPCC after a transfer occurred.

Body:

The body of the pull request specifies that the PR is a bug fix related to the incorrect condition for checking if the current session is in a transaction and reusing the buffer of the previous connection in the proxy. It also references the related issue on GitHub.

Changes in pkg/proxy/tunnel_test.go:

  1. TestTunnelClientToServer:

    • Changed the assertion from scp.isInTxn() to scp.safeToTransfer() multiple times.
    • Suggested improvement: It seems that the condition scp.safeToTransfer() is now used to check if it's safe to transfer. Ensure that this change accurately reflects the intended behavior and that all relevant test cases pass.
  2. TestTunnelReplaceConn:

    • Modified the function replaceServerConn by adding an extra parameter false.
    • Suggested improvement: Ensure that the additional parameter false is correctly handled and documented to avoid any confusion in the future.
  3. TestPipeCancelError, TestPipeStart, TestPipeStartAndPause, TestPipeMultipleStartAndPause:

    • Updated the creation of newMySQLConn by adding an extra parameter nil.
    • Suggested improvement: Verify if the additional nil parameter is necessary and if it's being used appropriately in the context of these tests.
  4. TestCanStartTransfer:

    • Refactored the canStartTransfer function to accept an additional parameter false.
    • Suggested improvement: Ensure that the new parameter false is correctly utilized within the canStartTransfer function and that it doesn't introduce any unexpected behavior.
  5. TestReplaceServerConn:

    • Modified the replaceServerConn function by adding an extra parameter false.
    • Suggested improvement: Confirm that the new parameter false in replaceServerConn is essential for the intended functionality and is properly integrated into the existing code logic.

Suggestions for Optimization:

  1. Consistency in Parameter Usage:

    • Ensure consistent usage of additional parameters (like false) across functions to maintain clarity and avoid confusion.
  2. Documentation:

    • Add or update comments in the code to explain the purpose of the changes made and the significance of any new parameters introduced.
  3. Testing:

    • Validate that all test cases related to the changes pass successfully to guarantee the correctness of the bug fixes.
  4. Code Review:

    • Conduct a thorough code review to check for any other potential bugs, inefficiencies, or security vulnerabilities that may exist in the codebase.

By addressing these points and ensuring the correctness and clarity of the changes, the quality and reliability of the codebase can be improved.

@sukki37 sukki37 merged commit 37eee99 into matrixorigin:1.1-dev Mar 22, 2024
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working size/M Denotes a PR that changes [100,499] lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants