Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cherry-pick-1.1-dev] improve delete from table without where & limit (#13120) #15372

Merged
merged 5 commits into from Apr 8, 2024

Conversation

noorall
Copy link
Contributor

@noorall noorall commented Apr 8, 2024

When a table does not have a foreign key. So for delete from table, it can be optimized to truncate.

Approved by: @m-schen, @nnsgmsone, @badboynt1, @aunjgr, @ouyuanning

(cherry picked from commit 118e577)

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #https://github.com/matrixorigin/MO-Cloud/issues/2919

What this PR does / why we need it:

improve delete from table without where & limit (#13120)

…13120)

When a table does not have a foreign key. So for delete from table, it can be optimized to truncate.

Approved by: @m-schen, @nnsgmsone, @badboynt1, @aunjgr, @ouyuanning

(cherry picked from commit 118e577)
@matrix-meow matrix-meow added the size/S Denotes a PR that changes [10,99] lines label Apr 8, 2024
@mergify mergify bot requested a review from sukki37 April 8, 2024 01:43
@matrix-meow
Copy link
Contributor

@noorall Thanks for your contributions!

Here are review comments for file pkg/pb/plan/plan.pb.go:

Pull Request Review:

Title:

The title of the pull request is clear and informative, indicating that the changes are related to improving the delete operation from a table without a where clause and limit.

Body:

The body of the pull request provides context about the optimization made for deleting from a table without a foreign key by using the truncate operation. It also lists the approvals and the commit from which the changes were cherry-picked. The PR type, related issue, and the purpose of the changes are also mentioned.

Changes in pkg/pb/plan/plan.pb.go:

  1. Addition of TruncateTable struct:

    • A new struct TruncateTable has been added to the plan.pb.go file.
    • The struct includes fields like ClusterTable, TableId, ForeignTbl, and a new field IsDelete.
    • The addition of IsDelete boolean field seems to be related to indicating whether the operation is a delete.
  2. Modification of DeleteCtx struct:

    • A new field TruncateTable of type *TruncateTable has been added to the existing DeleteCtx struct.
    • This change seems to be related to incorporating the new TruncateTable struct into the existing DeleteCtx struct.

Suggestions for Improvement:

  1. Clarify the Purpose of IsDelete Field:

    • The purpose of the IsDelete field in the TruncateTable struct should be clearly documented to explain its significance and usage. This will help other developers understand its role in the context of the codebase.
  2. Consistency in Field Naming:

    • Ensure consistency in field naming conventions. For example, the field IsDelete in the TruncateTable struct could be named more descriptively to align with other field names in the struct.
  3. Code Optimization:

    • Consider refactoring the code to ensure that the new fields and structs are integrated efficiently without redundancy or unnecessary complexity.
  4. Security Consideration:

    • When introducing new fields like IsDelete, ensure that proper validation and handling mechanisms are in place to prevent any potential security vulnerabilities, such as injection attacks or unauthorized deletions.
  5. Documentation Update:

    • Update the code comments and documentation to reflect the changes made, especially regarding the new TruncateTable struct and its fields. Clear documentation helps in maintaining the codebase and onboarding new developers.

By addressing the above suggestions, the codebase can be improved in terms of clarity, consistency, efficiency, and security.

Overall, the changes seem to be aimed at enhancing the delete operation from a table without a foreign key by introducing a more optimized approach using the truncate operation. It is essential to ensure that the new additions are well-documented and integrated seamlessly into the existing codebase.

Here are review comments for file pkg/sql/compile/compile.go:

Pull Request Review:

Title and Body:

The title of the pull request is clear and concise, indicating that it aims to improve the delete operation from a table without a where clause and limit. The body provides additional context about the optimization by suggesting the use of truncate when a table does not have a foreign key. The approval list and cherry-picked commit information are also included, which is helpful for tracking and accountability.

Changes in pkg/sql/compile/compile.go:

  1. Line 568-570:

    • Issue: The condition if s.Plan == nil before assigning s.Plan = pn seems redundant as it checks if s.Plan is nil and then assigns it to pn, which can be simplified to directly assigning s.Plan = pn.
    • Suggestion: Remove the conditional check and directly assign s.Plan = pn.
  2. Line 1138-1154:

    • Issue: The added code block introduces a new condition to check n.DeleteCtx.CanTruncate before creating a new scope for truncating a table. However, there is a potential security issue here. Allowing a delete operation without a WHERE clause and LIMIT can be risky as it can lead to unintentional data loss if executed accidentally or maliciously.
    • Security Concern: Truncating a table removes all rows without the possibility of rollback, which can have severe consequences if done mistakenly.
    • Suggestion: Consider adding additional safeguards or confirmation steps before allowing a truncate operation, such as requiring explicit user confirmation or restricting this feature to specific privileged users.

Overall Suggestions:

  • Ensure code consistency by simplifying unnecessary conditional checks for variable assignments.
  • Prioritize data safety and integrity by carefully evaluating the implications of allowing operations like table truncation without conditions.
  • Consider implementing safeguards or restrictions to prevent accidental or unauthorized data loss.

By addressing the mentioned issues and considering the suggestions provided, the pull request can be optimized for better code quality and security.

Here are review comments for file pkg/sql/compile/ddl.go:

Pull Request Review:

Title and Body:

The title of the pull request is clear and informative, indicating that it aims to improve the delete operation on a table without a foreign key by optimizing it to truncate. The body provides additional context about the optimization and mentions the approval of relevant team members. It also references the issue being addressed and the commit from which the changes were cherry-picked. Overall, the title and body effectively communicate the purpose of the pull request.

Changes in pkg/sql/compile/ddl.go:

  1. Variable Initialization:

    • The addition of keepAutoIncrement and affectedRows variables is appropriate for tracking whether to retain auto-increment values and the number of affected rows during the operation.
    • Suggestion: Ensure that these variables are properly scoped and named descriptively to avoid confusion.
  2. Condition for Delete Operation:

    • The condition if tqry.IsDelete is used to determine if the operation is a delete, setting keepAutoIncrement and fetching affectedRows.
    • Issue: The use of IsDelete to determine the operation type might be confusing or ambiguous. It's recommended to use a more descriptive flag or method to differentiate between operations clearly.
  3. Truncate Operation:

    • The logic for truncating temporary tables and handling the truncate operation seems appropriate.
    • Optimization: Consider refactoring the code to improve readability and maintainability, especially around the conditional blocks.
  4. Function Parameters:

    • The addition of keepAutoIncrement as a parameter in the Truncate function call is necessary for handling auto-increment values appropriately.
    • Suggestion: Ensure that the parameter is used consistently and documented clearly for future maintainability.
  5. Affected Rows Update:

    • The update to add the affected rows to the compiler context is essential for tracking the impact of the operation.
    • Optimization: Ensure that the update is done efficiently without unnecessary overhead.

Security Concerns:

  • Data Sanitization: Ensure that user input is properly sanitized to prevent SQL injection vulnerabilities, especially when dealing with table names and IDs.
  • Authorization: Verify that the user has the necessary permissions to perform truncate operations on the specified table to prevent unauthorized data manipulation.

General Suggestions:

  • Code Comments: Add or improve comments to explain the purpose and logic behind the code changes for better understanding by other developers.
  • Error Handling: Enhance error handling to provide informative messages and handle exceptions gracefully.
  • Unit Tests: Include unit tests to cover the new functionality and ensure that edge cases are handled correctly.

Overall, the changes in the pull request are focused on optimizing the delete operation by utilizing the truncate functionality. Addressing the mentioned suggestions and ensuring code quality standards are maintained will enhance the reliability and maintainability of the codebase.

Here are review comments for file pkg/sql/plan/build_dml_util.go:

Pull Request Review:

Title:

The title of the pull request indicates that it aims to improve the delete operation from a table without a WHERE clause and LIMIT. The title also includes the reference to the branch (cherry-pick-1.1-dev) and the pull request number (#13120), which is helpful for tracking purposes.

Body:

The body of the pull request provides some context on the optimization being made for tables without foreign keys, suggesting that truncating the table could be more efficient than a traditional delete operation. It also lists the approval from various team members and references the commit from which the changes were cherry-picked. The body could be more detailed in explaining the specific optimization and its benefits.

Changes:

  1. Addition of indexTableNames and foreignTbl in deleteNodeInfo struct:

    • The addition of these fields seems to be related to handling foreign keys and index tables during the delete operation.
    • Issue: The usage of these fields is not clearly explained in the context of the optimization for tables without foreign keys.
    • Suggestion: Provide comments or documentation explaining the purpose and usage of these new fields to improve code readability and maintainability.
  2. Modification in buildDeletePlans function:

    • The logic for determining whether to truncate the table instead of performing a delete operation has been added.
    • Issue: The logic for determining when to truncate the table is complex and could be simplified for better readability.
    • Suggestion: Refactor the logic into separate functions or add comments to clarify the conditions under which truncation should occur.
  3. Changes in makeOneDeletePlan function:

    • The addition of the canTruncate parameter and the logic to handle truncating the table.
    • Issue: The logic for creating the TruncateTable object and setting its properties could be extracted into a separate function for better code organization.
    • Suggestion: Create a helper function to handle the creation and initialization of the TruncateTable object to improve code readability.
  4. Updates in makeDeleteNodeInfo function:

    • Addition of logic to populate foreignTbl and indexTableNames based on table definitions.
    • Issue: The purpose of populating these fields is not explicitly mentioned in the context of the optimization.
    • Suggestion: Add comments to explain the significance of populating these fields and how they contribute to the overall functionality.

Security Concerns:

  • The changes in the pull request do not seem to introduce any direct security vulnerabilities. However, it is crucial to ensure that the logic for truncating tables is implemented securely to prevent unintended data loss or unauthorized access.

General Suggestions:

  • Improve code documentation by adding comments to explain the purpose of new fields and logic.
  • Refactor complex conditional statements into separate functions for better code readability.
  • Consider simplifying the logic for determining when to truncate a table for improved maintainability.

Overall, the changes aim to optimize the delete operation for tables without foreign keys, but there is room for improvement in terms of code clarity and maintainability. By addressing the suggestions provided, the codebase can be enhanced for better understanding and future modifications.

Here are review comments for file proto/plan.proto:

Pull Request Review:

Title and Body:

The title of the pull request is clear and informative, indicating that the changes are related to improving the delete operation for a table without a foreign key. The body provides additional context about the optimization being made and lists the approvers. However, there are some areas that need attention:

  1. Issue Link Format: The issue link in the body should be corrected to remove the unnecessary "issue #" prefix. It should be in the format: https://github.com/matrixorigin/MO-Cloud/issues/2919.

  2. Explanation Clarity: The body could benefit from a more detailed explanation of why truncating the table is considered an improvement over the standard delete operation. This would help other team members understand the rationale behind the change.

Changes in proto/plan.proto:

The changes made in the proto/plan.proto file are related to adding a new field is_delete to the TruncateTable message. While the changes seem to align with the goal of optimizing the delete operation, there are some issues and suggestions for improvement:

  1. Naming Convention: The field is_delete in the TruncateTable message might be misleading. Since the message is named TruncateTable, having a field named is_delete could cause confusion. Consider renaming the field to something like is_truncate for clarity.

  2. Documentation: It's important to update the documentation or comments in the code to explain the purpose of the new field is_delete in the TruncateTable message. This will help other developers understand its significance and usage.

  3. Consistency: Ensure that the new field is_delete is used consistently throughout the codebase and that its functionality is correctly implemented wherever it is referenced.

  4. Security Concern: While the optimization of truncating a table instead of deleting can be beneficial in certain scenarios, it's crucial to consider the implications, especially in terms of data retention and unintended data loss. Make sure there are proper safeguards in place to prevent accidental truncation of critical data.

  5. Testing: It's essential to include thorough testing for this new functionality to ensure that it works as intended and does not introduce any regressions or unexpected behavior.

Suggestions for Optimization:

  1. Code Review: Conduct a thorough code review to ensure that all changes are necessary, well-implemented, and adhere to coding standards.

  2. Performance Testing: Consider performing performance testing to validate that the optimization indeed improves the efficiency of the delete operation.

  3. Feedback Incorporation: Address any feedback or suggestions provided by reviewers or team members to enhance the quality of the code changes.

By addressing the issues mentioned above and considering the suggestions for optimization, the pull request can be improved in terms of clarity, functionality, and overall code quality.

noorall and others added 3 commits April 8, 2024 09:52
# Conflicts:
#	pkg/pb/plan/plan.pb.go
#	pkg/sql/plan/build_dml_util.go
#	proto/plan.proto
@sukki37 sukki37 merged commit ca03166 into matrixorigin:1.1-dev Apr 8, 2024
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement size/S Denotes a PR that changes [10,99] lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants