Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memorycache: code clean-ups #16313

Merged
merged 4 commits into from
Jun 6, 2024
Merged

memorycache: code clean-ups #16313

merged 4 commits into from
Jun 6, 2024

Conversation

reusee
Copy link
Contributor

@reusee reusee commented May 22, 2024

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #15508

What this PR does / why we need it:

fileservice: remove IOEntry.ReadFromOSFile

memorycache: remove RCBytes

@matrix-meow matrix-meow added the size/M Denotes a PR that changes [100,499] lines label May 22, 2024
@matrix-meow
Copy link
Contributor

@reusee Thanks for your contributions!

Here are review comments for file pkg/fileservice/disk_cache.go:

Pull Request Review:

Title and Body:

The title and body of the pull request are clear and concise, indicating that the changes are related to code clean-ups in the memorycache and fileservice packages. The PR is categorized as an improvement and references the issue it fixes (#15508). The specific changes mentioned are the removal of IOEntry.ReadFromOSFile in fileservice and the removal of RCBytes in memorycache.

Changes in disk_cache.go:

  1. Unsafe Package Usage:

    • The addition of import "unsafe" in the disk_cache.go file introduces the usage of the unsafe package. The unsafe package should be used with caution as it bypasses Go's type safety and memory management. It is generally recommended to avoid using the unsafe package unless absolutely necessary for specific low-level operations.
    • Suggestion: If possible, try to refactor the code to avoid the use of the unsafe package. If the use of unsafe is unavoidable, ensure that it is well-documented and thoroughly tested to prevent any unintended consequences.
  2. Complex Read Function Refactoring:

    • The refactored Read function in the DiskCache struct has become more complex due to the introduction of an anonymous function for reading from the file. While the refactoring might aim to improve readability or performance, the increased complexity could make the code harder to maintain and understand.
    • Suggestion: Consider breaking down the anonymous function into smaller, more descriptive functions to improve readability and maintainability. This can help in isolating specific functionalities and make the code easier to test and debug.
  3. Error Handling and Logging:

    • The error handling in the refactored code seems appropriate, with errors being returned and logged using logutil.Warn. However, the comment "ignore error" might be misleading and could lead to potential issues if errors are not properly handled or escalated.
    • Suggestion: Instead of ignoring errors, consider handling them appropriately based on the context. Logging errors is good practice, but ensure that critical errors are not ignored and are handled or reported effectively.
  4. Resource Management:

    • The refactored code includes resource management for memory allocation and deallocation using getMallocAllocator(). While resource management is essential, the complexity introduced by managing memory manually can lead to memory leaks or unsafe memory access if not handled correctly.
    • Suggestion: Ensure that memory allocation and deallocation are done securely and efficiently. Consider using higher-level abstractions or Go's built-in memory management features to simplify resource handling and reduce the risk of memory-related issues.

Overall Suggestions for Optimization:

  • Simplify Complex Code: Break down complex logic into smaller, more manageable functions to improve readability and maintainability.
  • Avoid Unnecessary Packages: Minimize the use of the unsafe package unless absolutely necessary for specific low-level operations.
  • Enhance Error Handling: Ensure that errors are handled appropriately and not ignored, especially critical errors that could impact the application's stability.
  • Optimize Resource Management: Use higher-level abstractions or Go's built-in features for memory management to reduce the complexity and potential risks associated with manual memory handling.

By addressing these points and optimizing the code changes, the pull request can not only improve the codebase but also enhance its security, maintainability, and overall quality.

Here are review comments for file pkg/fileservice/io_entry.go:

Pull Request Review:

Title: memorycache: code clean-ups

Problem 1: Unused Code

  • The ReadFromOSFile function in io_entry.go has been removed in this pull request. However, it seems like this function was not only removed but also commented out. This can lead to confusion for other developers who might wonder why the function is still present but not being used.

Solution 1:

  • Instead of just commenting out the function, it should be completely removed from the codebase to avoid confusion and keep the codebase clean.

Problem 2: Incomplete Explanation

  • The pull request mentions removing RCBytes from memorycache, but there are no changes related to RCBytes in the provided diff. This lack of alignment between the description in the pull request body and the actual changes can be misleading.

Solution 2:

  • Ensure that the description in the pull request body accurately reflects the changes made in the codebase. If RCBytes removal is intended but not included in this diff, it should be addressed in a separate commit or pull request.

Optimization:

  • Since the primary goal of this pull request is code clean-up, it would be beneficial to include additional clean-up tasks such as removing commented-out code, unused imports, or any other redundant code snippets to further enhance the codebase's readability and maintainability.

By addressing the mentioned problems and optimizing the clean-up process, the codebase can be improved in terms of clarity and maintainability.

Here are review comments for file pkg/fileservice/memorycache/cache.go:

Pull Request Review:

Title: memorycache: code clean-ups

Summary:

This pull request aims to clean up the code in the memorycache package by removing unnecessary code related to RCBytes and IOEntry.ReadFromOSFile. The changes include restructuring the Cache struct and modifying functions in cache.go.

Feedback:

  1. Unused Code Removal:

    • The removal of RCBytes related code is good as it seems to be unnecessary now.
    • However, it's important to ensure that no functionality is lost due to the removal. Double-check if RCBytes is not needed anywhere else in the codebase.
  2. Struct Refactoring:

    • The restructuring of the Cache struct is a positive change for better organization.
    • Consider adding comments to the struct fields to improve code readability and maintainability.
  3. Atomic Operations:

    • The addition of atomic.Int64 for size in the Cache struct is a good practice for concurrent safety.
    • Ensure that all operations related to size are properly synchronized using atomic operations to prevent race conditions.
  4. Function Refactoring:

    • The refactoring of functions like Alloc, Get, and Set seems reasonable.
    • Verify that the changes do not introduce any functional regressions and that the behavior remains consistent.
  5. Error Handling:

    • In the Set function, consider adding proper error handling in case the type assertion value.(*Data) fails.
    • It's essential to handle potential errors to prevent unexpected behavior or panics.
  6. Code Optimization:

    • The code can be further optimized by removing commented-out code or unnecessary comments to enhance code cleanliness.
    • Ensure consistent formatting throughout the file for better code readability.
  7. Security Concerns:

    • While the changes seem focused on code clean-up, it's crucial to conduct a security review to ensure that no security vulnerabilities are introduced inadvertently.
    • Perform a thorough review of the changes to identify any potential security risks, especially when dealing with memory management and data handling.

Overall, the code clean-up in the memorycache package is a positive step towards maintaining a clean and organized codebase. However, it's essential to validate the changes thoroughly, address the feedback provided, and consider incorporating additional optimizations and error handling to enhance the overall quality of the codebase.

Here are review comments for file pkg/fileservice/memorycache/data.go:

Pull Request Review:

Title and Body:

The title and body of the pull request clearly state that the changes are related to code clean-ups in the memorycache module. The PR aims to remove specific functions and make improvements. It references issue #15508 and provides a brief explanation of the changes made. The PR is categorized as an Improvement.

Changes in pkg/fileservice/memorycache/data.go:

  1. Addition of Data struct and related methods:

    • A new Data struct is introduced to represent a reference-counted byte buffer.
    • The struct includes fields for size, buffer, reference count, pointer, deallocator, and a global size counter.
    • New methods like newData, free, Bytes, Buf, Slice, acquire, and Release are added to handle the Data struct operations.
  2. Changes in existing methods:

    • The Truncate method is renamed to Slice and its return type is changed to CacheData.
    • The release method is renamed to Release and now directly calls free method without passing size.

Suggestions for Improvement:

  1. Error Handling:

    • Ensure proper error handling mechanisms are in place, especially when dealing with memory allocation and deallocation.
    • Add error checks when allocating memory to handle potential failures gracefully.
  2. Consistency in Naming:

    • Ensure consistent naming conventions across methods and variables for better code readability.
    • Review and maintain a consistent naming style for functions like Slice, Release, etc.
  3. Documentation:

    • Add or update comments/documentation for newly added methods and structs to explain their purpose, parameters, and return values.
    • Clear and concise documentation will help developers understand the codebase more easily.
  4. Memory Management:

    • Double-check memory management operations to prevent memory leaks or unsafe memory access.
    • Verify that memory is properly allocated and deallocated in all scenarios to avoid potential issues.
  5. Testing:

    • Consider adding unit tests to cover the new functionality and ensure the correctness of the code changes.
    • Test edge cases and error scenarios to validate the behavior of the Data struct and its methods.
  6. Optimization:

    • Review the code for any redundant operations or unnecessary allocations that could be optimized for better performance.
    • Look for opportunities to streamline the code without compromising readability or functionality.

By addressing these suggestions, the codebase can be enhanced in terms of reliability, maintainability, and performance. It is essential to prioritize code quality and security while making improvements to the memorycache module.

Here are review comments for file pkg/fileservice/memorycache/data_alloc_free.go:

Pull Request Review:

Title:

The title of the pull request is clear and concise, indicating that the changes are related to code clean-ups in the memory cache module.

Body:

The body of the pull request provides information about the type of PR, the specific issue it addresses, and a brief description of the changes made in the PR. It is helpful for understanding the context of the changes.

Changes Made:

The changes in the data_alloc_free.go file involve the removal of certain functions related to memory allocation and deallocation in the memory cache module. Specifically, the functions newData and free have been removed along with their associated code blocks.

Feedback and Suggestions:

  1. License Information:

    • The removed license information at the beginning of the file is important for maintaining compliance with the Apache License, Version 2.0. It should not be deleted from the file.
    • Suggestion: Instead of removing the license information, consider leaving it intact to ensure legal compliance.
  2. Function Removal:

    • The functions newData and free have been removed from the file. It is essential to ensure that these functions are not critical for the functionality of the memory cache module.
    • Suggestion: Before removing these functions, verify that they are not being used elsewhere in the codebase. If they are no longer needed, consider deprecating them with proper documentation.
  3. Code Cleanup:

    • The removal of unnecessary code is a good practice for maintaining a clean and efficient codebase. However, it is crucial to ensure that the removed code does not impact the functionality of the application.
    • Suggestion: Perform thorough testing after removing code to verify that the functionality of the memory cache module is not affected.
  4. Security Considerations:

    • When making changes related to memory allocation and deallocation, it is essential to consider potential security vulnerabilities such as memory leaks or buffer overflows.
    • Suggestion: Conduct a security review to ensure that the changes do not introduce any security risks. Consider using tools like static code analyzers to identify potential vulnerabilities.
  5. Documentation Update:

    • If the removed functions were part of the public API or were documented, ensure that the documentation is updated to reflect the changes.
    • Suggestion: Update the documentation to reflect the removal of newData and free functions, explaining the reason for their removal.
  6. Optimization:

    • While removing unused code is beneficial for code cleanliness, consider optimizing other aspects of the codebase for performance improvements.
    • Suggestion: Look for opportunities to optimize memory usage, improve algorithm efficiency, or enhance overall performance in the memory cache module.

By addressing the mentioned points and considering the suggestions provided, you can ensure that the code clean-up in the memory cache module is done effectively while maintaining code quality and security standards.

Here are review comments for file pkg/fileservice/memorycache/data_test.go:

Pull Request Review:

Title:

The title of the pull request is clear and concise, indicating that the changes are related to code clean-ups in the memorycache module.

Body:

The body of the pull request provides relevant information about the type of PR, the specific issue it addresses, and a brief description of the changes made in the PR. It would be helpful to include more details about why the RCBytes and IOEntry.ReadFromOSFile were removed for better context.

Changes in data_test.go:

  1. Truncate Method Change:

    • Issue: The Truncate method is replaced with Slice(0). This change might not be functionally equivalent and could lead to unexpected behavior.
    • Suggestion: Ensure that the Slice method behaves as intended and has the same functionality as the Truncate method. If not, consider reverting this change or updating the method to match the previous behavior.
  2. Release Method Change:

    • Issue: The release method is replaced with Release(). The original method seems to have accepted a parameter &size, which is no longer passed in the new Release method. This could lead to potential issues if size was being used in the release method.
    • Suggestion: If size was crucial in the release method, consider passing it as a parameter to the new Release method or ensure that the removal of &size does not impact the functionality negatively.
  3. Refactoring:

    • Consider adding comments to explain the purpose of the Slice and Release methods to improve code readability and maintainability.

Security Concerns:

  • Ensure that the removal of RCBytes and IOEntry.ReadFromOSFile does not introduce any security vulnerabilities or break existing functionality. It's essential to thoroughly test these changes to prevent any unforeseen security risks.

Optimization Suggestions:

  • Consider adding unit tests to cover the changes made in the data_test.go file to ensure that the refactored methods work as expected.
  • If possible, provide more detailed comments explaining the rationale behind the changes to aid future developers in understanding the codebase.

By addressing the identified issues and considering the optimization suggestions, the quality and maintainability of the codebase can be improved.

Here are review comments for file pkg/fileservice/memorycache/rc_bytes.go:

Pull Request Review:

Title:

The title of the pull request is clear and concise, indicating that the changes are related to cleaning up the code in the memorycache package.

Body:

The body of the pull request provides relevant information about the type of PR, the specific issue it addresses, and a brief description of the changes made. It would be beneficial to include more details about why the RCBytes struct and its associated methods are being removed for better context.

Changes in rc_bytes.go:

  1. License Information: The license information at the beginning of the file is removed. It's important to ensure that all necessary licensing information is retained in the codebase for legal compliance.

  2. RCBytes Struct and Methods Removal:

    • The RCBytes struct along with its methods Release(), Bytes(), and Slice() are completely removed from the file.
    • This removal may cause issues if other parts of the codebase depend on these methods or the RCBytes struct. It's crucial to assess the impact of removing these functionalities.

Suggestions for Improvement:

  1. License Information:

    • It's recommended to keep the license information intact in all files to comply with licensing requirements. If there is a specific reason for removing it, it should be documented.
  2. RCBytes Removal:

    • Before removing the RCBytes struct and its methods, ensure that they are no longer used or have been replaced with alternative implementations.
    • If these methods are no longer needed, consider adding comments explaining why they were removed to provide clarity for future developers.
  3. Documentation:

    • Enhance the PR description to provide more context on why the RCBytes struct and methods were removed. This will help reviewers and future developers understand the rationale behind the changes.
  4. Impact Assessment:

    • Conduct a thorough review to determine if any other parts of the codebase rely on the functionalities provided by the RCBytes struct. If so, appropriate adjustments or replacements should be made to prevent breaking existing functionality.
  5. Testing:

    • Ensure that relevant tests are updated or added to cover the changes made in this PR, especially if the removal of RCBytes impacts the functionality of the memorycache package.

By addressing the points mentioned above, the pull request can be improved in terms of code quality, maintainability, and ensuring the overall stability of the codebase.

Here are review comments for file pkg/fileservice/memorycache/rc_bytes_test.go:

Pull Request Review:

Title:

The title of the pull request is clear and concise, indicating that the changes are related to code clean-ups in the memorycache.

Body:

The body of the pull request provides information about the type of PR, the specific issue it fixes, and a brief description of the changes made. It mentions the removal of IOEntry.ReadFromOSFile in fileservice and the removal of RCBytes in memorycache.

Changes in rc_bytes_test.go:

  1. License Header Removal:

    • The copyright and license information has been removed from the file. It is important to ensure that the appropriate license information is retained in all source files for legal compliance and attribution.
  2. Test Function Modification:

    • The TestRCBytes function has been modified to remove the test cases related to RCBytes. This indicates that the RCBytes functionality has been removed from the codebase.
    • The test cases for Bytes, Slice, and Release related to RCBytes have been removed. It is essential to ensure that all test cases are updated or replaced when removing functionality to maintain test coverage.

Suggestions for Improvement:

  1. License Information:

    • Ensure that the appropriate license information is included in the file as per the project's licensing requirements. If the license header was removed intentionally, provide a justification for this change.
  2. Test Coverage:

    • If RCBytes functionality has been removed, update the test cases to reflect the changes. Consider either updating the existing tests to cover other relevant functionality or removing them entirely if they are no longer applicable.
  3. Code Cleanup:

    • While removing code is a part of the clean-up process, ensure that the removal of RCBytes does not impact other parts of the codebase that may be dependent on it. Perform a thorough review to identify any potential dependencies or references that need to be addressed.
  4. Documentation:

    • If RCBytes is a public API or used in other parts of the codebase, update the documentation to reflect the removal of this functionality. This will help other developers understand the changes and prevent confusion.
  5. Review Process:

    • It is important to have a clear review process in place to ensure that code clean-ups do not inadvertently introduce bugs or regressions. Consider involving team members in the review process to catch any potential issues early on.

By addressing the above suggestions and ensuring thorough testing and documentation updates, the code clean-ups in the memorycache can be effectively implemented while maintaining the integrity and quality of the codebase.

Here are review comments for file pkg/fileservice/memorycache/types.go:

Pull Request Review:

Title and Body:

The title of the pull request is clear and concise, indicating that it involves code clean-ups in the memorycache module. The body of the pull request provides relevant information about the type of PR, the associated issue, and a brief description of the changes made. It would be beneficial to include more details about the specific clean-up tasks performed to give a better understanding of the modifications.

Changes in pkg/fileservice/memorycache/types.go:

  1. Removal of Unused Code:

    • The pull request removes several unused types and functions from the types.go file, such as RCBytes, Data, and related methods.
    • Issue: Removing unused code is a good practice, but it's essential to ensure that the removed code is truly unnecessary and does not impact any existing functionality.
    • Suggestion: Before deleting unused code, verify that it is not required for any current or future features. Consider commenting out the code first and running tests to confirm that it is safe to remove.
  2. Global Variable Removal:

    • The EnableTracing global variable is removed from the file.
    • Issue: Global variables can introduce unexpected behavior and make code harder to reason about. It's better to avoid global state whenever possible.
    • Suggestion: If EnableTracing is necessary, consider encapsulating it within a struct or function to limit its scope and improve maintainability.
  3. Struct Refactoring:

    • The Cache struct is removed from the file.
    • Issue: Removing a struct like Cache may impact other parts of the codebase that rely on it. Ensure that its removal does not break any existing functionality.
    • Suggestion: If Cache is no longer needed, verify that its responsibilities are appropriately handled elsewhere in the codebase or refactor it into a different structure if required.

Security Concerns:

No specific security concerns are identified in the provided changes.

Overall Suggestions:

  1. Code Review and Testing:

    • Conduct a thorough code review to ensure that all removed code is truly unnecessary.
    • Run comprehensive tests to validate that the changes do not introduce any regressions or unexpected behavior.
  2. Documentation Update:

    • Update the documentation to reflect the changes made in the types.go file, especially if any public APIs have been modified or removed.
  3. Optimization Opportunities:

    • Consider optimizing other parts of the codebase while performing clean-ups to enhance performance and maintainability.

In conclusion, the pull request shows progress in cleaning up the codebase by removing unused code and global variables. However, it is crucial to verify the impact of these changes on the overall functionality and conduct thorough testing before merging the PR. Additionally, providing more detailed information in the pull request description would improve clarity for reviewers and maintainers.

@reusee reusee force-pushed the cacheopts branch 3 times, most recently from d9d1d21 to 77d031b Compare May 23, 2024 15:59
@reusee reusee force-pushed the cacheopts branch 6 times, most recently from d6b9a6b to d23d481 Compare June 4, 2024 02:23
memorycache: remove RCBytes
@reusee
Copy link
Contributor Author

reusee commented Jun 6, 2024

@mergify refresh

Copy link
Contributor

mergify bot commented Jun 6, 2024

refresh

✅ Pull request refreshed

@mergify mergify bot merged commit e1fd511 into matrixorigin:main Jun 6, 2024
16 of 18 checks passed
XuPeng-SH pushed a commit to XuPeng-SH/matrixone that referenced this pull request Jun 7, 2024
* fix bvt test (matrixorigin#16605)

fix bvt test

Approved by: @heni02

* remove duplicates in object list for flushing (matrixorigin#16677)

- reduce the size of tombstone files

Approved by: @XuPeng-SH

* fix stats for prefix_eq function (matrixorigin#16666)

由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句
现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度

Approved by: @aunjgr

* Fix condition to ignore delete booking if no transfer needed (matrixorigin#16633)

Add a log statement to improve traceability when the transfer is not needed

Approved by: @XuPeng-SH

* make sure all pipeline run in single parallel for tp query (matrixorigin#16685)

make sure all pipeline run in single parallel for tp query

Approved by: @ouyuanning, @aunjgr

* [Cherry-pick] handle null in convertRowsIntoBatch (matrixorigin#16676)

handle null in convertRowsIntoBatch

Approved by: @daviszhen

* Fix enumtype system variable check  (matrixorigin#16691)

Fix enumtype system variable check

Approved by: @daviszhen

* support query replica count of special cn (matrixorigin#16642)

support query replica count of special cn

Approved by: @reusee, @daviszhen

* split build operator into merge and build operators (matrixorigin#16673)

把收发数据的功能从build算子中拆开,拆成merge+build,为后续的重构做准备

Approved by: @m-schen, @ouyuanning, @aunjgr

* fileservice: add caching dns resolver (matrixorigin#16702)

fileservice: longer timeout for http client

Approved by: @zhangxu19830126

* Fix-16620 (matrixorigin#16681)

1.  Reuse latest partition state.

Approved by: @badboynt1, @m-schen, @XuPeng-SH

* rmTag16601_16597 (matrixorigin#16700)

rm  tag 16601 and 16597

Approved by: @heni02

* optimize top operator in pipeline for tp query (matrixorigin#16704)

optimize top operator in pipeline for tp query, don't need mergetop

Approved by: @m-schen

* optimize limit operator in pipeline for tp query (matrixorigin#16705)

optimize limit operator in pipeline for tp query, don't need toplimit

Approved by: @m-schen

* add global system variable and session variable account isolation cases (matrixorigin#16694)

add global system variable and session variable account isolation cases

Approved by: @aressu1985

* Add issue 16613 cases (matrixorigin#16719)

Add issue 16613 cases

Approved by: @aressu1985

* add case for function hex() and unhex() (matrixorigin#16711)

add case for hex() and unhex().

Approved by: @heni02

* optimize group operator in pipeline for tp query (matrixorigin#16717)

optimize group operator in pipeline for tp query, don't need mergegroup

Approved by: @m-schen

* add debug info for panic (matrixorigin#16634)

issue上的问题是事务状态异常。

在出问题的调用栈上,增加事务状态的检测逻辑。

txnIsValid 判断事务状态是否异常。

Approved by: @badboynt1, @m-schen, @ouyuanning, @triump2020, @qingxinhome, @aunjgr

* add optimizer hint exectype to force query to be ap or tp (matrixorigin#16722)

add optimizer hint exectype to force query to be ap or tp

Approved by: @ouyuanning

* update bloom filter for the new prefix bf (matrixorigin#16684)

support prefix bloom filter for object reader and writer

Approved by: @XuPeng-SH

* memorycache: code clean-ups (matrixorigin#16313)

fileservice: remove IOEntry.ReadFromOSFile

memorycache: remove RCBytes

Approved by: @zhangxu19830126

* optimize offset operator in pipeline for tp query (matrixorigin#16706)

optimize limit operator in pipeline for tp query, don't need mergeoffset

Approved by: @m-schen

* fix merge

---------

Co-authored-by: YANGGMM <www.yangzhao123@gmail.com>
Co-authored-by: aptend <49832303+aptend@users.noreply.github.com>
Co-authored-by: nitao <badboynt@126.com>
Co-authored-by: Wei Ziran <weiziran125@gmail.com>
Co-authored-by: Kai Cao <ck89119@users.noreply.github.com>
Co-authored-by: fagongzi <zhangxu19830126@gmail.com>
Co-authored-by: reusee <reusee@gmail.com>
Co-authored-by: triump2020 <63033222+triump2020@users.noreply.github.com>
Co-authored-by: Ariznawlll <ariznawl@163.com>
Co-authored-by: heni02 <113406637+heni02@users.noreply.github.com>
Co-authored-by: davis zhen <daviszhen007@gmail.com>
Co-authored-by: GreatRiver <2552853833@qq.com>
XuPeng-SH added a commit to XuPeng-SH/matrixone that referenced this pull request Jun 11, 2024
* fix bvt test (matrixorigin#16605)

fix bvt test

Approved by: @heni02

* remove duplicates in object list for flushing (matrixorigin#16677)

- reduce the size of tombstone files

Approved by: @XuPeng-SH

* fix stats for prefix_eq function (matrixorigin#16666)

由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句
现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度

Approved by: @aunjgr

* Fix condition to ignore delete booking if no transfer needed (matrixorigin#16633)

Add a log statement to improve traceability when the transfer is not needed

Approved by: @XuPeng-SH

* make sure all pipeline run in single parallel for tp query (matrixorigin#16685)

make sure all pipeline run in single parallel for tp query

Approved by: @ouyuanning, @aunjgr

* [Cherry-pick] handle null in convertRowsIntoBatch (matrixorigin#16676)

handle null in convertRowsIntoBatch

Approved by: @daviszhen

* Fix enumtype system variable check  (matrixorigin#16691)

Fix enumtype system variable check

Approved by: @daviszhen

* support query replica count of special cn (matrixorigin#16642)

support query replica count of special cn

Approved by: @reusee, @daviszhen

* split build operator into merge and build operators (matrixorigin#16673)

把收发数据的功能从build算子中拆开,拆成merge+build,为后续的重构做准备

Approved by: @m-schen, @ouyuanning, @aunjgr

* fileservice: add caching dns resolver (matrixorigin#16702)

fileservice: longer timeout for http client

Approved by: @zhangxu19830126

* Fix-16620 (matrixorigin#16681)

1.  Reuse latest partition state.

Approved by: @badboynt1, @m-schen, @XuPeng-SH

* rmTag16601_16597 (matrixorigin#16700)

rm  tag 16601 and 16597

Approved by: @heni02

* optimize top operator in pipeline for tp query (matrixorigin#16704)

optimize top operator in pipeline for tp query, don't need mergetop

Approved by: @m-schen

* optimize limit operator in pipeline for tp query (matrixorigin#16705)

optimize limit operator in pipeline for tp query, don't need toplimit

Approved by: @m-schen

* add global system variable and session variable account isolation cases (matrixorigin#16694)

add global system variable and session variable account isolation cases

Approved by: @aressu1985

* Add issue 16613 cases (matrixorigin#16719)

Add issue 16613 cases

Approved by: @aressu1985

* add case for function hex() and unhex() (matrixorigin#16711)

add case for hex() and unhex().

Approved by: @heni02

* optimize group operator in pipeline for tp query (matrixorigin#16717)

optimize group operator in pipeline for tp query, don't need mergegroup

Approved by: @m-schen

* add debug info for panic (matrixorigin#16634)

issue上的问题是事务状态异常。

在出问题的调用栈上,增加事务状态的检测逻辑。

txnIsValid 判断事务状态是否异常。

Approved by: @badboynt1, @m-schen, @ouyuanning, @triump2020, @qingxinhome, @aunjgr

* add optimizer hint exectype to force query to be ap or tp (matrixorigin#16722)

add optimizer hint exectype to force query to be ap or tp

Approved by: @ouyuanning

* update bloom filter for the new prefix bf (matrixorigin#16684)

support prefix bloom filter for object reader and writer

Approved by: @XuPeng-SH

* memorycache: code clean-ups (matrixorigin#16313)

fileservice: remove IOEntry.ReadFromOSFile

memorycache: remove RCBytes

Approved by: @zhangxu19830126

* optimize offset operator in pipeline for tp query (matrixorigin#16706)

optimize limit operator in pipeline for tp query, don't need mergeoffset

Approved by: @m-schen

* add issue 16139 cases (matrixorigin#16733)

add issue 16139 cases

Approved by: @aressu1985

* handle Restore Duplicate Entry (matrixorigin#16567)

SQL执行时将事务WriteOffset与当前语句绑定,解决读数据万圣节问题

MO Checkin Regression test susccess:
https://github.com/matrixorigin/ci-test/actions/runs/9362961560
https://github.com/matrixorigin/ci-test/actions/runs/9379340928

Approved by: @daviszhen, @badboynt1, @m-schen, @reusee, @zhangxu19830126, @XuPeng-SH, @aunjgr, @triump2020

* Handle Cancel Restore Statement Fail (matrixorigin#16735)

handle `ctrl+c` failed to cancel during restore data

Approved by: @daviszhen

* fix a bug that cause ap performance regression on multi cn (matrixorigin#16737)

fix a bug that cause ap performance regression on multi cn

Approved by: @ouyuanning

* optimize order operator in pipeline for tp query (matrixorigin#16742)

把mergeorder算子拆成merge+mergeorder,把接收数据的功能独立出来。
对于tp query,pipeline直接改成scan->order->mergeorder即可,不需要通过connector-merge进行连接

Approved by: @ouyuanning, @m-schen

* dashboard: refactor runtime dashboard (matrixorigin#16746)

refine go runtime metrics dashboard

Approved by: @zhangxu19830126

* malloc: add profiler (matrixorigin#16699)

malloc: refactor config

malloc: add chainDeallocator, FuncDeallocator; optimize metrics allocator

malloc: optimize metrics allocator

malloc: enable metrics default

Approved by: @zhangxu19830126

* skip stats for create view (matrixorigin#16728)

skip stats for create view

Approved by: @daviszhen, @aunjgr

* fileservice: add disk-based object storage (matrixorigin#16610)

add local disk s3 fs object storage for testing purposes

Approved by: @zhangxu19830126

* block reader supports between filter. (matrixorigin#16674)

block reader supports between filters.

Approved by: @XuPeng-SH, @heni02, @aunjgr

* change LIMIT,OFFSET's data type from int64 to uint64 (matrixorigin#16697)

limit和offset 的数据类型改为uint64。

Approved by: @m-schen, @reusee, @ouyuanning, @badboynt1, @aunjgr, @aressu1985

* fix shard service panic (matrixorigin#16759)

fix shard service panic

Approved by: @reusee

* make add txn error trace async (matrixorigin#16757)

Fix hung when add txn error trace

Approved by: @iamlinjunhong

* Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)" (matrixorigin#16754)

Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)"

Approved by: @badboynt1, @daviszhen

* [bug] launch: fix timeout mechanism when TN service startup (matrixorigin#16760)

fix timeout mechanism when TN service startup:
5m timeout for total wait and 5s timeout for each request.

Approved by: @zhangxu19830126

* add a code owner for vector (matrixorigin#16762)

add XuPeng-SH as vector owner

Approved by: @fengttt

* refactor the block reader filter to support more expressions (matrixorigin#16756)

1.  `>, >=, <, <=`
2. `between and, prefix in, prefix between, prefix eq`
3. `in, eq`

Approved by: @XuPeng-SH, @aressu1985

* update readme(1.2.0) (matrixorigin#16243)

* Remove list checkpoint meta file (matrixorigin#16723)

Remove list checkpoint meta file

Approved by: @XuPeng-SH

* fileservice: fix fd leak in LocalFS.read (matrixorigin#16748)

fix fd leak in LocalFS.read

Approved by: @fengttt

* [BugFix]: Remove unnecessary projections in master index (matrixorigin#16766)

Remove unnecessary projects from Master Index.

```sql
mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a75 = 'I2nJ0RqIQu';
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                                 |
|   Analyze: timeConsumed=0ms waitTime=6ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                                |
|   ->  Join                                                                                                                                              |
|         Analyze: timeConsumed=2ms waitTime=44ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                         |
|         Join Type: INDEX                                                                                                                                |
|         Join Cond: (tbl.a100 = #[1,0])                                                                                                                  |
|         Runtime Filter Build: #[-1,0]                                                                                                                   |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                            |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes     |
|               Filter Cond: (tbl.a75 = 'I2nJ0RqIQu')                                                                                                     |
|               Block Filter Cond: (tbl.a75 = 'I2nJ0RqIQu')                                                                                               |
|               Runtime Filter Probe: tbl.a100                                                                                                            |
|         ->  Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN]                                                      |
|               Analyze: timeConsumed=2ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes |
|               Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ')                                                                                      |
|               Block Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ')                                                                                |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.01 sec)


mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a37 = '3Tfm6CEXy5' AND tbl.a94 = '6PRBdXpsVB';
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                                                                         |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                                                                                   |
|   Analyze: timeConsumed=0ms waitTime=15ms inputRows=3 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=0bytes                                                                                 |
|   ->  Join                                                                                                                                                                                                |
|         Analyze: timeConsumed=4ms waitTime=72ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                                                                           |
|         Join Type: INDEX                                                                                                                                                                                  |
|         Join Cond: (tbl.a100 = #[1,0])                                                                                                                                                                    |
|         Runtime Filter Build: #[-1,0]                                                                                                                                                                     |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                                                                              |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes                                                       |
|               Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5')                                                                                                                             |
|               Block Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5')                                                                                                                       |
|               Runtime Filter Probe: tbl.a100                                                                                                                                                              |
|         ->  Join                                                                                                                                                                                          |
|               Analyze: timeConsumed=4ms probe_time=[total=0ms,min=0ms,max=0ms,dop=10] build_time=[4ms] waitTime=57ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=361187bytes |
|               Join Type: INNER                                                                                                                                                                            |
|               Join Cond: (#[0,0] = #[1,0])                                                                                                                                                                |
|               ->  Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN]                                                                                                  |
|                     Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=4 inputRows=32768 outputRows=1 InputSize=2mb OutputSize=24bytes MemorySize=679936bytes                                             |
|                     Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ')                                                                                                                                  |
|                     Block Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ')                                                                                                                            |
|               ->  Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN]                                                                                                  |
|                     Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes                                             |
|                     Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ')                                                                                                                                  |
|                     Block Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ')                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
24 rows in set (0.00 sec)

```

Approved by: @badboynt1

* fix

* fix

* fix

* fix

---------

Co-authored-by: YANGGMM <www.yangzhao123@gmail.com>
Co-authored-by: aptend <49832303+aptend@users.noreply.github.com>
Co-authored-by: nitao <badboynt@126.com>
Co-authored-by: Wei Ziran <weiziran125@gmail.com>
Co-authored-by: Kai Cao <ck89119@users.noreply.github.com>
Co-authored-by: fagongzi <zhangxu19830126@gmail.com>
Co-authored-by: reusee <reusee@gmail.com>
Co-authored-by: triump2020 <63033222+triump2020@users.noreply.github.com>
Co-authored-by: Ariznawlll <ariznawl@163.com>
Co-authored-by: heni02 <113406637+heni02@users.noreply.github.com>
Co-authored-by: davis zhen <daviszhen007@gmail.com>
Co-authored-by: GreatRiver <2552853833@qq.com>
Co-authored-by: qingxinhome <70939751+qingxinhome@users.noreply.github.com>
Co-authored-by: gouhongshen <gouhongshen@hotmail.com>
Co-authored-by: zengyan1 <93656539+zengyan1@users.noreply.github.com>
Co-authored-by: LiuBo <g.user.lb@gmail.com>
Co-authored-by: yangj1211 <153493538+yangj1211@users.noreply.github.com>
Co-authored-by: Arjun Sunil Kumar <arjunsk@users.noreply.github.com>
XuPeng-SH added a commit to XuPeng-SH/matrixone that referenced this pull request Jun 12, 2024
* fix bvt test (matrixorigin#16605)

fix bvt test

Approved by: @heni02

* remove duplicates in object list for flushing (matrixorigin#16677)

- reduce the size of tombstone files

Approved by: @XuPeng-SH

* fix stats for prefix_eq function (matrixorigin#16666)

由于索引表总是将主键序列化进去,导致ndv很高,索引表的过滤度估计严重错误,会导致优化器错判tp/ap语句
现在改成利用原始过滤条件的过滤度去计算prefix_eq函数的过滤度

Approved by: @aunjgr

* Fix condition to ignore delete booking if no transfer needed (matrixorigin#16633)

Add a log statement to improve traceability when the transfer is not needed

Approved by: @XuPeng-SH

* make sure all pipeline run in single parallel for tp query (matrixorigin#16685)

make sure all pipeline run in single parallel for tp query

Approved by: @ouyuanning, @aunjgr

* [Cherry-pick] handle null in convertRowsIntoBatch (matrixorigin#16676)

handle null in convertRowsIntoBatch

Approved by: @daviszhen

* Fix enumtype system variable check  (matrixorigin#16691)

Fix enumtype system variable check

Approved by: @daviszhen

* support query replica count of special cn (matrixorigin#16642)

support query replica count of special cn

Approved by: @reusee, @daviszhen

* split build operator into merge and build operators (matrixorigin#16673)

把收发数据的功能从build算子中拆开,拆成merge+build,为后续的重构做准备

Approved by: @m-schen, @ouyuanning, @aunjgr

* fileservice: add caching dns resolver (matrixorigin#16702)

fileservice: longer timeout for http client

Approved by: @zhangxu19830126

* Fix-16620 (matrixorigin#16681)

1.  Reuse latest partition state.

Approved by: @badboynt1, @m-schen, @XuPeng-SH

* rmTag16601_16597 (matrixorigin#16700)

rm  tag 16601 and 16597

Approved by: @heni02

* optimize top operator in pipeline for tp query (matrixorigin#16704)

optimize top operator in pipeline for tp query, don't need mergetop

Approved by: @m-schen

* optimize limit operator in pipeline for tp query (matrixorigin#16705)

optimize limit operator in pipeline for tp query, don't need toplimit

Approved by: @m-schen

* add global system variable and session variable account isolation cases (matrixorigin#16694)

add global system variable and session variable account isolation cases

Approved by: @aressu1985

* Add issue 16613 cases (matrixorigin#16719)

Add issue 16613 cases

Approved by: @aressu1985

* add case for function hex() and unhex() (matrixorigin#16711)

add case for hex() and unhex().

Approved by: @heni02

* optimize group operator in pipeline for tp query (matrixorigin#16717)

optimize group operator in pipeline for tp query, don't need mergegroup

Approved by: @m-schen

* add debug info for panic (matrixorigin#16634)

issue上的问题是事务状态异常。

在出问题的调用栈上,增加事务状态的检测逻辑。

txnIsValid 判断事务状态是否异常。

Approved by: @badboynt1, @m-schen, @ouyuanning, @triump2020, @qingxinhome, @aunjgr

* add optimizer hint exectype to force query to be ap or tp (matrixorigin#16722)

add optimizer hint exectype to force query to be ap or tp

Approved by: @ouyuanning

* update bloom filter for the new prefix bf (matrixorigin#16684)

support prefix bloom filter for object reader and writer

Approved by: @XuPeng-SH

* memorycache: code clean-ups (matrixorigin#16313)

fileservice: remove IOEntry.ReadFromOSFile

memorycache: remove RCBytes

Approved by: @zhangxu19830126

* optimize offset operator in pipeline for tp query (matrixorigin#16706)

optimize limit operator in pipeline for tp query, don't need mergeoffset

Approved by: @m-schen

* add issue 16139 cases (matrixorigin#16733)

add issue 16139 cases

Approved by: @aressu1985

* handle Restore Duplicate Entry (matrixorigin#16567)

SQL执行时将事务WriteOffset与当前语句绑定,解决读数据万圣节问题

MO Checkin Regression test susccess:
https://github.com/matrixorigin/ci-test/actions/runs/9362961560
https://github.com/matrixorigin/ci-test/actions/runs/9379340928

Approved by: @daviszhen, @badboynt1, @m-schen, @reusee, @zhangxu19830126, @XuPeng-SH, @aunjgr, @triump2020

* Handle Cancel Restore Statement Fail (matrixorigin#16735)

handle `ctrl+c` failed to cancel during restore data

Approved by: @daviszhen

* fix a bug that cause ap performance regression on multi cn (matrixorigin#16737)

fix a bug that cause ap performance regression on multi cn

Approved by: @ouyuanning

* optimize order operator in pipeline for tp query (matrixorigin#16742)

把mergeorder算子拆成merge+mergeorder,把接收数据的功能独立出来。
对于tp query,pipeline直接改成scan->order->mergeorder即可,不需要通过connector-merge进行连接

Approved by: @ouyuanning, @m-schen

* dashboard: refactor runtime dashboard (matrixorigin#16746)

refine go runtime metrics dashboard

Approved by: @zhangxu19830126

* malloc: add profiler (matrixorigin#16699)

malloc: refactor config

malloc: add chainDeallocator, FuncDeallocator; optimize metrics allocator

malloc: optimize metrics allocator

malloc: enable metrics default

Approved by: @zhangxu19830126

* skip stats for create view (matrixorigin#16728)

skip stats for create view

Approved by: @daviszhen, @aunjgr

* fileservice: add disk-based object storage (matrixorigin#16610)

add local disk s3 fs object storage for testing purposes

Approved by: @zhangxu19830126

* block reader supports between filter. (matrixorigin#16674)

block reader supports between filters.

Approved by: @XuPeng-SH, @heni02, @aunjgr

* change LIMIT,OFFSET's data type from int64 to uint64 (matrixorigin#16697)

limit和offset 的数据类型改为uint64。

Approved by: @m-schen, @reusee, @ouyuanning, @badboynt1, @aunjgr, @aressu1985

* fix shard service panic (matrixorigin#16759)

fix shard service panic

Approved by: @reusee

* make add txn error trace async (matrixorigin#16757)

Fix hung when add txn error trace

Approved by: @iamlinjunhong

* Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)" (matrixorigin#16754)

Revert "resubmit pipeline client max connections is too large (matrixorigin#16209)"

Approved by: @badboynt1, @daviszhen

* [bug] launch: fix timeout mechanism when TN service startup (matrixorigin#16760)

fix timeout mechanism when TN service startup:
5m timeout for total wait and 5s timeout for each request.

Approved by: @zhangxu19830126

* add a code owner for vector (matrixorigin#16762)

add XuPeng-SH as vector owner

Approved by: @fengttt

* refactor the block reader filter to support more expressions (matrixorigin#16756)

1.  `>, >=, <, <=`
2. `between and, prefix in, prefix between, prefix eq`
3. `in, eq`

Approved by: @XuPeng-SH, @aressu1985

* update readme(1.2.0) (matrixorigin#16243)

* Remove list checkpoint meta file (matrixorigin#16723)

Remove list checkpoint meta file

Approved by: @XuPeng-SH

* fileservice: fix fd leak in LocalFS.read (matrixorigin#16748)

fix fd leak in LocalFS.read

Approved by: @fengttt

* [BugFix]: Remove unnecessary projections in master index (matrixorigin#16766)

Remove unnecessary projects from Master Index.

```sql
mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a75 = 'I2nJ0RqIQu';
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                                 |
|   Analyze: timeConsumed=0ms waitTime=6ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                                |
|   ->  Join                                                                                                                                              |
|         Analyze: timeConsumed=2ms waitTime=44ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                         |
|         Join Type: INDEX                                                                                                                                |
|         Join Cond: (tbl.a100 = #[1,0])                                                                                                                  |
|         Runtime Filter Build: #[-1,0]                                                                                                                   |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                            |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes     |
|               Filter Cond: (tbl.a75 = 'I2nJ0RqIQu')                                                                                                     |
|               Block Filter Cond: (tbl.a75 = 'I2nJ0RqIQu')                                                                                               |
|               Runtime Filter Probe: tbl.a100                                                                                                            |
|         ->  Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN]                                                      |
|               Analyze: timeConsumed=2ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes |
|               Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ')                                                                                      |
|               Block Filter Cond: prefix_eq(#[0,0], 'F74 FI2nJ0RqIQu ')                                                                                |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.01 sec)


mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a37 = '3Tfm6CEXy5' AND tbl.a94 = '6PRBdXpsVB';
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                                                                         |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                                                                                   |
|   Analyze: timeConsumed=0ms waitTime=15ms inputRows=3 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=0bytes                                                                                 |
|   ->  Join                                                                                                                                                                                                |
|         Analyze: timeConsumed=4ms waitTime=72ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                                                                           |
|         Join Type: INDEX                                                                                                                                                                                  |
|         Join Cond: (tbl.a100 = #[1,0])                                                                                                                                                                    |
|         Runtime Filter Build: #[-1,0]                                                                                                                                                                     |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                                                                              |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes                                                       |
|               Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5')                                                                                                                             |
|               Block Filter Cond: (tbl.a94 = '6PRBdXpsVB'), (tbl.a37 = '3Tfm6CEXy5')                                                                                                                       |
|               Runtime Filter Probe: tbl.a100                                                                                                                                                              |
|         ->  Join                                                                                                                                                                                          |
|               Analyze: timeConsumed=4ms probe_time=[total=0ms,min=0ms,max=0ms,dop=10] build_time=[4ms] waitTime=57ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=361187bytes |
|               Join Type: INNER                                                                                                                                                                            |
|               Join Cond: (#[0,0] = #[1,0])                                                                                                                                                                |
|               ->  Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN]                                                                                                  |
|                     Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=4 inputRows=32768 outputRows=1 InputSize=2mb OutputSize=24bytes MemorySize=679936bytes                                             |
|                     Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ')                                                                                                                                  |
|                     Block Filter Cond: prefix_eq(#[0,0], 'F36 F3Tfm6CEXy5 ')                                                                                                                            |
|               ->  Table Scan on a.__mo_index_secondary_019003d8-3fd8-7455-b750-bd977ca13178 [ForceOneCN]                                                                                                  |
|                     Analyze: timeConsumed=3ms waitTime=0ms inputBlocks=6 inputRows=49152 outputRows=1 InputSize=3mb OutputSize=24bytes MemorySize=696320bytes                                             |
|                     Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ')                                                                                                                                  |
|                     Block Filter Cond: prefix_eq(#[0,0], 'F93 F6PRBdXpsVB ')                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
24 rows in set (0.00 sec)

```

Approved by: @badboynt1

* add SERVER_MORE_RESULTS_EXISTS judgment (matrixorigin#16712)

补全响应client时的SERVER_MORE_RESULTS_EXISTS 设置

Approved by: @daviszhen

* fileservice: tune metrics dashboard (matrixorigin#16770)

add metrics for io.ReadAll

Approved by: @zhangxu19830126

* [opt] retain IN expression in prepared stmt (matrixorigin#16744)

don't convert IN expression to OR list in prepared statements

Approved by: @ouyuanning

* Refactor view scope execute (matrixorigin#15984)

降低创建视图与创建表的操作的耦合,增强代码可读性,

Approved by: @badboynt1, @ouyuanning, @m-schen, @aunjgr

* optimize join pipeline for tp query (matrixorigin#16773)

对于tp query,将build端的pipeline优化掉,直接将build算子添加到右子树上,不需要connector->merge进行连接
并且直接在compile时完成,不需要在运行时再对pipeline做修改

Approved by: @ouyuanning

* add optimizer hints  (matrixorigin#16782)

增加了一个optimizer hints选项,可以强制所有的right join改成left join
修改了对query ap/tp hint的实现方式

Approved by: @aunjgr

* [BugFix]: Add ColName to MasterIndexScan for Filter PushDown (matrixorigin#16778)

- Adding ColName to MasterIndex Optimizer Plan for Filter Pushdown.
- With this change, master index performance is now reasonable. Hence removing the Experimental Flag.

1 Filter Query QPS
- No index: 500
- 1 Master: 2820
- 100 Secondary: 2958

<details>
<summary> Query Plan </summary>

```sql
-- Master Index

mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a48 = 'b92k7dWP5t';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                   |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                             |
|   Analyze: timeConsumed=0ms waitTime=4ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                            |
|   ->  Join                                                                                                                                          |
|         Analyze: timeConsumed=1ms waitTime=29ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                     |
|         Join Type: INDEX                                                                                                                            |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col)                                        |
|         Runtime Filter Build: #[-1,0]                                                                                                               |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                        |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes |
|               Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                                 |
|               Block Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                           |
|               Runtime Filter Probe: tbl.a100                                                                                                        |
|         ->  Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN]                                                  |
|               Analyze: timeConsumed=1ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes |
|               Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F47 Fb92k7dWP5t ')            |
|               Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F47 Fb92k7dWP5t ')      |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.00 sec)


-- Secondary Index

mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a48 = 'b92k7dWP5t';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| TP QURERY PLAN                                                                                                                                      |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                             |
|   Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                            |
|   ->  Join                                                                                                                                          |
|         Analyze: timeConsumed=0ms waitTime=1ms inputRows=1 outputRows=1 InputSize=24bytes OutputSize=24bytes MemorySize=0bytes                      |
|         Join Type: INDEX                                                                                                                            |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_pri_col)                                        |
|         Runtime Filter Build: #[-1,0]                                                                                                               |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                        |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=50bytes |
|               Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                                 |
|               Block Filter Cond: (tbl.a48 = 'b92k7dWP5t')                                                                                           |
|               Runtime Filter Probe: tbl.a100                                                                                                        |
|         ->  Table Scan on a.__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067 [ForceOneCN]                                                  |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=74bytes OutputSize=24bytes MemorySize=75bytes |
|               Filter Cond: prefix_eq(__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_idx_col, 'Fb92k7dWP5t ')                 |
|               Block Filter Cond: prefix_eq(__mo_index_secondary_01900577-f81d-7a7b-802c-b61a09a28067.__mo_index_idx_col, 'Fb92k7dWP5t ')           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.00 sec)



```
</details>


2 Filter Query QPS
- No Index:
- 1 Master: 1335 (right now we use both the filters using inner join)
- 100 Secondary: 1725 (we only make use of one secondary index table)

<details>
<summary> Query Plan </summary>


```sql
-- master index
mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a89 = '40u4JSeGvz' AND tbl.a31 = '3X5ZOcJbol';
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| AP QUERY PLAN ON MULTICN(10 core)                                                                                                                                                                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                                                                                      |
|   Analyze: timeConsumed=0ms waitTime=36ms inputRows=3 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=0bytes                                                                                    |
|   ->  Join                                                                                                                                                                                                   |
|         Analyze: timeConsumed=11ms waitTime=167ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                                                                            |
|         Join Type: INDEX                                                                                                                                                                                     |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col)                                                                                                 |
|         Runtime Filter Build: #[-1,0]                                                                                                                                                                        |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                                                                                 |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes                                                          |
|               Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                                                                                |
|               Block Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                                                                          |
|               Runtime Filter Probe: tbl.a100                                                                                                                                                                 |
|         ->  Join                                                                                                                                                                                             |
|               Analyze: timeConsumed=11ms probe_time=[total=0ms,min=0ms,max=0ms,dop=10] build_time=[11ms] waitTime=145ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=361187bytes |
|               Join Type: INNER                                                                                                                                                                               |
|               Join Cond: (__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col = __mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_pri_col)                       |
|               ->  Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN]                                                                                                     |
|                     Analyze: timeConsumed=5ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes                                                    |
|                     Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F30 F3X5ZOcJbol ')                                                               |
|                     Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F30 F3X5ZOcJbol ')                                                         |
|               ->  Table Scan on a.__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17 [ForceOneCN]                                                                                                     |
|                     Analyze: timeConsumed=11ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=79bytes OutputSize=24bytes MemorySize=80bytes                                                   |
|                     Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F88 F40u4JSeGvz ')                                                               |
|                     Block Filter Cond: prefix_eq(__mo_index_secondary_01900572-3257-76e5-9a7d-e4eaa2a28f17.__mo_index_idx_col, 'F88 F40u4JSeGvz ')                                                         |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
24 rows in set (0.03 sec)



-- secondary index

mysql> explain analyze SELECT tbl.a100  FROM tbl  WHERE tbl.a89 = '40u4JSeGvz' AND tbl.a31 = '3X5ZOcJbol';
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| TP QURERY PLAN                                                                                                                                      |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                             |
|   Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=1 InputSize=48bytes OutputSize=24bytes MemorySize=0bytes                            |
|   ->  Join                                                                                                                                          |
|         Analyze: timeConsumed=0ms waitTime=1ms inputRows=1 outputRows=1 InputSize=24bytes OutputSize=24bytes MemorySize=0bytes                      |
|         Join Type: INDEX                                                                                                                            |
|         Join Cond: (tbl.a100 = __mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_pri_col)                                        |
|         Runtime Filter Build: #[-1,0]                                                                                                               |
|         ->  Table Scan on a.tbl [ForceOneCN]                                                                                                        |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=72bytes OutputSize=24bytes MemorySize=75bytes |
|               Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                       |
|               Block Filter Cond: (tbl.a89 = '40u4JSeGvz'), (tbl.a31 = '3X5ZOcJbol')                                                                 |
|               Runtime Filter Probe: tbl.a100                                                                                                        |
|         ->  Table Scan on a.__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6 [ForceOneCN]                                                  |
|               Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=1 outputRows=1 InputSize=74bytes OutputSize=24bytes MemorySize=75bytes |
|               Filter Cond: prefix_eq(__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_idx_col, 'F3X5ZOcJbol ')                 |
|               Block Filter Cond: prefix_eq(__mo_index_secondary_01900568-4e08-7f3a-9fb9-755355944df6.__mo_index_idx_col, 'F3X5ZOcJbol ')           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
16 rows in set (0.00 sec)
```
</details>

Approved by: @daviszhen, @badboynt1, @heni02

* dashboard: fix runtime dashboard (matrixorigin#16771)

refine runtime metrics dashboard

Approved by: @zhangxu19830126

* remove 1PC commands (matrixorigin#16786)

remove 1PC subcommands

Approved by: @XuPeng-SH

* do not apply delete when flush (matrixorigin#16731)

do not apply delete when flush

Approved by: @XuPeng-SH

* fix merge

* rm 1pc

---------

Co-authored-by: YANGGMM <www.yangzhao123@gmail.com>
Co-authored-by: aptend <49832303+aptend@users.noreply.github.com>
Co-authored-by: nitao <badboynt@126.com>
Co-authored-by: Wei Ziran <weiziran125@gmail.com>
Co-authored-by: Kai Cao <ck89119@users.noreply.github.com>
Co-authored-by: fagongzi <zhangxu19830126@gmail.com>
Co-authored-by: reusee <reusee@gmail.com>
Co-authored-by: triump2020 <63033222+triump2020@users.noreply.github.com>
Co-authored-by: Ariznawlll <ariznawl@163.com>
Co-authored-by: heni02 <113406637+heni02@users.noreply.github.com>
Co-authored-by: davis zhen <daviszhen007@gmail.com>
Co-authored-by: GreatRiver <2552853833@qq.com>
Co-authored-by: qingxinhome <70939751+qingxinhome@users.noreply.github.com>
Co-authored-by: gouhongshen <gouhongshen@hotmail.com>
Co-authored-by: zengyan1 <93656539+zengyan1@users.noreply.github.com>
Co-authored-by: LiuBo <g.user.lb@gmail.com>
Co-authored-by: XuPeng-SH <xupeng3112@163.com>
Co-authored-by: yangj1211 <153493538+yangj1211@users.noreply.github.com>
Co-authored-by: Arjun Sunil Kumar <arjunsk@users.noreply.github.com>
Co-authored-by: CJKkkk_ <66134511+CJKkkk-315@users.noreply.github.com>
Co-authored-by: bRong Njam <longran1989@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement size/M Denotes a PR that changes [100,499] lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants