Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: sync create inverted index #15379

Merged
merged 11 commits into from
Apr 30, 2024

Conversation

b41sh
Copy link
Member

@b41sh b41sh commented Apr 30, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Due to the long time to create the inverted index, in the case of continuous insertion of data, there may be a problem that there is data block but the index does not exist, resulting in query failure. To solve this problem, inverted index refresh is changed to synchronised with data block creation, similar to bloom index.

Other changes

  • Remove inverted_indexes from snapshot, because the snapshot may have changed during the index creation process, resulting in inverted_indexes changes not being synchronised with the snapshot. Whether to create inverted index file changed to check the file exists or not.
  • Fix the problem of creating bloom index for nullable map type, the query of map data can be filtered by bloom index.
  • Add some inverted index related mertics to locate performance issues.
    • fuse_block_inverted_index_write_nums
    • fuse_block_inverted_index_write_bytes
    • fuse_block_inverted_index_write_milliseconds
    • fuse_block_inverted_index_generate_milliseconds
    • fuse_block_inverted_index_read_milliseconds
    • fuse_block_inverted_index_search_milliseconds

part of #14825

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Apr 30, 2024
@b41sh b41sh marked this pull request as ready for review April 30, 2024 10:52
@sundy-li sundy-li added this pull request to the merge queue Apr 30, 2024
@BohuTANG BohuTANG removed this pull request from the merge queue due to a manual request Apr 30, 2024
@BohuTANG BohuTANG merged commit 2783378 into datafuselabs:main Apr 30, 2024
72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants