Skip to content

branch-4.0:[fix](partition_prune) Move the pruning of predicates that are alwaystrue after partition pruning into the PlanPostProcessor #63111#63466

Closed
feiniaofeiafei wants to merge 1513 commits into
apache:masterfrom
feiniaofeiafei:pick_prune_predicate_mv_4.0
Closed

branch-4.0:[fix](partition_prune) Move the pruning of predicates that are alwaystrue after partition pruning into the PlanPostProcessor #63111#63466
feiniaofeiafei wants to merge 1513 commits into
apache:masterfrom
feiniaofeiafei:pick_prune_predicate_mv_4.0

Conversation

@feiniaofeiafei
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

airborne12 and others added 30 commits March 4, 2026 20:06
… bug fixes (apache#61028)

### What problem does this PR solve?

Squashed backport of all search() function improvements and bug fixes
from master to branch-4.0.

This PR combines the following master PRs into a single backport:

| Master PR | Type | Description |
|-----------|------|-------------|
| apache#59747 | fix | Make AND/OR/NOT operators case-sensitive in search DSL
|
| apache#60654 | refactor | Refactor SearchDslParser to single-phase ANTLR
parsing and fix ES compatibility issues |
| apache#60782 | fix | Upgrade query type for variant subcolumns with
analyzer-based indexes |
| apache#60784 | fix | Fix MATCH_ALL_DOCS query failing in multi-field search
mode |
| apache#60786 | feat | Support field-grouped query syntax field:(term1 OR
term2) |
| apache#60790 | fix | Add searcher cache reuse and DSL result cache for
search() function |
| apache#60793 | fix | Fix wildcard query on variant subcolumns returning
empty results |
| apache#60798 | fix | Use FE-provided analyzer key for multi-index columns in
search() |
| apache#60814 | fix | Fix implicit conjunction incorrectly modifying
preceding term in lucene mode |
| apache#60834 | test | Add regression test for wildcard query on variant
subcolumns with multi-index |
| apache#60873 | fix | fix MATCH_ALL_DOCS losing occur attribute in
multi-field expansion |
| apache#60891 | fix | inject MATCH_ALL_DOCS for multi-MUST_NOT queries in
lucene mode |

### Release note

Backport search() function improvements including DSL parser
refactoring, multi-field search fixes, variant subcolumn support, query
caching, and field-grouped query syntax.

### Check List (For Author)

- Test
    - [x] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [x] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason

- Behavior changed:
    - [ ] No.
- [x] Yes. New search() function features and bug fixes backported from
master.

- Does this need documentation?
    - [x] No.
    - [ ] Yes.

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
…efault is eventual consistency apache#60114 (apache#61032)

Cherry-picked from apache#60114

Co-authored-by: 924060929 <lanhuajian@selectdb.com>
… update on eos apache#60615 (apache#61039)

Cherry-picked from apache#60615

Co-authored-by: Pxl <xl@selectdb.com>
 (apache#61033)

Cherry-picked from apache#60288

Co-authored-by: minghong <zhouminghong@selectdb.com>
…pache#60947 (apache#61008)

Cherry-picked from apache#60947

Co-authored-by: Socrates <suyiteng@selectdb.com>
Cherry-picked from apache#61035

Co-authored-by: yujun <yujun@selectdb.com>
…acing temp partition in cloud mode apache#60888 (apache#61093)

Cherry-picked from apache#60888

Co-authored-by: hui lai <laihui@selectdb.com>
… result with MySQL driver 9.5.0 apache#61050 (apache#61062)

Cherry-picked from apache#61050

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
… slots apache#61029 (apache#61068)

### What problem does this PR solve?
pick apache#61029
Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
… causing std::bad_function_call crash apache#61077 (apache#61098)

Cherry-picked from apache#61077

Co-authored-by: zzzxl <yangsiyu@selectdb.com>
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
…GEINT. apache#61081 (apache#61105)

Cherry-picked from apache#61081

Co-authored-by: Mryange <yanxuecheng@selectdb.com>
…maScanNode apache#61086 (apache#61106)

Cherry-picked from apache#61086

Co-authored-by: hoshinojyunn <85017683+hoshinojyunn@users.noreply.github.com>
…unningTime apache#61042 (apache#61101)

Cherry-picked from apache#61042

Co-authored-by: HappenLee <happenlee@selectdb.com>
… two-level table+snapshot structure apache#60478 (apache#60741)

- Cherry-picked from apache#60478
- Keep branch-4.0 compatibility in PaimonExternalCatalog without
introducing apache#58894
…FE restart after swap table apache#61046 (apache#61128)

Cherry-picked from apache#61046

Co-authored-by: hui lai <laihui@selectdb.com>
…nt configuration apache#60906 (apache#61159)

Cherry-picked from apache#60906

Co-authored-by: Yixuan Wang <wangyixuan@selectdb.com>
…ogs (apache#61132)

This is a cherry-pick of apache#60738 to branch-4.0.


It only works for metadata_failure_recovery mode


Problem Summary:
When using metadata failure recovery mode, sometimes the journal logs
need to be truncated to a specific journal ID to recover from metadata
corruption. This PR adds a new parameter `--recovery_journal_id` to
allow users to specify the target journal ID, and all journals with IDs
greater than this value will be removed.

### How to use

When the Frontend (FE) metadata is corrupted and needs to be recovered,
follow these steps:

1. Stop the FE process
2. Start FE with both `--metadata_failure_recovery` and
`--recovery_journal_id` parameters:

```bash
./start_fe.sh --metadata_failure_recovery --recovery_journal_id <journal_id>
```

For example, to recover to journal ID 12345:
```bash
./start_fe.sh --metadata_failure_recovery --recovery_journal_id 12345
```

3. The system will:
- Remove all journal databases with IDs greater than the specified
journal ID
- Truncate the target journal database by deleting keys with IDs greater
than the specified value
   - Start FE in metadata failure recovery mode

**Note**: This operation will permanently delete metadata journals. Make
sure to backup your data before using this feature.
…le had been updated. apache#61112 (apache#61137)

Cherry-picked from apache#61112

Co-authored-by: daidai <changyuwei@selectdb.com>
### What problem does this PR solve?
pick apache#61057

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
…rpret_cast overflow (apache#61150)

## Proposed changes

Cherry-pick of apache#61120 to branch-4.0.

On ARM64, std::string is 24 bytes but StringRef is 16 bytes. Several
places pass StringRef* through void* and then reinterpret_cast to
std::string*, reading 8 bytes beyond the buffer.

1. **function_multi_match.cpp**: Convert StringRef to std::string before
passing as query_value. Downstream FullTextIndexReader::query()
reinterpret_casts query_value as std::string* (24 bytes on ARM64), but
StringRef is only 16 bytes, causing stack-buffer-overflow.

2. **in_list_predicate.h**: Fix 3 sites where HybridSet iterator returns
StringRef* via get_value(), but code treats it as std::string*. Add `if
constexpr (is_string_type(Type))` guard to construct std::string from
StringRef data/size before use.

Cherry-pick applied cleanly with no conflicts.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…rties and cover Paimon HMS+OSS case (apache#61155)

…

PaimonSysTableJniScanner / related JNI paths currently only consume
hadoopProps.
  When a catalog uses the new HMS kerberos parameters:

  - hive.metastore.authentication.type
  - hive.metastore.client.principal
  - hive.metastore.client.keytab

but does not explicitly configure HDFS kerberos parameters, JNI cannot
recognize the kerberos identity from getHadoopProperties().

  This is especially visible in the HMS kerberos + OSS storage scenario:
HMS access should use kerberos, while storage access should still use
OSS credentials.

  ## What Changed

### 1. Temporary compatibility in CatalogProperty#getHadoopProperties()

  When catalog properties contain:

  - hive.metastore.authentication.type=kerberos
  - hive.metastore.client.principal
  - hive.metastore.client.keytab

  getHadoopProperties() now injects canonical Hadoop kerberos keys:

  - hadoop.security.authentication=kerberos
  - hadoop.kerberos.principal=<hive.metastore.client.principal>
  - hadoop.kerberos.keytab=<hive.metastore.client.keytab>

  If configured, it also passes through:

  - hadoop.security.auth_to_local

This is a short-term compatibility fix so JNI paths can reuse HMS
kerberos identity even when only HMS-side kerberos properties are
provided.

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
… batch delete_rowset_data apache#60919 (apache#61161)

Cherry-picked from apache#60919

Co-authored-by: Xin Liao <liaoxin@selectdb.com>
…61162)

Cherry-picked from apache#61060

Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
…SELECT on catalog privilege checks apache#61147 (apache#61163)

Cherry-picked from apache#61147

Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
github-actions Bot and others added 17 commits May 12, 2026 19:21
…Gson replay apache#63094 (apache#63176)

Cherry-picked from apache#63094

Co-authored-by: hui lai <laihui@selectdb.com>
…er the first record is received apache#63141 (apache#63162)

Cherry-picked from apache#63141

Co-authored-by: wudi <wudi@selectdb.com>
…ersion range resolution failure apache#63278 (apache#63283)

Cherry-picked from apache#63278

Co-authored-by: Dongyang Li <lidongyang@selectdb.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…pache#63220 (apache#63236)

Cherry-picked from apache#63220

Co-authored-by: TengJianPing <tengjianping@selectdb.com>
… bypass apache#63208 (apache#63223)

Cherry-picked from apache#63208

Co-authored-by: morrySnow <zhangwenxin@selectdb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…yType recursion apache#63201 (apache#63212)

Cherry-picked from apache#63201

Co-authored-by: morrySnow <zhangwenxin@selectdb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ing sequentially reading key apache#62476 (apache#63122)

Cherry-picked from apache#62476

Co-authored-by: Yixuan Wang <wangyixuan@selectdb.com>
… after FE restart apache#62331 (apache#62545)

Cherry-picked from apache#62331

Co-authored-by: hui lai <laihui@selectdb.com>
…ers in catalog case apache#62313 (apache#62328)

Cherry-picked from apache#62313

Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
apache#62317)

Cherry-picked from apache#62274

Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
…loud schema change apache#62256 (apache#62310)

Cherry-picked from apache#62256

Co-authored-by: bobhan1 <baohan@selectdb.com>
…on_threshold default to 36 apache#61984 (apache#62002)

Cherry-picked from apache#61984

Co-authored-by: bobhan1 <baohan@selectdb.com>
…ng columns apache#62686 (apache#63302)

cherry-pick: apache#62686

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… true after partition pruning into the PlanPostProcessor (apache#63111)

Related PR: apache#57169

Problem Summary:
After partition pruning, predicates that are evaluated to constant true
are removed as an optimization (introduced by apache#57169). However, this
introduced a bug in materialized view rewriting: when such predicates
are removed, they are not compensated above the rewritten materialized
view. Since the materialized view itself is not partitioned, this leads
to incorrect query results.

This PR fixes the issue by moving the removal of constant-true
predicates to the planPostProcessors phase. This ensures that the
predicate removal does not interfere with materialized view rewriting,
preserving correctness while still retaining the optimization benefit.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@github-actions
Copy link
Copy Markdown
Contributor

Possible file(s) that should be tracked in LFS detected: 🚨

The following file(s) exceeds the file size limit: 1048576 bytes, as set in the .yml configuration files:

  • be/dict/pinyin/polyphone.txt

Consider using git-lfs to manage large files.

@github-actions github-actions Bot added the lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request label May 21, 2026
@feiniaofeiafei feiniaofeiafei marked this pull request as draft May 21, 2026 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request

Projects

None yet

Development

Successfully merging this pull request may close these issues.