branch-4.1: [Improvement](function) support window funnel v2 #61566#61935
Open
github-actions[bot] wants to merge 1 commit intobranch-4.1from
Open
branch-4.1: [Improvement](function) support window funnel v2 #61566#61935github-actions[bot] wants to merge 1 commit intobranch-4.1from
github-actions[bot] wants to merge 1 commit intobranch-4.1from
Conversation
apache/doris-website#3506 文档更新 ```sql select count(user_id) from ( SELECT user_id, WINDOW_FUNNEL_v1( 1800, 'default', event_time, event_type = 'view', event_type = 'add_to_cart', event_type = 'purchase' ) AS funnel_step FROM user_events GROUP BY user_id ) t where funnel_step = 1; 1.57 sec -> 0.12 sec ``` <hr> <h3>V2 与 V1 的语义差异</h3> <p>V2 在 DEFAULT、INCREASE、DEDUPLICATION 三种模式下与 V1 完全一致,仅 FIXED 模式存在有意的语义变更:</p> <p><strong>FIXED 模式语义变更</strong></p> <ul> <li><strong>V1 语义(物理行相邻)</strong>:要求漏斗链中相邻两个匹配事件之间不能存在任何不匹配条件的行,否则链条断裂。</li> <li><strong>V2 语义(事件级别连续)</strong>:只要匹配的事件级别是连续递增的(level 1→2→3→4),链条就不断裂。不匹配任何条件的行在 V2 中根本不会被存储,因此不影响链条连续性。</li> </ul> <p><strong>示例</strong>:用户 100123 的事件序列为 <code>登录(10:01) → 访问(10:02) → 登录2(10:03) → 下单(10:04) → 付款(10:10)</code>,漏斗条件为 <code>登录 → 访问 → 下单 → 付款</code>。其中 <code>登录2</code> 不匹配任何漏斗条件:</p> <ul> <li>V1 结果:2(<code>登录2</code> 打断了物理行相邻,链条在 <code>访问</code> 后断裂)</li> <li>V2 结果:4(<code>登录2</code> 不参与匹配,漏斗级别 1→2→3→4 连续递增,链条完整)</li> </ul> <p>V2 的 FIXED 语义更符合业务直觉——用户关心的是"漏斗步骤是否连续完成",而非"中间是否插入了无关行"。V1 的物理行相邻检查依赖于数据中不相关行的存在与否,在实际业务场景中容易产生非预期的结果。</p> <p><strong>受影响的测试文件</strong>:</p> 文件 | 变更 | 原因 -- | -- | -- regression-test/data/nereids_p0/aggregate/window_funnel.out | window_funnel_fixed1: 用户 100123 从 2→4 | FIXED 模式语义变更 regression-test/data/nereids_p0/sql_functions/aggregate_functions/test_aggregate_window_functions.out | agg_window_window_funnel: 用户 100123 从 2→4(×5行) | 同上(window function 形式,每个 partition 行输出一次) This pull request introduces a new implementation of the `window_funnel` aggregate function, called `window_funnel_v2`, which is designed to be more memory efficient by only storing matched events as (timestamp, event_index) pairs. The changes also rename the original implementation to `window_funnel_v1` and update function registrations and aliases accordingly. Additionally, the front-end and test files are updated to support and validate the new implementation. The most important changes are: ### Backend (BE) Function Implementation and Registration * Added a new file `aggregate_function_window_funnel_v2.cpp` implementing `window_funnel_v2`, which only supports DateTime types for the window argument and is registered with the factory. [[1]](diffhunk://#diff-a4382e322c2d2744d8c9078201207dac43afab8cca4cf0ee27c430057ba4f3f7R1-R53) [[2]](diffhunk://#diff-d14e703e022713963a2ea0aa14bb71f0fb4ccbf8ada913c71347e97e74cfddb8R59) [[3]](diffhunk://#diff-d14e703e022713963a2ea0aa14bb71f0fb4ccbf8ada913c71347e97e74cfddb8R113) * Renamed the original `window_funnel` function to `window_funnel_v1` in the registration, and set up an alias so that `window_funnel` now points to `window_funnel_v1` for backward compatibility. ### Frontend (FE) Function Classes and Registration * Added a new class `WindowFunnelV2` in the FE codebase, with signature checks and visitor support, to represent the new aggregate function. [[1]](diffhunk://#diff-9a98756874c78819dbec3dd1b58e35ca9812fd97b52fd62063776d80a6d8a5a7R1-R133) [[2]](diffhunk://#diff-df12d42e7cf55119d84a24ac5dc8e7b8207e18e88ea0776f8176593facf5dd76R98) [[3]](diffhunk://#diff-0a9452ef18cf7215271cce4f886ed4759322441a133d15bcbbf57aac61b7652fR96) [[4]](diffhunk://#diff-0a9452ef18cf7215271cce4f886ed4759322441a133d15bcbbf57aac61b7652fR408-R411) * Updated the function registration so that `window_funnel` now points to `WindowFunnelV2`, and the old implementation is available as `window_funnel_v1`. [[1]](diffhunk://#diff-df12d42e7cf55119d84a24ac5dc8e7b8207e18e88ea0776f8176593facf5dd76L196-R198) [[2]](diffhunk://#diff-1311d9eafa9e1244a161e21a24d97af049dcd05b08be9a3bc655805ab1780a8eL68-R68) [[3]](diffhunk://#diff-bf1978d57acdd865d169ed2b9c0b67c932698cf9313e5947078f11fb8d151335R58) ### Test and Output Updates * Added a new regression test output file for `window_funnel_v2` with various test cases. * Updated existing test output files to reflect the new behavior and results of the `window_funnel` function, which now uses the v2 implementation by default. [[1]](diffhunk://#diff-ec7a35e4876ce5b3093a11196b969d389806bfa0bfc7570a3d9163c3f45adcdeL120-R120) [[2]](diffhunk://#diff-6cdd8c456c7c16a3e70306901d104420d68a89d90960f931e81e088db6c805adL810-R814) ### Miscellaneous * Made `string_to_window_funnel_mode` inline for potential performance improvement.
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
|
run buildall |
Contributor
FE UT Coverage ReportIncrement line coverage |
yiguolei
approved these changes
Mar 31, 2026
Contributor
Author
|
PR approved by at least one committer and no changes requested. |
Contributor
Author
|
PR approved by anyone and no changes requested. |
Contributor
|
skip buildall |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-picked from #61566