New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize segment SQL when segment subqueries are used #21016
Conversation
05cd479
to
25cbc21
Compare
…s internally instead of flat structure (this is to make re-ordering easier)
…hey are within the same OR sequence or are alone within the same AND sequence
25cbc21
to
a349dd2
Compare
…s so they are still supported
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I've looked through the change and I can't confidently give it an approval on its merits, I could only do it on the fact the tests are passing, but the tests were adjusted as well, so I would need more time or defer to someone else. |
@sgiehl Now the urlencoding issue is resolved, could you take another look at reviewing this PR when you get a chance? Thanks! 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general the code looks fine to me. But as mentioned before it's really hard to ensure if it might introduce any possible regressions regarding the resulting data.
To minimize the risk I did some more local testing by creating a bunch of fake visits for a certain day. Then I created various different segments and added some logging to receive the amount of rows inserted in the temporary segment table. Then I ran archiving once without the config flag and once with it and compared the numbers. For my local tests the numbers were the same. We nevertheless could so some similar testing with bigger data sets or maybe really compare the resulting visit ids to ensure there are no differences.
This issue is in "needs review" but there has been no activity for 7 days. ping @matomo-org/core-reviewers |
…tree (#21669) Co-authored-by: Ben Burgess <88810029+bx80@users.noreply.github.com>
I've reverted the feature flag change and merged in Michal's improved parsing code. If somebody wants to give this a last quick check over then it should be ready to merge into 5.x-dev 🏁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ben and Stefan tested this extensively and we have done some more tests in the cloud environment without any obvious issues, so I guess that's as much as we can do at this stage.
* fix build for matomo-org/matomo#21016 * improve test --------- Co-authored-by: diosmosis <diosmosis@users.noreply.github.com> Co-authored-by: Ben <ben.burgess@innocraft.com> Co-authored-by: Michal Kleiner <michal@innocraft.com>
Description:
Fixes #20467
Changes:
Review