New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up sessions list query #2934
Conversation
Note clickhouse behavior is unchanged.
In addition to offset, postgres now returns a dict containing person_id, timestamp which is used to make sure we filter events on different pages correctly
Since we're ordering by end_time we know events before last end_time are all processed.
This gets used by the view
Need to draft this for a bit - there are some bugs in here I'm finding. |
b9a8c4d
to
880c153
Compare
880c153
to
8c215d8
Compare
590f678
to
0ccf718
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial code comments. will manually QA in a bit
const [page, setPage] = useState(1) | ||
const [pageSize, setPageSize] = useState(50) | ||
|
||
async function loadSessionEvents(): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we're trying to keep most data logic in kea logics if possible. even if this data is scoped here only it'll be more modular if there's an accompanying logic instead of localstate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the data was only used locally and the kea logic would have ended up quite a bit more convoluted.
But moving this out will become neccessary soon as we will want to show highlighted events in session recording. Ok if I punt on this until then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup just a heads up not strict
return [session for i, session in enumerate(sessions) if session["distinct_id"] in person_ids] | ||
|
||
@action(methods=["GET"], detail=False) | ||
def session_events(self, request: request.Request, *args: Any, **kwargs: Any) -> Response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does clickhouse also use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bit of feature drift here.
For clickhouse, the sessions query still returns events and the frontend knows to use those. For postgres, events are loaded separately (for perf reasons) -> separate query is needed.
Not sure what the correct approach here is, I don't think it's worth refactoring clickhouse query here right now. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm yeah I figured it was on these lines. I think if the frontend can handle then it's no problem. This will just be part of the consideration in consolidating our offering into clickhouse only. I think the duality will eventually be a major hinderance in moving quickly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QA looks good!
Relevant issue: #2739
This PR rewrites how self-hosted postgres instances calculate sessions.
This query was slow because:
LAG
was causing datasets to get re-sorted multiple times on postgres disk.The solution is sort of silly:
One consequence of this is that pagination logic got quite bit more complex for sessions. We're now passing opaque data blobs back-and-forth from BE to FE and back again for this purpose.
Behavioral changes:
Checklist