Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

695-Optimize funnel rendering #792

Merged
merged 22 commits into from May 18, 2020
Merged

695-Optimize funnel rendering #792

merged 22 commits into from May 18, 2020

Conversation

fuziontech
Copy link
Member

Changes

  • Generate lateral view sql for generating funnel metrics
  • Refactored sort logic for top 100 people on each step

Checklist

  • [ X ] All querysets/queries filter by Team (if applicable)
  • [ ] Backend tests (if applicable) <- working on this next

@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 16, 2020 20:21 Inactive
@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 17, 2020 01:01 Inactive
@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 17, 2020 01:02 Inactive
@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 17, 2020 01:49 Inactive
@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 17, 2020 01:56 Inactive
@fuziontech fuziontech requested a review from timgl May 17, 2020 02:02
Copy link
Collaborator

@timgl timgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. For me it brought down a representative query from 8 seconds to 700ms, so an order of magnitude better, so great work on that.

with connection.cursor() as cursor:
query_bodies = {step: cursor.mogrify(*q.query.sql_with_params()) for step, q in query_bodies.items()}
query = sql.SQL(QUERY_HEADER).format(
# step=steps[0],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops, yes it can.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops! yeah, definitely

.replace("'2000-01-01T00:00:00+00:00'::timestamptz", "{prev_step_ts}")
.replace('"posthog_event"."distinct_id"', '"pdi"."person_id"')
.replace('FROM "posthog_event"', 'FROM posthog_event JOIN posthog_persondistinctid pdi ON pdi.distinct_id = posthog_event.distinct_id'))
if i == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth splitting out each of these if/elif statements into each own function with a descriptive function name, so it's more obvious what's happening here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying to split these out into functions, but I think it is more readable in this if/elif/else block, but that might just be my golang showing. How about I annotate this with comments and see if it's more readable? If it's still ambiguous I'll refactor into individual funcs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to like smaller functions as a way of avoiding big blocks of logic, but I'm actually happy with these comments as they explain it clearly!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally me too, but this is so procedural in nature and depends on the surrounding context it was becoming a bowl of spaghetti when broken out into sep functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

i = 0
for step, qb in query_bodies.items():
q = sql.SQL(qb.decode('utf-8')
.replace("'1234321'", "{prev_step_person_id}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this couldn't happen in _gen_lateral_bodies? There might be, though if so might be good to add comment explaining that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Moved to _gen_lateral_bodies

@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 18, 2020 14:31 Inactive
@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 18, 2020 14:32 Inactive
with connection.cursor() as cursor:
event_string = cursor.mogrify(*event.query.sql_with_params())
query = sql.SQL(event_string.decode('utf-8')
.replace("'1234321'", "{prev_step_person_id}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about why these have to be replaced at all? I assume it's because the Django ORM doesn't support {}? In that case maybe worth adding a comment explaining

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

.replace("'2000-01-01T00:00:00+00:00'::timestamptz", "{prev_step_ts}")
.replace('"posthog_event"."distinct_id"', '"pdi"."person_id"')
.replace('FROM "posthog_event"', 'FROM posthog_event JOIN posthog_persondistinctid pdi ON pdi.distinct_id = posthog_event.distinct_id'))
if i == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to like smaller functions as a way of avoiding big blocks of logic, but I'm actually happy with these comments as they explain it clearly!

@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 18, 2020 15:08 Inactive
@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 18, 2020 15:31 Inactive
@timgl timgl temporarily deployed to posthog-695-funnel2-lkhgh4rdqf May 18, 2020 15:53 Inactive
@timgl timgl merged commit 8472a8c into master May 18, 2020
@timgl timgl deleted the 695-funnel2 branch May 18, 2020 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants