Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure time_to_convert bins stay uniform #5316

Merged
merged 2 commits into from
Jul 23, 2021
Merged

Ensure time_to_convert bins stay uniform #5316

merged 2 commits into from
Jul 23, 2021

Conversation

neilkakkar
Copy link
Collaborator

Changes

https://posthog.slack.com/archives/C01G8S5T08Z/p1626980526060300?thread_ts=1626980124.058400&cid=C01G8S5T08Z - sometimes, the bin sizes can stretch out unreasonably if the time to convert values are incorrect (similar to #5116 ).

This ensures we discard those values, and stick to the "nice" buckets, which ensures the frontend doesn't look like: https://posthog.slack.com/files/U01SAS8J92Q/F029M7CN7LY/screen_cast_2021-07-22_at_2.43.10_pm.gif

Checklist

  • All querysets/queries filter by Organization, by Team, and by User
  • Django backend tests
  • Jest frontend tests
  • Cypress end-to-end tests
  • Migrations are safe to run at scale (e.g. PostHog Cloud) – present proof if not obvious
  • New/changed UI is decent on smartphones (viewport width around 360px)

@macobo
Copy link
Contributor

macobo commented Jul 23, 2021

Me, with 0% context on the query:

Full outer joins and right outer joins are semantically really different. What is on the LEFT, what is the historic context for FULL OUTER JOIN, why is it incorrect and could it be captured in a test?

@neilkakkar
Copy link
Collaborator Author

The way this query is constructed, right & full outer join should be equivalent: On the left, you have bin start times & counts, on the right you have all possible bin start times. (counts are filled to 0).

(left outer join is not equivalent because some bins might be empty on the left, since 0 count, which aren't captured, and we aim to return a uniform bucket list)

With full outer join, since we union left and right, some discrepancies in left might come, like the conversion times being outside of our calculated range (because of the sporadic weirdness in conversion times - step_runs CTE being run again here). This isn't possible on the right, since it's counting all the bins from to with the same pre-calculated range.

Hard to capture in a test (intermittent. I haven't seen it happen except in the slack example), but since all 6 existing tests pass, I can sortof convince you it's equivalent.

@EDsCODE EDsCODE merged commit 259d12d into master Jul 23, 2021
@EDsCODE EDsCODE deleted the timetoconvert_bins branch July 23, 2021 14:40
neilkakkar added a commit that referenced this pull request Jul 29, 2021
macobo pushed a commit that referenced this pull request Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants