Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable person breakdowns querying for all ordering funnels #5043

Merged
merged 15 commits into from
Jul 12, 2021

Conversation

neilkakkar
Copy link
Collaborator

Changes

Resolves #5030

Checklist

  • All querysets/queries filter by Organization, by Team, and by User
  • Django backend tests
  • Jest frontend tests
  • Cypress end-to-end tests
  • Migrations are safe to run at scale (e.g. PostHog Cloud) – present proof if not obvious
  • New/changed UI is decent on smartphones (viewport width around 360px)

timestamp="2020-01-01T13:00:00Z",
)
_create_event(
team=self.team,
event="buy",
distinct_id=f"person_{num}_{i}",
properties={"key": "val", "some_breakdown_val": f"{num}"},
properties={"key": "val", "some_breakdown_val": num},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enables testing for both, numeric and string values. Did a quick search on Metabase, and not everyone sends strings always

@timgl timgl temporarily deployed to posthog-pr-5043 July 8, 2021 14:31 Inactive
Base automatically changed from refactor to master July 8, 2021 17:25
@timgl timgl temporarily deployed to posthog-pr-5043 July 9, 2021 10:32 Inactive
@timgl timgl temporarily deployed to posthog-pr-5043 July 9, 2021 11:53 Inactive
@timgl timgl temporarily deployed to posthog-pr-5043 July 9, 2021 12:01 Inactive
@neilkakkar neilkakkar marked this pull request as ready for review July 9, 2021 12:02
@neilkakkar neilkakkar requested review from Twixes and EDsCODE July 9, 2021 12:02
@neilkakkar neilkakkar changed the title Enable person breakdowns querying for regular + strict funnels Enable person breakdowns querying for all ordering funnels Jul 9, 2021
@timgl timgl temporarily deployed to posthog-pr-5043 July 9, 2021 12:23 Inactive
Copy link
Member

@EDsCODE EDsCODE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you expecting the breakdown to contain double quotes now or did you handle it somewhere that I'm missing?

@@ -66,7 +64,8 @@ def _format_single_funnel(self, result, with_breakdown=False):
serialized_result.update({"average_conversion_time": None})

if with_breakdown:
serialized_result.update({"breakdown": result[-1][1:-1]}) # strip quotes
serialized_result.update({"breakdown": result[-1]})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, do you not handle the double quotes now when the breakdown val is a string?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't handle them (so the tests have double quotes now). The reason being: with JSONExtractRaw used everywhere, the keys on which we filter props have double quotes when they're strings, and they are strings when they're input as numbers. I attempted playing with this extraction such that we get no quotes in our keys, but that turned out to be very annoying when you don't know your data type apriori. (I have a test for this now, where the values of a property are both strings and integers)

Us removing double quotes ourselves has implications: the front-end has to guess whether the key had double quotes or not when they try to get the persons for this breakdown, which makes things hard.

Hence, I propose: we don't modify the keys at all. If we want to change the presentation, we let the front-end strip the quotes, if they exist. But, when asking for persons for a breakdown, we expect the front-end to send the specific key which we sent ourselves, sans modification.

I think this approach makes things cleaner.

Thoughts? Also cc: @liyiy @samwinslow (or whoever is going to implement the frontend bit 😅 )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To give a concrete example of JSONExtractRaw:

SELECT JSONExtractRaw('{"a": "hello", "b": [-100, 200.0, 300], "c": "chrome", "d": 12}', 'c');

returns the string "Chrome"

SELECT JSONExtractRaw('{"a": "hello", "b": [-100, 200.0, 300], "c": "chrome", "d": 12}', 'd');

returns the string 12.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we expect the front-end to send the specific key which we sent ourselves, sans modification.

Do you mean with or without quotes here then? I think sending with quotes would start getting confusing especially since it's inconsistent with the rest of our app

Otherwise, I would agree as long as it's handled somewhere and stays consistent with how we're displaying the breakdown values elsewhere

Copy link
Collaborator Author

@neilkakkar neilkakkar Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'm not talking about how we display the values at all, just about how we should pass them to the front-end.

To display the problem with stripping quotes more concretely:

SELECT JSONExtractRaw('{"a": "1", "b": 1}', 'a'); -- returns the string "1"
SELECT JSONExtractRaw('{"a": "1", "b": 1}', 'b'); -- returns the string 1

In effect, we can't arbitrarily strip quotes, because it changes the key.

I'm all for the front-end removing the quotes, if they exist, to display breakdown values without quotes, but I don't think we can do this in the backend, since it changes our filter keys for props :$ '"1"' != '1'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AH! I see, there's the property filtering, which removes this from JSONExtractRaw: https://github.com/PostHog/posthog/blob/master/ee/clickhouse/models/property.py#L89

I guess we just need to follow this everywhere for breakdowns as well, and we should be good to go (we don't yet). I'll fix this like so instead then!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECT trim(BOTH '\"' FROM JSONExtractRaw('{"a": "1", "b": 1}', 'a'));
is same as
SELECT trim(BOTH '\"' FROM JSONExtractRaw('{"a": "1", "b": 1}', 'b'));

So we (will be) good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. I mean, we could actually convert to the appropriate type in the backend really, but in a smarter way that woud result in '"1"''1' (string), '1'1 (number).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried that with JSONExtract('{"a": "1", "b": 1}', 'a', <type>) - but it didn't let me infer the types using type inference as well :$ - so annoying.

I.e. JSONExtract('{"a": "1", "b": 1}', 'a', JSONType('{"a": "1", "b": 1}', 'a')) doesn't work 😢

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this'd have to happen in Django at earliest

@timgl timgl temporarily deployed to posthog-pr-5043 July 12, 2021 14:43 Inactive
Copy link
Member

@EDsCODE EDsCODE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@Twixes Twixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me but 55dde4c broke tests

@timgl timgl temporarily deployed to posthog-pr-5043 July 12, 2021 16:28 Inactive
@neilkakkar
Copy link
Collaborator Author

neilkakkar commented Jul 12, 2021

I resolved this for trends + funnels both. Some existing tests were reliant on internal sorting, so I've added an explicit sort on top of this to ensure there's no test failures here.

I don't think the order in the list means anything with breakdowns, we don't sort it based on anything, but let me know if the tests were testing for something I'm not aware of.

You can see these specific changes here: b74ba00

@Twixes Twixes merged commit c554bb5 into master Jul 12, 2021
@Twixes Twixes deleted the persons_and_breakdowns branch July 12, 2021 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support breakdowns in strict and unordered funnels
4 participants