-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable person breakdowns querying for all ordering funnels #5043
Conversation
timestamp="2020-01-01T13:00:00Z", | ||
) | ||
_create_event( | ||
team=self.team, | ||
event="buy", | ||
distinct_id=f"person_{num}_{i}", | ||
properties={"key": "val", "some_breakdown_val": f"{num}"}, | ||
properties={"key": "val", "some_breakdown_val": num}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enables testing for both, numeric and string values. Did a quick search on Metabase, and not everyone sends strings always
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you expecting the breakdown to contain double quotes now or did you handle it somewhere that I'm missing?
@@ -66,7 +64,8 @@ def _format_single_funnel(self, result, with_breakdown=False): | |||
serialized_result.update({"average_conversion_time": None}) | |||
|
|||
if with_breakdown: | |||
serialized_result.update({"breakdown": result[-1][1:-1]}) # strip quotes | |||
serialized_result.update({"breakdown": result[-1]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, do you not handle the double quotes now when the breakdown val is a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't handle them (so the tests have double quotes now). The reason being: with JSONExtractRaw
used everywhere, the keys on which we filter props have double quotes when they're strings, and they are strings when they're input as numbers. I attempted playing with this extraction such that we get no quotes in our keys, but that turned out to be very annoying when you don't know your data type apriori. (I have a test for this now, where the values of a property are both strings and integers)
Us removing double quotes ourselves has implications: the front-end has to guess whether the key had double quotes or not when they try to get the persons for this breakdown, which makes things hard.
Hence, I propose: we don't modify the keys at all. If we want to change the presentation, we let the front-end strip the quotes, if they exist. But, when asking for persons for a breakdown, we expect the front-end to send the specific key which we sent ourselves, sans modification.
I think this approach makes things cleaner.
Thoughts? Also cc: @liyiy @samwinslow (or whoever is going to implement the frontend bit 😅 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To give a concrete example of JSONExtractRaw
:
SELECT JSONExtractRaw('{"a": "hello", "b": [-100, 200.0, 300], "c": "chrome", "d": 12}', 'c');
returns the string "Chrome"
SELECT JSONExtractRaw('{"a": "hello", "b": [-100, 200.0, 300], "c": "chrome", "d": 12}', 'd');
returns the string 12
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we expect the front-end to send the specific key which we sent ourselves, sans modification.
Do you mean with or without quotes here then? I think sending with quotes would start getting confusing especially since it's inconsistent with the rest of our app
Otherwise, I would agree as long as it's handled somewhere and stays consistent with how we're displaying the breakdown values elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I'm not talking about how we display the values at all, just about how we should pass them to the front-end.
To display the problem with stripping quotes more concretely:
SELECT JSONExtractRaw('{"a": "1", "b": 1}', 'a'); -- returns the string "1"
SELECT JSONExtractRaw('{"a": "1", "b": 1}', 'b'); -- returns the string 1
In effect, we can't arbitrarily strip quotes, because it changes the key.
I'm all for the front-end removing the quotes, if they exist, to display breakdown values without quotes, but I don't think we can do this in the backend, since it changes our filter keys for props :$ '"1"' != '1'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AH! I see, there's the property filtering, which removes this from JSONExtractRaw
: https://github.com/PostHog/posthog/blob/master/ee/clickhouse/models/property.py#L89
I guess we just need to follow this everywhere for breakdowns as well, and we should be good to go (we don't yet). I'll fix this like so instead then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SELECT trim(BOTH '\"' FROM JSONExtractRaw('{"a": "1", "b": 1}', 'a'));
is same as
SELECT trim(BOTH '\"' FROM JSONExtractRaw('{"a": "1", "b": 1}', 'b'));
So we (will be) good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me. I mean, we could actually convert to the appropriate type in the backend really, but in a smarter way that woud result in '"1"'
→ '1'
(string
), '1'
→ 1
(number
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually tried that with JSONExtract('{"a": "1", "b": 1}', 'a', <type>)
- but it didn't let me infer the types using type inference as well :$ - so annoying.
I.e. JSONExtract('{"a": "1", "b": 1}', 'a', JSONType('{"a": "1", "b": 1}', 'a'))
doesn't work 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this'd have to happen in Django at earliest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me but 55dde4c broke tests
I resolved this for trends + funnels both. Some existing tests were reliant on internal sorting, so I've added an explicit sort on top of this to ensure there's no test failures here. I don't think the order in the list means anything with breakdowns, we don't sort it based on anything, but let me know if the tests were testing for something I'm not aware of. You can see these specific changes here: b74ba00 |
Changes
Resolves #5030
Checklist