-
Notifications
You must be signed in to change notification settings - Fork 3
Include a window/tab/session "id" (better names are welcome) #81
Comments
Link? :-) |
I'm OK with it if it's purely an extra level of detail inside a flow_id. If we get into a situation where the same window id appears in multiple flow ids, then we've got some extra correlation concerns with PII etc (e.g. we might be able to identify two users who share a single computer). That said, its not obvious to me what the concrete use-case is here. What's the concrete thing we want to measure but can't using the current events? |
I don't have a link but my recollection of the conversation runs something like: Some events occur in both the tab where the user submits the sign-in form and in the tab where the user confirms their email. We need to differentiate between the two, because reasons. @shane-tomlinson, can you fill in the blanks? 😄 |
@shane-tomlinson, do we still need/want this? |
I think we can get by without it, using distinct event names if a screen is used in multiple places. |
I'll close it out in that case, thanks! |
The flow id performs excellently at it's stated aim of allowing us to measure user journeys across devices/windows. However sometimes we need metrics that are specific to the window or tab the user is in at that moment. @shane-tomlinson encountered such a problem recently.
For those cases, our current model is insufficient. We need one more piece of metadata, which is some kind of non-identifiable, er, identifier (if you see what I mean), so that we can group events from a single flow in to separate windows/tabs/sessions.
I don't think it would make any sense for such an identifier to be emitted with our back-end metrics, so fortunately no API changes would be necessary there. We'd just need to emit it from the content server metrics module and then update the lua output script, the import scripts and the redshift schemata.
We have some experience of adding extra fields to the CSVs before and it went okay (apart from the one bit where I ran out of disk space and briefly uploaded empty CSVs to S3 😊). Trying to make the import scripts work conditionally proved problematic last time, so I think we'd want to follow a similar path here and pad all the historical CSVs with an extra comma at the end of each line (so any future re-imports work smoothly). The extra column can be added to
flow_events
with default null and then we just start filling it with data as and when it's available.@rfk, what do you think about the above? Maybe there's a simpler solution to the problem that I don't see yet?
The text was updated successfully, but these errors were encountered: