Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing Person-on-Events #9802

Open
12 of 19 tasks
neilkakkar opened this issue May 16, 2022 · 2 comments
Open
12 of 19 tasks

Implementing Person-on-Events #9802

neilkakkar opened this issue May 16, 2022 · 2 comments

Comments

@neilkakkar
Copy link
Collaborator

neilkakkar commented May 16, 2022

This issue is managing implementation work around person-on-events.

Broad things to keep in mind:

  1. We use an instance setting to decide when to use person-on-event queries.
  2. We're going piecemeal: Get a vertical slice on prod, then continue on the rest. Trends is first.
  3. All tests using journeys_for, or flush_person_and_events() now automatically get person properties on events. But, this doesn't matter unless we override config to use the new queries we write.
  4. Also note: The automated CI for person-on-events disregards snapshots, since these would obviously be different. Make sure there's at least one new test to confirm that the new snapshots are correct.
  5. Aliasing can be tricky, since there's no pdi.person_id anymore, but e.person_id instead. Remember to handle these.

... and the relevant tasks to attack: (Put an @ next to the task you pick up - they're in priority order :) )

  1. Setup CI to run all tests using the new instance setting: This ensures all old tests work with our changes. @neilkakkar
  2. Trends person-on-events querying: @neilkakkar
  3. Funnels querying - @EDsCODE
  4. Retention querying - @neilkakkar
  5. Lifecycle querying - @neilkakkar
  6. Correlation Analysis querying (depends on funnels^) - @neilkakkar
  7. Paths querying (funnel connection depends on funnels^)
  8. Stickyness querying
  9. Cohorts querying
  10. ???

The migration plan

Team ingestion set up a test team (team_id 7964) with only person-on-events. We use this for all initial testing, ensuring things work, and then move on to enabling it for team 2 when the backfill is done, which will take a bit longer.

Each piece of work should be self-contained, such that enabling this setting doesn't bork other things which haven't yet been switched on. The new CI tests should automatically ensure this.


Clean up

Once this migration is over, there's a lot of things in flux that we should clean up (and think about self-hosted users upgrade path & ramifications). List of things to remove:

  1. Aggregation by distinct IDs
  2. Person property pushdowns
  3. Person and distinctID joins
  4. Group joins
  5. Remove tests marked with TODO: Delete after
  6. Switch the default CI tests over to person-on-events
  7. (maybe?) remove using_person_on_events parameter

Problems to address

This section is for things that came up during implementation that we need a broader discussion for

  1. Funnel Property Filters and Breakdowns can get borked when using point-in-time person properties for funnels where the first (or last) event doesn't have that property (ex: pre signup)
  2. Lifecycle query depends on person.created_at column, which isn't yet exposed on the events table. If this query ever supports groups, the same would apply to $group_X.created_at
@Twixes
Copy link
Collaborator

Twixes commented Nov 18, 2022

Anything remaining from this list? 👀

@neilkakkar
Copy link
Collaborator Author

Yes, all the cleanup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants