Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add person info to events #9404

Merged
merged 9 commits into from
Apr 26, 2022
Merged

feat: Add person info to events #9404

merged 9 commits into from
Apr 26, 2022

Conversation

tiina303
Copy link
Contributor

@tiina303 tiina303 commented Apr 12, 2022

Problem

For #9180

Changes

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Depends on #9377

  • added UUID caching as that's what we'll want to be using in CH (based on the table types, but also I assume postgres person_id would be pretty useless there)
  • Query cache if the value isn't there query backing datastore and put it into cache. Designed it in a way that the user doesn't need to think about needing to updating the cache.
  • The person info is grouped together for efficiency as we'll be using it together (I assume), so if the value isn't in cache we'll query postgres once & can run the cache lookup in parallel (if they all separately fell back to postgres we'd either need to not run them in parallel or have multiple postgres reads & redis writes).

How did you test this code?

  1. ran the migration from feat(persons-on-events): add required person and group columns to events table #9251
  2. Turned off Geo IP (to have less data in properties)
  3. Ran locally with
CLICKHOUSE_DISABLE_EXTERNAL_SCHEMAS_TEAMS="1,2,3,4,5" PERSON_INFO_TO_REDIS_TEAMS="1,2,3,4,5" ./bin/start
  1. from py lib
>>> posthog.capture('test-buf-test-7', 'test-event-2', properties={ '$set': { 'userProperty': 997 } })
>>> posthog.capture('test-buf-test-7', 'test-event-3')
  1. saw the event was ingested
    In local CH instance
SELECT
    event,
    uuid,
    timestamp,
    distinct_id,
    person_id,
    person_properties
FROM events
WHERE distinct_id = 'test-buf-test-7'

Query id: 1d063222-2675-4655-b0de-bb3c05b839f1

┌─event────────┬─uuid─────────────────────────────────┬──────────────────timestamp─┬─distinct_id─────┬─person_id────────────────────────────┬─person_properties────┐
│ test-event-3 │ 01802553-d076-0000-6642-d0a13cc95343 │ 2022-04-13 23:48:11.064000 │ test-buf-test-7 │ 0180254f-3025-0000-26b6-ee09cce318fa │ {"userProperty":997} │
└──────────────┴──────────────────────────────────────┴────────────────────────────┴─────────────────┴──────────────────────────────────────┴──────────────────────┘
┌─event────────┬─uuid─────────────────────────────────┬──────────────────timestamp─┬─distinct_id─────┬─person_id────────────────────────────┬─person_properties────┐
│ test-event-2 │ 0180254f-301c-0000-3739-106d42db8d98 │ 2022-04-13 23:43:07.990000 │ test-buf-test-7 │ 0180254f-3025-0000-26b6-ee09cce318fa │ {"userProperty":997} │
└──────────────┴──────────────────────────────────────┴────────────────────────────┴─────────────────┴──────────────────────────────────────┴──────────────────────┘

SELECT *
FROM person_distinct_id2

Query id: 80833953-d523-4e52-9275-619a838c610b

┌─team_id─┬─distinct_id─────┬─person_id────────────────────────────┬─is_deleted─┬─version─┬──────────_timestamp─┬─_offset─┬─_partition─┐
│       1 │ test-buf-test-7 │ 0180254f-3025-0000-26b6-ee09cce318fa │          0 │       0 │ 2022-04-13 23:43:13 │      20 │          0 │
└─────────┴─────────────────┴──────────────────────────────────────┴────────────┴─────────┴─────────────────────┴─────────┴────────────┘

btw didn't see any change in the UI so we might need to change that too (note that I wasn't running on top of the branch with migration):
Screen Shot 2022-04-14 at 01 51 10

@tiina303 tiina303 marked this pull request as draft April 12, 2022 22:58
@tiina303 tiina303 force-pushed the poe-person-info-to-events branch 3 times, most recently from fd7cd27 to 8d2f222 Compare April 13, 2022 00:10
Base automatically changed from poe-write-to-redis to master April 13, 2022 10:34
@tiina303 tiina303 force-pushed the poe-person-info-to-events branch 2 times, most recently from 6d4e322 to d873578 Compare April 13, 2022 13:46
@tiina303 tiina303 marked this pull request as ready for review April 14, 2022 00:03
Copy link
Contributor

@yakkomajuri yakkomajuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, some nits.

I wonder if it's a worthy project for the future to have posthog_persondistinctid map distinct id -> UUID and add an index for posthog_person to use UUID. Probably is - toss it in the backlog lol

if (personId !== null) {
return await this.getPersonInfoThroughCache(teamId, personId)
}
return [null, null, null]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather return an object as that makes it much easier to work with instead of having to remember the order of the returned values

plugin-server/src/utils/db/db.ts Outdated Show resolved Hide resolved
plugin-server/src/utils/db/db.ts Outdated Show resolved Hide resolved
plugin-server/src/utils/db/db.ts Outdated Show resolved Hide resolved
plugin-server/src/worker/ingestion/process-event.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@yakkomajuri yakkomajuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good. one comment + have you tested JSON ingestion with this change?

@@ -576,6 +576,9 @@ export class EventsProcessor {

const elementsChain = elements && elements.length ? elementsToString(elements) : ''

// TODO: don't parse back and forth with json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's then just add a flag getRawProperties or something. It can event be a separate value in the return object

@tiina303
Copy link
Contributor Author

mostly looks good. one comment + have you tested JSON ingestion with this change?

Yes, check the test plan.

@yakkomajuri
Copy link
Contributor

Wanna give the Kafka Inspector a try and see what the raw message looks like? @tiina303 😉

@@ -663,6 +768,7 @@ export class DB {
}

await this.updatePersonPropertiesCache(updatedPerson.team_id, updatedPerson.id, updatedPerson.properties)
await this.updatePersonCreatedAtCache(updatedPerson.team_id, updatedPerson.id, updatedPerson.created_at)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot this earlier - on merges we might be updating create_at currently too

@tiina303
Copy link
Contributor Author

Screen Shot 2022-04-25 at 15 26 43

@yakkomajuri
Copy link
Contributor

Very nice!

Copy link
Contributor

@yakkomajuri yakkomajuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

let's monitor as we merge and confirm:

a) that all events are being ingested with no issues
b) what the messages look like in Kafka

@tiina303
Copy link
Contributor Author

Tested without CLICKHOUSE_DISABLE_EXTERNAL_SCHEMAS_TEAMS & PERSON_INFO_TO_REDIS_TEAMS enabled too. Only have tested on top of the schema change.

@tiina303
Copy link
Contributor Author

tiina303 commented Apr 26, 2022

Tested locally without the schema change, it worked fine (DEBUG=1 ./bin/start & events got ingested)

@tiina303 tiina303 merged commit 98619cf into master Apr 26, 2022
@tiina303 tiina303 deleted the poe-person-info-to-events branch April 26, 2022 11:05
fuziontech added a commit that referenced this pull request Apr 28, 2022
* master: (137 commits)
  feat(cohorts): add cohort filter grammars (#9540)
  feat(cohorts): Backwards compatibility of groups and properties (#9462)
  perf(ingestion): unsubscribe from buffer topic while no events are produced to it (#9556)
  fix: Fix `Loading` positioning and `LemonButton` disabled state (#9554)
  test: Speed up backend tests (#9289)
  fix: LemonSpacer -> LemonDivider (#9549)
  feat(funnels): Highlight significant deviations in new funnel viz (#9536)
  docs(storybook): Lemon UI (#9426)
  feat: add support for list of teams to enable the conversion buffer for (#9542)
  chore(onboarding): cleanup framework grid experiment (#9527)
  fix(signup): domain provisioning on cloud (#9515)
  chore: split out async migrations ci (#9539)
  feat(ingestion): enable json ingestion for self-hosted by default (#9448)
  feat(cohort): add all cohort filter selectors to Storybook (#9492)
  feat(ingestion): conversion events buffer consumer (#9432)
  ci(run-backend-tests): remove CH version default (#9532)
  feat: Add person info to events (#9404)
  feat(ingestion): produce to buffer partitioned by team_id:distinct_id (#9518)
  fix: bring latest_migrations.manifest up to date (#9525)
  chore: removes unused feature flag (#9529)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants