Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(web-analytics): Add Sessions Table V2 #23023

Merged
merged 48 commits into from
Jun 25, 2024

Conversation

robbie-c
Copy link
Collaborator

@robbie-c robbie-c commented Jun 17, 2024

Don't merge this until #22954 is in, as this PR uses that as a base

This PR currently has some failing tests, these tests are failing on master and are not introduced by this PR. I'll rebase when they are fixed. See https://posthog.slack.com/archives/C0113360FFV/p1718961268054819

Before this PR gets merged, we should change the where clause of the MV to only include events in the future. I'll do this as a ~last step before merging, as we might need to change it depending on when this gets merged.

Problem

The previous session design was not fast enough, and did not allow for sampling. See https://github.com/PostHog/product-internal/pull/601

Changes

  • Add migration that adds the v2 raw_sessions table
  • Add v2 raw_sessions and lazy sessions tables to hogql
  • Add hogl modifier to enable this on a per-customer basis
  • Update generate_demo_data to create events with a v7 sessions table
  • Update tests

The biggest difference between the v1 and v2 tables is that v2 relies on the session_id being a uuid7, and can use this assumption to have a much better index. When just querying the session table alone for date range queries, this was around twice as fast, and should be even more of an upgrade when joining to the events table.

It also will support sampling, I expect we will have to tune the index granularity to work well but in theory we should be able to get very fast queries working, sampling by session.

Does this work well for both Cloud and self-hosted?

Yes

How did you test this code?

Added tests for the new mode, kept running tests against current mode

Copy link
Contributor

github-actions bot commented Jun 17, 2024

Size Change: +485 B (+0.05%)

Total Size: 1.06 MB

ℹ️ View Unchanged
Filename Size Change
frontend/dist/toolbar.js 1.06 MB +485 B (+0.05%)

compressed-size-action

@robbie-c robbie-c force-pushed the add-uuidv7-based-sessions-table-for-testing branch 3 times, most recently from 0621875 to a94e711 Compare June 19, 2024 13:06
@robbie-c robbie-c changed the base branch from master to remove-old-feature-flags June 19, 2024 13:46
@robbie-c robbie-c changed the title WIP: Add Sessions Table V2 feat(web-analytics): Add Sessions Table V2 Jun 19, 2024
Base automatically changed from remove-old-feature-flags to master June 19, 2024 15:04
@robbie-c robbie-c force-pushed the add-uuidv7-based-sessions-table-for-testing branch from 305f328 to c06e1cc Compare June 19, 2024 15:13
posthog/hogql/modifiers.py Outdated Show resolved Hide resolved
@robbie-c robbie-c force-pushed the add-uuidv7-based-sessions-table-for-testing branch 7 times, most recently from b3c281c to b9748f9 Compare June 21, 2024 08:44
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@PostHog PostHog deleted a comment from posthog-bot Jun 21, 2024
@robbie-c robbie-c force-pushed the add-uuidv7-based-sessions-table-for-testing branch from efd2a06 to d88e4d0 Compare June 25, 2024 10:08
@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 2)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 2)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@Twixes Twixes requested a review from pauldambra June 25, 2024 11:32
@Twixes
Copy link
Collaborator

Twixes commented Jun 25, 2024

I see Paul reviewed https://github.com/PostHog/product-internal/pull/601, so he probably has some context already – tagged him here.

Copy link
Collaborator

@Twixes Twixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me, but sadly I don't have any context on sessions table v2

posthog/models/raw_sessions/sql.py Outdated Show resolved Hide resolved
posthog/hogql/test/test_parser_cpp.py Show resolved Hide resolved
@robbie-c robbie-c force-pushed the add-uuidv7-based-sessions-table-for-testing branch from a88fefc to 7e0f9d3 Compare June 25, 2024 13:25
@robbie-c robbie-c force-pushed the add-uuidv7-based-sessions-table-for-testing branch from 7e0f9d3 to cf099f6 Compare June 25, 2024 13:28
@robbie-c robbie-c merged commit bf8f4da into master Jun 25, 2024
92 checks passed
@robbie-c robbie-c deleted the add-uuidv7-based-sessions-table-for-testing branch June 25, 2024 14:12
Copy link

sentry-io bot commented Jun 25, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ NotImplementedError: FieldAliasType.resolve_database_field not implemented posthog.tasks.tasks.process_query_task View Issue
  • ‼️ NotImplementedError: FieldAliasType.resolve_database_field not implemented posthog.tasks.tasks.process_query_task View Issue
  • ‼️ NotImplementedError: FieldAliasType.resolve_database_field not implemented posthog.tasks.tasks.process_query_task View Issue
  • ‼️ NotImplementedError: FieldAliasType.resolve_database_field not implemented posthog.tasks.tasks.process_query_task View Issue
  • ‼️ CHQueryErrorCannotInsertNullInOrdinaryColumn: DB::Exception: Cannot convert NULL value to non-Nullable type: while converting source column url... posthog.clickhouse.client.execute in sync_execute View Issue

Did you find this useful? React with a 👍 or 👎

timgl pushed a commit that referenced this pull request Jun 27, 2024
* Add raw_sessions table

* Fix

* Change time chunking to 5 minutes

* Add modes of operation, and some comments

* WIP wire up sessions table V2

* More working v2 sessions tests

* Optimize imports

* Fix v1 sessions table test

* Fix more tests

* Fix channel type tests

* Fix session replay joining with v2

* Web analytics queries and their tests working

* Fix where clause extractor tests for v1

* Fix backfill script

* Show last select query instead of first

* Run ruff

* Fix ids in tests

* Fix database init

* spelling

* Fix test_query

* Formatting

* Handle session properties with v2 session table

* Add more columns, fix some properties

* Add new properties to taxonomy

* Fix trends tests

* Fix modifiers

* Set v1 sessions table to default

* Capture viewport size

* Fix keyword arg rename

* Update query snapshots

* Update query snapshots

* Fix test_utils

* Fix test_trends

* Make it easier to run test_parser_cpp from pytcharm

* Update query snapshots

* Update query snapshots

* Add last external click url to the sessions MV

* Update query snapshots

* Add test_last_external_click_url

* Add ingest from date

* Run schema build after a rebase

* Tweak test_all

* Update query snapshots

* Run schema after rebase

* Update query snapshots

* Update UI snapshots for `chromium` (2)

* Update UI snapshots for `chromium` (2)

* Change ingestion date and add explaining comment

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants