feat(oauth): log scope-ceiling rejections at /authorize by MattBro · Pull Request #61216 · PostHog/posthog

MattBro · 2026-06-02T17:42:03Z

Problem

The per-app OAuth scope ceiling (#60477) enforces, at /authorize, that a client may only be granted scopes within its OAuthApplication.scopes set. When a request asks for a scope outside that ceiling, OAuthValidator.validate_scopes returns False, and oauthlib turns that into a 302 redirect carrying error=invalid_scope.

That reject path is silent: no logger call, no capture_exception, and a 302 is not a 4xx/5xx, so error-rate and APM dashboards miss it too. The failure mode this hides is a first-party app whose scopes ceiling is empty or only partially seeded. Its OAuth clients (for example the setup wizard, which requests llm_gateway:read among others) begin failing /authorize with invalid_scope, users can't complete login, and nothing server-side records which client requested which scope.

This is the observability gap behind a real regression: after the ceiling enforcement shipped, a first-party client's logins broke with invalid_scope and there was no server-side trace of the rejection. The only signals were the client's own exception telemetry (fragmented across install paths, unalerted) and users noticing.

Changes

posthog/api/oauth/views.py: in validate_scopes, emit a structured logger.warning("oauth_scope_ceiling_rejected", ...) on the reject branch, carrying client_id, is_first_party, the requested out-of-ceiling scopes, and the app's effective ceiling. The boolean resolution is byte-for-byte unchanged; this only adds a log side-effect on the existing False path.
posthog/api/oauth/test_views.py: asserts the event fires once with the expected fields when a scope is rejected.

is_first_party is included so a downstream alert can page only on first-party rejections, which are near-zero baseline and almost always a misconfiguration (an unseeded or partially-seeded ceiling). Third-party clients over-asking is expected and logs at is_first_party=false.

Follow-ups

A log alert on oauth_scope_ceiling_rejected filtered to is_first_party=true, created once this deploys (the event has to exist before there's anything to alert on). That alert is what turns this from "users report it" into "we're paged at deploy time."
Client-side telemetry enrichment + issue grouping for the wizard: feat: enrich oauth login failure telemetry for diagnosis wizard#501

How did you test this code?

I'm an agent (Claude). Automated only:

test_authorize_rejection_emits_ceiling_log patches the module logger and asserts a single oauth_scope_ceiling_rejected warning with the expected client_id / is_first_party / requested / ceiling. It ran green in the Django CI matrix.
ruff check and ruff format clean on both files.
I could not run the Django test harness locally in this environment; the backend suite (including the new test) ran in CI.

Automatic notifications

Publish to changelog?
Alert Sales and Marketing teams?

Docs update

skip-inkeep-docs — no user-facing docs change.

🤖 Agent context

Authored by Claude Code (Opus 4.8). Follow-up observability on the per-app scope ceiling (#60477): the reject path in validate_scopes was silent, so a misconfigured first-party ceiling produced invalid_scope with no server-side signal. Chose a single logger.warning on the existing False branch (structlog, matching the other logger.warning calls in this file) over capture_exception, since the rejection is a normal 302 redirect rather than a 4xx/5xx error path. Boolean resolution is unchanged; only the log was added.

The per-app scope ceiling rejects out-of-ceiling requests by returning False, which oauthlib turns into a 302 with error=invalid_scope. That path emitted no log or capture, so a misconfigured first-party app (empty/partial ceiling) failed silently. Emit a structured warning carrying client_id, is_first_party, and the requested vs allowed sets so a log alert can page on first-party rejections. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-06-02T18:12:30Z

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
posthog/api/oauth/test_views.py:2290-2305
**Prefer a parameterised test to cover both rejection paths**

The new test only exercises the `has_ceiling=True` branch. The `has_ceiling=False` branch (no app ceiling, client requests a privileged or wildcard scope) also reaches the `logger.warning` call and is worth asserting in the same spirit. Following the team's preference for parameterised tests, both paths could live under a single `@pytest.mark.parametrize` (or `subTest`): one case with a ceiling set where the requested scope falls outside it, and one with no ceiling where a privileged scope is requested.

_{Reviews (1): Last reviewed commit: "chore: ruff format the ceiling-rejection..." | Re-trigger Greptile}

github-actions · 2026-06-02T18:12:43Z

🎭 Playwright report · View test results →

⚠️ 1 flaky test:

Change date range and toggle comparison (chromium)

These issues are not necessarily caused by your changes.
Annoyed by this comment? Help fix flakies and failures and it'll disappear!

greptile-apps · 2026-06-02T18:12:48Z

+    def test_authorize_rejection_emits_ceiling_log(self):
+        self._set_ceiling("experiment:read")
+        with patch("posthog.api.oauth.views.logger") as mock_logger:
+            response = self.client.get(f"{self.base_authorization_url}&scope=experiment:write")
+        self.assertEqual(response.status_code, status.HTTP_302_FOUND)
+        rejection_calls = [
+            call
+            for call in mock_logger.warning.call_args_list
+            if call.args and call.args[0] == "oauth_scope_ceiling_rejected"
+        ]
+        self.assertEqual(len(rejection_calls), 1)
+        kwargs = rejection_calls[0].kwargs
+        self.assertEqual(kwargs["client_id"], "test_confidential_client_id")
+        self.assertEqual(kwargs["is_first_party"], self.confidential_application.is_first_party)
+        self.assertEqual(kwargs["requested"], ["experiment:write"])
+        self.assertEqual(kwargs["ceiling"], ["experiment:read"])


Prefer a parameterised test to cover both rejection paths

The new test only exercises the has_ceiling=True branch. The has_ceiling=False branch (no app ceiling, client requests a privileged or wildcard scope) also reaches the logger.warning call and is worth asserting in the same spirit. Following the team's preference for parameterised tests, both paths could live under a single @pytest.mark.parametrize (or subTest): one case with a ceiling set where the requested scope falls outside it, and one with no ceiling where a privileged scope is requested.

Prompt To Fix With AI

This is a comment left during a code review. Path: posthog/api/oauth/test_views.py Line: 2290-2305 Comment: **Prefer a parameterised test to cover both rejection paths** The new test only exercises the `has_ceiling=True` branch. The `has_ceiling=False` branch (no app ceiling, client requests a privileged or wildcard scope) also reaches the `logger.warning` call and is worth asserting in the same spirit. Following the team's preference for parameterised tests, both paths could live under a single `@pytest.mark.parametrize` (or `subTest`): one case with a ceiling set where the requested scope falls outside it, and one with no ceiling where a privileged scope is requested. How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

+1, but it's a nit

Parameterized it. Covers the empty-ceiling reject path now too: 7d74e35

fercgomes · 2026-06-03T11:14:44Z

+    def test_authorize_rejection_emits_ceiling_log(self):
+        self._set_ceiling("experiment:read")
+        with patch("posthog.api.oauth.views.logger") as mock_logger:
+            response = self.client.get(f"{self.base_authorization_url}&scope=experiment:write")
+        self.assertEqual(response.status_code, status.HTTP_302_FOUND)
+        rejection_calls = [
+            call
+            for call in mock_logger.warning.call_args_list
+            if call.args and call.args[0] == "oauth_scope_ceiling_rejected"
+        ]
+        self.assertEqual(len(rejection_calls), 1)
+        kwargs = rejection_calls[0].kwargs
+        self.assertEqual(kwargs["client_id"], "test_confidential_client_id")
+        self.assertEqual(kwargs["is_first_party"], self.confidential_application.is_first_party)
+        self.assertEqual(kwargs["requested"], ["experiment:write"])
+        self.assertEqual(kwargs["ceiling"], ["experiment:read"])


+1, but it's a nit

Parameterize the test so it asserts the oauth_scope_ceiling_rejected log fires on both branches: a set ceiling with an out-of-ceiling scope, and an empty ceiling with a privileged scope excluded by the broad default. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-03T17:15:49Z

ClickHouse migration SQL per cloud environment

unset

all

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW events_batch_export ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64}
        AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64} - INTERVAL {lookback_days:Int32} DAY
        AND events.timestamp < {interval_end:DateTime64} + INTERVAL 1 DAY
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW events_batch_export_unbounded ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64}
        AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW events_batch_export_backfill ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        timestamp AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64}
        AND events.timestamp < {interval_end:DateTime64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW events_batch_export ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64}
        AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64} - INTERVAL {lookback_days:Int32} DAY
        AND events.timestamp < {interval_end:DateTime64} + INTERVAL 1 DAY
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW events_batch_export_unbounded ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64}
        AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW events_batch_export_backfill ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        timestamp AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64}
        AND events.timestamp < {interval_end:DateTime64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW events_batch_export_backfill ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        timestamp AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64}
        AND events.timestamp < {interval_end:DateTime64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW persons_batch_export_backfill ON CLUSTER posthog AS (
    SELECT
        pd.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            pd._timestamp < {interval_end:DateTime64}
                AND NOT p._timestamp < {interval_end:DateTime64},
            pd._timestamp,
            p._timestamp < {interval_end:DateTime64}
                AND NOT pd._timestamp < {interval_end:DateTime64},
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    FROM (
        SELECT
            team_id,
            distinct_id,
            max(version) AS version,
            argMax(person_id, person_distinct_id2.version) AS person_id,
            argMax(_timestamp, person_distinct_id2.version) AS _timestamp
        FROM
            person_distinct_id2
        PREWHERE
            team_id = {team_id:Int64}
        GROUP BY
            team_id,
            distinct_id
    ) AS pd
    INNER JOIN (
        SELECT
            team_id,
            id,
            max(version) AS version,
            argMax(properties, person.version) AS properties,
            argMax(created_at, person.version) AS created_at,
            argMax(_timestamp, person.version) AS _timestamp
        FROM
            person
        PREWHERE
            team_id = {team_id:Int64}
        GROUP BY
            team_id,
            id
    ) AS p ON p.id = pd.person_id AND p.team_id = pd.team_id
    WHERE
        pd.team_id = {team_id:Int64}
        AND p.team_id = {team_id:Int64}
        AND (
            pd._timestamp < {interval_end:DateTime64}
            OR p._timestamp < {interval_end:DateTime64}
        )
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW persons_batch_export_backfill ON CLUSTER posthog AS (
    SELECT
        pd.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            pd._timestamp < {interval_end:DateTime64}
                AND NOT p._timestamp < {interval_end:DateTime64},
            pd._timestamp,
            p._timestamp < {interval_end:DateTime64}
                AND NOT pd._timestamp < {interval_end:DateTime64},
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    FROM (
        SELECT
            team_id,
            distinct_id,
            max(version) AS version,
            argMax(person_id, person_distinct_id2.version) AS person_id,
            argMax(_timestamp, person_distinct_id2.version) AS _timestamp
        FROM
            person_distinct_id2
        PREWHERE
            team_id = {team_id:Int64}
        GROUP BY
            team_id,
            distinct_id
    ) AS pd
    INNER JOIN (
        SELECT
            team_id,
            id,
            max(version) AS version,
            argMax(properties, person.version) AS properties,
            argMax(created_at, person.version) AS created_at,
            argMax(_timestamp, person.version) AS _timestamp
        FROM
            person
        PREWHERE
            team_id = {team_id:Int64}
        GROUP BY
            team_id,
            id
    ) AS p ON p.id = pd.person_id AND p.team_id = pd.team_id
    WHERE
        pd.team_id = {team_id:Int64}
        AND p.team_id = {team_id:Int64}
        AND (
            pd._timestamp < {interval_end:DateTime64}
            OR p._timestamp < {interval_end:DateTime64}
        )
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW events_batch_export_recent ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events_recent.distinct_id), cityHash64(events_recent.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        inserted_at AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events_recent
    PREWHERE
        events_recent.inserted_at >= {interval_start:DateTime64}
        AND events_recent.inserted_at < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW persons_batch_export_backfill ON CLUSTER posthog AS (
    SELECT
        pd.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            pd._timestamp < {interval_end:DateTime64}
                AND NOT p._timestamp < {interval_end:DateTime64},
            pd._timestamp,
            p._timestamp < {interval_end:DateTime64}
                AND NOT pd._timestamp < {interval_end:DateTime64},
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    FROM (
        SELECT
            team_id,
            distinct_id,
            max(version) AS version,
            argMax(person_id, person_distinct_id2.version) AS person_id,
            argMax(_timestamp, person_distinct_id2.version) AS _timestamp
        FROM
            person_distinct_id2
        PREWHERE
            team_id = {team_id:Int64}
        GROUP BY
            team_id,
            distinct_id
    ) AS pd
    INNER JOIN (
        SELECT
            team_id,
            id,
            max(version) AS version,
            argMax(properties, person.version) AS properties,
            argMax(created_at, person.version) AS created_at,
            argMax(_timestamp, person.version) AS _timestamp
        FROM
            person
        PREWHERE
            team_id = {team_id:Int64}
        GROUP BY
            team_id,
            id
    ) AS p ON p.id = pd.person_id AND p.team_id = pd.team_id
    WHERE
        pd.team_id = {team_id:Int64}
        AND p.team_id = {team_id:Int64}
        AND (
            pd._timestamp < {interval_end:DateTime64}
            OR p._timestamp < {interval_end:DateTime64}
        )
    ORDER BY
        _inserted_at
)

US, EU, DEV

data

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW events_batch_export ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64}
        AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64} - INTERVAL {lookback_days:Int32} DAY
        AND events.timestamp < {interval_end:DateTime64} + INTERVAL 1 DAY
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW events_batch_export_unbounded ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64}
        AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW events_batch_export_backfill ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        timestamp AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64}
        AND events.timestamp < {interval_end:DateTime64}
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW persons_batch_export ON CLUSTER posthog AS (
    with new_persons as (
        select
            id,
            max(version) as version,
            argMax(_timestamp, person.version) AS _timestamp2
        from
            person
        where
            team_id = {team_id:Int64}
            and id in (
                select
                    id
                from
                    person
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            id
        having
            (
                _timestamp2 >= {interval_start:DateTime64}
                AND _timestamp2 < {interval_end:DateTime64}
            )
    ),
    new_distinct_ids as (
        SELECT
            argMax(person_id, person_distinct_id2.version) as person_id
        from
            person_distinct_id2
        where
            team_id = {team_id:Int64}
            and distinct_id in (
                select
                    distinct_id
                from
                    person_distinct_id2
                where
                    team_id = {team_id:Int64}
                    and _timestamp >= {interval_start:DateTime64}
                    AND _timestamp < {interval_end:DateTime64}
            )
        group by
            distinct_id
        having
            (
                argMax(_timestamp, person_distinct_id2.version) >= {interval_start:DateTime64}
                AND argMax(_timestamp, person_distinct_id2.version) < {interval_end:DateTime64}
            )
    ),
    all_new_persons as (
        select
            id,
            version
        from
            new_persons
        UNION
        ALL
        select
            id,
            max(version)
        from
            person
        where
            team_id = {team_id:Int64}
            and id in new_distinct_ids
        group by
            id
    )
    select
        p.team_id AS team_id,
        pd.distinct_id AS distinct_id,
        toString(p.id) AS person_id,
        p.properties AS properties,
        pd.version AS person_distinct_id_version,
        p.version AS person_version,
        p.created_at AS created_at,
        multiIf(
            (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            ),
            pd._timestamp,
            (
                p._timestamp >= {interval_start:DateTime64}
                AND p._timestamp < {interval_end:DateTime64}
            )
            AND NOT (
                pd._timestamp >= {interval_start:DateTime64}
                AND pd._timestamp < {interval_end:DateTime64}
            ),
            p._timestamp,
            least(p._timestamp, pd._timestamp)
        ) AS _inserted_at
    from
        person p
        INNER JOIN (
            SELECT
                distinct_id,
                max(version) AS version,
                argMax(person_id, person_distinct_id2.version) AS person_id2,
                argMax(_timestamp, person_distinct_id2.version) AS _timestamp
            FROM
                person_distinct_id2
            WHERE
                team_id = {team_id:Int64}
                and person_id IN (
                    select
                        id
                    from
                        all_new_persons
                )
            GROUP BY
                distinct_id
        ) AS pd ON p.id = pd.person_id2
    where
        team_id = {team_id:Int64}
        and (id, version) in all_new_persons
    ORDER BY
        _inserted_at
)

CREATE OR REPLACE VIEW events_batch_export ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64}
        AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64}
    WHERE
        team_id = {team_id:Int64}
        AND events.timestamp >= {interval_start:DateTime64} - INTERVAL {lookback_days:Int32} DAY
        AND events.timestamp < {interval_end:DateTime64} + INTERVAL 1 DAY
        AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)})
        AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)})
    ORDER BY
        _inserted_at, event
    SETTINGS optimize_aggregation_in_order=1
)

CREATE OR REPLACE VIEW events_batch_export_unbounded ON CLUSTER posthog AS (
    SELECT DISTINCT ON (team_id, event, cityHash64(events.distinct_id), cityHash64(events.uuid))
        team_id AS team_id,
        timestamp AS timestamp,
        event AS event,
        distinct_id AS distinct_id,
        toString(uuid) AS uuid,
        COALESCE(inserted_at, _timestamp) AS _inserted_at,
        created_at AS created_at,
        elements_chain AS elements_chain,
        toString(person_id) AS person_id,
        nullIf(properties, '') AS properties,
        nullIf(person_properties, '') AS person_properties,
        nullIf(JSONExtractString(properties, '$set'), '') AS set,
        nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once
    FROM
        events
    PREWHERE
        COALESCE(events.inserted_at, events._timestamp) >= {

…truncated. See the full SQL in the workflow logs.

MattBro added the skip-inkeep-docs Use this label to skip an Inkeep docs PR in posthog.com label Jun 2, 2026

MattBro requested review from a team, fercgomes and rafaeelaudibert and removed request for a team June 2, 2026 17:42

assign-reviewers-posthog Bot assigned MattBro Jun 2, 2026

chore: ruff format the ceiling-rejection test

be47f99

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

MattBro marked this pull request as ready for review June 2, 2026 18:10

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

MattBro mentioned this pull request Jun 2, 2026

feat: enrich oauth login failure telemetry for diagnosis PostHog/wizard#501

Merged

fercgomes approved these changes Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(oauth): log scope-ceiling rejections at /authorize#61216

feat(oauth): log scope-ceiling rejections at /authorize#61216
MattBro wants to merge 3 commits into
masterfrom
matt/oauth-ceiling-reject-log

MattBro commented Jun 2, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

fercgomes Jun 3, 2026

Uh oh!

MattBro Jun 3, 2026

Uh oh!

fercgomes Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MattBro commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

Follow-ups

How did you test this code?

Automatic notifications

Docs update

🤖 Agent context

Uh oh!

greptile-apps Bot commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

fercgomes Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

MattBro Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

fercgomes Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 3, 2026

ClickHouse migration SQL per cloud environment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MattBro commented Jun 2, 2026 •

edited

Loading