Unify analytics tables into a single 'events' table#526
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (10)
📝 WalkthroughWalkthroughConsolidates legacy ClickHouse tables into a unified ChangesUnified events migration & query surface
Sequence Diagram(s)sequenceDiagram
autonumber
participant Mapper as Importer/Mapper
participant Processor as DataImportProcessor
participant Transformer as eventTransformer
participant ClickHouse as ClickHouse (events)
participant Analytics as AnalyticsService
Mapper->>Processor: yield { type, data }
Processor->>Transformer: transform({ type, data })
Transformer-->>Processor: transformedRecord (event_name / error_* / meta)
Processor->>ClickHouse: INSERT JSONEachRow INTO events
ClickHouse-->>Processor: insert ack
Analytics->>ClickHouse: SELECT ... FROM events WHERE type=...
ClickHouse-->>Analytics: rows
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
backend/apps/community/src/project/project.controller.ts (1)
1017-1028:⚠️ Potential issue | 🟠 Major | ⚡ Quick winIncomplete cleanup leaves orphaned data after project deletion.
The migration drops all legacy tables (
analytics,customEV,errors,performance,captcha) after copying their data into the unifiedeventstable with type values'pageview','custom_event','error','performance', and'captcha'. However, the delete endpoint only removespageviewandcustom_event, leavingerror,performance, andcaptchaevents orphaned.Additionally,
error_statusescleanup is missing here but is present inproject.service.ts:removeDataFromClickhouse.Proposed fix to delete all event types and error_statuses
try { await deleteProjectSharesByProjectClickhouse(id) await clickhouse.command({ - query: `ALTER TABLE events DELETE WHERE pid={pid:FixedString(12)} AND type IN ('pageview', 'custom_event')`, + query: `ALTER TABLE events DELETE WHERE pid={pid:FixedString(12)} AND type IN ('pageview', 'custom_event', 'error', 'performance', 'captcha')`, query_params: { pid: id }, }) + await clickhouse.command({ + query: 'ALTER TABLE error_statuses DELETE WHERE pid = {pid:FixedString(12)}', + query_params: { pid: id }, + }) await deleteProjectRedis(id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/project/project.controller.ts` around lines 1017 - 1028, The delete handler currently only removes 'pageview' and 'custom_event' from ClickHouse and misses cleaning up other migrated event types and error statuses; update the block that calls deleteProjectSharesByProjectClickhouse / clickhouse.command / deleteProjectRedis to also delete events with types 'error', 'performance', and 'captcha' (i.e. expand the ALTER TABLE ... DELETE WHERE type IN (...) to include 'error','performance','captcha') and add deletion of error status rows (match the logic in project.service.ts:removeDataFromClickhouse — delete from error_statuses where pid equals the project id) so all migrated legacy data is removed.backend/apps/community/src/data-import/mappers/fathom.mapper.ts (1)
44-46:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winInconsistent error handling between cloud and community mappers.
The community
FathomMapperthrows genericError(lines 44, 57, 217), while the cloud version throwsImportError. The cloud processor atdata-import.processor.tslines 202-205 checkserror instanceof ImportErrorto provide user-friendly messages. With genericError, users will see the generic "unexpected error" message instead of the descriptive CSV validation errors.🔧 Proposed fix
-import { ImportMapper, AnalyticsImportRow } from './mapper.interface' +import { ImportMapper, ImportError, AnalyticsImportRow } from './mapper.interface'Then update the throws:
- throw new Error( + throw new ImportError(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/data-import/mappers/fathom.mapper.ts` around lines 44 - 46, The FathomMapper currently throws generic Error instances (in FathomMapper where CSV validation and other checks throw at the three spots) which prevents the cloud processor from recognizing them; replace those throws with ImportError so the processor can render friendly messages, i.e., update each throw new Error(...) in the FathomMapper class to throw new ImportError(...) with the same message, and add/import the ImportError symbol at the top of the module (ensure the three locations in FathomMapper that currently throw Error are changed).backend/apps/cloud/src/task-manager/task-manager.service.ts (1)
403-405:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep report goal matching case-insensitive like the dashboard.
The goal controllers use
ILIKEfor contains matches, but this helper now usesLIKE. That means emailed report conversions can undercount compared with/goal/:id/statswhenever the storedpgorevent_namecasing differs from the goal value.Suggested fix
if (goal.matchType === GoalMatchType.CONTAINS) { params[paramKey] = `%${goalValue}%` - return { condition: `${column} LIKE {${paramKey}:String}`, params } + return { condition: `${column} ILIKE {${paramKey}:String}`, params } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/task-manager/task-manager.service.ts` around lines 403 - 405, Replace the SQL operator for contains matches to be case-insensitive: in the block that handles GoalMatchType.CONTAINS (where params[paramKey] = `%${goalValue}%` and you return { condition: `${column} LIKE {${paramKey}:String}`, params }), change `LIKE` to `ILIKE` so the condition becomes `${column} ILIKE {${paramKey}:String}`; leave column, paramKey and the param value construction unchanged.backend/apps/cloud/src/experiment/experiment.controller.ts (1)
943-952:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard blank goal values before building the
ILIKEpredicate.Line 946 and Line 1185 coerce a missing goal value to
'', so a contains-match goal becomesILIKE '%%'and every matching event type after exposure is counted as a conversion. That makes both the results table and the chart wrong for experiments linked to an empty-value goal.Also applies to: 1182-1195
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/experiment/experiment.controller.ts` around lines 943 - 952, Guard against empty goalValue before building the ILIKE predicate: where code sets eventType, matchColumn, goalValue and builds matchCondition based on experiment.goal.matchType, add a check that if goalValue is falsy/empty and matchType is a contains/ILIKE branch, set matchCondition to a false predicate (e.g., "1=0" or "FALSE") instead of using ILIKE concat('%', {goalValue}, '%'); apply the same guard in both places that build matchCondition (the block using eventType/matchColumn/goalValue and the duplicated block later) so empty goals don't become ILIKE '%%' and count all events.backend/apps/cloud/src/analytics/analytics.service.ts (1)
6026-6034:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScope the error-detail params query by
pid.Line 6030 builds the params subquery without a project filter, so the error detail page can aggregate parameter counts from other projects that happen to share the same
eid.Suggested fix
- `FROM events WHERE type = 'error' AND eid = {eid:FixedString(32)} AND created BETWEEN {groupFrom:String} AND {groupTo:String}`, + `FROM events WHERE pid = {pid:FixedString(12)} AND type = 'error' AND eid = {eid:FixedString(32)} AND created BETWEEN {groupFrom:String} AND {groupTo:String}`,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 6026 - 6034, The params subquery passed into groupErrorsByTimeBucket is missing a project filter, causing parameter counts to include events from other projects; update the query and where fragment used in the call inside analytics.service.ts (the groupErrorsByTimeBucket(...) invocation) to include "AND pid = {pid:FixedString(32)}" (e.g., change the main SQL string and the additional where clause argument from 'AND eid = {eid:FixedString(32)}' to 'AND pid = {pid:FixedString(32)} AND eid = {eid:FixedString(32)}'), and ensure the paramsData object passed into the same call contains the pid binding so the subquery is scoped to the correct project.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/analytics/utils/transformers.ts`:
- Around line 179-180: The code currently sets "lineno: opts.lineno || null" and
"colno: opts.colno || null" which will convert valid 0 values to null; change
both uses to the nullish coalescing operator (opts.lineno ?? null and opts.colno
?? null) so 0 is preserved while still defaulting to null when undefined or
null; update the object properties "lineno" and "colno" where they are assigned
from opts to use ?? instead of ||.
In `@backend/apps/cloud/src/goal/goal.controller.ts`:
- Around line 335-362: In buildGoalMatchCondition, if goal.matchType is a
"contains" (i.e., not GoalMatchType.EXACT) and the goal.value is empty or only
whitespace, short-circuit and return a condition that never matches (e.g.,
`1=0`) instead of constructing an ILIKE '%…%' that becomes '%%'; specifically
update the branches handling GoalType.CUSTOM_EVENT (event_name ILIKE ...) and
the default page view branch (pg ILIKE ...) to check for empty/whitespace
goal.value before setting params.goalValue and, when empty, return { condition:
'1=0', params: {} } so queries don't count every row.
In `@backend/apps/cloud/src/project/project.service.ts`:
- Around line 682-685: The unified delete predicate in the queries array
currently omits captcha events; update the ALTER TABLE events DELETE WHERE ...
type IN (...) clause (the string in the queries variable) to include 'captcha'
alongside 'pageview', 'custom_event', 'performance', and 'error' so captcha rows
are deleted in the same date-range cleanup.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 668-676: The PR currently maps legacy error column names only
inside getErrorsFilters; update getFiltersQuery and generateParamsQuery to reuse
a single mapping helper so public ERROR_COLUMNS like "name", "message", and
"filename" are translated to their real DB columns ("error_name",
"error_message", "error_filename") before being injected into SQL. Add a small
shared function (e.g., mapErrorColumn(type: string) or a constant errorColMap
exported locally) and call it from getErrorsFilters, getFiltersQuery, and
generateParamsQuery (and anywhere else that uses ERROR_COLUMNS) so every SQL
builder uses the mapped sqlCol instead of the public key.
In `@backend/apps/community/src/goal/goal.controller.ts`:
- Around line 255-282: In buildGoalMatchCondition, the "contains" branches
produce ILIKE '%%' when goal.value is empty, matching everything; add a guard at
the top of the function (or inside the non-EXACT branches) to detect empty
goal.value and return a non-matching predicate (e.g., condition: 'FALSE' or
'1=0' with params {}) instead of constructing the ILIKE concat; update both the
CUSTOM_EVENT contains branch and the default contains branch (references:
buildGoalMatchCondition, GoalType.CUSTOM_EVENT, GoalMatchType.EXACT) so empty
values do not become match-all queries.
In `@backend/migrations/clickhouse/2026_05_01_unify_events.js`:
- Around line 4-55: The bootstrap DB initializer defines an out-of-date events
table schema; update the events CREATE TABLE statement in initialise_database.js
to match the new canonical schema in this migration (columns like type, pid,
psid, profileId, meta.key/meta.value arrays, importID, event_name, eid,
error_name/error_message/stackTrace/lineno/colno/error_filename, timing fields
like dns/tls/conn/response/render/domLoad/pageLoad/ttfb and created DateTime
with the same ENGINE/PARTITION/ORDER BY). Replace the old column list and codecs
with the exact column names/types/NULLability and ENGINE/PARTITION/ORDER BY used
in this migration so fresh environments create an events table identical to the
source-of-truth created by 2026_05_01_unify_events.js.
- Around line 57-91: The migration is unsafe because plain INSERT ... SELECT
plus unconditional DROP can duplicate data on retry; instead perform an atomic
swap using a temporary events table: create ${dbName}.events_tmp with the same
schema as events (or use CREATE TABLE ... AS events), run the INSERT ... SELECT
statements into events_tmp (using the same SELECTs from analytics, customEV,
errors, performance, captcha), verify row counts (or ensure dedup by
importID/created if needed), then perform an atomic RENAME TABLE events TO
events_backup, events_tmp TO events (using ClickHouse RENAME TABLE to swap), and
only after successful swap drop the legacy tables (analytics, customEV, errors,
performance, captcha) and optionally drop events_backup; update the migration
code to use these table names (events_tmp, events_backup) and to not DROP legacy
tables until the swap completes.
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 6026-6034: The params subquery passed into groupErrorsByTimeBucket
is missing a project filter, causing parameter counts to include events from
other projects; update the query and where fragment used in the call inside
analytics.service.ts (the groupErrorsByTimeBucket(...) invocation) to include
"AND pid = {pid:FixedString(32)}" (e.g., change the main SQL string and the
additional where clause argument from 'AND eid = {eid:FixedString(32)}' to 'AND
pid = {pid:FixedString(32)} AND eid = {eid:FixedString(32)}'), and ensure the
paramsData object passed into the same call contains the pid binding so the
subquery is scoped to the correct project.
In `@backend/apps/cloud/src/experiment/experiment.controller.ts`:
- Around line 943-952: Guard against empty goalValue before building the ILIKE
predicate: where code sets eventType, matchColumn, goalValue and builds
matchCondition based on experiment.goal.matchType, add a check that if goalValue
is falsy/empty and matchType is a contains/ILIKE branch, set matchCondition to a
false predicate (e.g., "1=0" or "FALSE") instead of using ILIKE concat('%',
{goalValue}, '%'); apply the same guard in both places that build matchCondition
(the block using eventType/matchColumn/goalValue and the duplicated block later)
so empty goals don't become ILIKE '%%' and count all events.
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 403-405: Replace the SQL operator for contains matches to be
case-insensitive: in the block that handles GoalMatchType.CONTAINS (where
params[paramKey] = `%${goalValue}%` and you return { condition: `${column} LIKE
{${paramKey}:String}`, params }), change `LIKE` to `ILIKE` so the condition
becomes `${column} ILIKE {${paramKey}:String}`; leave column, paramKey and the
param value construction unchanged.
In `@backend/apps/community/src/data-import/mappers/fathom.mapper.ts`:
- Around line 44-46: The FathomMapper currently throws generic Error instances
(in FathomMapper where CSV validation and other checks throw at the three spots)
which prevents the cloud processor from recognizing them; replace those throws
with ImportError so the processor can render friendly messages, i.e., update
each throw new Error(...) in the FathomMapper class to throw new
ImportError(...) with the same message, and add/import the ImportError symbol at
the top of the module (ensure the three locations in FathomMapper that currently
throw Error are changed).
In `@backend/apps/community/src/project/project.controller.ts`:
- Around line 1017-1028: The delete handler currently only removes 'pageview'
and 'custom_event' from ClickHouse and misses cleaning up other migrated event
types and error statuses; update the block that calls
deleteProjectSharesByProjectClickhouse / clickhouse.command / deleteProjectRedis
to also delete events with types 'error', 'performance', and 'captcha' (i.e.
expand the ALTER TABLE ... DELETE WHERE type IN (...) to include
'error','performance','captcha') and add deletion of error status rows (match
the logic in project.service.ts:removeDataFromClickhouse — delete from
error_statuses where pid equals the project id) so all migrated legacy data is
removed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0d0b85bf-705e-434a-9922-65088a5195e5
📒 Files selected for processing (38)
backend/apps/cloud/src/ai/ai.service.tsbackend/apps/cloud/src/analytics/analytics.controller.tsbackend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/analytics/utils/transformers.tsbackend/apps/cloud/src/captcha/captcha.service.tsbackend/apps/cloud/src/captcha/utils/transformers.tsbackend/apps/cloud/src/data-import/data-import.processor.tsbackend/apps/cloud/src/data-import/data-import.service.tsbackend/apps/cloud/src/data-import/mappers/fathom.mapper.tsbackend/apps/cloud/src/data-import/mappers/ga4.mapper.tsbackend/apps/cloud/src/data-import/mappers/mapper.interface.tsbackend/apps/cloud/src/data-import/mappers/plausible.mapper.tsbackend/apps/cloud/src/data-import/mappers/simple-analytics.mapper.tsbackend/apps/cloud/src/data-import/mappers/umami.mapper.tsbackend/apps/cloud/src/experiment/experiment.controller.tsbackend/apps/cloud/src/goal/goal.controller.tsbackend/apps/cloud/src/project/project.controller.tsbackend/apps/cloud/src/project/project.service.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.controller.tsbackend/apps/community/src/analytics/analytics.service.tsbackend/apps/community/src/analytics/utils/transformers.tsbackend/apps/community/src/captcha/captcha.service.tsbackend/apps/community/src/captcha/utils/transformers.tsbackend/apps/community/src/data-import/data-import.processor.tsbackend/apps/community/src/data-import/data-import.service.tsbackend/apps/community/src/data-import/mappers/fathom.mapper.tsbackend/apps/community/src/data-import/mappers/mapper.interface.tsbackend/apps/community/src/data-import/mappers/plausible.mapper.tsbackend/apps/community/src/data-import/mappers/simple-analytics.mapper.tsbackend/apps/community/src/data-import/mappers/umami.mapper.tsbackend/apps/community/src/goal/goal.controller.tsbackend/apps/community/src/project/project.controller.tsbackend/apps/community/src/project/project.service.tsbackend/apps/community/src/user/user.controller.tsbackend/meta/clickhouse/generate-dummy-data.jsbackend/migrations/clickhouse/2026_05_01_unify_events.jsbackend/migrations/clickhouse/initialise_database.js
💤 Files with no reviewable changes (2)
- backend/apps/cloud/src/captcha/utils/transformers.ts
- backend/apps/community/src/captcha/utils/transformers.ts
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
1259-1268:⚠️ Potential issue | 🟠 Major | ⚡ Quick winApply the legacy error-field remap in the main query builders too.
This remap fixes only
getErrorsFilters(). The rest of the error query path still uses public names likename,message, andfilenamedirectly againsteventsviagetFiltersQuery()/generateParams(), but the unified table only exposeserror_name,error_message, anderror_filename. Any error filter or params request on those fields will still break after this migration.Also applies to: 2883-2891
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 1259 - 1268, The main query builders still reference public error field names (name, message, filename) but the unified table exposes legacy columns (error_name, error_message, error_filename); update the mapping logic in the query builders used by getFiltersQuery() and generateParams() (and the same branch pattern as the ev/ev:value handling where sqlColumn is assigned) to remap when column === 'name'|'message'|'filename' (and tag variants if applicable) to set sqlColumn to 'error_name'|'error_message'|'error_filename' so error filters and params use the correct unified-table columns (apply the same change pattern used in getErrorsFilters()).backend/apps/community/src/analytics/analytics.service.ts (1)
4929-4934:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRestore the project filter in the error-details params subquery.
groupErrorsByTimeBucket()uses thissubQueryto build the params payload. Line 4933 no longer constrainspid, so the params lists for a giveneidcan aggregate rows from other projects that hit the same error signature. That breaks tenant isolation and can leak unrelated error metadata into this response.🔧 Minimal fix
const groupedChart = await this.groupErrorsByTimeBucket( timeBucket, groupFrom, groupTo, - `FROM events WHERE type = 'error' AND eid = {eid:FixedString(32)} AND created BETWEEN {groupFrom:String} AND {groupTo:String}`, + `FROM events WHERE pid = {pid:FixedString(12)} AND type = 'error' AND eid = {eid:FixedString(32)} AND created BETWEEN {groupFrom:String} AND {groupTo:String}`, 'AND eid = {eid:FixedString(32)}', paramsData, safeTimezone, ChartRenderMode.PERIODICAL, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 4929 - 4934, The subquery passed to groupErrorsByTimeBucket is missing the project filter which allows rows from other projects to be included; update the call that assigns groupedChart (the groupErrorsByTimeBucket(...) invocation) to add the project constraint "AND pid = {pid:FixedString(32)}" into the subQuery string (the `FROM events WHERE ...` template) and also include the same "AND pid = {pid:FixedString(32)}" in the additional filter argument currently set to 'AND eid = {eid:FixedString(32)}' so the params payload includes pid and tenant isolation is preserved.
🧹 Nitpick comments (1)
backend/apps/cloud/src/project/project.service.ts (1)
899-915: ⚡ Quick winParameterize month bounds here for consistency and safety.
Lines 907-908 interpolate
monthStart/monthEnddirectly, while nearby queries usequery_params. Switching to parameters keeps query construction uniform and avoids brittle string embedding.Proposed diff
- const params = { pids: pidChunk } + const params = { pids: pidChunk, monthStart, monthEnd } const query = ` SELECT countIf(type = 'pageview') AS traffic, countIf(type = 'custom_event') AS customEvents, countIf(type = 'captcha') AS captcha, countIf(type = 'error') AS errors FROM events WHERE pid IN ({pids:Array(FixedString(12))}) - AND created BETWEEN '${monthStart}' AND '${monthEnd}' + AND created BETWEEN {monthStart:String} AND {monthEnd:String} AND type IN ('pageview', 'custom_event', 'captcha', 'error') `🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/project/project.service.ts` around lines 899 - 915, The SQL string in the query variable embeds monthStart/monthEnd directly; change it to use parameter placeholders and pass monthStart and monthEnd through the existing params object so the ClickHouse call remains uniform and safe. Update the query template in project.service.ts (the query constant) to reference parameter placeholders for the created BETWEEN bounds, and add monthStart and monthEnd to the params passed into clickhouse.query({ query, query_params: params }). Ensure the placeholder types match ClickHouse parameter types used elsewhere so clickhouse.query consumes them correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 6271-6278: The current SQL joins "errors" to every "pageview" row
using aliases errors and analytics causing count(*) to multiply errorCount; fix
by aggregating the analytics/pageview side to one row per (pid, psid) before the
LEFT JOIN (e.g., replace the direct events AS analytics join with a subquery or
CTE that SELECTs pid, psid, COUNT(*) AS pageviews or a single marker WHERE
type='pageview' GROUP BY pid, psid) so the join does not duplicate errors, or
alternatively remove pageview rows from the join and compute error counts solely
from the errors side (keep the WHERE filters errors.pid, errors.type='error',
errors.eid as-is). Ensure references to errors.psid, errors.pid and
analytics.type='pageview' are adjusted to use the aggregated alias and use the
aggregated column (pageviews) instead of relying on count(*) after the join.
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 1999-2001: The inactivity probe currently ignores rows with type =
'error'; update the three SQL query strings (queryAnalytics, queryCaptcha,
queryCustomEvents) so their WHERE type condition includes 'error' as well (e.g.,
replace "type = 'pageview'" with "type IN ('pageview','error')", similarly for
'captcha' and 'custom_event') to ensure error activity is counted by the probe.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 4965-4971: queryTotalSessions is counting DISTINCT psid without
applying filtersQuery, causing the errorRate numerator (from queryErrorStats
which uses filtersQuery) and denominator to be from different scopes; update
queryTotalSessions to apply the same filtersQuery or derive the total session
set from the same filtered session population (e.g., include filtersQuery in the
WHERE clause or compute totalSessions by selecting DISTINCT psid from a subquery
that applies filtersQuery/joins used by queryErrorStats) and repeat the same fix
for the other total-session queries referenced near queryErrorStats so numerator
and denominator use the same filter scope.
- Around line 4397-4406: The session details query (querySessionDetails) only
selects rows with type = 'pageview' (and elsewhere falls back to
'custom_event'), so sessions that contain only 'error' events return empty
details; update the WHERE clause used by querySessionDetails (and the analogous
clause around lines 4487-4499) to include error events as well (e.g., allow type
IN ('pageview','error','custom_event') or add OR type = 'error') so the shared
device/location fields from error rows are returned for error-only sessions;
ensure you update any function or variable that builds/uses that SQL
(querySessionDetails, queryPages) to use the new filter.
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 1259-1268: The main query builders still reference public error
field names (name, message, filename) but the unified table exposes legacy
columns (error_name, error_message, error_filename); update the mapping logic in
the query builders used by getFiltersQuery() and generateParams() (and the same
branch pattern as the ev/ev:value handling where sqlColumn is assigned) to remap
when column === 'name'|'message'|'filename' (and tag variants if applicable) to
set sqlColumn to 'error_name'|'error_message'|'error_filename' so error filters
and params use the correct unified-table columns (apply the same change pattern
used in getErrorsFilters()).
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 4929-4934: The subquery passed to groupErrorsByTimeBucket is
missing the project filter which allows rows from other projects to be included;
update the call that assigns groupedChart (the groupErrorsByTimeBucket(...)
invocation) to add the project constraint "AND pid = {pid:FixedString(32)}" into
the subQuery string (the `FROM events WHERE ...` template) and also include the
same "AND pid = {pid:FixedString(32)}" in the additional filter argument
currently set to 'AND eid = {eid:FixedString(32)}' so the params payload
includes pid and tenant isolation is preserved.
---
Nitpick comments:
In `@backend/apps/cloud/src/project/project.service.ts`:
- Around line 899-915: The SQL string in the query variable embeds
monthStart/monthEnd directly; change it to use parameter placeholders and pass
monthStart and monthEnd through the existing params object so the ClickHouse
call remains uniform and safe. Update the query template in project.service.ts
(the query constant) to reference parameter placeholders for the created BETWEEN
bounds, and add monthStart and monthEnd to the params passed into
clickhouse.query({ query, query_params: params }). Ensure the placeholder types
match ClickHouse parameter types used elsewhere so clickhouse.query consumes
them correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 18858ba9-a16a-4b65-a7f1-577271c39c62
📒 Files selected for processing (11)
backend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/analytics/utils/transformers.tsbackend/apps/cloud/src/experiment/experiment.controller.tsbackend/apps/cloud/src/goal/goal.controller.tsbackend/apps/cloud/src/project/project.service.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.service.tsbackend/apps/community/src/data-import/mappers/fathom.mapper.tsbackend/apps/community/src/goal/goal.controller.tsbackend/apps/community/src/project/project.controller.tsbackend/migrations/clickhouse/2026_05_01_unify_events.js
✅ Files skipped from review due to trivial changes (1)
- backend/apps/cloud/src/analytics/utils/transformers.ts
🚧 Files skipped from review as they are similar to previous changes (5)
- backend/apps/community/src/project/project.controller.ts
- backend/migrations/clickhouse/2026_05_01_unify_events.js
- backend/apps/community/src/data-import/mappers/fathom.mapper.ts
- backend/apps/cloud/src/experiment/experiment.controller.ts
- backend/apps/cloud/src/goal/goal.controller.ts
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/apps/cloud/src/task-manager/task-manager.service.ts (1)
390-405:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve
EXACTempty-value matching here.This early blank-value guard now diverges from
backend/apps/cloud/src/goal/goal.controller.ts:341-360: exact custom-event goals withvalue = ''will never match in reports anymore, even though the main goal query path still allows them. Move the1=0shortcut into theCONTAINSbranch so only the wildcard case is suppressed.Suggested fix
- // If goal value is blank, never match anything (avoid LIKE '%%') - if (goalValue.trim() === '') { - return { condition: '1=0', params: {} } - } - const params: Record<string, string> = {} const column = goal.type === GoalType.CUSTOM_EVENT ? 'event_name' : 'pg' if (goal.matchType === GoalMatchType.EXACT) { params[paramKey] = goalValue return { condition: `${column} = {${paramKey}:String}`, params } } if (goal.matchType === GoalMatchType.CONTAINS) { + // Avoid generating a wildcard that matches everything. + if (goalValue.trim() === '') { + return { condition: '1=0', params: {} } + } params[paramKey] = `%${goalValue}%` return { condition: `${column} ILIKE {${paramKey}:String}`, params } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/task-manager/task-manager.service.ts` around lines 390 - 405, The current blank-value guard at the top (returning { condition: '1=0' } when goalValue.trim() === '') incorrectly prevents EXACT matches; instead, remove that early check and only suppress empty-values for wildcard matching by moving the blank-value guard into the CONTAINS branch: keep the EXISTING logic that builds params/column/paramKey and for GoalMatchType.CONTAINS, if goalValue.trim() === '' return { condition: '1=0', params: {} } else set params[paramKey] = `%${goalValue}%` and return the ILIKE condition; leave the EXACT branch (GoalMatchType.EXACT) unchanged so empty-string exact custom-event goals still match.backend/apps/community/src/analytics/analytics.service.ts (1)
5165-5184:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFallback to the error row for
cc/br/oshere.
analytics.type = 'pageview'can be missing for error-only sessions, so this list returns empty device/location fields even though unified error events already store them. UsingCOALESCE(any(analytics.cc), any(errors.cc))(same forbr/os) or aggregating directly fromerrorsavoids dropping that context.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 5165 - 5184, The SELECT pulls device/location fields from the joined alias analytics which can be NULL for error-only sessions; update the query in the querySessions SQL so cc, br and os fall back to the values on the errors row (e.g. use COALESCE(any(analytics.cc), any(errors.cc)) for cc, and analogously for br and os) or aggregate directly from errors when analytics.pageview is missing so error-only sessions retain device/location context.
🧹 Nitpick comments (1)
backend/apps/cloud/src/task-manager/task-manager.service.ts (1)
1999-2075: ⚡ Quick winCollapse the inactivity probe into one
eventsscan.Now that all activity lives in
events, these three queries/loops are redundant. For inactive users this code scans the same PID chunk and date range up to three times, anderroris counted in every pass anyway. A singletype IN ('pageview', 'captcha', 'custom_event', 'error')query would keep the behavior and cut the ClickHouse work substantially.Possible simplification
- const queryAnalytics = `SELECT count() FROM events WHERE pid IN ({pids:Array(FixedString(12))}) AND type IN ('pageview', 'error') AND created BETWEEN {nineWeeksAgo:String} AND {now:String}` - const queryCaptcha = `SELECT count() FROM events WHERE pid IN ({pids:Array(FixedString(12))}) AND type IN ('captcha', 'error') AND created BETWEEN {nineWeeksAgo:String} AND {now:String}` - const queryCustomEvents = `SELECT count() FROM events WHERE pid IN ({pids:Array(FixedString(12))}) AND type IN ('custom_event', 'error') AND created BETWEEN {nineWeeksAgo:String} AND {now:String}` + const activityQuery = `SELECT count() FROM events WHERE pid IN ({pids:Array(FixedString(12))}) AND type IN ('pageview', 'captcha', 'custom_event', 'error') AND created BETWEEN {nineWeeksAgo:String} AND {now:String}` - let totalAnalytics = 0 - let totalCaptcha = 0 - let totalCustomEvents = 0 + let totalActivity = 0 for (let i = 0; i < pids.length; i += CHUNK_SIZE) { const pidChunk = pids.slice(i, i + CHUNK_SIZE) const queryParams = { pids: pidChunk, nineWeeksAgo, now, } - const { data: analyticsResult } = await clickhouse + const { data: activityResult } = await clickhouse .query({ - query: queryAnalytics, + query: activityQuery, query_params: queryParams, }) .then((resultSet) => resultSet.json<{ 'count()': number }>()) - totalAnalytics += analyticsResult[0]['count()'] + totalActivity += activityResult[0]['count()'] - // Early return if we found activity - if (totalAnalytics > 0) { - return - } - } - - for (let i = 0; i < pids.length; i += CHUNK_SIZE) { - ... - } - - for (let i = 0; i < pids.length; i += CHUNK_SIZE) { - ... - if (totalCustomEvents > 0) { + if (totalActivity > 0) { return } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/task-manager/task-manager.service.ts` around lines 1999 - 2075, Replace the three separate queries/loops (queryAnalytics, queryCaptcha, queryCustomEvents) that iterate pids in CHUNK_SIZE with a single events scan: build one query (e.g. queryEvents) using type IN ('pageview','captcha','custom_event','error') and the same {pids:Array(FixedString(12))}, {nineWeeksAgo:String}, {now:String} params, then loop over pids in CHUNK_SIZE once, call clickhouse.query(...).then(resultSet => resultSet.json<{ 'count()': number }>()), accumulate the single result (like totalEvents) and early-return as soon as totalEvents > 0; remove the redundant queryAnalytics/queryCaptcha/queryCustomEvents blocks and their totals so you only hit ClickHouse once per chunk.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 390-405: The current blank-value guard at the top (returning {
condition: '1=0' } when goalValue.trim() === '') incorrectly prevents EXACT
matches; instead, remove that early check and only suppress empty-values for
wildcard matching by moving the blank-value guard into the CONTAINS branch: keep
the EXISTING logic that builds params/column/paramKey and for
GoalMatchType.CONTAINS, if goalValue.trim() === '' return { condition: '1=0',
params: {} } else set params[paramKey] = `%${goalValue}%` and return the ILIKE
condition; leave the EXACT branch (GoalMatchType.EXACT) unchanged so
empty-string exact custom-event goals still match.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 5165-5184: The SELECT pulls device/location fields from the joined
alias analytics which can be NULL for error-only sessions; update the query in
the querySessions SQL so cc, br and os fall back to the values on the errors row
(e.g. use COALESCE(any(analytics.cc), any(errors.cc)) for cc, and analogously
for br and os) or aggregate directly from errors when analytics.pageview is
missing so error-only sessions retain device/location context.
---
Nitpick comments:
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 1999-2075: Replace the three separate queries/loops
(queryAnalytics, queryCaptcha, queryCustomEvents) that iterate pids in
CHUNK_SIZE with a single events scan: build one query (e.g. queryEvents) using
type IN ('pageview','captcha','custom_event','error') and the same
{pids:Array(FixedString(12))}, {nineWeeksAgo:String}, {now:String} params, then
loop over pids in CHUNK_SIZE once, call clickhouse.query(...).then(resultSet =>
resultSet.json<{ 'count()': number }>()), accumulate the single result (like
totalEvents) and early-return as soon as totalEvents > 0; remove the redundant
queryAnalytics/queryCaptcha/queryCustomEvents blocks and their totals so you
only hit ClickHouse once per chunk.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 12e983ac-69fc-4a33-b338-c9c4e9b496d9
📒 Files selected for processing (4)
backend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/project/project.service.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.service.ts
✅ Files skipped from review due to trivial changes (2)
- backend/apps/cloud/src/project/project.service.ts
- backend/apps/cloud/src/analytics/analytics.service.ts
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
backend/apps/community/src/analytics/analytics.service.ts (3)
4788-4792:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTreat missing statuses as active in the error list.
The unresolved branch filters to
'active'and'regressed'only. Fresh errors have no row inerror_statuses, so they disappear from the list until someone manually changes their status.getErrorOverview()already treatsNULLas active, so these two views will diverge.🩹 Suggested fix
${ parsedOptions?.showResolved ? '' - : "WHERE status.status = 'active' OR status.status = 'regressed'" + : "WHERE status.status = 'active' OR status.status = 'regressed' OR status.status IS NULL" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 4788 - 4792, The WHERE clause that filters unresolved errors currently only includes "status.status = 'active' OR status.status = 'regressed'" which excludes rows where status is NULL; update the clause used where parsedOptions?.showResolved is false so that missing statuses are treated as active (e.g., include "OR status.status IS NULL" or use COALESCE(status.status, 'active') to compare), aligning this behavior with getErrorOverview() and the parsedOptions?.showResolved branch.
3169-3203:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix the cumulative error chart aliases.
generateErrorsAggregationQuery()now projectscountandaffectedUsers, but the cumulative branch still sumspageviewsanduniques. Any caller that requestsChartRenderMode.CUMULATIVEwill hit missing-column SQL here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 3169 - 3203, The cumulative branch of generateErrorsAggregationQuery still references pageviews and uniques which no longer exist; update the SELECT that wraps baseQuery (when mode === ChartRenderMode.CUMULATIVE) to sum the actual projected columns from baseQuery (sum(count) OVER (ORDER BY ${groupBy}) and sum(affectedUsers) OVER (ORDER BY ${groupBy})) and give them clear aliases (e.g., count_cumulative or keep count if you prefer) so the SQL uses the correct column names produced by baseQuery and aligns with callers expecting count and affectedUsers.
5699-5715:⚠️ Potential issue | 🟠 Major | ⚡ Quick winProfile details should not require a pageview row.
This lookup still reads only
type = 'pageview'. Profiles created entirely by custom events or errors will return an empty device/location block even though the unifiedeventsrows carry the same fields. Matching the broader session-details behavior here would avoid blank profile panels.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 5699 - 5715, The query in queryDetails currently filters WHERE type = 'pageview', causing profiles without pageview rows to return empty device/location data; update the WHERE clause in the queryDetails SQL so it does not restrict to 'pageview' (e.g., remove the type = 'pageview' condition or replace it with a broader condition that matches the session-details query behavior such as allowing all types or an IN(...) set), keeping the pid = {pid:FixedString(12)} and profileId = {profileId:String} predicates intact so device/location fields are aggregated from any event type in the events table.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Line 1999: The SQL in queryEvents uses a date-only upper bound named now
(YYYY-MM-DD) which truncates to midnight and omits recent activity; update the
code that builds the query so the BETWEEN uses full timestamps (YYYY-MM-DD
HH:mm:ss) for both nineWeeksAgo and now, or change the SQL to compare the upper
bound to now() (e.g., use created BETWEEN {nineWeeksAgo:String} AND now()) to
include the current day's full time range; locate the queryEvents constant and
the variables that set nineWeeksAgo/now and adjust their formatting or the SQL
accordingly.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 4586-4610: The sessions seed query only selects rows with type IN
('pageview','custom_event') so sessions made solely of 'error' events are
omitted; update both occurrences where distinct_sessions_filtered is populated
(the customEVFilterApplied branch building primaryEventsSubquery and the
alternate branch around lines 4613-4628) to include 'error' in the type IN
clause (e.g., type IN ('pageview','custom_event','error') ) so error-only
sessions are included for session listing and downstream functions like
getSessionDetails will surface them.
- Around line 4965-4973: The pageview denominator currently uses
queryTotalSessions which applies the full filtersQuery (including error-only
predicates like name/message/filename) to type='pageview', collapsing counts;
change queryTotalSessions to count DISTINCT psid from a session-level subquery
that applies only traffic/session-scoped filters (exclude or strip error_*
predicates from filtersQuery) so sessions are retained even if their error rows
have error predicates; locate where filtersQuery is composed and either build a
new sessionFiltersQuery or filter out predicates referencing
name/message/filename before using it in queryTotalSessions (keep existing
events table and the DISTINCT psid logic but apply the session-scoped filter set
instead of the full filtersQuery).
---
Outside diff comments:
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 4788-4792: The WHERE clause that filters unresolved errors
currently only includes "status.status = 'active' OR status.status =
'regressed'" which excludes rows where status is NULL; update the clause used
where parsedOptions?.showResolved is false so that missing statuses are treated
as active (e.g., include "OR status.status IS NULL" or use
COALESCE(status.status, 'active') to compare), aligning this behavior with
getErrorOverview() and the parsedOptions?.showResolved branch.
- Around line 3169-3203: The cumulative branch of generateErrorsAggregationQuery
still references pageviews and uniques which no longer exist; update the SELECT
that wraps baseQuery (when mode === ChartRenderMode.CUMULATIVE) to sum the
actual projected columns from baseQuery (sum(count) OVER (ORDER BY ${groupBy})
and sum(affectedUsers) OVER (ORDER BY ${groupBy})) and give them clear aliases
(e.g., count_cumulative or keep count if you prefer) so the SQL uses the correct
column names produced by baseQuery and aligns with callers expecting count and
affectedUsers.
- Around line 5699-5715: The query in queryDetails currently filters WHERE type
= 'pageview', causing profiles without pageview rows to return empty
device/location data; update the WHERE clause in the queryDetails SQL so it does
not restrict to 'pageview' (e.g., remove the type = 'pageview' condition or
replace it with a broader condition that matches the session-details query
behavior such as allowing all types or an IN(...) set), keeping the pid =
{pid:FixedString(12)} and profileId = {profileId:String} predicates intact so
device/location fields are aggregated from any event type in the events table.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 988ba2b3-c543-4d17-8a72-b290bca0fdc4
📒 Files selected for processing (3)
backend/apps/cloud/src/analytics/analytics.controller.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.service.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/apps/cloud/src/analytics/analytics.controller.ts
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
2303-2324:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFilter
duration_avgwith the same event scope asanalytics_counts.In the
period === 'all'branch,analytics_countsrespectstypeandfiltersQuery, butduration_avgcurrently averages every session in the project. That makes all-time birdseyesdurwrong whenever the request is filtered by page, country, referrer, or custom event.Suggested fix
duration_avg AS ( SELECT avgOrNull(duration) as sdur FROM ( SELECT psid, dateDiff('second', min(firstSeen), max(lastSeen)) as duration FROM sessions WHERE pid = {pid:FixedString(12)} + AND psid IN ( + SELECT DISTINCT psid + FROM events + WHERE pid = {pid:FixedString(12)} + AND type = '${allTimeType}' + AND psid IS NOT NULL + ${filtersQuery} + ) GROUP BY psid ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 2303 - 2324, The duration_avg subquery in queryAll is currently averaging all sessions for the project and ignores the same event scope used by analytics_counts; update the duration_avg logic in analytics.service.ts so it restricts sessions to the same scope by applying pid = {pid:FixedString(12)}, type = '${allTimeType}' and the ${filtersQuery} (or equivalently filter sessions by psid values from events that match that WHERE clause). Adjust the duration_avg CTE (and/or add a JOIN or IN-subquery against events) so sdur is computed only for sessions whose psid appears in events matching the analytics_counts filters.backend/apps/cloud/src/analytics/analytics.controller.ts (1)
261-296:⚠️ Potential issue | 🟠 Major | ⚡ Quick winChoose the all-time event type after parsing filters.
This branch always asks
calculateTimeBucketForAllTime()forpageview. That breaks two cases introduced by the unified table: CAPTCHA requests (/captcha) should usecaptcha, and analytics requests with anev/custom-event filter should usecustom_event. Otherwise the computed all-time window can be too short and silently clip results.Suggested fix
- let newTimebucket = timeBucket - let allowedTumebucketForPeriodAll - - let diff - - if (period === 'all') { - const res = await this.analyticsService.calculateTimeBucketForAllTime( - pid, - 'pageview', - ) + const [filtersQuery, filtersParams, appliedFilters, customEVFilterApplied] = + this.analyticsService.getFiltersQuery( + filters, + isCaptcha ? DataType.CAPTCHA : DataType.ANALYTICS, + ) + + let newTimebucket = timeBucket + let allowedTumebucketForPeriodAll + let diff + + if (period === 'all') { + const res = await this.analyticsService.calculateTimeBucketForAllTime( + pid, + isCaptcha + ? 'captcha' + : this.analyticsService.getAnalyticsEventType(customEVFilterApplied), + ) @@ - const [filtersQuery, filtersParams, appliedFilters, customEVFilterApplied] = - this.analyticsService.getFiltersQuery( - filters, - isCaptcha ? DataType.CAPTCHA : DataType.ANALYTICS, - )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.controller.ts` around lines 261 - 296, The code calls calculateTimeBucketForAllTime(pid, 'pageview') before parsing filters, which forces pageview event type and can clip CAPTCHA or custom-event ranges; move the calculateTimeBucketForAllTime call to after getFiltersQuery(...) so you can read customEVFilterApplied and isCaptcha, then choose the event type dynamically: use 'captcha' when isCaptcha is true, 'custom_event' when customEVFilterApplied is true, otherwise 'pageview', and pass that eventType into calculateTimeBucketForAllTime (keep existing symbols: calculateTimeBucketForAllTime, getFiltersQuery, customEVFilterApplied, isCaptcha, and use the resulting diff/timeBucket as before).
♻️ Duplicate comments (1)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
6294-6309:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep the pageview enrichment join inside the requested time window.
This aggregated join no longer multiplies
errorCount, but it still pullscc/br/osfrom all pageviews ever seen for the samepid/psid. Sincepsidis derived deterministically and can recur across sessions,any(cc/br/os)can come from outside the requested error window and show stale metadata here.Suggested fix
LEFT JOIN ( SELECT pid, psid, any(cc) AS cc, any(br) AS br, any(os) AS os, count(*) AS pageviews FROM events WHERE pid = {pid:FixedString(12)} AND type = 'pageview' AND psid IS NOT NULL + AND created BETWEEN {groupFrom:String} AND {groupTo:String} GROUP BY pid, psid ) AS analytics🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 6294 - 6309, The analytics LEFT JOIN currently aggregates any(cc)/any(br)/any(os) from all events for the pid/psid; restrict that aggregation to the same time window used for errors so the enriched cc/br/os come only from pageviews in the requested time range. Update the subquery in the analytics join (the SELECT from events grouped by pid, psid using any(cc)/any(br)/any(os)/count(*) ) to include the same timestamp filter as the outer errors query (e.g., the BETWEEN or >=/< conditions on the events time column) so the GROUP BY only considers pageviews within the requested window for pid and psid. Ensure the predicate uses the same parameters used by errors (the request window variables) and keep the join keys errors.psid = analytics.psid and errors.pid = analytics.pid.
🧹 Nitpick comments (1)
backend/apps/cloud/src/experiment/experiment.controller.ts (1)
942-973: ⚡ Quick winExtract the goal-to-events SQL mapping into one helper.
eventType,matchColumn, blank-value handling, and metadata filter generation are now duplicated in both conversion paths. Keeping that in one private builder would reduce drift the next time the events schema changes.Also applies to: 1183-1217
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/experiment/experiment.controller.ts` around lines 942 - 973, The logic that maps experiment.goal to SQL pieces (eventType, matchColumn, matchCondition/blank-value handling) and the metadata filter generation (metaCondition, metaParams built from metadataFilters) should be extracted into a single private helper (e.g., buildGoalEventConditions) that accepts the experiment.goal and returns an object with eventType, matchColumn, matchCondition, metaCondition, metaParams and goalValue; replace the duplicated blocks (the current code that computes eventType/matchColumn/matchCondition and the metadataFilters.forEach block) with calls to this helper in both places so both conversion paths reuse the same builder and avoid drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 2607-2640: The current customEVFilterApplied branch in
extractAnalyticsChartDataForScope incorrectly uses raw custom-event counts for
both visits and uniques and skips cumulative gap-filling; change it to call
this.extractChartData(result, xShifted) to obtain a baseline chartData (so
uniques and sdur come from extractChartData), then replace only chartData.visits
with the custom-event counts from
this.extractCustomEventsChartData(...)?._unknown_event (ensuring lengths match
by padding/truncating to xShifted length); keep chartData.uniques and
chartData.sdur intact and still run the ChartRenderMode.CUMULATIVE propagation
loop on chartData when mode === ChartRenderMode.CUMULATIVE before returning
chartData.
In `@backend/apps/cloud/src/experiment/experiment.controller.ts`:
- Around line 886-889: The call to
analyticsService.calculateTimeBucketForAllTime is hard-coded to 'pageview' but
must use the experiment's goal event type so the 'all' period selects the
correct bucket for custom-event experiments; in experiment.controller.ts replace
the literal 'pageview' passed to calculateTimeBucketForAllTime with the
experiment's goal event identifier (e.g., derive from experiment.goal or
experiment.goal.event/type used elsewhere in conversion queries) and fall back
to 'pageview' only if the goal event is absent, ensuring consistency with the
conversion query logic.
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.controller.ts`:
- Around line 261-296: The code calls calculateTimeBucketForAllTime(pid,
'pageview') before parsing filters, which forces pageview event type and can
clip CAPTCHA or custom-event ranges; move the calculateTimeBucketForAllTime call
to after getFiltersQuery(...) so you can read customEVFilterApplied and
isCaptcha, then choose the event type dynamically: use 'captcha' when isCaptcha
is true, 'custom_event' when customEVFilterApplied is true, otherwise
'pageview', and pass that eventType into calculateTimeBucketForAllTime (keep
existing symbols: calculateTimeBucketForAllTime, getFiltersQuery,
customEVFilterApplied, isCaptcha, and use the resulting diff/timeBucket as
before).
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 2303-2324: The duration_avg subquery in queryAll is currently
averaging all sessions for the project and ignores the same event scope used by
analytics_counts; update the duration_avg logic in analytics.service.ts so it
restricts sessions to the same scope by applying pid = {pid:FixedString(12)},
type = '${allTimeType}' and the ${filtersQuery} (or equivalently filter sessions
by psid values from events that match that WHERE clause). Adjust the
duration_avg CTE (and/or add a JOIN or IN-subquery against events) so sdur is
computed only for sessions whose psid appears in events matching the
analytics_counts filters.
---
Duplicate comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 6294-6309: The analytics LEFT JOIN currently aggregates
any(cc)/any(br)/any(os) from all events for the pid/psid; restrict that
aggregation to the same time window used for errors so the enriched cc/br/os
come only from pageviews in the requested time range. Update the subquery in the
analytics join (the SELECT from events grouped by pid, psid using
any(cc)/any(br)/any(os)/count(*) ) to include the same timestamp filter as the
outer errors query (e.g., the BETWEEN or >=/< conditions on the events time
column) so the GROUP BY only considers pageviews within the requested window for
pid and psid. Ensure the predicate uses the same parameters used by errors (the
request window variables) and keep the join keys errors.psid = analytics.psid
and errors.pid = analytics.pid.
---
Nitpick comments:
In `@backend/apps/cloud/src/experiment/experiment.controller.ts`:
- Around line 942-973: The logic that maps experiment.goal to SQL pieces
(eventType, matchColumn, matchCondition/blank-value handling) and the metadata
filter generation (metaCondition, metaParams built from metadataFilters) should
be extracted into a single private helper (e.g., buildGoalEventConditions) that
accepts the experiment.goal and returns an object with eventType, matchColumn,
matchCondition, metaCondition, metaParams and goalValue; replace the duplicated
blocks (the current code that computes eventType/matchColumn/matchCondition and
the metadataFilters.forEach block) with calls to this helper in both places so
both conversion paths reuse the same builder and avoid drift.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6011634a-386b-4007-8e5f-3056f653a39b
📒 Files selected for processing (6)
backend/apps/cloud/src/analytics/analytics.controller.tsbackend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/experiment/experiment.controller.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.controller.tsbackend/apps/community/src/analytics/analytics.service.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/apps/community/src/analytics/analytics.controller.ts
- backend/apps/cloud/src/task-manager/task-manager.service.ts
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
backend/apps/community/src/analytics/analytics.service.ts (2)
5546-5588:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSeed profile aggregates from error rows too.
Both profile CTE builders still start from
type IN ('pageview', 'custom_event'). After the unified-table migration, a profile or profile-session that only producederrorrows disappears fromgetProfilesList()/getProfileSessionsList(), even though those responses already join error counts fromevents. Includeerrorin the seed set so error-only profiles stay discoverable.💡 Minimal fix
- AND type IN ('pageview', 'custom_event') + AND type IN ('pageview', 'custom_event', 'error')Apply the same change in both helper CTEs.
Also applies to: 5964-6001
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 5546 - 5588, The seed CTE currently filters events with "type IN ('pageview', 'custom_event')" which omits profiles that only have 'error' rows; update buildProfilesListDataCTE (and the other analogous CTE builder around the 5964-6001 range) to include 'error' in the type list so the seed query selects type IN ('pageview','custom_event','error'), keeping existing use of profileTypeFilter, filtersQuery and scopedProfileFilter unchanged.
5728-5742:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't derive profile details from arbitrary event types.
This query now runs
any(cc/rg/ct/os/...)across everyeventsrow for the profile. In the unified table that can pick a non-activity row and return nondeterministic or empty device/location data even when pageview/error rows have the right values. Restrict the source rows to the shared activity event types, and preferably use a deterministic aggregate likeargMax(..., created).💡 Suggested fix
SELECT - any(cc) AS cc, - any(rg) AS rg, - any(ct) AS ct, - any(os) AS os, - any(osv) AS osv, - any(br) AS br, - any(brv) AS brv, - any(dv) AS dv, - any(lc) AS lc + argMax(cc, created) AS cc, + argMax(rg, created) AS rg, + argMax(ct, created) AS ct, + argMax(os, created) AS os, + argMax(osv, created) AS osv, + argMax(br, created) AS br, + argMax(brv, created) AS brv, + argMax(dv, created) AS dv, + argMax(lc, created) AS lc FROM events WHERE pid = {pid:FixedString(12)} AND profileId = {profileId:String} + AND type IN ('pageview', 'custom_event', 'error', 'captcha')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 5728 - 5742, The query in queryDetails pulls profile/device/location fields using any(...) over all events which can return nondeterministic values from non-activity rows; change the WHERE to only include the shared activity event types (e.g. add "AND type IN ('pageview','error','click','session')" or the actual activity type names used in your domain) and replace the any(...) aggregates with deterministic argMax(column, created) (for example argMax(cc, created) AS cc, argMax(rg, created) AS rg, etc.) so values are sourced from the most recent activity row per field; update the SELECT and WHERE in the queryDetails string accordingly.backend/apps/cloud/src/task-manager/task-manager.service.ts (1)
384-410:⚠️ Potential issue | 🟠 Major | ⚡ Quick winApply
goal.metadataFiltersto report conversions.This helper/query path only carries the name/path predicate, so report conversions ignore metadata-constrained goals. A goal like “
signupwhereplan=premium” will be counted for every matchingsignupevent in report emails.Suggested direction
- private buildGoalMatchCondition( + private buildGoalMatchCondition( goal: Goal, paramKey: string, - ): { condition: string; params: Record<string, string> } { + ): { + condition: string + metaCondition: string + params: Record<string, string> + } { const goalValue = (goal.value ?? '').toString() const params: Record<string, string> = {} const column = goal.type === GoalType.CUSTOM_EVENT ? 'event_name' : 'pg' + let metaCondition = '' if (goal.matchType === GoalMatchType.EXACT) { params[paramKey] = goalValue - return { condition: `${column} = {${paramKey}:String}`, params } + // build metaCondition from goal.metadataFilters here + return { + condition: `${column} = {${paramKey}:String}`, + metaCondition, + params, + } } if (goal.matchType === GoalMatchType.CONTAINS) { if (goalValue.trim() === '') { - return { condition: '1=0', params: {} } + return { condition: '1=0', metaCondition: '', params: {} } } params[paramKey] = `%${goalValue}%` - return { condition: `${column} ILIKE {${paramKey}:String}`, params } + return { + condition: `${column} ILIKE {${paramKey}:String}`, + metaCondition, + params, + } } params[paramKey] = goalValue - return { condition: `match(${column}, {${paramKey}:String})`, params } + return { + condition: `match(${column}, {${paramKey}:String})`, + metaCondition, + params, + } } ... - const { condition, params: matchParams } = this.buildGoalMatchCondition( + const { condition, metaCondition, params: matchParams } = + this.buildGoalMatchCondition( goal, goalValueKey, ) ... AND type = '${eventType}' AND ${condition} + ${metaCondition} AND created BETWEEN {groupFrom:String} AND {groupTo:String}Also applies to: 443-470
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/task-manager/task-manager.service.ts` around lines 384 - 410, The goal matching helper buildGoalMatchCondition currently only emits a name/path predicate and ignores goal.metadataFilters; update buildGoalMatchCondition to append metadata filter predicates and params to the returned condition so metadata-constrained goals (e.g., signup where plan=premium) are enforced. Concretely: convert goal.metadataFilters into one or more SQL/QL predicate fragments (e.g., JSON/column equality or ILIKE depending on filter type), add those fragments joined with AND to the existing condition, and add their uniquely-named params to the params object (use unique keys derived from paramKey like `${paramKey}_metaN` to avoid collisions). Do the same fix for the other similar helper at the second occurrence (around lines 443-470) so both query paths apply goal.metadataFilters consistently. Ensure empty metadataFilters are a no-op and that exact/contains/regex match handling for the main value remains unchanged.backend/apps/cloud/src/analytics/analytics.controller.ts (1)
879-891:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize both the query window and the x-axis in
getPerfChartData().This path computes normalized
groupFrom/groupTo, but then sends local bounds inparamsDataand still callsgetPerfChartData()with the rawfrom/toinputs. Preset periods liketoday/7dand non-UTC custom ranges can therefore query one window and render another.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.controller.ts` around lines 879 - 891, The controller is normalizing the chart window into groupFrom/groupTo but still passes the original raw from/to into getPerfChartData, causing mismatched query vs x-axis; update the call so paramsData contains the normalized bounds (use pid, groupFrom, groupTo, ...filtersParams) and pass the normalized groupFrom and groupTo (instead of raw from/to) into analyticsService.getPerfChartData so the queried window and rendered x-axis match; reference symbols: paramsData, pid, groupFrom, groupTo, getPerfChartData, analyticsService.
♻️ Duplicate comments (1)
backend/apps/community/src/analytics/analytics.service.ts (1)
4998-5006:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCount the error-rate denominator from the same session population.
queryTotalSessionsstill hard-codestype = 'pageview', whilequeryErrorStatscounts affected sessions fromtype = 'error'. Error-only sessions therefore still never reach the denominator, soerrorRateis understated for filtered error cohorts and can even fall to0when the cohort has no pageviews. This is the same scope mismatch that was flagged earlier; it needs the denominator to come from a session seed/subquery, not raw pageview rows.💡 Suggested direction
- const queryTotalSessions = ` - SELECT count(DISTINCT psid) as totalSessions - FROM events - WHERE pid = {pid:FixedString(12)} - AND type = 'pageview' - AND created BETWEEN {groupFrom:String} AND {groupTo:String} - ${sessionFiltersQuery} - ` + const queryTotalSessions = ` + SELECT count() AS totalSessions + FROM ( + SELECT DISTINCT psid + FROM events + WHERE pid = {pid:FixedString(12)} + AND type IN ('pageview', 'custom_event', 'error') + AND created BETWEEN {groupFrom:String} AND {groupTo:String} + ${sessionFiltersQuery} + ) + `Also applies to: 5009-5026
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 4998 - 5006, queryTotalSessions currently restricts the denominator to rows with type = 'pageview', but queryErrorStats counts sessions from type = 'error', causing a scope mismatch; update queryTotalSessions to derive its session population from the same session seed/subquery used by queryErrorStats (e.g., count(DISTINCT psid) from the session seed/subquery or by reusing the same JOIN/subquery logic and sessionFiltersQuery) so both numerator and denominator operate over the identical set of sessions (refer to queryTotalSessions, queryErrorStats, sessionFiltersQuery and the psid usage in the events table) and remove the hard-coded type = 'pageview' filter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@admin/src/db/clickhouse.ts`:
- Around line 37-46: PROJECT_ACTIVITY_EVENT_TYPES currently omits "performance",
causing project-level activity filters (used where PROJECT_ACTIVITY_EVENT_TYPES
and PROJECT_ACTIVITY_TYPE_LIST are referenced) to exclude performance-only
projects; add "performance" to the PROJECT_ACTIVITY_EVENT_TYPES array and
regenerate PROJECT_ACTIVITY_TYPE_LIST so the SQL string includes 'performance'
(ensure any callers using PROJECT_ACTIVITY_EVENT_TYPES or
PROJECT_ACTIVITY_TYPE_LIST — e.g., the places noted around
PROJECT_ACTIVITY_EVENT_TYPES mapping and usage — will now include performance
events in project activity counts).
In `@backend/apps/cloud/src/analytics/analytics.controller.ts`:
- Around line 729-731: paramsData is currently built with local
groupFrom/groupTo but the aggregation service expects UTC-normalized bounds (as
used by getData()); convert groupFrom and groupTo to the same UTC-normalized
bounds before putting them into paramsData so ClickHouse queries use UTC
windows. Locate where paramsData is constructed (symbols: paramsData, pid,
groupFrom, groupTo, filtersParams) and replace the raw groupFrom/groupTo with
their UTC-normalized equivalents (use the same normalization helper or logic
used by getData()); apply the same fix to the other occurrences referenced (the
block around lines with the same paramsData at 2737-2739).
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 6104-6112: The denominator query (queryTotalSessions) currently
counts DISTINCT psid only for type = 'pageview', which mismatches the numerator
(affectedSessions derived from error rows) and can inflate errorRate; update
queryTotalSessions (the SQL string assigned to queryTotalSessions) to count
DISTINCT psid from the same session universe as affectedSessions by removing the
strict pageview filter or replacing it with the same type set used for
affectedSessions (e.g., include 'error' and 'custom_event' or no type filter),
while keeping pid, created BETWEEN and sessionFiltersQuery intact so both
numerator and denominator scan the same sessions.
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 579-582: The activity-count SQL excludes type='performance',
causing performance-only projects to be treated as having no events; update the
query that currently filters "AND type IN ('pageview', 'custom_event', 'error',
'captcha')" (the SELECT count() AS totalEvents query in task-manager.service.ts)
to include 'performance' in the IN list, and make the same change for the other
occurrences around the 2001–2025 range so performance events are counted
consistently.
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.controller.ts`:
- Around line 879-891: The controller is normalizing the chart window into
groupFrom/groupTo but still passes the original raw from/to into
getPerfChartData, causing mismatched query vs x-axis; update the call so
paramsData contains the normalized bounds (use pid, groupFrom, groupTo,
...filtersParams) and pass the normalized groupFrom and groupTo (instead of raw
from/to) into analyticsService.getPerfChartData so the queried window and
rendered x-axis match; reference symbols: paramsData, pid, groupFrom, groupTo,
getPerfChartData, analyticsService.
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 384-410: The goal matching helper buildGoalMatchCondition
currently only emits a name/path predicate and ignores goal.metadataFilters;
update buildGoalMatchCondition to append metadata filter predicates and params
to the returned condition so metadata-constrained goals (e.g., signup where
plan=premium) are enforced. Concretely: convert goal.metadataFilters into one or
more SQL/QL predicate fragments (e.g., JSON/column equality or ILIKE depending
on filter type), add those fragments joined with AND to the existing condition,
and add their uniquely-named params to the params object (use unique keys
derived from paramKey like `${paramKey}_metaN` to avoid collisions). Do the same
fix for the other similar helper at the second occurrence (around lines 443-470)
so both query paths apply goal.metadataFilters consistently. Ensure empty
metadataFilters are a no-op and that exact/contains/regex match handling for the
main value remains unchanged.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 5546-5588: The seed CTE currently filters events with "type IN
('pageview', 'custom_event')" which omits profiles that only have 'error' rows;
update buildProfilesListDataCTE (and the other analogous CTE builder around the
5964-6001 range) to include 'error' in the type list so the seed query selects
type IN ('pageview','custom_event','error'), keeping existing use of
profileTypeFilter, filtersQuery and scopedProfileFilter unchanged.
- Around line 5728-5742: The query in queryDetails pulls profile/device/location
fields using any(...) over all events which can return nondeterministic values
from non-activity rows; change the WHERE to only include the shared activity
event types (e.g. add "AND type IN ('pageview','error','click','session')" or
the actual activity type names used in your domain) and replace the any(...)
aggregates with deterministic argMax(column, created) (for example argMax(cc,
created) AS cc, argMax(rg, created) AS rg, etc.) so values are sourced from the
most recent activity row per field; update the SELECT and WHERE in the
queryDetails string accordingly.
---
Duplicate comments:
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 4998-5006: queryTotalSessions currently restricts the denominator
to rows with type = 'pageview', but queryErrorStats counts sessions from type =
'error', causing a scope mismatch; update queryTotalSessions to derive its
session population from the same session seed/subquery used by queryErrorStats
(e.g., count(DISTINCT psid) from the session seed/subquery or by reusing the
same JOIN/subquery logic and sessionFiltersQuery) so both numerator and
denominator operate over the identical set of sessions (refer to
queryTotalSessions, queryErrorStats, sessionFiltersQuery and the psid usage in
the events table) and remove the hard-coded type = 'pageview' filter.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e740be50-8816-444a-aa17-b8ea5a490406
📒 Files selected for processing (11)
admin/src/db/clickhouse.tsbackend/apps/cloud/src/analytics/analytics.controller.tsbackend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/experiment/experiment.controller.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.controller.tsbackend/apps/community/src/analytics/analytics.service.tsbackend/migrations/clickhouse/2026_05_01_unify_events.jsweb/app/api/api.server.tsweb/app/pages/Project/tabs/Sessions/SessionDetailView.tsxweb/app/pages/Project/tabs/Sessions/SessionsView.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/migrations/clickhouse/2026_05_01_unify_events.js
- backend/apps/community/src/analytics/analytics.controller.ts
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/apps/community/src/analytics/analytics.service.ts (1)
1976-2003:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep all-time
sduron the same filtered cohort as the headline counts.
analytics_countsapplies${filtersQuery}and${allTimeType}, butduration_avgstill averages every session for the project. With page/country/custom-event filters, the returnedsdurno longer describes the same population asall,unique, andusers.Proposed fix
duration_avg AS ( SELECT avgOrNull(duration) as sdur FROM ( SELECT psid, dateDiff('second', min(firstSeen), max(lastSeen)) as duration FROM sessions WHERE pid = {pid:FixedString(12)} + AND psid IN ( + SELECT DISTINCT psid + FROM events + WHERE + pid = {pid:FixedString(12)} + AND type = '${allTimeType}' + AND psid IS NOT NULL + ${filtersQuery} + ) GROUP BY psid ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 1976 - 2003, The sdur in queryAll is computed in duration_avg over all sessions, but analytics_counts applies ${filtersQuery} and ${allTimeType}, so sdur must be restricted to the same cohort; change duration_avg to derive its cohort from the same events filter (e.g., use SELECT DISTINCT psid FROM events WHERE pid = {pid:FixedString(12)} AND type = '${allTimeType}' ${filtersQuery}) and then compute duration_avg from sessions WHERE psid IN (that cohort) so the computed sdur matches analytics_counts; update the query in queryAll to reference this cohort instead of aggregating all sessions.
♻️ Duplicate comments (2)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
2813-2814:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize performance summary windows to UTC before querying.
periodFormatted/nowcan come fromgetGroupFromTo()in project-local time, but these queries comparecreatedagainst them directly. On non-UTC projects,/performance/birdseyewill shift both the current and previous windows by the timezone offset. Carry the UTC bounds through this path and deriveperiodSubtractedfrom those UTC values instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 2813 - 2814, The time windows used in the performance queries (queryCurrent and queryPrevious) are using periodFormatted / now / periodSubtracted which may be in project-local time from getGroupFromTo(); normalize those bounds to UTC before building the SQL strings so events.created is compared against UTC-aligned ranges, and compute periodSubtracted from the UTC period bounds rather than the local-time ones; update the code paths that set periodFormatted, now, and periodSubtracted (the values passed into queryCurrent/queryPrevious and derived from getGroupFromTo()) so the UTC-converted timestamps are carried through into the SQL.backend/apps/cloud/src/analytics/analytics.controller.ts (1)
817-821:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse UTC-normalized bounds for the performance query params here too.
These two endpoints still bind local
groupFrom/groupTointoparamsData, while the performance service queries read those placeholders as raw ClickHouse bounds. Non-UTC projects will get shifted windows in both/performanceand/performance/chart. Please buildparamsDatafromgroupFromUTC/groupToUTC, matching the analytics chart/data paths.Also applies to: 885-887
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.controller.ts` around lines 817 - 821, The paramsData uses local groupFrom/groupTo which causes ClickHouse to use non-UTC bounds; change paramsData to bind the UTC-normalized values (groupFromUTC and groupToUTC) instead of groupFrom/groupTo so the performance queries (subQuery and the similar block around the other occurrence) use the same UTC window; update both the paramsData object near subQuery and the other paramsData at the later occurrence (references: subQuery, paramsData, groupFrom, groupTo, groupFromUTC, groupToUTC) to pass groupFromUTC/groupToUTC into params.
🧹 Nitpick comments (1)
backend/apps/cloud/src/task-manager/task-manager.service.ts (1)
2023-2047: ⚡ Quick winUse an existence probe here instead of
count().This job only needs to know whether any activity exists, but
count()still scans the full 9-week slice for every chunk. On a Sunday pre-report cron, that is avoidable load. ASELECT 1 ... LIMIT 1/EXISTSquery lets you short-circuit at the database too.♻️ Proposed change
- const queryEvents = `SELECT count() FROM events WHERE pid IN ({pids:Array(FixedString(12))}) AND type IN ('pageview', 'captcha', 'custom_event', 'error', 'performance') AND created BETWEEN {nineWeeksAgo:String} AND {now:String}` - - // Process project IDs in chunks to avoid ClickHouse field value limit - let totalEvents = 0 + const queryEvents = `SELECT 1 FROM events WHERE pid IN ({pids:Array(FixedString(12))}) AND type IN ('pageview', 'captcha', 'custom_event', 'error', 'performance') AND created BETWEEN {nineWeeksAgo:String} AND {now:String} LIMIT 1` for (let i = 0; i < pids.length; i += CHUNK_SIZE) { const pidChunk = pids.slice(i, i + CHUNK_SIZE) const queryParams = { pids: pidChunk, nineWeeksAgo, now, } - const { data: eventsResult } = await clickhouse + const { data: eventsResult } = await clickhouse .query({ query: queryEvents, query_params: queryParams, }) - .then((resultSet) => resultSet.json<{ 'count()': number }>()) - - totalEvents += eventsResult[0]['count()'] - - // Early return if we found activity - if (totalEvents > 0) { + .then((resultSet) => resultSet.json<Record<string, number>>()) + + if (eventsResult.length > 0) { return } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/task-manager/task-manager.service.ts` around lines 2023 - 2047, The current loop uses queryEvents with `SELECT count()` which scans the full 9-week slice for every pid chunk; change it to an existence probe (e.g., `SELECT 1 ... LIMIT 1` or `EXISTS(...)`) so ClickHouse can short-circuit per chunk. Update the query string referenced by queryEvents and the clickhouse.query handling in the loop to expect a single-row existence result (adjust the result parsing from `{ 'count()': number }` to a boolean/row-presence check) and keep the same early-return logic that checks totalEvents/row presence and returns immediately when any chunk reports activity; ensure identifiers pids, CHUNK_SIZE, nineWeeksAgo, now, clickhouse.query and totalEvents remain used as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/analytics/analytics.controller.ts`:
- Around line 2038-2042: The code currently derives the "all time" window by
calling this.analyticsService.calculateTimeBucketForAllTime(pid,
this.analyticsService.getAnalyticsEventType(customEVFilterApplied)), which only
uses pageviews or custom_event when filtered and can collapse when a project has
only custom_event/error rows; change calculateTimeBucketForAllTime calls (and
the similar usages in the other spots) to compute the max diff over the same
session-bearing event set used downstream (at minimum pageview, custom_event,
and error) or update calculateTimeBucketForAllTime to accept an explicit
eventTypes array and internally take the max diff across those event types so
the returned range is based on pageview, custom_event, and error collectively
rather than a single event type.
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 5420-5431: The current avgDuration query (queryAvgDuration)
computes average session_duration only from events where type = 'pageview',
which returns 0 for profiles that only have custom_event or error rows; update
the logic to fall back to the sessions table when the pageview-based aggregate
is empty or replace the query to compute avgDuration directly from the sessions
table: run the existing pageview-based aggregation on events (using psid,
dateDiff(...) as session_duration) and if it yields NULL/empty, query sessions
(filtering on pid and profileId) to compute avg(session_last_seen -
session_first_seen) or use the sessions-based avg as the primary source; ensure
the code that reads avgDuration (the variable/queryAvgDuration result handling)
prefers the events-derived value but uses the sessions-derived value when the
events result is missing or zero.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 5692-5707: queryAvgDuration only computes avgDuration from events
filtered by type = 'pageview', so profiles returned by getProfilesList that only
have error/custom events never get a non-zero avgDuration even though the
sessions table contains timing data; update the logic inside queryAvgDuration
(or its surrounding flow) to fall back to session timings when no
pageview-derived duration exists — e.g., extend the query to UNION or LEFT JOIN
with the sessions-derived session_duration (from the sessions table or the
existing sessions computation) so that avgDuration uses pageview durations when
present and otherwise uses sessions.duration for psid/profileId combinations;
locate and modify the queryAvgDuration variable in analytics.service.ts and
adjust any downstream handling of avgDuration to accept the fallback value.
- Around line 4251-4258: getOnlineUserCount currently filters events with type
IN ('pageview', 'custom_event') and thus omits error-only sessions; update the
query in getOnlineUserCount to include 'error' in the type filter (e.g., type IN
('pageview','custom_event','error')) so sessions that only have recent error
events are counted as online; ensure the query string referenced in
analytics.service.ts (the const query used by getOnlineUserCount) is the one
updated.
- Around line 5201-5219: The LEFT JOIN to events AS analytics causes
multiplication of rows and inflates count(*) — remove the join and stop
referencing analytics fields; in the SELECT replace COALESCE(any(analytics.cc),
any(errors.cc)) / COALESCE(any(analytics.br), any(errors.br)) /
COALESCE(any(analytics.os), any(errors.os)) with any(errors.cc), any(errors.br),
any(errors.os) respectively, drop the LEFT JOIN ... AS analytics block and any
analytics.* references so the query aggregates only the error rows (keep
CAST(errors.psid), any(errors.profileId), min(errors.created),
max(errors.created), count(*) and GROUP BY errors.psid as-is).
---
Outside diff comments:
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 1976-2003: The sdur in queryAll is computed in duration_avg over
all sessions, but analytics_counts applies ${filtersQuery} and ${allTimeType},
so sdur must be restricted to the same cohort; change duration_avg to derive its
cohort from the same events filter (e.g., use SELECT DISTINCT psid FROM events
WHERE pid = {pid:FixedString(12)} AND type = '${allTimeType}' ${filtersQuery})
and then compute duration_avg from sessions WHERE psid IN (that cohort) so the
computed sdur matches analytics_counts; update the query in queryAll to
reference this cohort instead of aggregating all sessions.
---
Duplicate comments:
In `@backend/apps/cloud/src/analytics/analytics.controller.ts`:
- Around line 817-821: The paramsData uses local groupFrom/groupTo which causes
ClickHouse to use non-UTC bounds; change paramsData to bind the UTC-normalized
values (groupFromUTC and groupToUTC) instead of groupFrom/groupTo so the
performance queries (subQuery and the similar block around the other occurrence)
use the same UTC window; update both the paramsData object near subQuery and the
other paramsData at the later occurrence (references: subQuery, paramsData,
groupFrom, groupTo, groupFromUTC, groupToUTC) to pass groupFromUTC/groupToUTC
into params.
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 2813-2814: The time windows used in the performance queries
(queryCurrent and queryPrevious) are using periodFormatted / now /
periodSubtracted which may be in project-local time from getGroupFromTo();
normalize those bounds to UTC before building the SQL strings so events.created
is compared against UTC-aligned ranges, and compute periodSubtracted from the
UTC period bounds rather than the local-time ones; update the code paths that
set periodFormatted, now, and periodSubtracted (the values passed into
queryCurrent/queryPrevious and derived from getGroupFromTo()) so the
UTC-converted timestamps are carried through into the SQL.
---
Nitpick comments:
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 2023-2047: The current loop uses queryEvents with `SELECT count()`
which scans the full 9-week slice for every pid chunk; change it to an existence
probe (e.g., `SELECT 1 ... LIMIT 1` or `EXISTS(...)`) so ClickHouse can
short-circuit per chunk. Update the query string referenced by queryEvents and
the clickhouse.query handling in the loop to expect a single-row existence
result (adjust the result parsing from `{ 'count()': number }` to a
boolean/row-presence check) and keep the same early-return logic that checks
totalEvents/row presence and returns immediately when any chunk reports
activity; ensure identifiers pids, CHUNK_SIZE, nineWeeksAgo, now,
clickhouse.query and totalEvents remain used as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d498430b-210c-4382-bd29-57170e3ce154
📒 Files selected for processing (5)
admin/src/db/clickhouse.tsbackend/apps/cloud/src/analytics/analytics.controller.tsbackend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.service.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- admin/src/db/clickhouse.ts
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
backend/apps/cloud/src/analytics/analytics.service.ts (2)
5757-5773:⚠️ Potential issue | 🟠 Major | ⚡ Quick winInclude
errorrows when building profile sessions.This CTE drops
errorevents, sogetProfileSessionsListomits sessions whose only activity is errors and can report a stalelastActivityfor sessions that ended with an error. The downstreamerror_countsjoin only enriches already-selected sessions, so it does not fix the omission.Suggested fix
all_profile_events AS ( SELECT CAST(psid, 'String') AS psidCasted, pid, profileId, cc, os, br, toTimeZone(created, {timezone:String}) AS created_tz FROM events WHERE pid = {pid:FixedString(12)} - AND type IN ('pageview', 'custom_event') + AND type IN ('pageview', 'custom_event', 'error') AND profileId = {profileId:String} AND psid IS NOT NULL AND created BETWEEN {groupFrom:String} AND {groupTo:String} ${scopedSessionFilter} )`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 5757 - 5773, The CTE all_profile_events currently filters out events with type 'error', causing getProfileSessionsList to miss sessions composed only of errors and to compute stale lastActivity; update the all_profile_events selection (used by getProfileSessionsList) to include 'error' in the type filter (e.g., change AND type IN ('pageview','custom_event') to include 'error') so error events are part of the session assembly and the downstream error_counts enrichment will correctly augment those sessions (ensure any logic that assumes non-error events still works with the expanded set).
5298-5317:⚠️ Potential issue | 🟠 Major | ⚡ Quick winInclude
errorrows in the driving profile CTE.
all_profile_dataonly scanspageviewandcustom_event, so profiles that only emitted errors disappear fromgetProfilesList, and mixed profiles can underreportsessionsCount/lastSeenwhen an error is the only or latest event in the window.profile_errorsis joined later, but it cannot resurrect profiles that never make it intoprofile_aggregated.Suggested fix
all_profile_data AS ( SELECT profileId, psid, cc, os, br, dv, created, if(type = 'pageview', 1, 0) AS isPageview, if(type = 'custom_event', 1, 0) AS isEvent FROM events WHERE pid = {pid:FixedString(12)} - AND type IN ('pageview', 'custom_event') + AND type IN ('pageview', 'custom_event', 'error') AND created BETWEEN {groupFrom:String} AND {groupTo:String} AND profileId IS NOT NULL AND profileId != '' ${profileTypeFilter} ${scopedProfileFilter} )`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 5298 - 5317, The all_profile_data CTE currently filters only 'pageview' and 'custom_event', which excludes profiles that emitted only errors; update the CTE (all_profile_data) to include 'error' rows by adding 'error' to the type filter (e.g., type IN ('pageview','custom_event','error')) and add an isError flag (e.g., if(type = 'error', 1, 0) AS isError) so driving-profile aggregation / sessionsCount / lastSeen can account for error-only profiles while leaving existing isPageview/isEvent logic intact and keeping subsequent joins (like profile_errors) compatible.backend/apps/community/src/analytics/analytics.service.ts (1)
5635-5646:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep
errorsCounton the same filter scope as the rest of the profile row.
profile_aggregatedcomes fromall_profile_data, sopageviewsCount,eventsCount,sessionsCount, and first/last seen already respectfiltersQuery/scopedProfileFilter.profile_errorsdrops that scope and counts every error for the profile in the date range, which inflateserrorsCounton filtered profile lists. Reuse the same scoped CTE here (for example with anisErrorflag) or apply the same scoped filter in this CTE.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 5635 - 5646, profile_errors is counting all profile errors across the date range but ignores the scoped filters used by profile_aggregated/all_profile_data (filtersQuery/scopedProfileFilter), which inflates errorsCount; update the SQL so errorsCount is computed within the same scope by either reusing the all_profile_data CTE (e.g., add an isError boolean to all_profile_data and aggregate it there) or by applying the same filters/scopedProfileFilter inside the profile_errors CTE so that profile_errors respects the same filtered profile set as profile_aggregated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/task-manager/task-manager.service.ts`:
- Around line 430-432: The regex branch currently sets params[paramKey] =
goalValue and always returns appendMetadataFilters(`match(${column},
{${paramKey}:String})`), but empty strings form a valid regex and match
everything; update the branch to first guard for empty goalValue (e.g., if
(!goalValue || goalValue === '') ) and in that case return
appendMetadataFilters('false') (matching the CONTAINS branch behavior);
otherwise set params[paramKey] = goalValue and return
appendMetadataFilters(`match(${column}, {${paramKey}:String})`).
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 4389-4431: The query overwrites the raw UTC created timestamp with
toTimeZone(...) which shifts arithmetic used later (duration fallback,
live-status, chart bounds); modify the events_with_meta CTE to select both the
original UTC timestamp (e.g., created_utc AS created) and a display-local
timestamp (e.g., created_local = toTimeZone(created, {timezone:String})), keep
the current toTimeZone result only for display, and update downstream code in
queryPages and any uses of created (duration fallback, live-status checks, chart
bounds) to use created_utc for arithmetic or pass created_utc into safeTimezone
when a timezone-aware value is required. Ensure references include the
events_with_meta CTE and functions/logic named queryPages, duration fallback,
live-status checks, and chart bounds so you update all places that perform time
calculations.
- Around line 2275-2306: The custom-event branch in
extractAnalyticsChartDataForScope is fabricating uniques and sdur by copying
bucketed count() series; instead, update the custom-event path so
generateCustomEventsAggregationQuery produces bucketed countDistinct(psid) (for
uniques/visits) and aggregated session duration (e.g., sum or avg of
session_duration) per time bucket, then parse those fields in
extractAnalyticsChartDataForScope (via extractCustomEventsChartData) instead of
reusing the raw count series; alternatively, if you prefer not to change
queries, stop returning visits/uniques/sdur for customEVFilterApplied in
extractAnalyticsChartDataForScope to avoid misleading data (modify
generateCustomEventsAggregationQuery, extractCustomEventsChartData, and
extractAnalyticsChartDataForScope accordingly).
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 5757-5773: The CTE all_profile_events currently filters out events
with type 'error', causing getProfileSessionsList to miss sessions composed only
of errors and to compute stale lastActivity; update the all_profile_events
selection (used by getProfileSessionsList) to include 'error' in the type filter
(e.g., change AND type IN ('pageview','custom_event') to include 'error') so
error events are part of the session assembly and the downstream error_counts
enrichment will correctly augment those sessions (ensure any logic that assumes
non-error events still works with the expanded set).
- Around line 5298-5317: The all_profile_data CTE currently filters only
'pageview' and 'custom_event', which excludes profiles that emitted only errors;
update the CTE (all_profile_data) to include 'error' rows by adding 'error' to
the type filter (e.g., type IN ('pageview','custom_event','error')) and add an
isError flag (e.g., if(type = 'error', 1, 0) AS isError) so driving-profile
aggregation / sessionsCount / lastSeen can account for error-only profiles while
leaving existing isPageview/isEvent logic intact and keeping subsequent joins
(like profile_errors) compatible.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 5635-5646: profile_errors is counting all profile errors across
the date range but ignores the scoped filters used by
profile_aggregated/all_profile_data (filtersQuery/scopedProfileFilter), which
inflates errorsCount; update the SQL so errorsCount is computed within the same
scope by either reusing the all_profile_data CTE (e.g., add an isError boolean
to all_profile_data and aggregate it there) or by applying the same
filters/scopedProfileFilter inside the profile_errors CTE so that profile_errors
respects the same filtered profile set as profile_aggregated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 987f7e04-bf59-4b18-9765-029a0310511d
📒 Files selected for processing (4)
backend/apps/cloud/src/analytics/analytics.controller.tsbackend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.service.ts
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
5465-5478:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep profile details off non-visitor event types.
This aggregate now reads
any(...)from every row inevents. After the table unification,performance/captcharows can win the arbitraryany()selection and return blank or wrongcc/os/br/dvdata for the profile details pane.💡 Suggested change
const queryDetails = ` SELECT any(cc) AS cc, any(rg) AS rg, any(ct) AS ct, any(os) AS os, any(osv) AS osv, any(br) AS br, any(brv) AS brv, any(dv) AS dv, any(lc) AS lc FROM events WHERE pid = {pid:FixedString(12)} AND profileId = {profileId:String} + AND type IN ('pageview', 'custom_event', 'error') `🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 5465 - 5478, The aggregate currently selects any(...) across all event rows which lets non-visitor types (e.g., performance/captcha) win and corrupt profile detail fields; update the query in analytics.service.ts (the SQL block that begins with "SELECT any(cc) AS cc, any(rg) AS rg, ... FROM events WHERE pid = {pid:FixedString(12)} AND profileId = {profileId:String}") to restrict rows to visitor events only — e.g., add a predicate like "AND eventType IN (...visitor event types...)" or "AND is_visitor = 1" (use the actual column/enum name used in the events table) so only visitor/pageview-type events contribute to any(...).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 6341-6355: The analytics subquery currently filters WHERE type =
'pageview', so sessions with only 'error' or 'custom_event' rows return NULL
cc/br/os; modify the subquery (alias analytics) to aggregate cc, br, os across
all relevant event types (e.g., include type IN
('pageview','error','custom_event') or remove the type filter) while still
computing pageviews separately using a conditional count (e.g., countIf(type =
'pageview') AS pageviews) so metadata comes from any event row but pageview
counts remain accurate.
- Around line 6115-6121: queryTotalSessions currently counts distinct psid
across all events, which pulls in non-visitor session rows and dilutes
errorRate; update the query to restrict the denominator to the same
visitor/session-scoped events used for the error numerator by applying the same
visitor session filters (i.e., add the event-type/scope filter(s) used for error
rows to queryTotalSessions or incorporate the visitor-specific clause from
sessionFiltersQuery into the SELECT), ensuring the queryTotalSessions string
(identifier: queryTotalSessions) only counts psid from visitor/session events so
numerator and denominator match.
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 5465-5478: The aggregate currently selects any(...) across all
event rows which lets non-visitor types (e.g., performance/captcha) win and
corrupt profile detail fields; update the query in analytics.service.ts (the SQL
block that begins with "SELECT any(cc) AS cc, any(rg) AS rg, ... FROM events
WHERE pid = {pid:FixedString(12)} AND profileId = {profileId:String}") to
restrict rows to visitor events only — e.g., add a predicate like "AND eventType
IN (...visitor event types...)" or "AND is_visitor = 1" (use the actual
column/enum name used in the events table) so only visitor/pageview-type events
contribute to any(...).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5779a907-ff71-46c2-9184-5cbdf1c0b1bd
📒 Files selected for processing (4)
backend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/cloud/src/task-manager/task-manager.service.tsbackend/apps/community/src/analytics/analytics.service.tsbackend/apps/community/src/analytics/interfaces/index.ts
✅ Files skipped from review due to trivial changes (1)
- backend/apps/cloud/src/task-manager/task-manager.service.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/apps/community/src/analytics/analytics.service.ts
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/apps/cloud/src/analytics/analytics.service.ts (2)
4916-4923:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon’t source session details from the first mixed event row.
After expanding this query to
type IN ('pageview', 'custom_event', 'error'),ORDER BY created ASC LIMIT 1can lock onto an early error row with sparse/nulldv/br/os/...fields and hide richer metadata from later visitor rows in the same session.Suggested direction
- SELECT - dv, br, brv, os, osv, lc, ref, so, me, ca, te, co, cc, rg, ct, profileId + SELECT + anyIf(dv, dv IS NOT NULL AND dv != '') AS dv, + anyIf(br, br IS NOT NULL AND br != '') AS br, + anyIf(brv, brv IS NOT NULL AND brv != '') AS brv, + anyIf(os, os IS NOT NULL AND os != '') AS os, + anyIf(osv, osv IS NOT NULL AND osv != '') AS osv, + anyIf(lc, lc IS NOT NULL AND lc != '') AS lc, + anyIf(ref, ref IS NOT NULL AND ref != '') AS ref, + anyIf(so, so IS NOT NULL AND so != '') AS so, + anyIf(me, me IS NOT NULL AND me != '') AS me, + anyIf(ca, ca IS NOT NULL AND ca != '') AS ca, + anyIf(te, te IS NOT NULL AND te != '') AS te, + anyIf(co, co IS NOT NULL AND co != '') AS co, + anyIf(cc, cc IS NOT NULL AND cc != '') AS cc, + anyIf(rg, rg IS NOT NULL AND rg != '') AS rg, + anyIf(ct, ct IS NOT NULL AND ct != '') AS ct, + anyIf(profileId, profileId IS NOT NULL AND profileId != '') AS profileId FROM events WHERE pid = {pid:FixedString(12)} AND type IN ('pageview', 'custom_event', 'error') AND psid IS NOT NULL AND toString(psid) = {psid:String} - ORDER BY created ASC - LIMIT 1;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 4916 - 4923, The query in analytics.service.ts that selects session details by ordering only on created ASC can pick an early error row with sparse metadata; update the SQL in that query (the FROM events ... WHERE pid = {pid:FixedString(12)} ... toString(psid) = {psid:String} block) to prefer richer visitor rows by ordering first by event type priority (e.g., pageview first, custom_event next, error last) and then by created ASC, or alternatively add a condition to prefer rows with non-null device/browser/os fields when choosing LIMIT 1; implement the type-priority ORDER BY (CASE WHEN type='pageview' THEN 0 WHEN type='custom_event' THEN 1 ELSE 2 END), created ASC so the selected row contains the fuller dv/br/os metadata.
6010-6019:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep error occurrence stats scoped to the selected range.
queryFirstLastSeenis the only query in this method that ignoresgroupFrom/groupTo. That makes the detail payload mix all-timecount/first_seen/last_seenwith range-limited metadata and chart data.Suggested fix
const queryFirstLastSeen = ` SELECT max(created) AS last_seen, min(created) AS first_seen, count() AS count FROM events WHERE pid = {pid:FixedString(12)} AND type = 'error' - AND eid = {eid:FixedString(32)}; + AND eid = {eid:FixedString(32)} + AND created BETWEEN {groupFrom:String} AND {groupTo:String}; `🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 6010 - 6019, The queryFirstLastSeen SQL ignores the selected time window; update the WHERE clause in queryFirstLastSeen to scope results to groupFrom/groupTo (the same range used for charts/metadata). Add conditions on created using the existing parameters (e.g. AND created >= {groupFrom:DateTime} AND created <= {groupTo:DateTime} or BETWEEN {groupFrom:DateTime} AND {groupTo:DateTime}), keeping the existing pid and eid filters so count/first_seen/last_seen are computed only within the selected range.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 4833-4842: The SQL building error metadata fabricates zero/empty
tuples for missing fields; update the metadata construction in
analytics.service.ts so the IF/conditional branch for error events only emits
tuples for fields that are present and non-empty/non-zero and then apply
arrayFilter to remove any empty or "0" values. Concretely, change the error
branch (the IF/when building tuple('message', ...), tuple('lineno', ...),
tuple('colno', ...), tuple('filename', ...)) to produce NULL or omit tuples when
error_message, lineno, colno, or error_filename are absent (or produce empty
strings for message/filename but NULL for numeric fields) and then use
arrayFilter on arrayZip(meta_key, meta_value) (or wrap the error tuples in
arrayFilter(x -> x.2 != '' AND x.2 != '0', ...)) so only real metadata entries
remain in the metadata column.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 2296-2311: The custom-event branch in the analytics service is
returning visits from uniques, which hides repeated events per session; update
the branch that runs when customEVFilterApplied to set visits from the event
counts (use customEventsData._count or fallback Array(_size(xShifted)).fill(0))
while keeping uniques from customEventsData._uniques (or _unknown_event) and
sdur from customEventsData._sdur; locate this logic around
extractCustomEventsChartData, customEventsData, customEVFilterApplied and
replace the visits assignment accordingly.
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 4916-4923: The query in analytics.service.ts that selects session
details by ordering only on created ASC can pick an early error row with sparse
metadata; update the SQL in that query (the FROM events ... WHERE pid =
{pid:FixedString(12)} ... toString(psid) = {psid:String} block) to prefer richer
visitor rows by ordering first by event type priority (e.g., pageview first,
custom_event next, error last) and then by created ASC, or alternatively add a
condition to prefer rows with non-null device/browser/os fields when choosing
LIMIT 1; implement the type-priority ORDER BY (CASE WHEN type='pageview' THEN 0
WHEN type='custom_event' THEN 1 ELSE 2 END), created ASC so the selected row
contains the fuller dv/br/os metadata.
- Around line 6010-6019: The queryFirstLastSeen SQL ignores the selected time
window; update the WHERE clause in queryFirstLastSeen to scope results to
groupFrom/groupTo (the same range used for charts/metadata). Add conditions on
created using the existing parameters (e.g. AND created >= {groupFrom:DateTime}
AND created <= {groupTo:DateTime} or BETWEEN {groupFrom:DateTime} AND
{groupTo:DateTime}), keeping the existing pid and eid filters so
count/first_seen/last_seen are computed only within the selected range.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e1a66c75-7f77-4062-957a-bae4785e1f6f
📒 Files selected for processing (2)
backend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/community/src/analytics/analytics.service.ts
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
backend/apps/cloud/src/analytics/analytics.service.ts (2)
6407-6420:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject empty
eidsbefore generatingIN (...).When
eidsis empty, this buildseid IN (), which turns a client validation error into a ClickHouse syntax failure.Suggested fix
async validateEIDs(eids: string[], pid: string) { + if (_isEmpty(eids)) { + throw new UnprocessableEntityException( + 'At least one error ID must be provided', + ) + } + const params = _reduce( eids, (acc, curr, index) => ({ ...acc, [`e_${index}`]: curr,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 6407 - 6420, The validateEIDs function builds an SQL fragment "eid IN (...)" which produces invalid SQL when the input eids array is empty; add an early guard at the top of validateEIDs to reject empty eids (e.g., throw a validation/BadRequest error or return a clear zero/empty result) before constructing params and the query so you never generate "IN ()". Ensure the check references the eids parameter and stops execution prior to the params/_map/_reduce usage and query string construction.
1249-1255:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScope entry/exit-page filters to the current time window.
These subqueries compute
argMin/argMax(pg, created)over all historical pageviews for a reusablepsid, so a session can be included/excluded based on an older visit’s first/last page instead of the requested range. Add the samecreated BETWEEN {groupFrom:String} AND {groupTo:String}constraint here.Suggested fix
const subQueryForPages = isContains - ? `SELECT psid FROM (SELECT psid, ${pageSelector} as page FROM events WHERE pid = {pid:FixedString(12)} AND type = 'pageview' GROUP BY psid) WHERE page ILIKE concat('%', {${param}:String}, '%')` - : `SELECT psid FROM (SELECT psid, ${pageSelector} as page FROM events WHERE pid = {pid:FixedString(12)} AND type = 'pageview' GROUP BY psid) WHERE page = {${param}:String}` + ? `SELECT psid FROM (SELECT psid, ${pageSelector} as page FROM events WHERE pid = {pid:FixedString(12)} AND type = 'pageview' AND psid IS NOT NULL AND created BETWEEN {groupFrom:String} AND {groupTo:String} GROUP BY psid) WHERE page ILIKE concat('%', {${param}:String}, '%')` + : `SELECT psid FROM (SELECT psid, ${pageSelector} as page FROM events WHERE pid = {pid:FixedString(12)} AND type = 'pageview' AND psid IS NOT NULL AND created BETWEEN {groupFrom:String} AND {groupTo:String} GROUP BY psid) WHERE page = {${param}:String}`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 1249 - 1255, The subQueryForPages currently selects psid based on page values across all history; restrict it to the current time window by adding the created BETWEEN {groupFrom:String} AND {groupTo:String} condition inside the inner SELECT's WHERE clause so the argMin/argMax(page, created) logic used by pageSelector only considers events within the requested range; update the subQueryForPages construction (the templates built where pageSelector, isContains, param are used) so the inner SELECT WHERE includes "AND created BETWEEN {groupFrom:String} AND {groupTo:String}" and keep the rest of the logic that appends to query (psid IN/NOT IN ...) unchanged.backend/apps/community/src/analytics/analytics.service.ts (3)
5660-5717:⚠️ Potential issue | 🟠 Major | ⚡ Quick winExpose
errorsCountin profile details too.
getProfilesList()now surfaceserrorsCount, butgetProfileDetails()still only loads pageview/custom-event counts. Opening an error-only profile will show sessions/timing data while the activity counters stay at zero.Proposed fix
const queryEvents = ` SELECT count() AS eventsCount FROM events WHERE pid = {pid:FixedString(12)} AND type = 'custom_event' AND profileId = {profileId:String} ` + + const queryErrors = ` + SELECT count() AS errorsCount + FROM events + WHERE pid = {pid:FixedString(12)} + AND type = 'error' + AND profileId = {profileId:String} + ` @@ const [ sessionCountResult, avgDurationResult, pageviewsResult, eventsResult, + errorsResult, detailsResult, ] = await Promise.all([ @@ clickhouse .query({ query: queryEvents, query_params: params }) .then((resultSet) => resultSet.json()), + clickhouse + .query({ query: queryErrors, query_params: params }) + .then((resultSet) => resultSet.json()), clickhouse .query({ query: queryDetails, query_params: params }) .then((resultSet) => resultSet.json()), ]) @@ const pageviews = (pageviewsResult.data[0] || {}) as Record<string, any> const events = (eventsResult.data[0] || {}) as Record<string, any> + const errors = (errorsResult.data[0] || {}) as Record<string, any> const details = (detailsResult.data[0] || {}) as Record<string, any> @@ sessionsCount: sessionCount.sessionsCount || 0, pageviewsCount: pageviews.pageviewsCount || 0, eventsCount: events.eventsCount || 0, + errorsCount: errors.errorsCount || 0, firstSeen: sessionCount.firstSeen, lastSeen: sessionCount.lastSeen,Also applies to: 5750-5865
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 5660 - 5717, getProfileDetails currently aggregates pageviews/events but omits errorsCount, causing error-only profiles to show zeros; update the aggregation in getProfileDetails (the same aggregation logic used in getProfilesList/profile_aggregated) to SUM(isError) AS errorsCount, include errorsCount in the SELECT that builds the profile detail record, and map that field into the returned profile details object so the API returns errorsCount alongside pageviewsCount, sessionsCount and eventsCount (also apply the same addition where similar aggregations occur in the nearby code paths referenced around getProfileDetails).
4698-4770:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep a UTC timestamp for
isLivecalculations.
created_for_groupingis already localized here, butgetSessionsList()later compareslastActivityagainst ClickHousenow(). For non-UTC viewers that shifts the 2-minute live window by the timezone offset, so stale sessions can stay “live” for hours or fresh ones can flip offline early.Proposed fix
return ` SELECT CAST(psid, 'String') AS psidCasted, pid, cc, os, br, + created AS created_utc, toTimeZone(created, {timezone:String}) AS created_for_grouping FROM events WHERE pid = {pid:FixedString(12)} AND type IN ('pageview', 'custom_event', 'error') @@ WITH distinct_sessions_filtered AS ( SELECT psidCasted, pid, any(cc) AS cc, any(os) AS os, any(br) AS br, min(created_for_grouping) AS sessionStart, - max(created_for_grouping) AS lastActivity + max(created_for_grouping) AS lastActivity, + max(created_utc) AS lastActivityUtc FROM (${primaryEventsSubquery}) AS primary_events GROUP BY psidCasted, pid ), @@ - if(dateDiff('second', dsf.lastActivity, now()) < ${LIVE_SESSION_THRESHOLD_SECONDS}, 1, 0) AS isLive, + if(dateDiff('second', dsf.lastActivityUtc, now()) < ${LIVE_SESSION_THRESHOLD_SECONDS}, 1, 0) AS isLive,Also applies to: 4821-4836
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 4698 - 4770, The isLive calculation uses ClickHouse now() (server timezone) while created_for_grouping/lastActivity may be localized, causing timezone-shifted live windows; update the isLive expression in the query (and the similar block used by getSessionsList) to compare UTC timestamps by wrapping both sides with toTimeZone(...,'UTC') (e.g. use dateDiff('second', toTimeZone(dsf.lastActivity,'UTC'), toTimeZone(now(),'UTC')) < LIVE_SESSION_THRESHOLD_SECONDS) so the live-window logic is evaluated in UTC consistently.
5082-5099:⚠️ Potential issue | 🟠 Major | ⚡ Quick winIgnore error rows that do not belong to a session.
These queries count/group by
psidwithout excluding nulls. In ClickHouse that can turn “error with no session id” into a synthetic affected session, skewingaffectedSessions,errorRate, and the affected-sessions list total.Proposed fix
const queryErrorStats = ` SELECT count(*) as totalErrors, count(DISTINCT eid) as uniqueErrors, count(DISTINCT psid) as affectedSessions, count(DISTINCT profileId) as affectedUsers FROM events AS errors @@ WHERE pid = {pid:FixedString(12)} AND type = 'error' + AND psid IS NOT NULL AND created BETWEEN {groupFrom:String} AND {groupTo:String} ${filtersQuery} ${resolvedFilter} ` @@ const queryCount = ` SELECT count(DISTINCT psid) as total FROM events WHERE pid = {pid:FixedString(12)} AND type = 'error' + AND psid IS NOT NULL AND eid = {eid:FixedString(32)} AND created BETWEEN {groupFrom:String} AND {groupTo:String} ` @@ FROM events AS errors WHERE errors.pid = {pid:FixedString(12)} AND errors.type = 'error' + AND errors.psid IS NOT NULL AND errors.eid = {eid:FixedString(32)} AND errors.created BETWEEN {groupFrom:String} AND {groupTo:String}Also applies to: 5265-5289
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/community/src/analytics/analytics.service.ts` around lines 5082 - 5099, The query currently counts DISTINCT psid and groups by psid without excluding null session IDs, which can create synthetic affected sessions; update the SQL in queryErrorStats to add a session filter (e.g., "AND psid IS NOT NULL" or "AND psid != ''" depending on your schema) to the WHERE clause so that affectedSessions/count(DISTINCT psid) and any grouping by psid exclude nulls, and apply the same change to the other error-related query that computes affected sessions (the similar query around the later error-count block referenced in the review).
♻️ Duplicate comments (1)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
2231-2240:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
getTotalSessionsForReportstill drops non-pageview sessions.This helper now undercounts sessions after the unified-events refactor, because it only looks at
type = 'pageview'. Any report path using it will miss sessions that emitted onlycustom_eventand/orerrorrows in the window.Suggested fix
const query = ` SELECT pid, uniqExact(psid) as totalSessions FROM events WHERE pid IN {pids:Array(FixedString(12))} - AND type = 'pageview' + AND type IN ('pageview', 'custom_event', 'error') + AND psid IS NOT NULL AND created BETWEEN {groupFrom:String} AND {groupTo:String} GROUP BY pid `🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 2231 - 2240, The query in getTotalSessionsForReport currently filters WHERE type = 'pageview' and thus misses sessions that only emitted custom_event or error rows; update the query string (variable name query) to either remove the type = 'pageview' filter entirely or broaden it to include the unified event types (e.g. type IN ('pageview','custom_event','error')) so uniqExact(psid) counts all sessions in the window rather than only pageviews.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 6407-6420: The validateEIDs function builds an SQL fragment "eid
IN (...)" which produces invalid SQL when the input eids array is empty; add an
early guard at the top of validateEIDs to reject empty eids (e.g., throw a
validation/BadRequest error or return a clear zero/empty result) before
constructing params and the query so you never generate "IN ()". Ensure the
check references the eids parameter and stops execution prior to the
params/_map/_reduce usage and query string construction.
- Around line 1249-1255: The subQueryForPages currently selects psid based on
page values across all history; restrict it to the current time window by adding
the created BETWEEN {groupFrom:String} AND {groupTo:String} condition inside the
inner SELECT's WHERE clause so the argMin/argMax(page, created) logic used by
pageSelector only considers events within the requested range; update the
subQueryForPages construction (the templates built where pageSelector,
isContains, param are used) so the inner SELECT WHERE includes "AND created
BETWEEN {groupFrom:String} AND {groupTo:String}" and keep the rest of the logic
that appends to query (psid IN/NOT IN ...) unchanged.
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 5660-5717: getProfileDetails currently aggregates pageviews/events
but omits errorsCount, causing error-only profiles to show zeros; update the
aggregation in getProfileDetails (the same aggregation logic used in
getProfilesList/profile_aggregated) to SUM(isError) AS errorsCount, include
errorsCount in the SELECT that builds the profile detail record, and map that
field into the returned profile details object so the API returns errorsCount
alongside pageviewsCount, sessionsCount and eventsCount (also apply the same
addition where similar aggregations occur in the nearby code paths referenced
around getProfileDetails).
- Around line 4698-4770: The isLive calculation uses ClickHouse now() (server
timezone) while created_for_grouping/lastActivity may be localized, causing
timezone-shifted live windows; update the isLive expression in the query (and
the similar block used by getSessionsList) to compare UTC timestamps by wrapping
both sides with toTimeZone(...,'UTC') (e.g. use dateDiff('second',
toTimeZone(dsf.lastActivity,'UTC'), toTimeZone(now(),'UTC')) <
LIVE_SESSION_THRESHOLD_SECONDS) so the live-window logic is evaluated in UTC
consistently.
- Around line 5082-5099: The query currently counts DISTINCT psid and groups by
psid without excluding null session IDs, which can create synthetic affected
sessions; update the SQL in queryErrorStats to add a session filter (e.g., "AND
psid IS NOT NULL" or "AND psid != ''" depending on your schema) to the WHERE
clause so that affectedSessions/count(DISTINCT psid) and any grouping by psid
exclude nulls, and apply the same change to the other error-related query that
computes affected sessions (the similar query around the later error-count block
referenced in the review).
---
Duplicate comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 2231-2240: The query in getTotalSessionsForReport currently
filters WHERE type = 'pageview' and thus misses sessions that only emitted
custom_event or error rows; update the query string (variable name query) to
either remove the type = 'pageview' filter entirely or broaden it to include the
unified event types (e.g. type IN ('pageview','custom_event','error')) so
uniqExact(psid) counts all sessions in the window rather than only pageviews.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: edb591a1-4f69-4422-b3a5-8d7973a022ba
📒 Files selected for processing (2)
backend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/community/src/analytics/analytics.service.ts
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
2231-2239:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCount report sessions from the unified visitor event set.
This helper still treats
pageviewas the only session-bearing event. After the merge, projects can have sessions represented only bycustom_event/errorrows, so any report metric built on this helper will undercount sessions and overstate per-session ratios.Suggested change
SELECT pid, uniqExact(psid) as totalSessions FROM events WHERE pid IN {pids:Array(FixedString(12))} - AND type = 'pageview' + AND type IN ('pageview', 'custom_event', 'error') + AND psid IS NOT NULL AND created BETWEEN {groupFrom:String} AND {groupTo:String} GROUP BY pid🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 2231 - 2239, The query in analytics.service.ts that computes totalSessions currently restricts session counting to type = 'pageview', which misses sessions represented by other events; instead, count unique psid values across any event that carries a session id. Update the query string (the block that defines totalSessions using uniqExact(psid)) by removing the AND type = 'pageview' filter and replacing it with a condition that selects rows with a present/non-empty psid (e.g., psid IS NOT NULL AND psid != '' or an equivalent has/length check), so uniqExact(psid) reflects sessions from pageview, custom_event, error, etc.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 2931-2935: The lookup query for filter values currently hardcodes
scope to 'pageview' for shared fields and uses the local variable `type` to
decide between custom_event vs pageview, which makes countries/browsers/version
missing when the active analytics scope is `custom_event`; update the
filter-value lookup logic (the query construction around the `type === 'ev' ?
... : ...` expression in analytics.service.ts and the similar block at the later
location) to accept the active analytics scope parameter (or a scope variable)
and use it when building the WHERE clause, or switch the FROM set to the unified
visitor events table so shared fields are selected from both `pageview` and
`custom_event` rows — ensure the query uses the passed scope (e.g., `type =
<activeScope>` or no type filter if using unified set) so dropdowns surface
values for the selected analytics scope.
---
Duplicate comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 2231-2239: The query in analytics.service.ts that computes
totalSessions currently restricts session counting to type = 'pageview', which
misses sessions represented by other events; instead, count unique psid values
across any event that carries a session id. Update the query string (the block
that defines totalSessions using uniqExact(psid)) by removing the AND type =
'pageview' filter and replacing it with a condition that selects rows with a
present/non-empty psid (e.g., psid IS NOT NULL AND psid != '' or an equivalent
has/length check), so uniqExact(psid) reflects sessions from pageview,
custom_event, error, etc.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ff954b80-b31a-49b9-b97f-6669b62dceee
📒 Files selected for processing (2)
backend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/community/src/analytics/analytics.service.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/apps/community/src/analytics/analytics.service.ts
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
6503-6520:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse
Number(data[0]?.count ?? 0)to extract the single count row.The query returns exactly one row with
count(*). The currentreduceoperation will concatenate the count as a string if ClickHouse materializes it that way, corrupting the cached value. Extract and convert explicitly instead:Suggested fix
const { data } = await clickhouse .query({ query, }) .then((resultSet) => resultSet.json()) - // `@ts-expect-error` - const events = data.reduce((total, row) => total + row.count, 0) as number + const events = Number(data[0]?.count ?? 0)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 6503 - 6520, The events aggregation uses data.reduce which can concatenate string counts; replace that logic in the block after the clickhouse.query(...) call so that events is assigned Number(data[0]?.count ?? 0) (ensure conversion to a number) instead of using the reduce; update the variable assignment where events is defined (near the clickhouse query result handling and the events constant) to explicitly read data[0].count and coerce with Number to avoid string concatenation.
♻️ Duplicate comments (1)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
2931-2935:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftKeep filter-value lookups aligned with the active analytics scope.
Lines 2933-2935 and 2977-2984 still hardcode traffic filter values/version lookups to
type = 'pageview'. In custom-event analytics, sessions represented only bycustom_eventrows will never surface countries, browsers, or versions here, so those dropdowns stay incomplete. Please thread the active analytics scope into these helpers, or widen shared-field lookups to the unified visitor event set.Also applies to: 2977-2984
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/cloud/src/analytics/analytics.service.ts` around lines 2931 - 2935, The filter-value lookup is restricted to only pageview rows; update the query construction around the query variable (and the similar helper used later) to include the active analytics scope instead of hardcoding type = 'pageview'—either by parameterizing the WHERE clause with the incoming type/scope (e.g., use type === 'ev' ? "type = 'custom_event'" : "type = 'pageview'" as part of the WHERE) or by widening the lookup to the unified visitor event set (e.g., WHERE type IN ('pageview','custom_event')) so countries/browsers/versions surface for custom_event sessions; apply the same change to the other helper referenced in the comment to keep both lookups aligned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/community/src/analytics/analytics.service.ts`:
- Around line 1268-1270: The subQueryForPages currently selects psid from
pageview events without restricting events to the report time window, so include
the groupFrom/groupTo filters in the events WHERE clause used by
subQueryForPages (the inner SELECT inside subQueryForPages that uses
pageSelector and param). Modify the inner WHERE in the SELECT psid,
${pageSelector} as page FROM events ... to add AND timestamp >=
{groupFrom:DateTime} AND timestamp <= {groupTo:DateTime} (or the project's
equivalent datetime placeholders) so entryPage/exitPage matching only considers
events within the requested time window; keep using the same symbols isContains,
pageSelector, param and pid placeholders.
- Around line 2296-2312: The custom-event branch returns early and doesn't
forward-fill cumulative gaps, so when mode === CUMULATIVE the
visits/uniques/sdur arrays can remain zeros for empty buckets; modify the branch
handling customEVFilterApplied in analytics.service.ts (around
extractCustomEventsChartData, variables visits, uniques, sdur, and xShifted) to,
after computing those arrays, check if the current query mode is CUMULATIVE and
if so iterate each array left-to-right filling each index with the previous
non-null value (i.e., cumulative += current; replace zeros/nulls with the
running total) to ensure lengths match Array(_size(xShifted)) and the series
remains monotonic before returning the object.
---
Outside diff comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 6503-6520: The events aggregation uses data.reduce which can
concatenate string counts; replace that logic in the block after the
clickhouse.query(...) call so that events is assigned Number(data[0]?.count ??
0) (ensure conversion to a number) instead of using the reduce; update the
variable assignment where events is defined (near the clickhouse query result
handling and the events constant) to explicitly read data[0].count and coerce
with Number to avoid string concatenation.
---
Duplicate comments:
In `@backend/apps/cloud/src/analytics/analytics.service.ts`:
- Around line 2931-2935: The filter-value lookup is restricted to only pageview
rows; update the query construction around the query variable (and the similar
helper used later) to include the active analytics scope instead of hardcoding
type = 'pageview'—either by parameterizing the WHERE clause with the incoming
type/scope (e.g., use type === 'ev' ? "type = 'custom_event'" : "type =
'pageview'" as part of the WHERE) or by widening the lookup to the unified
visitor event set (e.g., WHERE type IN ('pageview','custom_event')) so
countries/browsers/versions surface for custom_event sessions; apply the same
change to the other helper referenced in the comment to keep both lookups
aligned.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6115b7e7-cba1-4c47-81e1-ec6aeed30c4f
📒 Files selected for processing (2)
backend/apps/cloud/src/analytics/analytics.service.tsbackend/apps/community/src/analytics/analytics.service.ts
Changes
If applicable, please describe what changes were made in this pull request.
Community Edition support
Database migrations
Documentation
Summary by CodeRabbit
Refactor
Bug Fixes
Chores
New Features