added login analytics page#284
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds an admin-only Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Controller as Controller<br/>`get_product_login_stats`
participant Auth as Auth Helpers<br/>`get_current_user`/`check_is_admin`
participant Service as ProductAnalysisService
participant DB as Database
Client->>Controller: GET /product-analysis/stats/login?start_date&end_date
Controller->>Auth: get_current_user()
Auth-->>Controller: current_user
Controller->>Auth: check_is_admin(current_user)
Auth-->>Controller: is_admin (true/false)
alt not admin
Controller-->>Client: 401 Unauthorized (error payload)
else admin
Controller->>Service: get_login_stats(start_date, end_date)
Service->>DB: execute_query(SQL with timezone conversion, filters, aggregation, excluded emails)
DB-->>Service: rows
Service-->>Controller: results
Controller-->>Client: 200 OK { "login_stats": results }
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~18 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
wavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py (3)
53-64: Operational: ensure an index supports this query at scale.The CTE filters
product_analyticsonevent_name = 'user_login'and acreated_atrange. Without an appropriate index this becomes a sequential scan on a hot, ever-growing table. A composite partial/non-partial index, e.g.:CREATE INDEX IF NOT EXISTS idx_product_analytics_event_created_at ON product_analytics (event_name, created_at); -- or, since only 'user_login' is queried here: CREATE INDEX IF NOT EXISTS idx_product_analytics_login_created_at ON product_analytics (created_at) WHERE event_name = 'user_login';would make this query bounded by the time window rather than the full table. Worth confirming/adding via a migration alongside this PR.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py` around lines 53 - 64, The query's CTE login_events filters product_analytics by event_name = 'user_login' and a created_at date range, which will cause full scans on a large table; add a DB migration to create an index to support this filter—either a composite index on (event_name, created_at) or a partial index on created_at WHERE event_name = 'user_login'—and ensure the migration uses CREATE INDEX IF NOT EXISTS with the chosen name (e.g., idx_product_analytics_event_created_at or idx_product_analytics_login_created_at) so the login_events CTE can use the index for range scans.
44-50: Reading env on every request — consider caching.
os.getenv('PRODUCT_ANALYTICS_EXCLUDED_EMAILS')is parsed on every request. Reading once at module load (or via your config layer) avoids repeated string splitting and makes the configured value testable/auditable. Minor, take or leave.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py` around lines 44 - 50, The code currently calls os.getenv('PRODUCT_ANALYTICS_EXCLUDED_EMAILS') and splits it on every request (variables excluded_emails_raw and excluded_emails); change this to read-and-parse the environment once and reuse the result by moving the getenv+split logic to module load (e.g., define a module-level EXCLUDED_ANALYTICS_EMAILS list) or encapsulate it in a memoized getter (e.g., get_excluded_emails()) so subsequent requests reuse the parsed list; ensure the new symbol name references PRODUCT_ANALYTICS_EXCLUDED_EMAILS and is easily importable/testable.
56-64: HardcodedAsia/Kolkatatimezone embedded in SQL.The reporting timezone is hardcoded in two places. If/when reporting needs to support other regions, both occurrences must be updated. Consider sourcing it from config/env (e.g.,
PRODUCT_ANALYTICS_REPORT_TZ) and binding it as a parameter so the query stays timezone-agnostic.- pa.created_at AT TIME ZONE 'UTC' AT TIME ZONE 'Asia/Kolkata' AS created_at_ist + pa.created_at AT TIME ZONE 'UTC' AT TIME ZONE :report_tz AS created_at_local…and bind
report_tzfrom env (default'Asia/Kolkata'for backwards compatibility).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py` around lines 56 - 64, The SQL embeds a hardcoded 'Asia/Kolkata' timezone in two places (used to compute created_at_ist from pa.created_at); replace both hardcoded occurrences with a bound parameter (e.g., :report_tz) and load that value from env/config (env var PRODUCT_ANALYTICS_REPORT_TZ, defaulting to 'Asia/Kolkata') in the code that constructs the query in product_analysis_service.py; ensure the query binds report_tz for both timezone conversions and keep the alias created_at_ist and filters (e.g., BETWEEN :start_date AND :end_date) unchanged so behavior remains the same when the TZ is overridden.wavefront/server/modules/product_analysis_module/product_analysis_module/controllers/product_anaysis_controllers.py (1)
121-128: Nit: 401 vs 403 for non-admin authenticated users.When
user_idis present butcheck_is_admin(...)returnsFalse, the user is authenticated but not authorized — semantically that's403 Forbidden, not401 Unauthorized. I notice the siblingget_product_analysis(lines 85–92) does the same thing, so this preserves consistency; flagging only because both could be tightened in a follow-up.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wavefront/server/modules/product_analysis_module/product_analysis_module/controllers/product_anaysis_controllers.py` around lines 121 - 128, The current auth check in the controller uses status.HTTP_401_UNAUTHORIZED for all failures; change it to return 401 only when user_id is missing and 403 (status.HTTP_403_FORBIDDEN) when get_current_user yields a user_id but check_is_admin(user_role_id) returns False. Update the block around get_current_user(...) and check_is_admin(...) in product_anaysis_controllers.py so that JSONResponse uses status.HTTP_401_UNAUTHORIZED when user_id is falsy, and uses status.HTTP_403_FORBIDDEN when user_id is present but user_role (from check_is_admin) is falsy.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@wavefront/server/modules/product_analysis_module/product_analysis_module/controllers/product_anaysis_controllers.py`:
- Around line 109-136: Add a hard cap on the requested date span and optionally
forbid future end dates: compute the span as (end_date - start_date).days and if
it exceeds a defined MAX_DATE_RANGE_DAYS (e.g., 366) return a JSONResponse with
status_code=status.HTTP_400_BAD_REQUEST using
response_formatter.buildErrorResponse('date range exceeds maximum allowed
span'); likewise, check if end_date > date.today() and return 400 with an
appropriate error message; place these checks after the existing start_date <=
end_date check and reference get_current_user, check_is_admin, and
response_formatter to locate the controller logic.
In
`@wavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py`:
- Around line 67-78: The COUNT aggregates in the SQL (producing
total_login_count and unique_login_days) are wrapped in COALESCE
unnecessarily—remove the COALESCE(COUNT(...), 0) wrappers and use COUNT(...)
directly for total_login_count and COUNT(DISTINCT ...::date) for
unique_login_days; also review whether the LEFT JOIN of "user" u to login_events
l (and grouping by u.email) is intentional—if you only want users who logged in,
change the LEFT JOIN to an INNER JOIN (or aggregate directly from login_events)
to avoid returning all non-deleted/non-excluded users with zero counts.
---
Nitpick comments:
In
`@wavefront/server/modules/product_analysis_module/product_analysis_module/controllers/product_anaysis_controllers.py`:
- Around line 121-128: The current auth check in the controller uses
status.HTTP_401_UNAUTHORIZED for all failures; change it to return 401 only when
user_id is missing and 403 (status.HTTP_403_FORBIDDEN) when get_current_user
yields a user_id but check_is_admin(user_role_id) returns False. Update the
block around get_current_user(...) and check_is_admin(...) in
product_anaysis_controllers.py so that JSONResponse uses
status.HTTP_401_UNAUTHORIZED when user_id is falsy, and uses
status.HTTP_403_FORBIDDEN when user_id is present but user_role (from
check_is_admin) is falsy.
In
`@wavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py`:
- Around line 53-64: The query's CTE login_events filters product_analytics by
event_name = 'user_login' and a created_at date range, which will cause full
scans on a large table; add a DB migration to create an index to support this
filter—either a composite index on (event_name, created_at) or a partial index
on created_at WHERE event_name = 'user_login'—and ensure the migration uses
CREATE INDEX IF NOT EXISTS with the chosen name (e.g.,
idx_product_analytics_event_created_at or
idx_product_analytics_login_created_at) so the login_events CTE can use the
index for range scans.
- Around line 44-50: The code currently calls
os.getenv('PRODUCT_ANALYTICS_EXCLUDED_EMAILS') and splits it on every request
(variables excluded_emails_raw and excluded_emails); change this to
read-and-parse the environment once and reuse the result by moving the
getenv+split logic to module load (e.g., define a module-level
EXCLUDED_ANALYTICS_EMAILS list) or encapsulate it in a memoized getter (e.g.,
get_excluded_emails()) so subsequent requests reuse the parsed list; ensure the
new symbol name references PRODUCT_ANALYTICS_EXCLUDED_EMAILS and is easily
importable/testable.
- Around line 56-64: The SQL embeds a hardcoded 'Asia/Kolkata' timezone in two
places (used to compute created_at_ist from pa.created_at); replace both
hardcoded occurrences with a bound parameter (e.g., :report_tz) and load that
value from env/config (env var PRODUCT_ANALYTICS_REPORT_TZ, defaulting to
'Asia/Kolkata') in the code that constructs the query in
product_analysis_service.py; ensure the query binds report_tz for both timezone
conversions and keep the alias created_at_ist and filters (e.g., BETWEEN
:start_date AND :end_date) unchanged so behavior remains the same when the TZ is
overridden.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d8e9611f-49f7-41aa-befa-aa32d670514f
📒 Files selected for processing (2)
wavefront/server/modules/product_analysis_module/product_analysis_module/controllers/product_anaysis_controllers.pywavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py
| SELECT | ||
| u.email, | ||
| COALESCE(COUNT(l.created_at_ist), 0) AS total_login_count, | ||
| COALESCE(COUNT(DISTINCT l.created_at_ist::date), 0) AS unique_login_days | ||
| FROM "user" u | ||
| LEFT JOIN login_events l ON u.email = l.email | ||
| WHERE | ||
| u.deleted = FALSE | ||
| AND u.email <> ALL(CAST(:excluded_emails AS text[])) | ||
| GROUP BY u.email | ||
| ORDER BY u.email | ||
| """ |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
PostgreSQL does COUNT() ever return NULL aggregate
💡 Result:
No. PostgreSQL’s COUNT aggregate does not return NULL. It returns a bigint count of rows: COUNT(*) counts input rows; COUNT(expression) counts input rows where expression is not NULL. When there are no input rows (e.g., no matching rows in a group), COUNT returns 0 rather than NULL. This contrasts with other aggregates like SUM/AVG/array_agg, which can return NULL when there are no input rows selected.
Citations:
- 1: https://www.postgresql.org/docs/current/functions-aggregate.html
- 2: https://www.postgresql.org/docs/9.6/functions-aggregate.html
- 3: https://www.postgresql.org/docs/16/functions-aggregate.html
Remove redundant COALESCE wrappers around COUNT — COUNT never returns NULL.
The query uses COALESCE(COUNT(...), 0) but COUNT aggregates always return 0 (never NULL) when there are no matching rows in a group. The COALESCE is unnecessary and should be removed.
Separately: confirm whether returning every user (even with total_login_count = 0) is intended. The LEFT JOIN from "user" includes all non-deleted, non-excluded users in the result set. If the endpoint is meant to show only users who actually logged in within the date range, consider switching to an INNER JOIN or aggregating directly over login_events to avoid O(total_users) payload inflation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@wavefront/server/modules/product_analysis_module/product_analysis_module/product_analysis_service.py`
around lines 67 - 78, The COUNT aggregates in the SQL (producing
total_login_count and unique_login_days) are wrapped in COALESCE
unnecessarily—remove the COALESCE(COUNT(...), 0) wrappers and use COUNT(...)
directly for total_login_count and COUNT(DISTINCT ...::date) for
unique_login_days; also review whether the LEFT JOIN of "user" u to login_events
l (and grouping by u.email) is intentional—if you only want users who logged in,
change the LEFT JOIN to an INNER JOIN (or aggregate directly from login_events)
to avoid returning all non-deleted/non-excluded users with zero counts.
Summary by CodeRabbit