VER-304: Optimize get_snippets filters and searching to fix timeout issues by rajivsinclair · Pull Request #57 · PublicDataWorks/verdad

rajivsinclair · 2026-01-29T19:47:53Z

Summary

Comprehensive database performance optimization fixing timeout errors and improving query efficiency across the application.

Performance Improvements

Function/Query	Before	After	Improvement
get_snippets (starredBy)	>30s timeout	~1.5ms	99.99%+
get_snippets (labeledBy)	>30s timeout	~50ms	99.8%+
get_snippets (upvotedBy)	~6.7s	~48ms	99.3%
get_snippets (state filter)	~134ms	~50ms	63%
get_snippets (source filter)	~134ms	~50ms	63%
get_filtering_options	3.7s avg	~0.7ms	99.98%
get_trending_topics	3.3s avg	~50ms	98.5%
get_recording_filter_options	1.2s avg	~410ms	65%
fetch_a_snippet_that_has_no_embedding	373ms avg	~98ms	74%

Changes Made

1. Query Pattern Optimizations

get_snippets filters: Changed EXISTS subqueries to CTEs + JOINs for starredBy, labeledBy, upvotedBy
get_snippets state/source: Changed IN subqueries to CTEs + JOINs (same pattern)
get_trending_topics: Changed NOT EXISTS to LEFT JOIN with null check
fetch_a_snippet_that_has_no_embedding: Changed to NOT IN with hash pattern

2. Caching

Created filter_options_cache materialized view for states/sources/languages
Updated get_filtering_options and get_recording_filter_options to use cache
Added refresh_filter_options_cache() function for maintenance

3. Index Improvements

Added indexes:

idx_comment_reactions_comment_id
idx_comments_room_id
idx_snippets_stage_1_llm_response
idx_user_roles_role
idx_user_hide_snippets_snippet
idx_snippets_processed_recorded_at (partial index)

Removed unused indexes (~9.5MB saved):

idx_audio_files_radio_station (9.5MB) - uses cache now
user_hide_snippets_user_idx (16KB) - use snippet index instead
idx_label_upvotes_snippet_label_upvoted_by (16KB) - unused composite

Kept sorting indexes (used by get_snippets ORDER BY options):

idx_snippets_comment_count, idx_snippets_upvote_count, idx_snippets_like_count, idx_snippets_user_last_activity

4. Maintenance

Disabled completed TSV backfill cron job (was running every minute doing nothing)

Migrations Applied

20260129_optimize_get_snippets_filters.sql - CTE+JOIN pattern for starred/labeled/upvoted filters
20260129_optimize_filter_functions.sql - Materialized view cache
20260129_optimize_recording_filter_and_indexes.sql - FK indexes
20260129_cleanup_unused_indexes.sql - Remove truly unused indexes
20260130_optimize_state_source_filters.sql - CTE+JOIN pattern for state/source filters

Cache Maintenance

Call refresh_filter_options_cache() after adding new radio stations, states, or languages.

Monitoring

Supabase Logs Explorer

regexp_contains(event_message, 'duration|statement timeout')

pg_stat_statements

SELECT query, calls, round(mean_exec_time::numeric, 2) as avg_ms
FROM pg_stat_statements WHERE calls > 10 ORDER BY mean_exec_time DESC;

Test Plan

All filter benchmarks verified
Cache populated correctly (183 entries)
No timeout errors in recent logs
All migrations applied successfully
Sorting indexes verified as needed and restored
State/source filter optimization applied

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added dislike count tracking for snippets alongside existing like counts.
Performance
- Optimized snippet filtering and search queries for faster results.
- Implemented caching system for filter options to reduce database load.
- Introduced optimized trending topics retrieval with improved filter support.
Chores
- Removed deprecated filtering function.
- Cleaned up unused database indexes.

The starredBy, labeledBy, and upvotedBy filters were causing statement timeouts (>30s) due to inefficient EXISTS subqueries being evaluated for every row in the snippets table (119k+ rows). Root cause: - EXISTS subqueries like `EXISTS (SELECT 1 FROM user_star_snippets WHERE snippet = s.id)` were evaluated for each of the 119k snippets - This is extremely slow because PostgreSQL scans the large snippets table first, then evaluates the EXISTS condition for each row Solution: - Changed from EXISTS subqueries to JOINs with pre-filtered CTEs - Pre-compute the filtered snippet IDs in CTEs (which operate on much smaller tables) - Then JOIN against these pre-filtered IDs instead of using EXISTS Performance improvements: - starredBy filter: timeout (>30s) -> <1s - labeledBy filter: timeout (>30s) -> <1s - upvotedBy filter: 6.7s -> <1s - politicalSpectrum filter: already uses index, no change needed Technical details: - Added filter detection flags: has_starred_filter, has_labeled_filter, has_upvoted_filter - Added pre-filtered CTEs: starred_snippet_ids, labeled_snippet_ids, upvoted_snippet_ids - Renamed upvoted filter variables to filter_upvoted_by_me/filter_upvoted_by_others to avoid column reference ambiguity with label_summary CTE Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

gemini-code-assist · 2026-01-29T19:48:11Z

Summary of Changes

Hello @rajivsinclair, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the performance of the get_snippets PostgreSQL function by resolving timeout issues encountered when applying certain filters. The core problem stemmed from EXISTS subqueries that were inefficiently evaluated against a large dataset. The solution involves refactoring the query to use JOINs with pre-filtered CTEs, which drastically reduces the processing time for filters like "Starred by me/others," "Labeled by me/others," and "Upvoted by me/others," making the application more responsive and preventing statement timeouts.

Highlights

Performance Optimization: Addressed critical timeout issues in the get_snippets function, reducing query times from over 30 seconds to under 1 second for starredBy, labeledBy, and upvotedBy filters.
Query Refactoring: Replaced inefficient EXISTS subqueries with optimized JOIN operations against pre-filtered Common Table Expressions (CTEs) to improve query plan efficiency.
Filter Logic Enhancement: Introduced dedicated boolean flags (has_starred_filter, starred_by_me, etc.) to conditionally apply filter logic, making the query more adaptive and performant.
Variable Renaming: Renamed upvotedBy filter variables to filter_upvoted_by_me and filter_upvoted_by_others to prevent column ambiguity within CTEs.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-01-29T19:48:15Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Replaces per-row EXISTS/NOT EXISTS checks with precomputed CTEs and JOINs across multiple get_snippets implementations; adds a materialized filter options cache with refresh function; introduces index cleanup and new/updated indexes; enforces auth in filtering functions and updates snippet dislike_count handling.

Changes

Cohort / File(s)	Summary
get_snippets implementations `supabase/database/sql/get_snippets_function.sql`, `supabase/migrations/20260129_optimize_get_snippets_filters.sql`, `supabase/migrations/20260130_optimize_state_source_filters.sql`	Rewrote get_snippets to use pre-filter CTEs (`starred_snippet_ids`, `labeled_snippet_ids`, `upvoted_snippet_ids`, `state_filtered_audio_ids`, `source_filtered_audio_ids`), compute filter-detection flags, replace per-row EXISTS with JOINs, apply language-aware text projections and multi-field search, compute combined upvote/like counts for ordering, and return enriched JSON snippet objects with audio metadata and user flags. Public signature unchanged.
Filter options, trending & cache `supabase/database/sql/get_filtering_options_function.sql`, `supabase/migrations/20260129_optimize_filter_functions.sql`	Added `filter_options_cache` materialized view, index `idx_filter_options_cache_type`, and `refresh_filter_options_cache()`; refactored get_filtering_options to read states/sources/languages from the cache and preserve label pagination; rewrote get_trending_topics to use join-based hidden-snippet exclusion and added auth checks.
Indexing & recording filters `supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql`, `supabase/migrations/20260129_cleanup_unused_indexes.sql`	Dropped legacy `get_recording_filter_options()`, added several indexes (if not exists) to support joins/queries, and added a migration to drop unused indexes with documented rationale.
Snippet like/dislike update `supabase/database/sql/update_snippet_like_count.sql`	Extended the snippet like-count update to also compute and set `dislike_count` (counts of `user_like_snippets.value = -1`) alongside existing `like_count`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

[f] VER-263 - Replace temp table creation with CTE query #19 — Refactors get_snippets internals to replace EXISTS patterns with CTEs/joins and pagination; strong overlap with these changes.
[f] 116 - Implement sort feature (BE) #6 — Adjusts get_snippets ordering and sorting semantics (upvotes/comments/activities); touches the same ORDER BY logic.
[f] 241 - Update the get_snippets query #11 — Expands search-term handling and text-index considerations for get_snippets; related to the multi-field search changes.

Suggested reviewers

nhphong

Poem

🐰 I swapped slow EXISTS for CTE rows,
I hopped through JOINs where the wild data grows,
Indexes hum, caches snug in a heap,
Snippets now fetch with a lighter leap,
A carrot code-hop — fast queries, sweet!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title clearly and specifically summarizes the main objective: optimizing get_snippets filters and search functionality to resolve timeout issues, which is the core focus of the PR.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/optimize-get-snippets-filters

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to c74cc4c in 44 seconds. Click for details.

Reviewed 993 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_jc1eRc5h7oRXMSHV

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@supabase/database/sql/get_snippets_function.sql`:
- Line 357: The calculation total_pages := CEIL(total_count::FLOAT / page_size)
can divide by zero when page_size is 0; add input validation at the start of the
function (e.g., check the page_size parameter) and raise a clear exception if
page_size <= 0 or alternatively use NULLIF(page_size,0) in the division to avoid
the error; update the function that declares page_size/total_pages (references
to page_size, total_count, total_pages, CEIL) to include this validation or the
NULLIF-based guard before computing total_pages.
- Around line 306-327: The JSON field name for whether the current user upvoted
a label is inconsistent; update the migration to use the same name used in this
function: change any occurrences of filter_upvoted_by_me to upvoted_by_me so the
label_summary CTE and the jsonb_build_object in snippets_with_labels both emit
'upvoted_by_me' (i.e., ensure the migration's SELECT/alias and
jsonb_build_object key match the label_summary/upvoted_by_me naming).

In `@supabase/migrations/20260129_optimize_get_snippets_filters.sql`:
- Around line 303-324: The label JSON uses the wrong field name: change the
boolean alias and JSON key from filter_upvoted_by_me to upvoted_by_me;
specifically update the SELECT in the label_summary CTE (the expression "lu.id
IS NOT NULL AS filter_upvoted_by_me") to alias as upvoted_by_me, and update the
jsonb_build_object call inside the labels aggregation (currently building
'filter_upvoted_by_me') to use 'upvoted_by_me' so the label JSON produced by
snippets_with_labels matches get_snippets_function.sql and frontend
expectations.

🧹 Nitpick comments (2)

supabase/database/sql/get_snippets_function.sql (2)
295-302: Consider using COALESCE for upvote_count as well.

Line 297 handles NULL for like_count with COALESCE, but upvote_count could also be NULL. For consistency and to avoid potential NULL arithmetic issues:
♻️ Suggested improvement
-                WHEN p_order_by = 'upvotes' THEN s.upvote_count + COALESCE(s.like_count, 0)
+                WHEN p_order_by = 'upvotes' THEN COALESCE(s.upvote_count, 0) + COALESCE(s.like_count, 0)
352-352: Hardcoded dislike_count always returns 0.

This field is hardcoded to 0. If dislike tracking is not implemented, consider either removing this field from the response or adding a comment explaining the intent.

coderabbitai · 2026-01-29T19:51:15Z

+    ) INTO result
+    FROM snippets_with_labels s;

    total_pages := CEIL(total_count::FLOAT / page_size);


⚠️ Potential issue | 🟡 Minor

Potential division by zero if page_size is 0.

If a caller passes page_size = 0, this line will raise a division-by-zero error. Consider adding input validation at the start of the function or using NULLIF.

🛡️ Proposed fix with input validation

Add near the beginning of the function (after line 44):

IF page_size <= 0 THEN RAISE EXCEPTION 'page_size must be greater than 0'; END IF;

🤖 Prompt for AI Agents

In `@supabase/database/sql/get_snippets_function.sql` at line 357, The calculation total_pages := CEIL(total_count::FLOAT / page_size) can divide by zero when page_size is 0; add input validation at the start of the function (e.g., check the page_size parameter) and raise a clear exception if page_size <= 0 or alternatively use NULLIF(page_size,0) in the division to avoid the error; update the function that declares page_size/total_pages (references to page_size, total_count, total_pages, CEIL) to include this validation or the NULLIF-based guard before computing total_pages.

coderabbitai · 2026-01-29T19:51:15Z

+    label_summary AS (
+        SELECT
+            l.id,
+            CASE WHEN p_language = 'spanish' THEN l.text_spanish ELSE l.text END AS text,
+            sl.upvote_count,
+            lu.id IS NOT NULL AS filter_upvoted_by_me,
+            sl.snippet AS snippet_id
+        FROM snippet_labels sl
+        JOIN labels l ON l.id = sl.label
+        LEFT JOIN label_upvotes lu ON lu.snippet_label = sl.id AND lu.upvoted_by = current_user_id
+        WHERE sl.snippet IN (SELECT id FROM filtered_snippets)
+    ),
+    snippets_with_labels AS (
+        SELECT
+            fs.*,
+            COALESCE(ld.labels, '[]'::jsonb) AS labels
+        FROM filtered_snippets fs
+        LEFT JOIN (
+            SELECT snippet_id, jsonb_agg(jsonb_build_object('id', id, 'text', text, 'upvote_count', upvote_count, 'filter_upvoted_by_me', filter_upvoted_by_me)) as labels
+            FROM label_summary
+            GROUP BY snippet_id
+        ) ld ON fs.id = ld.snippet_id


⚠️ Potential issue | 🟠 Major

Incorrect field name filter_upvoted_by_me in label JSON output.

The label_summary CTE (line 308) and JSON aggregation (line 321) use filter_upvoted_by_me, which is the name of a filter flag variable, not a descriptive name for whether the user upvoted this label. The main function file (get_snippets_function.sql) correctly uses upvoted_by_me.

This will cause the API to return labels with the field name filter_upvoted_by_me instead of the expected upvoted_by_me, potentially breaking frontend consumers.

🐛 Proposed fix to use consistent naming

label_summary AS ( SELECT l.id, CASE WHEN p_language = 'spanish' THEN l.text_spanish ELSE l.text END AS text, sl.upvote_count, - lu.id IS NOT NULL AS filter_upvoted_by_me, + lu.id IS NOT NULL AS upvoted_by_me, sl.snippet AS snippet_id FROM snippet_labels sl JOIN labels l ON l.id = sl.label LEFT JOIN label_upvotes lu ON lu.snippet_label = sl.id AND lu.upvoted_by = current_user_id WHERE sl.snippet IN (SELECT id FROM filtered_snippets) ), snippets_with_labels AS ( SELECT fs.*, COALESCE(ld.labels, '[]'::jsonb) AS labels FROM filtered_snippets fs LEFT JOIN ( - SELECT snippet_id, jsonb_agg(jsonb_build_object('id', id, 'text', text, 'upvote_count', upvote_count, 'filter_upvoted_by_me', filter_upvoted_by_me)) as labels + SELECT snippet_id, jsonb_agg(jsonb_build_object('id', id, 'text', text, 'upvote_count', upvote_count, 'upvoted_by_me', upvoted_by_me)) as labels FROM label_summary GROUP BY snippet_id ) ld ON fs.id = ld.snippet_id )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

label_summary AS (

SELECT

l.id,

CASE WHEN p_language = 'spanish' THEN l.text_spanish ELSE l.text END AS text,

sl.upvote_count,

lu.id IS NOT NULL AS filter_upvoted_by_me,

sl.snippet AS snippet_id

FROM snippet_labels sl

JOIN labels l ON l.id = sl.label

LEFT JOIN label_upvotes lu ON lu.snippet_label = sl.id AND lu.upvoted_by = current_user_id

WHERE sl.snippet IN (SELECT id FROM filtered_snippets)

),

snippets_with_labels AS (

SELECT

fs.*,

COALESCE(ld.labels, '[]'::jsonb) AS labels

FROM filtered_snippets fs

LEFT JOIN (

SELECT snippet_id, jsonb_agg(jsonb_build_object('id', id, 'text', text, 'upvote_count', upvote_count, 'filter_upvoted_by_me', filter_upvoted_by_me)) as labels

FROM label_summary

GROUP BY snippet_id

) ld ON fs.id = ld.snippet_id

label_summary AS (

SELECT

l.id,

CASE WHEN p_language = 'spanish' THEN l.text_spanish ELSE l.text END AS text,

sl.upvote_count,

lu.id IS NOT NULL AS upvoted_by_me,

sl.snippet AS snippet_id

FROM snippet_labels sl

JOIN labels l ON l.id = sl.label

LEFT JOIN label_upvotes lu ON lu.snippet_label = sl.id AND lu.upvoted_by = current_user_id

WHERE sl.snippet IN (SELECT id FROM filtered_snippets)

),

snippets_with_labels AS (

SELECT

fs.*,

COALESCE(ld.labels, '[]'::jsonb) AS labels

FROM filtered_snippets fs

LEFT JOIN (

SELECT snippet_id, jsonb_agg(jsonb_build_object('id', id, 'text', text, 'upvote_count', upvote_count, 'upvoted_by_me', upvoted_by_me)) as labels

FROM label_summary

GROUP BY snippet_id

) ld ON fs.id = ld.snippet_id

🤖 Prompt for AI Agents

In `@supabase/migrations/20260129_optimize_get_snippets_filters.sql` around lines 303 - 324, The label JSON uses the wrong field name: change the boolean alias and JSON key from filter_upvoted_by_me to upvoted_by_me; specifically update the SELECT in the label_summary CTE (the expression "lu.id IS NOT NULL AS filter_upvoted_by_me") to alias as upvoted_by_me, and update the jsonb_build_object call inside the labels aggregation (currently building 'filter_upvoted_by_me') to use 'upvoted_by_me' so the label JSON produced by snippets_with_labels matches get_snippets_function.sql and frontend expectations.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c74cc4c67e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-29T19:51:42Z

+        LEFT JOIN (
+            SELECT snippet_id, jsonb_agg(jsonb_build_object('id', id, 'text', text, 'upvote_count', upvote_count, 'filter_upvoted_by_me', filter_upvoted_by_me)) as labels
+            FROM label_summary


Preserve label field name upvoted_by_me

The migration changes the labels JSON to emit filter_upvoted_by_me instead of the previously returned upvoted_by_me. Any client code that reads labels[].upvoted_by_me will now see the field missing and treat all labels as not upvoted by the user. This is a breaking API change that doesn’t appear to be accompanied by client updates in this commit.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

This PR optimizes the get_snippets database function to fix timeout issues when filtering by starred/labeled/upvoted status. The optimization replaces slow EXISTS subqueries with JOIN-based queries using pre-filtered CTEs, reducing query execution time from >30 seconds (timeout) to <1 second for the affected filters.

Changes:

Replaced EXISTS subqueries with LEFT JOIN pattern using pre-filtered CTEs for starred/labeled/upvoted filters
Added filter detection boolean flags to enable conditional CTE execution
Renamed upvoted filter variables to avoid column name conflicts in the label_summary CTE

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
supabase/migrations/20260129_optimize_get_snippets_filters.sql	Migration that applies the optimized function to production database
supabase/database/sql/get_snippets_function.sql	Source SQL file that should mirror the migration but has critical inconsistency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-29T19:53:52Z

-            END AS text,
+            CASE WHEN p_language = 'spanish' THEN l.text_spanish ELSE l.text END AS text,
            sl.upvote_count,
            lu.id IS NOT NULL AS upvoted_by_me,


The column name in the label_summary CTE is inconsistent between the source SQL file and the migration file. The migration file correctly uses 'filter_upvoted_by_me' (line 308, 321), while the source file uses 'upvoted_by_me' (line 311, 324). This means the source SQL file has not been updated to match the migration that was already applied to production.

The PR description mentions that this rename was necessary "to avoid column reference ambiguity with the label_summary CTE", so the source file should use the same 'filter_upvoted_by_me' naming as the migration.

gemini-code-assist

Code Review

This pull request significantly improves query performance by optimizing the get_snippets function, replacing inefficient EXISTS subqueries with JOINs on pre-filtered CTEs. However, a critical security issue was identified: the use of SECURITY DEFINER without an explicit search_path could lead to search path hijacking. This should be remediated by adding a SET search_path clause to the function definition in both the migration and the source SQL file. Additionally, further enhancements are needed to address code duplication in result counting and fetching logic, simplify boolean logic within new filter CTEs, investigate a potential regression where dislike_count is hardcoded to 0, and improve clarity for a variable alias in the migration script. Addressing these points will make the new function more maintainable, robust, and secure.

gemini-code-assist · 2026-01-29T19:55:32Z

+    filter_upvoted_by_me := has_upvoted_filter AND p_filter->'upvotedBy' ? 'by_me';
+    filter_upvoted_by_others := has_upvoted_filter AND p_filter->'upvotedBy' ? 'by_others';
+
+    -- Get count using optimized query with JOINs instead of EXISTS


There is significant code duplication between the query to get total_count (lines 72-171) and the query to fetch the paginated data (lines 173-355). Both parts define the same CTEs (starred_snippet_ids, labeled_snippet_ids, upvoted_snippet_ids) and have nearly identical filtering logic in the WHERE clause.

This duplication makes the function difficult to maintain, as any change to the filtering logic must be applied in two places.

Consider refactoring to define the filtered set of snippets once in a CTE, and then reuse that CTE for both counting and pagination. This would look something like this:

WITH -- ... your filter ID CTEs (starred_snippet_ids, etc.) ... filtered_snippets_base AS ( SELECT s.* -- select all columns needed for ordering and final output FROM snippets s -- ... all your JOINs and WHERE clauses ... ), count_cte AS ( SELECT COUNT(*) as total FROM filtered_snippets_base ), paginated_snippets AS ( SELECT * FROM filtered_snippets_base ORDER BY ... LIMIT page_size OFFSET page * page_size ) -- Then use these CTEs to get the count and build the final result set.

This approach would make the function more DRY (Don't Repeat Yourself) and easier to manage.

gemini-code-assist · 2026-01-29T19:55:32Z

+    page_size INTEGER,
+    p_order_by text,
+    p_search_term text DEFAULT ''
+) RETURNS jsonb SECURITY DEFINER AS $$


The function get_snippets is defined with SECURITY DEFINER but does not set a search_path. This is a security risk in PostgreSQL because it allows the function to be hijacked if an attacker can create objects in a schema that is part of the caller's search_path. For example, an attacker could create a malicious auth.uid() function in their own schema to impersonate other users or bypass security checks.

Recommendation: Add SET search_path = public, extensions, auth; (or the appropriate schemas) to the function definition. This should also be applied to the source file supabase/database/sql/get_snippets_function.sql.

) RETURNS jsonb SECURITY DEFINER SET search_path = public, extensions, auth AS $$

gemini-code-assist · 2026-01-29T19:55:32Z

+        WHERE has_starred_filter AND (
+            (starred_by_me AND starred_by_others) OR
+            (starred_by_me AND NOT starred_by_others AND uss."user" = current_user_id) OR
+            (starred_by_others AND NOT starred_by_me AND uss."user" != current_user_id)
+        )


The boolean logic in this WHERE clause is correct but can be simplified for better readability and maintainability. The same simplification can be applied to the labeled_snippet_ids and upvoted_snippet_ids CTEs.

A simpler equivalent is:

WHERE has_starred_filter AND ( (starred_by_me AND uss."user" = current_user_id) OR (starred_by_others AND uss."user" != current_user_id) )

This revised logic is more concise and easier to understand at a glance.

WHERE has_starred_filter AND ( (starred_by_me AND uss."user" = current_user_id) OR (starred_by_others AND uss."user" != current_user_id) )

Additional performance optimizations identified via pg_stat_statements: 1. get_filtering_options: 3.7s avg -> <10ms - Root cause: DISTINCT queries on 1M+ row audio_files table - Solution: Created materialized view (filter_options_cache) to cache states, sources, and languages - Added refresh_filter_options_cache() function for updates 2. get_trending_topics: optimized hidden snippets check - Changed NOT EXISTS to LEFT JOIN pattern (same as get_snippets fix) - Added index on user_hide_snippets.snippet Performance summary for all optimizations in this PR: - starredBy filter: timeout (>30s) -> <1s - labeledBy filter: timeout (>30s) -> <1s - upvotedBy filter: 6.7s -> <1s - get_filtering_options: 3.7s -> <10ms Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 34a19c3 in 34 seconds. Click for details.

Reviewed 436 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_dWc5kEW3pjvrcJ4j

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

rajivsinclair · 2026-01-29T19:59:34Z

Additional Optimizations Added

In the second commit, I found and fixed additional slow functions using pg_stat_statements:

`get_filtering_options` (3.7s avg → <10ms)

Root cause: SELECT DISTINCT queries on the 1M+ row audio_files table to get states and radio stations.

Solution: Created a materialized view filter_options_cache that caches the distinct values. The cache contains:

14 states
59 radio station sources
110 languages

Note: Call SELECT refresh_filter_options_cache(); when new radio stations or states are added.

`get_trending_topics` (optimized)

Root cause: Same NOT EXISTS pattern for hidden snippets check.

Solution: Changed to LEFT JOIN pattern (same technique as get_snippets).

Summary of All Optimizations

Function/Filter	Before	After
starredBy filter	timeout (>30s)	<1s
labeledBy filter	timeout (>30s)	<1s
upvotedBy filter	6.7s	<1s
get_filtering_options	3.7s avg	<10ms
get_trending_topics	3.3s avg	improved

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@supabase/database/sql/get_filtering_options_function.sql`:
- Around line 60-88: The three SELECTs that read from filter_options_cache
(assigning INTO states, sources, languages) must reference the view with an
explicit schema to avoid search_path hijacking in this SECURITY DEFINER
function; update the three FROM clauses that currently use filter_options_cache
to use public.filter_options_cache so these queries (the ones building jsonb_agg
into the variables states, sources, and languages) consistently match other
schema-qualified references like public.labels and public.audio_files.

In `@supabase/migrations/20260129_optimize_filter_functions.sql`:
- Around line 85-91: Add input validation for p_label_page_size to guard against
zero or negative values: at the start of the routine (before calculating
total_pages or using LIMIT) check if p_label_page_size <= 0 and RAISE EXCEPTION
with a clear message (e.g., "p_label_page_size must be positive");
alternatively, normalize the value using GREATEST(p_label_page_size, 1) where
total_pages is computed and where LIMIT is applied so division by zero and
unexpected LIMIT behavior are prevented. Reference p_label_page_size,
total_labels/total_pages and the later LIMIT usage to ensure both the
CEIL(total_labels::FLOAT / p_label_page_size) calculation and the query LIMIT
are protected.
- Around line 214-227: The hidden_snippets CTE is currently returning snippets
hidden by any user; restrict it to the current user by filtering
user_hide_snippets with the current_user_id (same approach used for
user_star_snippets and user_like_snippets). Update the hidden_snippets CTE (or
the LEFT JOIN to it in filtered_snippets) to include AND
user_hide_snippets."user" = current_user_id so only snippets hidden by the
requesting user are excluded from filtered_snippets.
- Around line 64-68: The SECURITY DEFINER functions get_filtering_options and
get_trending_topics use unqualified object names and must lock down their
search_path to avoid hijacking; update each CREATE OR REPLACE FUNCTION header to
include "SET search_path = public, pg_catalog" (e.g., change the function header
so the SECURITY DEFINER declaration includes the SET search_path clause) so the
function runs with a fixed search_path and still references the same unqualified
names (filter_options_cache, auth.uid(), snippets, audio_files,
user_hide_snippets, snippet_labels, labels).

🧹 Nitpick comments (1)

supabase/migrations/20260129_optimize_filter_functions.sql (1)
55-60: Consider REFRESH ... CONCURRENTLY to avoid blocking reads.

REFRESH MATERIALIZED VIEW takes an exclusive lock; if refresh can overlap with reads, concurrent refresh keeps reads flowing. This requires a unique index that matches row uniqueness.
♻️ Possible adjustment (if uniqueness is guaranteed)
+CREATE UNIQUE INDEX IF NOT EXISTS idx_filter_options_cache_unique
+    ON filter_options_cache(option_type, value, label, secondary_value);
@@
-    REFRESH MATERIALIZED VIEW filter_options_cache;
+    REFRESH MATERIALIZED VIEW CONCURRENTLY filter_options_cache;

coderabbitai · 2026-01-29T20:07:12Z

+    -- Fetch states from cached view (fast!)
    SELECT jsonb_agg(
        jsonb_build_object(
-            'label', location_state,
-            'value', location_state
+            'label', label,
+            'value', value
        )
    ) INTO states
-    FROM unique_states;
+    FROM filter_options_cache
+    WHERE option_type = 'states';

-    -- Fetch unique radio station codes from the audio_files table
-    WITH unique_sources AS (
-    SELECT DISTINCT 
-        radio_station_code,
-        radio_station_name
-    FROM public.audio_files
-    WHERE radio_station_code IS NOT NULL
-    )
+    -- Fetch sources from cached view (fast!)
    SELECT jsonb_agg(
        jsonb_build_object(
-            'label', CASE 
-                WHEN radio_station_name IS NOT NULL 
-                THEN radio_station_name || ' - ' || radio_station_code
-                ELSE radio_station_code
-            END,
-            'value', radio_station_code
+            'label', label,
+            'value', value
        )
    ) INTO sources
-    FROM unique_sources;
+    FROM filter_options_cache
+    WHERE option_type = 'sources';

-    -- Fetch unique primary languages from the snippets table
-    WITH unique_languages AS (
-        SELECT DISTINCT language->>'primary_language' AS primary_language
-        FROM public.snippets
-        WHERE language IS NOT NULL
-    )
+    -- Fetch languages from cached view (fast!)
    SELECT jsonb_agg(
        jsonb_build_object(
-            'label', primary_language,
-            'value', primary_language
+            'label', label,
+            'value', value
        )
    ) INTO languages
-    FROM unique_languages;
+    FROM filter_options_cache
+    WHERE option_type = 'languages';


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and read the SQL file find . -name "get_filtering_options_function.sql" -type f | head -20

Repository: PublicDataWorks/verdad

Length of output: 124

🏁 Script executed:

# Let's also look for the file using git git ls-files | grep -i "get_filtering_options"

Repository: PublicDataWorks/verdad

Length of output: 122

🏁 Script executed:

cat -n supabase/database/sql/get_filtering_options_function.sql

Repository: PublicDataWorks/verdad

Length of output: 3844

🏁 Script executed:

# Find where filter_options_cache is defined git ls-files | xargs grep -l "filter_options_cache" | head -20

Repository: PublicDataWorks/verdad

Length of output: 181

🏁 Script executed:

# Search for the CREATE statement of filter_options_cache rg "CREATE.*filter_options_cache" -A 5

Repository: PublicDataWorks/verdad

Length of output: 1477

🏁 Script executed:

cat -n supabase/migrations/20260129_optimize_filter_functions.sql | head -50

Repository: PublicDataWorks/verdad

Length of output: 1933

🏁 Script executed:

# Check if filter_options_cache is explicitly created with a schema prefix rg "CREATE.*VIEW.*filter_options_cache" -B 2 -A 2

Repository: PublicDataWorks/verdad

Length of output: 550

🏁 Script executed:

# Check for any explicit schema prefixes in the migration file cat supabase/migrations/20260129_optimize_filter_functions.sql | grep -E "(filter_options_cache|SET search_path)"

Repository: PublicDataWorks/verdad

Length of output: 423

🏁 Script executed:

# Check if there's a SET search_path clause in the function cat -n supabase/database/sql/get_filtering_options_function.sql | grep -i "search_path"

Repository: PublicDataWorks/verdad

Length of output: 48

🏁 Script executed:

# Check the full migration file to see if there's a schema-qualified creation cat supabase/migrations/20260129_optimize_filter_functions.sql | head -100

Repository: PublicDataWorks/verdad

Length of output: 3050

Schema‑qualify filter_options_cache to public.filter_options_cache in this SECURITY DEFINER function.

The function inconsistently qualifies schema names: it explicitly uses public.labels and public.audio_files elsewhere but leaves filter_options_cache unqualified on lines 67, 77, and 87. In SECURITY DEFINER functions, unqualified object names can be hijacked through search_path manipulation (including pg_temp). Qualify the view consistently to prevent that risk.

Proposed fix

- FROM filter_options_cache + FROM public.filter_options_cache WHERE option_type = 'states'; - FROM filter_options_cache + FROM public.filter_options_cache WHERE option_type = 'sources'; - FROM filter_options_cache + FROM public.filter_options_cache WHERE option_type = 'languages';

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

-- Fetch states from cached view (fast!)

SELECT jsonb_agg(

jsonb_build_object(

'label', location_state,

'value', location_state

'label', label,

'value', value

)

) INTO states

FROM unique_states;

FROM filter_options_cache

WHERE option_type = 'states';

-- Fetch unique radio station codes from the audio_files table

WITH unique_sources AS (

SELECT DISTINCT

radio_station_code,

radio_station_name

FROM public.audio_files

WHERE radio_station_code IS NOT NULL

)

-- Fetch sources from cached view (fast!)

SELECT jsonb_agg(

jsonb_build_object(

'label', CASE

WHEN radio_station_name IS NOT NULL

THEN radio_station_name || ' - ' || radio_station_code

ELSE radio_station_code

END,

'value', radio_station_code

'label', label,

'value', value

)

) INTO sources

FROM unique_sources;

FROM filter_options_cache

WHERE option_type = 'sources';

-- Fetch unique primary languages from the snippets table

WITH unique_languages AS (

SELECT DISTINCT language->>'primary_language' AS primary_language

FROM public.snippets

WHERE language IS NOT NULL

)

-- Fetch languages from cached view (fast!)

SELECT jsonb_agg(

jsonb_build_object(

'label', primary_language,

'value', primary_language

'label', label,

'value', value

)

) INTO languages

FROM unique_languages;

FROM filter_options_cache

WHERE option_type = 'languages';

-- Fetch states from cached view (fast!)

SELECT jsonb_agg(

jsonb_build_object(

'label', label,

'value', value

)

) INTO states

FROM public.filter_options_cache

WHERE option_type = 'states';

-- Fetch sources from cached view (fast!)

SELECT jsonb_agg(

jsonb_build_object(

'label', label,

'value', value

)

) INTO sources

FROM public.filter_options_cache

WHERE option_type = 'sources';

-- Fetch languages from cached view (fast!)

SELECT jsonb_agg(

jsonb_build_object(

'label', label,

'value', value

)

) INTO languages

FROM public.filter_options_cache

WHERE option_type = 'languages';

🤖 Prompt for AI Agents

In `@supabase/database/sql/get_filtering_options_function.sql` around lines 60 - 88, The three SELECTs that read from filter_options_cache (assigning INTO states, sources, languages) must reference the view with an explicit schema to avoid search_path hijacking in this SECURITY DEFINER function; update the three FROM clauses that currently use filter_options_cache to use public.filter_options_cache so these queries (the ones building jsonb_agg into the variables states, sources, and languages) consistently match other schema-qualified references like public.labels and public.audio_files.

coderabbitai · 2026-01-29T20:07:12Z

+CREATE OR REPLACE FUNCTION get_filtering_options (
+  p_language TEXT DEFAULT 'english',
+  p_label_page INT DEFAULT 0,
+  p_label_page_size INT DEFAULT 5
+) RETURNS jsonb SECURITY DEFINER AS $$


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "*20260129_optimize_filter_functions.sql" -o -name "*optimize_filter*" | head -20

Repository: PublicDataWorks/verdad

Length of output: 126

🏁 Script executed:

wc -l ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 130

🏁 Script executed:

sed -n '60,75p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 477

🏁 Script executed:

sed -n '168,180p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 326

🏁 Script executed:

sed -n '160,180p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 589

🏁 Script executed:

sed -n '64,150p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 2389

🏁 Script executed:

sed -n '171,250p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 2944

🏁 Script executed:

grep -n "SET search_path\|SECURITY DEFINER" ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 128

🏁 Script executed:

sed -n '64,175p' ./supabase/migrations/20260129_optimize_filter_functions.sql | grep -A 5 "SECURITY DEFINER"

Repository: PublicDataWorks/verdad

Length of output: 226

🏁 Script executed:

sed -n '1,50p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 1583

🏁 Script executed:

sed -n '100,120p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 637

🏁 Script executed:

sed -n '115,135p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 561

🏁 Script executed:

grep -n "CREATE OR REPLACE FUNCTION" ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 232

🏁 Script executed:

sed -n '56,160p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 3037

🏁 Script executed:

sed -n '165,340p' ./supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 6574

Lock down search_path for both SECURITY DEFINER functions.

These functions reference unqualified objects; with SECURITY DEFINER, that's vulnerable to search_path hijacking (including pg_temp). Add SET search_path = public, pg_catalog to both function definitions.

Applies to:

get_filtering_options() (line 68): references unqualified filter_options_cache and auth.uid()

get_trending_topics() (line 173): heavily uses unqualified table references (snippets, audio_files, user_hide_snippets, snippet_labels, labels)

🤖 Prompt for AI Agents

In `@supabase/migrations/20260129_optimize_filter_functions.sql` around lines 64 - 68, The SECURITY DEFINER functions get_filtering_options and get_trending_topics use unqualified object names and must lock down their search_path to avoid hijacking; update each CREATE OR REPLACE FUNCTION header to include "SET search_path = public, pg_catalog" (e.g., change the function header so the SECURITY DEFINER declaration includes the SET search_path clause) so the function runs with a fixed search_path and still references the same unqualified names (filter_options_cache, auth.uid(), snippets, audio_files, user_hide_snippets, snippet_labels, labels).

coderabbitai · 2026-01-29T20:07:12Z

+    -- Fetch total number of labels
+    SELECT COUNT(*) INTO total_labels
+    FROM public.labels;
+
+    -- Calculate total pages
+    total_pages := CEIL(total_labels::FLOAT / p_label_page_size);
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "*20260129*optimize_filter_functions.sql" -o -name "*.sql" | head -20

Repository: PublicDataWorks/verdad

Length of output: 1182

🏁 Script executed:

cd supabase/migrations 2>/dev/null && ls -la | grep -E "20260129|optimize" || echo "Directory not found, searching from root"

Repository: PublicDataWorks/verdad

Length of output: 247

🏁 Script executed:

find . -type f -name "*.sql" | grep -i "filter\|label" | head -20

Repository: PublicDataWorks/verdad

Length of output: 588

🏁 Script executed:

cat -n supabase/migrations/20260129_optimize_filter_functions.sql | head -120

Repository: PublicDataWorks/verdad

Length of output: 4452

🏁 Script executed:

cat -n supabase/migrations/20260129_optimize_filter_functions.sql | sed -n '85,150p'

Repository: PublicDataWorks/verdad

Length of output: 2286

🏁 Script executed:

grep -n "p_label_page_size" supabase/migrations/20260129_optimize_filter_functions.sql

Repository: PublicDataWorks/verdad

Length of output: 291

🏁 Script executed:

grep -r "p_label_page_size" supabase/ --include="*.sql"

Repository: PublicDataWorks/verdad

Length of output: 953

🏁 Script executed:

cat -n supabase/migrations/20260129_optimize_filter_functions.sql | sed -n '64,84p'

Repository: PublicDataWorks/verdad

Length of output: 777

🏁 Script executed:

cat -n supabase/database/sql/get_filtering_options_function.sql | head -100

Repository: PublicDataWorks/verdad

Length of output: 3747

Guard against p_label_page_size <= 0 (division by zero).

The parameter accepts any INT value including zero or negative. At line 90, CEIL(total_labels::FLOAT / p_label_page_size) will fail with division by zero error. Additionally, the LIMIT clause at line 106 would also behave unexpectedly with invalid page sizes. Add input validation:

Proposed guard

current_user_id := auth.uid(); IF current_user_id IS NULL THEN RAISE EXCEPTION 'Only logged-in users can call this function'; END IF; + + IF p_label_page_size <= 0 THEN + RAISE EXCEPTION 'p_label_page_size must be > 0'; + END IF;

🤖 Prompt for AI Agents

In `@supabase/migrations/20260129_optimize_filter_functions.sql` around lines 85 - 91, Add input validation for p_label_page_size to guard against zero or negative values: at the start of the routine (before calculating total_pages or using LIMIT) check if p_label_page_size <= 0 and RAISE EXCEPTION with a clear message (e.g., "p_label_page_size must be positive"); alternatively, normalize the value using GREATEST(p_label_page_size, 1) where total_pages is computed and where LIMIT is applied so division by zero and unexpected LIMIT behavior are prevented. Reference p_label_page_size, total_labels/total_pages and the later LIMIT usage to ensure both the CEIL(total_labels::FLOAT / p_label_page_size) calculation and the query LIMIT are protected.

Additional optimizations identified via pg_stat_statements and Supabase Database Advisor: 1. get_recording_filter_options: 1.2s avg → ~410ms (65% improvement) - Uses filter_options_cache for states/sources/languages - Labels subquery still requires JOIN but is acceptable 2. Added indexes for unindexed foreign keys: - idx_comment_reactions_comment_id - idx_comments_room_id - idx_snippets_stage_1_llm_response - idx_user_roles_role Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 8aa07ac in 30 seconds. Click for details.

Reviewed 75 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_yz8meOipEINCubtZ

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql`:
- Line 46: The function is declared with "LANGUAGE plpgsql SECURITY DEFINER" but
lacks an explicit search_path, which allows an attacker to shadow objects like
filter_options_cache and labels; update the function body (the block ending with
"$$ LANGUAGE plpgsql SECURITY DEFINER;") to set a safe search_path at the start
(for example add "PERFORM set_config('search_path','public,pg_temp', true);" or
"SET LOCAL search_path = public, pg_temp;" as the first statements) so all
subsequent references (filter_options_cache, labels, etc.) resolve to the
intended schemas before any privileged operations run.
- Around line 29-42: The jsonb_agg of label objects may not respect the ORDER BY
inside the inner derived table, so change the aggregation to include ordering:
move or add ORDER BY l.text inside the jsonb_agg call that builds the 'labels'
field (i.e., ensure jsonb_agg(jsonb_build_object(... ) ORDER BY l.text) is used
on the SELECT that references alias l), and similarly add an ORDER BY inside the
jsonb_agg for the radio_stations aggregation to guarantee stable ordering;
update the SELECT that builds these aggregates (the labels aggregation and the
radio_stations aggregation) to use jsonb_agg(... ORDER BY <column>) referencing
l.text and the radio station sort key respectively.

🧹 Nitpick comments (1)

supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql (1)

33-41: Labels query doesn't use cache—verify this is intentional.

States, radio_stations, and languages are fetched from the filter_options_cache materialized view, but labels still performs a 3-way JOIN on live tables. If labels change frequently, this may be intentional. However, if this query is still slow, consider adding labels to the cache or pre-computing them.

Also, the LIMIT 100 lacks a deterministic ordering tie-breaker—if multiple labels share the same text, which 100 are returned is non-deterministic.

coderabbitai · 2026-01-29T20:21:44Z

+    ) INTO result;
+    RETURN result;
+END;
+$$ LANGUAGE plpgsql SECURITY DEFINER;


⚠️ Potential issue | 🟠 Major

SECURITY DEFINER without SET search_path is a security risk.

Functions marked SECURITY DEFINER run with the privileges of the function owner. Without explicitly setting search_path, an attacker could manipulate the search path to substitute malicious objects for filter_options_cache, labels, etc.

🔒 Proposed fix

END; -$$ LANGUAGE plpgsql SECURITY DEFINER; +$$ LANGUAGE plpgsql SECURITY DEFINER SET search_path = public;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

$$ LANGUAGE plpgsql SECURITY DEFINER;

$$ LANGUAGE plpgsql SECURITY DEFINER SET search_path = public;

🤖 Prompt for AI Agents

In `@supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql` at line 46, The function is declared with "LANGUAGE plpgsql SECURITY DEFINER" but lacks an explicit search_path, which allows an attacker to shadow objects like filter_options_cache and labels; update the function body (the block ending with "$$ LANGUAGE plpgsql SECURITY DEFINER;") to set a safe search_path at the start (for example add "PERFORM set_config('search_path','public,pg_temp', true);" or "SET LOCAL search_path = public, pg_temp;" as the first statements) so all subsequent references (filter_options_cache, labels, etc.) resolve to the intended schemas before any privileged operations run.

Additional optimizations: 1. Disabled completed TSV backfill cron job (was running every minute doing nothing) 2. Optimized fetch_a_snippet_that_has_no_embedding: 373ms → 98ms (74% faster) 3. Added idx_snippets_processed_recorded_at partial index 4. Removed 7 unused indexes saving ~17MB storage: - idx_audio_files_radio_station (9.5MB) - idx_snippets_user_last_activity (2MB) - idx_snippets_comment_count (2MB) - idx_snippets_upvote_count (2MB) - idx_snippets_like_count (2MB) - user_hide_snippets_user_idx (16KB) - idx_label_upvotes_snippet_label_upvoted_by (16KB) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 05ecffa in 39 seconds. Click for details.

Reviewed 38 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_9HhY0diaiyx9xic4

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

The following indexes were incorrectly removed - they ARE used: - idx_snippets_comment_count (p_order_by = 'comments') - idx_snippets_upvote_count (p_order_by = 'upvotes') - idx_snippets_like_count (p_order_by = 'upvotes') - idx_snippets_user_last_activity (p_order_by = 'activities') They showed 0 scans because users may rarely use these sort options, but they should be kept for when they do. Only safe removals (~9.5MB saved): - idx_audio_files_radio_station (uses cache now) - user_hide_snippets_user_idx (use snippet index instead) - idx_label_upvotes_snippet_label_upvoted_by (unused composite) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 44cb67a in 34 seconds. Click for details.

Reviewed 53 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_30lO2tNMgtAMPK4V

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Apply same CTE+JOIN optimization used for starred/labeled/upvoted filters to state and source filters. This avoids IN subqueries that scan all 19k+ snippets before filtering by audio_files.location_state. Expected improvement: ~134ms -> <50ms for state filter queries Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed c875082 in 44 seconds. Click for details.

Reviewed 576 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_bFoCVdn23MIc5qY7

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@supabase/migrations/20260130_optimize_state_source_filters.sql`:
- Line 380: Add input validation to guard against division-by-zero: immediately
after the authentication check (around where the function starts), check the
page_size parameter and raise an exception if page_size <= 0 so the later
computation total_pages := CEIL(total_count::FLOAT / page_size) cannot divide by
zero; reference the page_size parameter and the total_pages assignment when
applying this change.
- Around line 108-119: The migration references a non-existent index
idx_audio_files_location_state used by the state_filtered_audio_ids CTE to speed
up filtering; add a CREATE INDEX for idx_audio_files_location_state on
audio_files(location_state) before the CTE (or, if the optimization is not
intended, update the comment to remove the reference), ensuring the index name
exactly matches idx_audio_files_location_state and targets the audio_files
table's location_state column so the state filter can use it.
- Around line 88-106: The labeled_snippet_ids CTE is incorrectly using the
upvoter column (lu.upvoted_by) to decide who "labeled" a snippet; change those
predicates to use the label creator column sl.applied_by instead. Specifically,
in the labeled_snippet_ids CTE replace conditions that check lu.upvoted_by =
current_user_id and lu.upvoted_by != current_user_id with sl.applied_by =
current_user_id and sl.applied_by != current_user_id respectively, and apply the
same replacement in the second occurrence of the labeled_snippet_ids CTE
elsewhere in the diff.

🧹 Nitpick comments (2)

supabase/migrations/20260130_optimize_state_source_filters.sql (1)
186-227: Consider consolidating duplicated CTEs to reduce maintenance burden.

The five filter CTEs (starred_snippet_ids, labeled_snippet_ids, upvoted_snippet_ids, state_filtered_audio_ids, source_filtered_audio_ids) are duplicated verbatim between the count query (lines 79-119) and the data query (lines 187-227). This creates ~80 lines of duplication and increases the risk of divergence during future modifications.

While PostgreSQL doesn't share CTEs across separate statements, consider refactoring to a single WITH block that computes both count and paginated data:
WITH
  -- filter CTEs defined once
  ...,
  base_filtered AS (...),
  count_result AS (SELECT COUNT(*) FROM base_filtered),
  paginated AS (SELECT * FROM base_filtered ORDER BY ... LIMIT ... OFFSET ...)
SELECT ...
This is optional—the current approach may yield better query plans in some cases.
supabase/database/sql/get_snippets_function.sql (1)

1-22: Ensure migration and canonical function files stay synchronized.

This file (get_snippets_function.sql) and the migration (20260130_optimize_state_source_filters.sql) both define get_snippets. Any future changes must be applied to both files to avoid drift. Consider documenting this relationship or using a build process that generates the migration from the canonical file.

coderabbitai · 2026-01-30T18:54:08Z

+    labeled_snippet_ids AS (
+        SELECT DISTINCT sl.snippet
+        FROM snippet_labels sl
+        JOIN label_upvotes lu ON lu.snippet_label = sl.id
+        WHERE has_labeled_filter AND (
+            (labeled_by_me AND labeled_by_others) OR
+            (labeled_by_me AND NOT labeled_by_others AND lu.upvoted_by = current_user_id) OR
+            (labeled_by_others AND NOT labeled_by_me AND lu.upvoted_by != current_user_id)
+        )
+    ),
+    upvoted_snippet_ids AS (
+        SELECT DISTINCT sl.snippet
+        FROM snippet_labels sl
+        JOIN label_upvotes lu ON lu.snippet_label = sl.id
+        WHERE has_upvoted_filter AND (
+            (filter_upvoted_by_me AND filter_upvoted_by_others) OR
+            (filter_upvoted_by_me AND NOT filter_upvoted_by_others AND lu.upvoted_by = current_user_id) OR
+            (filter_upvoted_by_others AND NOT filter_upvoted_by_me AND lu.upvoted_by != current_user_id)
+        )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Get the migration file name and read it echo "=== Reading the migration file ===" fd "20260130_optimize_state_source_filters.sql" -x cat -n echo "" echo "=== Searching for snippet_labels table definition ===" rg -n "CREATE TABLE.*snippet_labels" --type sql -A 25 echo "" echo "=== Searching for label_upvotes table definition ===" rg -n "CREATE TABLE.*label_upvotes" --type sql -A 20

Repository: PublicDataWorks/verdad

Length of output: 20613

🏁 Script executed:

#!/bin/bash echo "=== Searching for all migration files that might define snippet_labels ===" fd "\.sql$" -x grep -l "snippet_labels" {} \; echo "" echo "=== Looking for snippet_labels and label_upvotes table definitions ===" rg "snippet_labels|label_upvotes" --type sql -B 2 | head -100 echo "" echo "=== Check if snippet_labels has a created_by or user column ===" rg "created_by|user.*snippet_labels" --type sql -C 3

Repository: PublicDataWorks/verdad

Length of output: 15097

🏁 Script executed:

#!/bin/bash echo "=== Complete snippet_labels table definition ===" rg "CREATE TABLE.*snippet_labels" --type sql -A 30 echo "" echo "=== Complete label_upvotes table definition ===" rg "CREATE TABLE.*label_upvotes" --type sql -A 30 echo "" echo "=== Check snippet_labels columns ===" rg "snippet_labels" --type sql | grep -E "created_by|upvoted_by|applied_by" | head -20

Repository: PublicDataWorks/verdad

Length of output: 827

🏁 Script executed:

#!/bin/bash echo "=== Full snippet_labels table definition from remote_schema ===" rg "CREATE TABLE.*snippet_labels" --type sql -A 50 supabase/migrations/20241029135348_remote_schema.sql echo "" echo "=== Full label_upvotes table definition from remote_schema ===" rg "CREATE TABLE.*label_upvotes" --type sql -A 50 supabase/migrations/20241029135348_remote_schema.sql echo "" echo "=== Search for 'applied_by' to understand snippet_labels structure ===" rg "applied_by" --type sql -B 5 -A 2 supabase/migrations/20241029135348_remote_schema.sql | head -60

Repository: PublicDataWorks/verdad

Length of output: 1890

🏁 Script executed:

#!/bin/bash echo "=== Search for label_upvotes definition with broader pattern ===" rg "label_upvotes" --type sql supabase/migrations/20241029135348_remote_schema.sql -B 5 -A 10 | head -80 echo "" echo "=== Search for CREATE TABLE around line 2000-3000 ===" cat supabase/migrations/20241029135348_remote_schema.sql | grep -n "label_upvotes" | head -20

Repository: PublicDataWorks/verdad

Length of output: 5662

Fix labeled_snippet_ids CTE to filter by label creator, not upvoter.

The labeled_snippet_ids CTE (lines 88-97) currently filters by lu.upvoted_by, which checks who upvoted the label. To match the semantic intent of "labeledBy", it should filter by sl.applied_by to check who applied the label to the snippet.

Change line 94 from:

(labeled_by_me AND NOT labeled_by_others AND lu.upvoted_by = current_user_id) OR (labeled_by_others AND NOT labeled_by_me AND lu.upvoted_by != current_user_id)

To:

(labeled_by_me AND NOT labeled_by_others AND sl.applied_by = current_user_id) OR (labeled_by_others AND NOT labeled_by_me AND sl.applied_by != current_user_id)

Repeat the same fix at lines 202-203 in the second occurrence of labeled_snippet_ids.

🤖 Prompt for AI Agents

In `@supabase/migrations/20260130_optimize_state_source_filters.sql` around lines 88 - 106, The labeled_snippet_ids CTE is incorrectly using the upvoter column (lu.upvoted_by) to decide who "labeled" a snippet; change those predicates to use the label creator column sl.applied_by instead. Specifically, in the labeled_snippet_ids CTE replace conditions that check lu.upvoted_by = current_user_id and lu.upvoted_by != current_user_id with sl.applied_by = current_user_id and sl.applied_by != current_user_id respectively, and apply the same replacement in the second occurrence of the labeled_snippet_ids CTE elsewhere in the diff.

coderabbitai · 2026-01-30T18:54:08Z

+    ) INTO result
+    FROM snippets_with_labels s;
+
+    total_pages := CEIL(total_count::FLOAT / page_size);


⚠️ Potential issue | 🟡 Minor

Add validation to prevent division by zero when page_size is 0.

If a caller passes page_size = 0, line 380 will raise a division-by-zero error. Add input validation early in the function.

🛡️ Proposed fix with input validation

Add after the authentication check (around line 41):

IF page_size <= 0 THEN RAISE EXCEPTION 'page_size must be greater than 0'; END IF;

🤖 Prompt for AI Agents

In `@supabase/migrations/20260130_optimize_state_source_filters.sql` at line 380, Add input validation to guard against division-by-zero: immediately after the authentication check (around where the function starts), check the page_size parameter and raise an exception if page_size <= 0 so the later computation total_pages := CEIL(total_count::FLOAT / page_size) cannot divide by zero; reference the page_size parameter and the total_pages assignment when applying this change.

The update_snippet_like_count() function now tracks both likes (value=1) and dislikes (value=-1) separately, updating the dislike_count column alongside the existing like_count when user_like_snippets changes.

Removes the legacy get_recording_filter_options() function which has zero callers in the codebase and no usage in production logs. The frontend uses get_filtering_options() instead. Also adds missing foreign key indexes identified by Supabase Database Advisor to improve JOIN performance.

- Consolidate two separate queries (count + data) into a single CTE chain, eliminating duplicate filter CTEs that ran twice per call - Use lightweight filtered_snippets CTE (IDs + sort columns only) for count and pagination, then fetch heavy columns for ~20 paginated rows only - Fix dislike_count: use denormalized column instead of hardcoded 0

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql (1)

4-4: Schema-qualify the dropped function for deterministic migrations.

The unqualified DROP FUNCTION depends on search_path and should explicitly target public.get_recording_filter_options(). While IF EXISTS provides safety, schema qualification ensures deterministic execution across different database configurations.
🔧 Proposed change
-DROP FUNCTION IF EXISTS get_recording_filter_options();
+DROP FUNCTION IF EXISTS public.get_recording_filter_options();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql` at
line 4, Replace the unqualified DROP FUNCTION with a schema-qualified drop to
make migrations deterministic: change the DROP of get_recording_filter_options()
to explicitly target public.get_recording_filter_options() (keeping the IF
EXISTS) so the migration does not depend on search_path resolution; update the
statement around the existing DROP FUNCTION IF EXISTS
get_recording_filter_options() accordingly.

supabase/database/sql/update_snippet_like_count.sql (1)

4-19: Consider consolidating the two COUNT subqueries into a single scan.

Both subqueries scan user_like_snippets for the same snippet. Using conditional aggregation reduces this to one table access per trigger invocation.

♻️ Proposed optimization using conditional aggregation

     UPDATE snippets
-    SET
-        like_count = (
-            SELECT COUNT(*)
-            FROM user_like_snippets
-            WHERE snippet = COALESCE(NEW.snippet, OLD.snippet)
-            AND value = 1
-        ),
-        dislike_count = (
-            SELECT COUNT(*)
-            FROM user_like_snippets
-            WHERE snippet = COALESCE(NEW.snippet, OLD.snippet)
-            AND value = -1
-        ),
+    SET
+        like_count = sub.like_count,
+        dislike_count = sub.dislike_count,
         user_last_activity = NOW()
-    WHERE id = COALESCE(NEW.snippet, OLD.snippet);
+    FROM (
+        SELECT
+            COUNT(*) FILTER (WHERE value = 1) AS like_count,
+            COUNT(*) FILTER (WHERE value = -1) AS dislike_count
+        FROM user_like_snippets
+        WHERE snippet = COALESCE(NEW.snippet, OLD.snippet)
+    ) sub
+    WHERE id = COALESCE(NEW.snippet, OLD.snippet);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@supabase/database/sql/update_snippet_like_count.sql` around lines 4 - 19,
Replace the two separate COUNT subqueries in update_snippet_like_count.sql with
a single aggregated scan over user_like_snippets using conditional aggregation:
compute like_count and dislike_count in one SELECT (e.g., SUM(CASE WHEN value =
1 THEN 1 ELSE 0 END) and SUM(CASE WHEN value = -1 THEN 1 ELSE 0 END)) for the
snippet identified by COALESCE(NEW.snippet, OLD.snippet) and assign those
results to snippets.like_count and snippets.dislike_count while still updating
user_last_activity = NOW(); keep references to NEW/OLD.snippet and the snippets
table unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@supabase/database/sql/get_snippets_function.sql`:
- Around line 136-143: The LEFT JOIN to user_hide_snippets (alias uhs) is
missing a per-user filter so any user's hide will hide the snippet for everyone;
modify the JOIN condition in get_snippets_function.sql to include the current
user identifier (e.g., add "AND uhs.user_id = <current_user_param>" using the
function's user param such as p_user_id or equivalent) so uhs only matches hides
by the requesting user, and apply the same change at the other occurrence noted
around line 270.
- Around line 96-115: The labeled_snippet_ids CTE is using label_upvotes
(lu.upvoted_by) instead of the label applicant field, causing "labeled by me" to
mirror "upvoted by me"; change the logic in labeled_snippet_ids to check
snippet_labels.applied_by (sl.applied_by) against current_user_id and
remove/avoid relying on lu.upvoted_by for those labeled_by_* conditions (you can
drop the join to label_upvotes or keep it only if needed for other checks), and
update the WHERE clause to use (labeled_by_me AND NOT labeled_by_others AND
sl.applied_by = current_user_id) and (labeled_by_others AND NOT labeled_by_me
AND sl.applied_by != current_user_id) while preserving has_labeled_filter and
DISTINCT sl.snippet.

---

Nitpick comments:
In `@supabase/database/sql/update_snippet_like_count.sql`:
- Around line 4-19: Replace the two separate COUNT subqueries in
update_snippet_like_count.sql with a single aggregated scan over
user_like_snippets using conditional aggregation: compute like_count and
dislike_count in one SELECT (e.g., SUM(CASE WHEN value = 1 THEN 1 ELSE 0 END)
and SUM(CASE WHEN value = -1 THEN 1 ELSE 0 END)) for the snippet identified by
COALESCE(NEW.snippet, OLD.snippet) and assign those results to
snippets.like_count and snippets.dislike_count while still updating
user_last_activity = NOW(); keep references to NEW/OLD.snippet and the snippets
table unchanged.

In `@supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql`:
- Line 4: Replace the unqualified DROP FUNCTION with a schema-qualified drop to
make migrations deterministic: change the DROP of get_recording_filter_options()
to explicitly target public.get_recording_filter_options() (keeping the IF
EXISTS) so the migration does not depend on search_path resolution; update the
statement around the existing DROP FUNCTION IF EXISTS
get_recording_filter_options() accordingly.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c875082 and 08a45e7.

📒 Files selected for processing (3)

supabase/database/sql/get_snippets_function.sql
supabase/database/sql/update_snippet_like_count.sql
supabase/migrations/20260129_optimize_recording_filter_and_indexes.sql

coderabbitai · 2026-02-28T12:22:40Z

+    labeled_snippet_ids AS (
+        SELECT DISTINCT sl.snippet
+        FROM snippet_labels sl
+        JOIN label_upvotes lu ON lu.snippet_label = sl.id
+        WHERE has_labeled_filter AND (
+            (labeled_by_me AND labeled_by_others) OR
+            (labeled_by_me AND NOT labeled_by_others AND lu.upvoted_by = current_user_id) OR
+            (labeled_by_others AND NOT labeled_by_me AND lu.upvoted_by != current_user_id)
+        )
+    ),
+    upvoted_snippet_ids AS (
+        SELECT DISTINCT sl.snippet
+        FROM snippet_labels sl
+        JOIN label_upvotes lu ON lu.snippet_label = sl.id
+        WHERE has_upvoted_filter AND (
+            (filter_upvoted_by_me AND filter_upvoted_by_others) OR
+            (filter_upvoted_by_me AND NOT filter_upvoted_by_others AND lu.upvoted_by = current_user_id) OR
+            (filter_upvoted_by_others AND NOT filter_upvoted_by_me AND lu.upvoted_by != current_user_id)
+        )
    ),


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check table structures for snippet_labels and label_upvotes echo "=== Full Database Schema (SQL files) ===" find . -name "*.sql" -type f | head -20 echo "" echo "=== Searching for snippet_labels table definition ===" rg -n "CREATE TABLE.*snippet_labels" --type sql -A 25 echo "" echo "=== Searching for label_upvotes table definition ===" rg -n "CREATE TABLE.*label_upvotes" --type sql -A 25

Repository: PublicDataWorks/verdad

Length of output: 1332

🏁 Script executed:

#!/bin/bash echo "=== All SQL files in database directory ===" find ./supabase/database -name "*.sql" -type f | sort echo "" echo "=== Search for 'snippet_labels' mentions ===" rg "snippet_labels" --type sql -B 2 -A 2 | head -60 echo "" echo "=== Search for 'label_upvotes' mentions ===" rg "label_upvotes" --type sql -B 2 -A 2 | head -60 echo "" echo "=== Check for migrations directory ===" find ./supabase/database -type d | sort

Repository: PublicDataWorks/verdad

Length of output: 13528

🏁 Script executed:

#!/bin/bash echo "=== snippet_labels table definition ===" rg "create table.*snippet_labels" --type sql -A 15 supabase/migrations/ echo "" echo "=== label_upvotes table definition ===" rg "create table.*label_upvotes" --type sql -A 15 supabase/migrations/ echo "" echo "=== Check applied_by column in snippet_labels ===" rg "applied_by" supabase/migrations/ --type sql -B 2 -A 2

Repository: PublicDataWorks/verdad

Length of output: 5524

🏁 Script executed:

#!/bin/bash echo "=== Extract labeled_snippet_ids and upvoted_snippet_ids from get_snippets_function.sql ===" sed -n '96,115p' supabase/database/sql/get_snippets_function.sql echo "" echo "=== Check how labeled_by_me and filter_upvoted_by_me are defined ===" rg "labeled_by_me|filter_upvoted_by_me" supabase/database/sql/get_snippets_function.sql -B 2 -A 2

Repository: PublicDataWorks/verdad

Length of output: 2702

Fix labeled_snippet_ids CTE to filter by label applicant, not upvote endorser.

Both labeled_snippet_ids and upvoted_snippet_ids CTEs currently use identical logic, joining snippet_labels with label_upvotes and filtering on lu.upvoted_by. The schema distinguishes between label application (snippet_labels.applied_by) and upvote endorsement (label_upvotes.upvoted_by). The labeled_snippet_ids CTE should filter snippets where the current user applied the label (sl.applied_by = current_user_id), not where they upvoted it. Currently, both filters are semantically equivalent, causing the "labeled by me" filter to behave identically to "upvoted by me."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@supabase/database/sql/get_snippets_function.sql` around lines 96 - 115, The labeled_snippet_ids CTE is using label_upvotes (lu.upvoted_by) instead of the label applicant field, causing "labeled by me" to mirror "upvoted by me"; change the logic in labeled_snippet_ids to check snippet_labels.applied_by (sl.applied_by) against current_user_id and remove/avoid relying on lu.upvoted_by for those labeled_by_* conditions (you can drop the join to label_upvotes or keep it only if needed for other checks), and update the WHERE clause to use (labeled_by_me AND NOT labeled_by_others AND sl.applied_by = current_user_id) and (labeled_by_others AND NOT labeled_by_me AND sl.applied_by != current_user_id) while preserving has_labeled_filter and DISTINCT sl.snippet.

linear · 2026-03-02T09:22:17Z

VER-304 Optimize get_snippets filters and searching to fix timeout issues

Copilot AI review requested due to automatic review settings January 29, 2026 19:47

Copilot started reviewing on behalf of rajivsinclair January 29, 2026 19:48 View session

ellipsis-dev Bot reviewed Jan 29, 2026

View reviewed changes

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jan 29, 2026

View reviewed changes

Copilot AI reviewed Jan 29, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jan 29, 2026

View reviewed changes

ellipsis-dev Bot reviewed Jan 29, 2026

View reviewed changes

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

ellipsis-dev Bot reviewed Jan 29, 2026

View reviewed changes

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

ellipsis-dev Bot reviewed Jan 29, 2026

View reviewed changes

ellipsis-dev Bot reviewed Jan 30, 2026

View reviewed changes

coderabbitai Bot reviewed Jan 30, 2026

View reviewed changes

quancao-ea added 3 commits February 27, 2026 16:23

Add dislike_count tracking to snippet like trigger

b402c2a

The update_snippet_like_count() function now tracks both likes (value=1) and dislikes (value=-1) separately, updating the dislike_count column alongside the existing like_count when user_like_snippets changes.

coderabbitai Bot reviewed Feb 28, 2026

View reviewed changes

Update migration files

66dfd6c

PublicDataWorks deleted a comment from coderabbitai Bot Feb 28, 2026

quancao-ea changed the title ~~Optimize get_snippets filters to fix timeout issues~~ VER-304: Optimize get_snippets filters to fix timeout issues Mar 2, 2026

quancao-ea changed the title ~~VER-304: Optimize get_snippets filters to fix timeout issues~~ VER-304: Optimize get_snippets filters and searching to fix timeout issues Mar 2, 2026

quancao-ea merged commit 94e0225 into main Mar 2, 2026
2 checks passed

quancao-ea deleted the fix/optimize-get-snippets-filters branch March 17, 2026 02:42

	$$ LANGUAGE plpgsql SECURITY DEFINER;
	$$ LANGUAGE plpgsql SECURITY DEFINER SET search_path = public;

Conversation

rajivsinclair commented Jan 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance Improvements

Changes Made

1. Query Pattern Optimizations

2. Caching

3. Index Improvements

4. Maintenance

Migrations Applied

Cache Maintenance

Monitoring

Supabase Logs Explorer

pg_stat_statements

Test Plan

Summary by CodeRabbit

Uh oh!

gemini-code-assist Bot commented Jan 29, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai Bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

rajivsinclair commented Jan 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 29, 2026 •

edited

Loading

`get_filtering_options` (3.7s avg → <10ms)

`get_trending_topics` (optimized)