Fix candidate search ordering and test schema by michaelmwu · Pull Request #150 · 508-dev/508-workflows

michaelmwu · 2026-03-04T09:47:57Z

Description

Aggregate discord member rows before joining and order by coalesced member status.
Align candidate search E2E test schema with migrations (constraints, timestamps, triggers, indexes).

Related Issue

N/A

How Has This Been Tested?

Summary by CodeRabbit

New Features
- Enhanced candidate search ranking to prioritize skill strength and preferred-match metrics alongside member status for improved result ordering.
Tests
- Updated test infrastructure and database schema with new fields for tracking timestamps and member information.

coderabbitai · 2026-03-04T09:48:17Z

📝 Walkthrough

Walkthrough

The PR refactors the candidate search query to aggregate Discord user information into a new CTE (dm_agg) and expands result ordering criteria. It simultaneously updates test database schemas to include new metadata columns (created_at, updated_at, is_member, github_username, seniority, etc.) with auto-update triggers, removes the UNIQUE constraint on crm_contact_id, and extends fixture logic to populate the new fields.

Changes

Cohort / File(s)	Summary
Candidate Search Query Logic `packages/shared/src/five08/candidate_search.py`	Introduces a new `dm_agg` CTE to pre-aggregate Discord user metadata (discord_user_id, discord_username, display_name, roles) from discord_members. Replaces direct join to discord_members with join to dm_agg. Extends ORDER BY from just `p.is_member` to include `required_skill_score` and `preferred_matched` for broader ranking criteria.
Test Database Schemas & Fixtures `tests/integration/test_candidate_search_e2e.py`	Updates `people` table schema: removes UNIQUE constraint from crm_contact_id, adds created_at/updated_at (TIMESTAMPTZ with auto-update triggers), contact_type, address_city, github_username, is_member, seniority, linkedin, latest_resume_id, latest_resume_name, and sync_status CHECK constraint. Updates `discord_members` table with roles JSONB default and timestamp columns. Adds indexes on people (email, discord_user_id, skills GIN, is_member, seniority, address_country) and discord_members. Introduces CREATE\_UPDATED_AT_FUNCTION and CREATE\_UPDATED_AT_TRIGGER for auto-timestamp updates. Updates `_insert` helper to populate new columns with Jsonb wrappers.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix: candidate role search SQL #149: Modifies candidate_search.py to refactor how discord_members data is joined and used with COALESCE for discord member fields, overlapping with the dm_agg CTE introduction.
feat: add job match extraction and ranked candidate search #142: Changes candidate_search.py ranking and query logic including ORDER BY criteria, directly related to the expanded sorting in this PR.
feat: loose candidate matching with Discord role sync #148: Updates candidate_search.py to change discord_members/discord_roles joins and introduces aggregated fields, parallel to the dm_agg CTE approach here.

Poem

🐰 A CTE blooms where Discord dwells,
Aggregating tales and bells,
With timestamps auto-ticking true,
The schema dances, fresh and new—
Sorting by skill and favor's gleam! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix candidate search ordering and test schema' accurately reflects the main changes: ordering logic updates in candidate_search.py and schema alignment in the test integration file.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch michaelmwu/candidate-search-ordering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/shared/src/five08/candidate_search.py (1)

195-200: ⚠️ Potential issue | 🟠 Major

SQL ranking is still being overridden downstream.

This ORDER BY is not final because Line 264 re-sorts in Python with a different precedence (it omits timezone_matched and uses Discord role count instead of the SQL binary match), so candidate order can still diverge from the intended SQL ranking.

💡 Suggested alignment fix

-    # Secondary sort: preserve primary SQL ranking, break ties with seniority alignment.
-    results.sort(
-        key=lambda c: (
-            not c.is_member,
-            -len(c.matched_required_skills),
-            -len(c.matched_discord_roles),
-            -c.required_skill_score,
-            -len(c.matched_preferred_skills),
-            -c.seniority_score,
-        )
-    )
+    preferred_tz_set = set(preferred_timezones)
+    results.sort(
+        key=lambda c: (
+            not c.is_member,
+            -(1 if preferred_tz_set and c.timezone in preferred_tz_set else 0),
+            -len(c.matched_required_skills),
+            -(1 if c.matched_discord_roles else 0),
+            -c.required_skill_score,
+            -len(c.matched_preferred_skills),
+            -c.seniority_score,
+        )
+    )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/candidate_search.py` around lines 195 - 200, The
SQL ORDER BY you added (is_member, timezone_matched, required_matched,
discord_role_matched, required_skill_score, preferred_matched) is being
overridden by the Python re-sort later (around line 264); update the downstream
sort used in the function in this module so it preserves the same precedence and
semantics as the SQL ORDER BY (or remove the Python re-sort entirely).
Specifically, ensure the Python comparator/key uses the same tuple ordering:
is_member, timezone_matched (include this boolean), required_matched,
discord_role_matched as a binary match (not a count), required_skill_score,
preferred_matched — so the final ordering matches the SQL ranking.

🧹 Nitpick comments (2)

tests/integration/test_candidate_search_e2e.py (1)

58-66: Remove the duplicate people(discord_user_id) index.

CONSTRAINT uq_people_discord_user_id UNIQUE (discord_user_id) (Line 54) already creates a btree index, so idx_people_discord_user_id duplicates it and adds unnecessary setup/write overhead.

🧹 Suggested cleanup

 _CREATE_PEOPLE_INDEXES = [
     "CREATE INDEX IF NOT EXISTS idx_people_email ON people (email)",
     "CREATE INDEX IF NOT EXISTS idx_people_email_508 ON people (email_508)",
-    "CREATE INDEX IF NOT EXISTS idx_people_discord_user_id ON people (discord_user_id)",
     "CREATE INDEX IF NOT EXISTS idx_people_skills ON people USING gin (skills)",
     "CREATE INDEX IF NOT EXISTS idx_people_is_member ON people (is_member)",
     "CREATE INDEX IF NOT EXISTS idx_people_seniority ON people (seniority)",
     "CREATE INDEX IF NOT EXISTS idx_people_address_country ON people (address_country)",
 ]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_candidate_search_e2e.py` around lines 58 - 66, The
_CREATE_PEOPLE_INDEXES list contains a redundant index entry
"idx_people_discord_user_id" on people(discord_user_id) that duplicates the
btree index already created by the unique constraint CONSTRAINT
uq_people_discord_user_id UNIQUE (discord_user_id); remove the "CREATE INDEX IF
NOT EXISTS idx_people_discord_user_id ON people (discord_user_id)" element from
_CREATE_PEOPLE_INDEXES to avoid duplicate index creation and extra write
overhead.

packages/shared/src/five08/candidate_search.py (1)

109-112: Make dm_agg.roles ordering deterministic.

jsonb_agg(DISTINCT role) has unspecified element order; adding ORDER BY role avoids unstable role ordering in results.

♻️ Suggested query tweak

-                COALESCE(
-                    jsonb_agg(DISTINCT role) FILTER (WHERE role IS NOT NULL),
-                    '[]'::jsonb
-                ) AS roles
+                COALESCE(
+                    jsonb_agg(DISTINCT role ORDER BY role) FILTER (WHERE role IS NOT NULL),
+                    '[]'::jsonb
+                ) AS roles

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/candidate_search.py` around lines 109 - 112, The
roles aggregation is nondeterministic because jsonb_agg(DISTINCT role) does not
guarantee element order; update the aggregate expression (the COALESCE wrapping
jsonb_agg(DISTINCT role) FILTER (WHERE role IS NOT NULL) AS roles) to include a
deterministic ordering (e.g., add ORDER BY role inside the jsonb_agg call) so
dm_agg.roles are returned in a stable, predictable order.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/shared/src/five08/candidate_search.py`:
- Around line 195-200: The SQL ORDER BY you added (is_member, timezone_matched,
required_matched, discord_role_matched, required_skill_score, preferred_matched)
is being overridden by the Python re-sort later (around line 264); update the
downstream sort used in the function in this module so it preserves the same
precedence and semantics as the SQL ORDER BY (or remove the Python re-sort
entirely). Specifically, ensure the Python comparator/key uses the same tuple
ordering: is_member, timezone_matched (include this boolean), required_matched,
discord_role_matched as a binary match (not a count), required_skill_score,
preferred_matched — so the final ordering matches the SQL ranking.

---

Nitpick comments:
In `@packages/shared/src/five08/candidate_search.py`:
- Around line 109-112: The roles aggregation is nondeterministic because
jsonb_agg(DISTINCT role) does not guarantee element order; update the aggregate
expression (the COALESCE wrapping jsonb_agg(DISTINCT role) FILTER (WHERE role IS
NOT NULL) AS roles) to include a deterministic ordering (e.g., add ORDER BY role
inside the jsonb_agg call) so dm_agg.roles are returned in a stable, predictable
order.

In `@tests/integration/test_candidate_search_e2e.py`:
- Around line 58-66: The _CREATE_PEOPLE_INDEXES list contains a redundant index
entry "idx_people_discord_user_id" on people(discord_user_id) that duplicates
the btree index already created by the unique constraint CONSTRAINT
uq_people_discord_user_id UNIQUE (discord_user_id); remove the "CREATE INDEX IF
NOT EXISTS idx_people_discord_user_id ON people (discord_user_id)" element from
_CREATE_PEOPLE_INDEXES to avoid duplicate index creation and extra write
overhead.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ec63434a-3cb3-45ca-9d8d-791e9ba5681d

📥 Commits

Reviewing files that changed from the base of the PR and between f071694 and 11805f1.

📒 Files selected for processing (2)

packages/shared/src/five08/candidate_search.py
tests/integration/test_candidate_search_e2e.py

Fix candidate search ordering

11805f1

michaelmwu temporarily deployed to test March 4, 2026 09:48 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 4, 2026

View reviewed changes

michaelmwu merged commit 82a5f9b into main Mar 4, 2026
5 checks passed

michaelmwu deleted the michaelmwu/candidate-search-ordering branch March 4, 2026 09:56

This was referenced Mar 4, 2026

Add match scoring and resume download dropdown #151

Merged

Improve location ranking for /match-candidates #172

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix candidate search ordering and test schema#150

Fix candidate search ordering and test schema#150
michaelmwu merged 1 commit into
mainfrom
michaelmwu/candidate-search-ordering

michaelmwu commented Mar 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

michaelmwu commented Mar 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

How Has This Been Tested?

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

michaelmwu commented Mar 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 4, 2026 •

edited

Loading