Fix candidate search ordering and test schema#150
Conversation
📝 WalkthroughWalkthroughThe PR refactors the candidate search query to aggregate Discord user information into a new CTE (dm_agg) and expands result ordering criteria. It simultaneously updates test database schemas to include new metadata columns (created_at, updated_at, is_member, github_username, seniority, etc.) with auto-update triggers, removes the UNIQUE constraint on crm_contact_id, and extends fixture logic to populate the new fields. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/shared/src/five08/candidate_search.py (1)
195-200:⚠️ Potential issue | 🟠 MajorSQL ranking is still being overridden downstream.
This
ORDER BYis not final because Line 264 re-sorts in Python with a different precedence (it omitstimezone_matchedand uses Discord role count instead of the SQL binary match), so candidate order can still diverge from the intended SQL ranking.💡 Suggested alignment fix
- # Secondary sort: preserve primary SQL ranking, break ties with seniority alignment. - results.sort( - key=lambda c: ( - not c.is_member, - -len(c.matched_required_skills), - -len(c.matched_discord_roles), - -c.required_skill_score, - -len(c.matched_preferred_skills), - -c.seniority_score, - ) - ) + preferred_tz_set = set(preferred_timezones) + results.sort( + key=lambda c: ( + not c.is_member, + -(1 if preferred_tz_set and c.timezone in preferred_tz_set else 0), + -len(c.matched_required_skills), + -(1 if c.matched_discord_roles else 0), + -c.required_skill_score, + -len(c.matched_preferred_skills), + -c.seniority_score, + ) + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/shared/src/five08/candidate_search.py` around lines 195 - 200, The SQL ORDER BY you added (is_member, timezone_matched, required_matched, discord_role_matched, required_skill_score, preferred_matched) is being overridden by the Python re-sort later (around line 264); update the downstream sort used in the function in this module so it preserves the same precedence and semantics as the SQL ORDER BY (or remove the Python re-sort entirely). Specifically, ensure the Python comparator/key uses the same tuple ordering: is_member, timezone_matched (include this boolean), required_matched, discord_role_matched as a binary match (not a count), required_skill_score, preferred_matched — so the final ordering matches the SQL ranking.
🧹 Nitpick comments (2)
tests/integration/test_candidate_search_e2e.py (1)
58-66: Remove the duplicatepeople(discord_user_id)index.
CONSTRAINT uq_people_discord_user_id UNIQUE (discord_user_id)(Line 54) already creates a btree index, soidx_people_discord_user_idduplicates it and adds unnecessary setup/write overhead.🧹 Suggested cleanup
_CREATE_PEOPLE_INDEXES = [ "CREATE INDEX IF NOT EXISTS idx_people_email ON people (email)", "CREATE INDEX IF NOT EXISTS idx_people_email_508 ON people (email_508)", - "CREATE INDEX IF NOT EXISTS idx_people_discord_user_id ON people (discord_user_id)", "CREATE INDEX IF NOT EXISTS idx_people_skills ON people USING gin (skills)", "CREATE INDEX IF NOT EXISTS idx_people_is_member ON people (is_member)", "CREATE INDEX IF NOT EXISTS idx_people_seniority ON people (seniority)", "CREATE INDEX IF NOT EXISTS idx_people_address_country ON people (address_country)", ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_candidate_search_e2e.py` around lines 58 - 66, The _CREATE_PEOPLE_INDEXES list contains a redundant index entry "idx_people_discord_user_id" on people(discord_user_id) that duplicates the btree index already created by the unique constraint CONSTRAINT uq_people_discord_user_id UNIQUE (discord_user_id); remove the "CREATE INDEX IF NOT EXISTS idx_people_discord_user_id ON people (discord_user_id)" element from _CREATE_PEOPLE_INDEXES to avoid duplicate index creation and extra write overhead.packages/shared/src/five08/candidate_search.py (1)
109-112: Makedm_agg.rolesordering deterministic.
jsonb_agg(DISTINCT role)has unspecified element order; addingORDER BY roleavoids unstable role ordering in results.♻️ Suggested query tweak
- COALESCE( - jsonb_agg(DISTINCT role) FILTER (WHERE role IS NOT NULL), - '[]'::jsonb - ) AS roles + COALESCE( + jsonb_agg(DISTINCT role ORDER BY role) FILTER (WHERE role IS NOT NULL), + '[]'::jsonb + ) AS roles🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/shared/src/five08/candidate_search.py` around lines 109 - 112, The roles aggregation is nondeterministic because jsonb_agg(DISTINCT role) does not guarantee element order; update the aggregate expression (the COALESCE wrapping jsonb_agg(DISTINCT role) FILTER (WHERE role IS NOT NULL) AS roles) to include a deterministic ordering (e.g., add ORDER BY role inside the jsonb_agg call) so dm_agg.roles are returned in a stable, predictable order.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@packages/shared/src/five08/candidate_search.py`:
- Around line 195-200: The SQL ORDER BY you added (is_member, timezone_matched,
required_matched, discord_role_matched, required_skill_score, preferred_matched)
is being overridden by the Python re-sort later (around line 264); update the
downstream sort used in the function in this module so it preserves the same
precedence and semantics as the SQL ORDER BY (or remove the Python re-sort
entirely). Specifically, ensure the Python comparator/key uses the same tuple
ordering: is_member, timezone_matched (include this boolean), required_matched,
discord_role_matched as a binary match (not a count), required_skill_score,
preferred_matched — so the final ordering matches the SQL ranking.
---
Nitpick comments:
In `@packages/shared/src/five08/candidate_search.py`:
- Around line 109-112: The roles aggregation is nondeterministic because
jsonb_agg(DISTINCT role) does not guarantee element order; update the aggregate
expression (the COALESCE wrapping jsonb_agg(DISTINCT role) FILTER (WHERE role IS
NOT NULL) AS roles) to include a deterministic ordering (e.g., add ORDER BY role
inside the jsonb_agg call) so dm_agg.roles are returned in a stable, predictable
order.
In `@tests/integration/test_candidate_search_e2e.py`:
- Around line 58-66: The _CREATE_PEOPLE_INDEXES list contains a redundant index
entry "idx_people_discord_user_id" on people(discord_user_id) that duplicates
the btree index already created by the unique constraint CONSTRAINT
uq_people_discord_user_id UNIQUE (discord_user_id); remove the "CREATE INDEX IF
NOT EXISTS idx_people_discord_user_id ON people (discord_user_id)" element from
_CREATE_PEOPLE_INDEXES to avoid duplicate index creation and extra write
overhead.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ec63434a-3cb3-45ca-9d8d-791e9ba5681d
📒 Files selected for processing (2)
packages/shared/src/five08/candidate_search.pytests/integration/test_candidate_search_e2e.py
Description
Related Issue
How Has This Been Tested?
Summary by CodeRabbit
New Features
Tests