feat(shared): reuse resume extractor in bot and worker by michaelmwu · Pull Request #113 · 508-dev/508-workflows

michaelmwu · 2026-03-03T04:00:35Z

Description

Introduced a shared resume extraction service in packages/shared/src/five08/resume_extractor.py to normalize candidate fields, emit confidence, and provide LLM + heuristic fallback extraction.
Updated Discord bot CRM resume workflows to use that shared extractor for resume matching payloads, name inference, and inferred contact creation fields.
Refactored worker resume intake and resume profile processors to use the shared extractor and removed duplicated extraction logic.
Added bug-tracking filtering and normalized skill attr handling for strengths, website link arrays, and phone/email flow updates.
Validation passed pre-commit hooks (ruff, ruff format, mypy), with no runtime tests executed in this pass.

Related Issue

How Has This Been Tested?

Pre-commit checks (ruff, ruff format, mypy) passed on commit.

Summary by CodeRabbit

Release Notes

New Features

AI-powered resume extraction now captures comprehensive candidate profiles including skills, GitHub usernames, LinkedIn URLs, location, and experience level
Improved skill management with automatic normalization, deduplication, and intelligent filtering of irrelevant terms

Configuration

Added OpenAI integration settings to support advanced resume processing

coderabbitai · 2026-03-03T04:00:59Z

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 26 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 01609db and 398a953.

📒 Files selected for processing (6)

apps/worker/src/five08/worker/crm/intake_form_processor.py
apps/worker/src/five08/worker/crm/processor.py
apps/worker/src/five08/worker/crm/resume_profile_processor.py
packages/shared/src/five08/resume_extractor.py
tests/unit/test_intake_form_processor.py
tests/unit/test_resume_profile_processor.py

📝 Walkthrough

Walkthrough

This PR introduces OpenAI-driven resume profile extraction and structured profile processing across the Discord bot and worker services. It adds a new shared ResumeProfileExtractor module supporting both LLM-based and heuristic extraction, refactors resume processing pipelines to use canonicalized skills and normalized seniority levels, integrates OpenAI configuration parameters, and updates skill and intake form processors to consume structured profile data.

Changes

Cohort / File(s)	Summary
OpenAI Configuration `apps/discord_bot/src/five08/discord_bot/config.py`	Adds three new settings fields: `openai_api_key`, `openai_base_url`, and `openai_model` (default "gpt-4o-mini") to support external LLM integration.
Resume Extraction Infrastructure `packages/shared/src/five08/resume_extractor.py`	New module introducing `ResumeProfileExtractor` with LLM-backed and heuristic resume parsing, `ResumeExtractedProfile` data model, normalization utilities for name/email/phone/seniority/skills, and DISALLOWED_SKILLS constant.
Discord CRM Resume Integration `apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Integrates `ResumeProfileExtractor`, introduces profile-based hint extraction via new `_extract_resume_profile` and `_extract_resume_contact_hints` methods, expands contact payload with emails, GitHub/LinkedIn URLs, skills, seniority, and country fields.
Worker Resume Processing `apps/worker/src/five08/worker/crm/resume_profile_processor.py`	Refactors to use `ResumeProfileExtractor`, adds comprehensive helper methods for skill coercion, deduplication, website/email normalization, and seniority mapping; enhances `apply_profile_updates` with validation and batch update logic.
Skills & Seniority Handling `apps/worker/src/five08/worker/crm/intake_form_processor.py`, `apps/worker/src/five08/worker/crm/skills_extractor.py`, `apps/worker/src/five08/worker/crm/processor.py`	Updates seniority map to include "principal" → "staff"; switches to canonicalized skill handling; adds `SkillsExtractor` integration to `EspoCRMClient`; introduces `ContactSkillsProcessor` class.
Data Model Alignment `apps/worker/src/five08/worker/models.py`	Aliases `ResumeExtractedProfile` to shared `SharedResumeExtractedProfile` implementation, removing duplicate local definition.
Test Coverage `tests/unit/test_crm.py`, `tests/unit/test_intake_form_processor.py`, `tests/unit/test_resume_profile_processor.py`, `tests/unit/test_skills_extractor.py`, `tests/unit/test_worker_processor.py`	Adds tests for skills normalization, seniority mapping, resume profile field serialization, website link deduplication, and email/skills merging behavior.

Sequence Diagram(s)

sequenceDiagram
    participant DCBot as Discord Bot
    participant CRMCog as CRM Cog
    participant ResExtractor as ResumeProfileExtractor
    participant OpenAI as OpenAI API
    participant ResProcProc as ResumeProfileProcessor
    participant EspoCRM as EspoCRM Client

    DCBot->>CRMCog: Upload resume file
    CRMCog->>ResExtractor: extract(file_content)
    ResExtractor->>ResExtractor: Check cache
    alt Cache miss
        ResExtractor->>OpenAI: Request profile extraction (LLM)
        OpenAI-->>ResExtractor: JSON profile response
    else Cache hit
        ResExtractor-->>ResExtractor: Return cached profile
    end
    Note over ResExtractor: Heuristic fallback if LLM unavailable
    ResExtractor-->>CRMCog: ResumeExtractedProfile
    
    CRMCog->>CRMCog: _extract_resume_contact_hints(profile)
    Note over CRMCog: Extract emails, GitHub, LinkedIn,<br/>phone, skills, seniority, country
    CRMCog->>CRMCog: _build_resume_create_contact_payload(hints)
    CRMCog-->>EspoCRM: Create contact with structured data

    rect rgba(100, 150, 200, 0.5)
        Note over ResProcProc: Background: Profile Processing
        ResProcProc->>ResProcProc: apply_profile_updates(profile)
        ResProcProc->>ResProcProc: Normalize & deduplicate skills
        ResProcProc->>ResProcProc: Coerce seniority level
        ResProcProc->>ResProcProc: Merge website links & emails
        ResProcProc-->>EspoCRM: Update contact fields
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: add Google Forms intake webhook #74 — Both PRs modify IntakeFormProcessor for resume/profile extraction and seniority handling; overlapping changes to skills and form-level attribute canonicalization.
feat: add mailbox resume intake with candidate targeting #38 — Mailbox resume intake that ingests resumes into the ResumeProfileProcessor and CRM pipeline; directly consumes the new ResumeProfileExtractor functionality.
feat: async resume extraction and CRM apply flow #35 — Implements overlapping async resume extraction and CRM apply flow, modifying the same ResumeProfileProcessor, worker API endpoints, and ResumeProfileExtractor integration.

Poem

🐰 A rabbit hops through profiles new,
With skills extracted, shiny true!
From OpenAI to heuristics grand,
Resume data, neatly planned!
Deduplicated, canonicalized fast,
No more messy parsing—at last! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 38.61% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and concisely describes the main change: introducing a shared resume extractor that is reused across the bot and worker applications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch michaelmwu/resume-skills

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/worker/src/five08/worker/crm/resume_profile_processor.py (1)

356-381: ⚠️ Potential issue | 🟠 Major

Move the empty-update guard after Discord link enrichment.

The early return at Line 356 short-circuits link-only apply requests before cDiscordUserID/cDiscordUsername are added.

Suggested fix

-            if not sanitized_updates:
-                return ResumeApplyResult(
-                    contact_id=contact_id,
-                    updated_fields=[],
-                    success=False,
-                    error="No valid profile fields provided",
-                )
-
             link_applied = False
             if link_discord:
                 discord_user_id = str(link_discord.get("user_id", "")).strip()
                 discord_username = str(link_discord.get("username", "")).strip()
                 if discord_user_id and discord_username:
                     sanitized_updates["cDiscordUserID"] = discord_user_id
                     sanitized_updates["cDiscordUsername"] = (
                         f"{discord_username} (ID: {discord_user_id})"
                     )
                     link_applied = True

             if not sanitized_updates:
                 return ResumeApplyResult(
                     contact_id=contact_id,
                     updated_fields=[],
                     success=False,
                     error="No valid profile fields provided",
                 )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/crm/resume_profile_processor.py` around lines
356 - 381, The current early-return that checks sanitized_updates and returns a
ResumeApplyResult happens before enriching sanitized_updates with Discord info,
which prevents link-only updates from applying; move the empty-update guard (the
conditional that returns ResumeApplyResult when sanitized_updates is false) to
after the Discord enrichment block that uses link_discord and sets
sanitized_updates["cDiscordUserID"] and sanitized_updates["cDiscordUsername"]
(and link_applied) so that Discord-only requests will be applied; update
references in resume_profile_processor.py around the sanitized_updates check,
the link_discord handling, and the ResumeApplyResult creation accordingly.

🧹 Nitpick comments (2)

apps/worker/src/five08/worker/crm/intake_form_processor.py (1)

413-416: Keep cSkillAttrs payload shape consistent across form and resume paths.

Form updates write {"skill": {"strength": N}}, while resume updates currently write {"skill": N}. Unifying this avoids schema drift in CRM payload consumers.

♻️ Suggested consistency patch

-    def _parse_profile_skill_attrs(self, profile: Any) -> dict[str, int]:
+    def _parse_profile_skill_attrs(self, profile: Any) -> dict[str, dict[str, int]]:
         raw_attrs = getattr(profile, "skill_attrs", {})
         if not isinstance(raw_attrs, dict):
             return {}
-        parsed: dict[str, int] = {}
+        parsed: dict[str, dict[str, int]] = {}
         for raw_skill, raw_payload in raw_attrs.items():
             normalized_name = self.skills_extractor.canonicalize_skill(str(raw_skill))
             if not normalized_name:
                 continue
             if isinstance(raw_payload, dict):
                 raw_payload = raw_payload.get("strength")
             try:
-                parsed[normalized_name] = max(1, min(5, int(raw_payload)))
+                parsed[normalized_name] = {
+                    "strength": max(1, min(5, int(raw_payload)))
+                }
             except Exception:
                 continue
         return parsed

Also applies to: 421-436

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/crm/intake_form_processor.py` around lines 413
- 416, The cSkillAttrs payload written by the form path must match the nested
shape used by the resume path; update the logic around
_parse_profile_skill_attrs and where updates["cSkillAttrs"] is set so each skill
value is an object like {"strength": N} rather than a bare number—ensure
updates["cSkillAttrs"] = json.dumps(profile_attrs) builds profile_attrs with
values as dicts (e.g., {"skill": {"strength": value}}) and apply the same
transformation in the resume-path code block covering the logic duplicated
around lines referenced (the other block at 421-436) so both form and resume
produce identical shapes; keep updates["skills"] = sorted(profile_attrs.keys())
unchanged.

tests/unit/test_skills_extractor.py (1)

117-125: Extend this test to also cover the hyphenated variant.

The extractor now disallows "bug-tracking" too; asserting that here will fully lock this regression case.

✅ Suggested test tweak

-    result = extractor._normalize_extracted_payload(
-        skills_value=["Bug Tracking", "Python", "bugtracking", "Code Review"],
+    result = extractor._normalize_extracted_payload(
+        skills_value=[
+            "Bug Tracking",
+            "Python",
+            "bugtracking",
+            "bug-tracking",
+            "Code Review",
+        ],
         skill_attrs_value=None,
         confidence=0.8,
         source="model",
     )

     assert "bug tracking" not in result.skills
     assert "bugtracking" not in result.skills
+    assert "bug-tracking" not in result.skills
     assert result.skills == ["python"]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_skills_extractor.py` around lines 117 - 125, Add an assertion
to ensure the hyphenated token is also filtered: extend the existing assertions
after creating the extractor result (referencing result.skills and the
skills_value list used to build it) to assert that "bug-tracking" is not in
result.skills, and keep the existing checks that "bug tracking" and
"bugtracking" are absent and that result.skills == ["python"].

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 136-166: _normailze_seniority currently returns the string
"unknown" for unmapped or empty inputs which is truthy and prevents the fallback
_infer_seniority_from_resume from running in extract; update
_normalize_seniority to (1) add explicit mappings for "principal" and "principal
engineer" (return a distinct "principal" value) and (2) return None (not
"unknown") for empty or otherwise-unmapped inputs so extract can fall back to
_infer_seniority_from_resume; apply the same change to the other normalization
occurrence referenced around lines 396-400.
- Around line 647-657: The heuristic parsing after the heading fails when the
matched position is followed by a newline because tail.splitlines()[0] returns
an empty string; change the logic in the block using line_start/tail/first_line
so you skip leading blank lines (e.g., strip leading whitespace or iterate to
the first non-empty line) before splitting into comma-separated items and
calling _normalize_skill_payload; ensure you still set attrs to
{skill.casefold(): DEFAULT_SKILL_STRENGTH} when skills exist but attrs is None.

In `@tests/unit/test_resume_profile_processor.py`:
- Around line 80-82: Change the multiline return annotation "-> (None)" to a
single-line "-> None" on the test function signature for
test_extract_profile_proposal_merges_and_serializes_website_and_skill_attrs (and
any other test functions that currently use "-> (None)" such as the similar test
near the end of the file). Edit the def lines so the return annotation is on the
same line as the def and use -> None without parentheses to satisfy ruff format
--check.

In `@tests/unit/test_worker_processor.py`:
- Around line 38-41: The current assertion assumes positional args and exact
order; change it to inspect the mock's call args
(processor.espocrm_client.update_contact_skills.call_args), extract the contact
id and the skills either from kwargs or from positional args, assert the contact
id equals "contact-1", and assert the skills using an order-insensitive
comparison (e.g., compare sets) so the test accepts any valid skill ordering
while still ensuring the same elements were passed.

---

Outside diff comments:
In `@apps/worker/src/five08/worker/crm/resume_profile_processor.py`:
- Around line 356-381: The current early-return that checks sanitized_updates
and returns a ResumeApplyResult happens before enriching sanitized_updates with
Discord info, which prevents link-only updates from applying; move the
empty-update guard (the conditional that returns ResumeApplyResult when
sanitized_updates is false) to after the Discord enrichment block that uses
link_discord and sets sanitized_updates["cDiscordUserID"] and
sanitized_updates["cDiscordUsername"] (and link_applied) so that Discord-only
requests will be applied; update references in resume_profile_processor.py
around the sanitized_updates check, the link_discord handling, and the
ResumeApplyResult creation accordingly.

---

Nitpick comments:
In `@apps/worker/src/five08/worker/crm/intake_form_processor.py`:
- Around line 413-416: The cSkillAttrs payload written by the form path must
match the nested shape used by the resume path; update the logic around
_parse_profile_skill_attrs and where updates["cSkillAttrs"] is set so each skill
value is an object like {"strength": N} rather than a bare number—ensure
updates["cSkillAttrs"] = json.dumps(profile_attrs) builds profile_attrs with
values as dicts (e.g., {"skill": {"strength": value}}) and apply the same
transformation in the resume-path code block covering the logic duplicated
around lines referenced (the other block at 421-436) so both form and resume
produce identical shapes; keep updates["skills"] = sorted(profile_attrs.keys())
unchanged.

In `@tests/unit/test_skills_extractor.py`:
- Around line 117-125: Add an assertion to ensure the hyphenated token is also
filtered: extend the existing assertions after creating the extractor result
(referencing result.skills and the skills_value list used to build it) to assert
that "bug-tracking" is not in result.skills, and keep the existing checks that
"bug tracking" and "bugtracking" are absent and that result.skills ==
["python"].

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f926dd and 01609db.

📒 Files selected for processing (13)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
apps/discord_bot/src/five08/discord_bot/config.py
apps/worker/src/five08/worker/crm/intake_form_processor.py
apps/worker/src/five08/worker/crm/processor.py
apps/worker/src/five08/worker/crm/resume_profile_processor.py
apps/worker/src/five08/worker/crm/skills_extractor.py
apps/worker/src/five08/worker/models.py
packages/shared/src/five08/resume_extractor.py
tests/unit/test_crm.py
tests/unit/test_intake_form_processor.py
tests/unit/test_resume_profile_processor.py
tests/unit/test_skills_extractor.py
tests/unit/test_worker_processor.py

coderabbitai · 2026-03-03T04:06:53Z

+def _normalize_seniority(value: Any) -> str | None:
+    if not isinstance(value, str):
+        return None
+    normalized = value.strip().lower()
+    if not normalized:
+        return "unknown"
+    if normalized in {"jr", "junior", "entry", "entry-level", "entry level"}:
+        return "junior"
+    if normalized in {"intern", "internship"}:
+        return "junior"
+    if normalized in {"mid-level", "midlevel", "mid", "intermediate"}:
+        return "midlevel"
+    if normalized in {"staff", "staff+", "staff and beyond"}:
+        return "staff"
+    if normalized in {
+        "senior",
+        "sr",
+        "sr. engineer",
+        "lead",
+        "lead engineer",
+        "lead engineer/tech lead",
+    }:
+        return "senior"
+    if "lead" in normalized and ("engineer" in normalized or "lead" == normalized):
+        return "senior"
+    if "staff" in normalized:
+        return "staff"
+    if normalized.startswith("sr "):
+        return "senior"
+    return "unknown"
+


⚠️ Potential issue | 🟠 Major

Seniority normalization currently collapses principal titles and can block fallback inference.

_normalize_seniority does not map "principal"/"principal engineer" and returns "unknown" for unmapped values. In extract(), that truthy "unknown" prevents _infer_seniority_from_resume(...) from running.

🐛 Suggested fix

def _normalize_seniority(value: Any) -> str | None: @@ - if normalized in {"staff", "staff+", "staff and beyond"}: + if normalized in { + "staff", + "staff+", + "staff and beyond", + "principal", + "principal engineer", + }: return "staff" @@ - return "unknown" + return None

Also applies to: 396-400

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/shared/src/five08/resume_extractor.py` around lines 136 - 166, _normailze_seniority currently returns the string "unknown" for unmapped or empty inputs which is truthy and prevents the fallback _infer_seniority_from_resume from running in extract; update _normalize_seniority to (1) add explicit mappings for "principal" and "principal engineer" (return a distinct "principal" value) and (2) return None (not "unknown") for empty or otherwise-unmapped inputs so extract can fall back to _infer_seniority_from_resume; apply the same change to the other normalization occurrence referenced around lines 396-400.

coderabbitai · 2026-03-03T04:06:53Z

+        line_start = match.end()
+        tail = resume_text[line_start : line_start + 500]
+        first_line = tail.splitlines()[0] if tail else ""
+        if first_line:
+            skills, attrs = _normalize_skill_payload(
+                [item.strip() for item in first_line.split(",") if item.strip()],
+                None,
+            )
+            if skills and not attrs:
+                attrs = {skill.casefold(): DEFAULT_SKILL_STRENGTH for skill in skills}
+            return skills, attrs


⚠️ Potential issue | 🟠 Major

Heuristic skills parsing drops common Skills: sections followed by a newline.

After the heading match, tail usually starts with a newline, so splitlines()[0] is empty and no skills are parsed.

🐛 Suggested fix

- first_line = tail.splitlines()[0] if tail else "" - if first_line: + first_non_empty = next( + (line.strip() for line in tail.splitlines() if line.strip()), + "", + ) + if first_non_empty: skills, attrs = _normalize_skill_payload( - [item.strip() for item in first_line.split(",") if item.strip()], + [item.strip() for item in first_non_empty.split(",") if item.strip()], None, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

line_start = match.end()

tail = resume_text[line_start : line_start + 500]

first_line = tail.splitlines()[0] if tail else ""

if first_line:

skills, attrs = _normalize_skill_payload(

[item.strip() for item in first_line.split(",") if item.strip()],

None,

)

if skills and not attrs:

attrs = {skill.casefold(): DEFAULT_SKILL_STRENGTH for skill in skills}

return skills, attrs

line_start = match.end()

tail = resume_text[line_start : line_start + 500]

first_non_empty = next(

(line.strip() for line in tail.splitlines() if line.strip()),

"",

)

if first_non_empty:

skills, attrs = _normalize_skill_payload(

[item.strip() for item in first_non_empty.split(",") if item.strip()],

None,

)

if skills and not attrs:

attrs = {skill.casefold(): DEFAULT_SKILL_STRENGTH for skill in skills}

return skills, attrs

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/shared/src/five08/resume_extractor.py` around lines 647 - 657, The heuristic parsing after the heading fails when the matched position is followed by a newline because tail.splitlines()[0] returns an empty string; change the logic in the block using line_start/tail/first_line so you skip leading blank lines (e.g., strip leading whitespace or iterate to the first non-empty line) before splitting into comma-separated items and calling _normalize_skill_payload; ensure you still set attrs to {skill.casefold(): DEFAULT_SKILL_STRENGTH} when skills exist but attrs is None.

coderabbitai · 2026-03-03T04:06:53Z

+def test_extract_profile_proposal_merges_and_serializes_website_and_skill_attrs() -> (
+    None
+):


⚠️ Potential issue | 🟠 Major

Fix these function signatures to pass ruff format --check.

The current multiline -> (None) annotations are causing the formatter gate to fail.

Suggested fix

-def test_extract_profile_proposal_merges_and_serializes_website_and_skill_attrs() -> ( - None -): +def test_extract_profile_proposal_merges_and_serializes_website_and_skill_attrs() -> None: @@ -def test_apply_profile_updates_appends_resume_email_as_primary_emailAddressData() -> ( - None -): +def test_apply_profile_updates_appends_resume_email_as_primary_emailAddressData() -> None:

Also applies to: 313-315

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_resume_profile_processor.py` around lines 80 - 82, Change the multiline return annotation "-> (None)" to a single-line "-> None" on the test function signature for test_extract_profile_proposal_merges_and_serializes_website_and_skill_attrs (and any other test functions that currently use "-> (None)" such as the similar test near the end of the file). Edit the def lines so the return annotation is on the same line as the def and use -> None without parentheses to satisfy ruff format --check.

coderabbitai · 2026-03-03T04:06:53Z

+    processor.espocrm_client.update_contact_skills.assert_called_once_with(
+        "contact-1",
+        ["python", "redis", "fastapi", "docker"],
+    )


⚠️ Potential issue | 🟠 Major

Make this assertion kwargs-aware and order-insensitive.

CI already reports this call arrives via kwargs and with a different (but valid) skill order.

Suggested fix

- processor.espocrm_client.update_contact_skills.assert_called_once_with( - "contact-1", - ["python", "redis", "fastapi", "docker"], - ) + processor.espocrm_client.update_contact_skills.assert_called_once() + _, kwargs = processor.espocrm_client.update_contact_skills.call_args + assert kwargs["contact_id"] == "contact-1" + assert set(kwargs["skills"]) == {"python", "redis", "fastapi", "docker"}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

processor.espocrm_client.update_contact_skills.assert_called_once_with(

"contact-1",

["python", "redis", "fastapi", "docker"],

)

processor.espocrm_client.update_contact_skills.assert_called_once()

_, kwargs = processor.espocrm_client.update_contact_skills.call_args

assert kwargs["contact_id"] == "contact-1"

assert set(kwargs["skills"]) == {"python", "redis", "fastapi", "docker"}

🧰 Tools

🪛 GitHub Actions: Tests

[error] 38-38: CI failure in pytest: test_process_contact_skills_merges_and_updates did not call update_contact_skills with expected argument order. Actual call: update_contact_skills(contact_id='contact-1', skills=['python', 'redis', 'docker', 'fastapi']); expected: update_contact_skills('contact-1', ['python', 'redis', 'fastapi', 'docker']). Command: 'uv run pytest tests/ -v --tb=short'.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_worker_processor.py` around lines 38 - 41, The current assertion assumes positional args and exact order; change it to inspect the mock's call args (processor.espocrm_client.update_contact_skills.call_args), extract the contact id and the skills either from kwargs or from positional args, assert the contact id equals "contact-1", and assert the skills using an order-insensitive comparison (e.g., compare sets) so the test accepts any valid skill ordering while still ensuring the same elements were passed.

michaelmwu added 9 commits March 3, 2026 10:55

feat: share resume extraction and update resume linking

8b08297

fix(worker): normalize resume skills before apply

826846c

Normalize CRM skills values to lowercase

1dfb19b

Normalize resume contact updates for skills, websites, and email arrays

5e0e87b

fix(resume): normalize and dedupe skills/seniority for reprocess

d2ca2d1

fix(resume): normalize seniority to known buckets

628d277

fix(resume): map principal to staff seniority bucket

13a6f4f

Resolve merge with main

35703d4

refactor: unify resume and intake skill extraction

01609db

michaelmwu had a problem deploying to test March 3, 2026 04:00 — with GitHub Actions Failure

Handle social links as cSocialLinks Url Multiple

a72add7

michaelmwu had a problem deploying to test March 3, 2026 04:03 — with GitHub Actions Failure

michaelmwu added 2 commits March 3, 2026 12:04

Populate cWebsiteLink from resume extracts in intake path

34bda32

Store cResumeLastProcessed in UTC datetime format

6889070

michaelmwu had a problem deploying to test March 3, 2026 04:05 — with GitHub Actions Failure

Normalize cRoles values to lowercase no-space enums

2ebbe5c

michaelmwu had a problem deploying to test March 3, 2026 04:06 — with GitHub Actions Failure

coderabbitai Bot reviewed Mar 3, 2026

View reviewed changes

Normalize resume extraction and apply skill/website handling

02c973d

michaelmwu had a problem deploying to test March 3, 2026 04:13 — with GitHub Actions Failure

fix(worker): resolve resume profile CI regressions

398a953

michaelmwu temporarily deployed to test March 3, 2026 04:16 — with GitHub Actions Inactive

michaelmwu merged commit 5d49f9b into main Mar 3, 2026
5 checks passed

michaelmwu deleted the michaelmwu/resume-skills branch March 3, 2026 04:17

This was referenced Mar 3, 2026

fix worker startup side effects and DRY crm normalization #140

Merged

feat: add job match extraction and ranked candidate search #142

Merged

Improve resume CRM inference and Espo client integration #168

Merged

This was referenced Mar 18, 2026

refactor CRM resume processing flow #206

Merged

Add member agreement Docuseal flow and role grant #215

Merged

coderabbitai Bot mentioned this pull request Mar 26, 2026

Improve resume parsing with website reparsing #219

Merged

coderabbitai Bot mentioned this pull request May 14, 2026

[codex] Add provider model request profiles #278

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(shared): reuse resume extractor in bot and worker#113

feat(shared): reuse resume extractor in bot and worker#113
michaelmwu merged 15 commits into
mainfrom
michaelmwu/resume-skills

michaelmwu commented Mar 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 3, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 3, 2026

Uh oh!

coderabbitai Bot Mar 3, 2026

Uh oh!

coderabbitai Bot Mar 3, 2026

Uh oh!

coderabbitai Bot Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-    processor.espocrm_client.update_contact_skills.assert_called_once_with(
-        "contact-1",
-        ["python", "redis", "fastapi", "docker"],
-    )
+    processor.espocrm_client.update_contact_skills.assert_called_once()
+    _, kwargs = processor.espocrm_client.update_contact_skills.call_args
+    assert kwargs["contact_id"] == "contact-1"
+    assert set(kwargs["skills"]) == {"python", "redis", "fastapi", "docker"}

Conversation

michaelmwu commented Mar 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

How Has This Been Tested?

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

michaelmwu commented Mar 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 3, 2026 •

edited

Loading