Fix resume contact inference and LinkedIn field handling by michaelmwu · Pull Request #130 · 508-dev/508-workflows

michaelmwu · 2026-03-03T09:54:15Z

Description

Resume-based contact creation now shows the backend error text + status directly in the Discord failure message.
LinkedIn field usage is now centralized via _configured_linkedin_field and used for search/create/update flows.
Resume inference failure for no-match now includes parsed name/email so users can verify the candidate identity before creating.
LinkedIn update embeds now read from the same configured field used in update payloads.

Related Issue

N/A

How Has This Been Tested?

uv run pytest tests/unit/test_crm.py -k "search_contacts_by_field_uses_configured_linkedin_field or build_resume_parsed_identity_summary_includes_name_and_email or upload_resume_no_matching_inferred_contact_shows_name_and_email or update_contact_uses_configured_linkedin_field or test_resume_create_contact_view_logs_create_failure or test_upload_resume_link_user_shows_confirm_then_creates_contact"
uv run pytest tests/unit/test_resume_extractor.py -k "split_name"

Summary by CodeRabbit

New Features
- Automatic first/last name extraction for resumes and configurable LinkedIn field usage.
- Concise parsed-identity summaries shown when resume inference is ambiguous.
- Enhanced error messages include detailed status and error info.
Improvements
- Dynamic field selection in contact lookups and more robust deduplication.
- Name fields reliably populated across resume create/link flows; clearer prompts.
Tests
- Expanded unit tests covering name-splitting, resume flows, and LinkedIn-field behavior.

coderabbitai · 2026-03-03T09:54:35Z

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 9 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between cf56227 and d7666b3.

📒 Files selected for processing (4)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
packages/shared/src/five08/resume_extractor.py
tests/unit/test_crm.py
tests/unit/test_resume_extractor.py

📝 Walkthrough

Walkthrough

Added configurable LinkedIn-field handling in the CRM cog and expanded resume extraction with robust name-splitting (LLM + heuristic + fallbacks). Name fields (first/last) are surfaced in extracted profiles and used across resume-to-CRM payloads; error messages for resume create/upload include richer details.

Changes

Cohort / File(s)	Summary
Resume Name Extraction `packages/shared/src/five08/resume_extractor.py`	Introduced first_name/last_name on ResumeExtractedProfile; added split_name with LLM-based `_split_name_with_llm`, heuristic `_split_name_heuristically`, normalization and placeholder handling; updated prompts/output schema and both extraction flows to populate first_name/last_name.
CRM Cog & Payloads `apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Added module+class helpers for configurable LinkedIn field (`_configured_linkedin_field`), `_populate_name_fields`, and `_build_resume_parsed_identity_summary`. Replaced hard-coded `cLinkedInUrl` uses with configurable field, populated firstName/lastName in resume/contact payloads, and enriched resume create/upload error details (error_detail, status_note).
Tests — CRM `tests/unit/test_crm.py`	Added/updated tests to assert configured LinkedIn field is used in searches and updates, verify firstName/lastName population from resume extraction (including single-name fallback), and check identity-summary and error messaging in resume flows.
Tests — Resume Extractor `tests/unit/test_resume_extractor.py`	Added unit tests for split_name behavior: preferring LLM when available, falling back to heuristic on failure, single-token name handling, numeric-token filtering, hint/placeholder behavior, and integration of mocked LLM responses.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Resume Upload Client
    participant Extractor as ResumeProfileExtractor
    participant LLM as LLM Service
    participant Heuristic as Heuristic Parser
    participant CRM as CRM Contact Handler

    Client->>Extractor: extract(file_content, name_hints)
    Extractor->>Extractor: parse_resume(file_content)
    Extractor->>Extractor: split_name(full_name, hints)
    alt hints present
        Extractor->>Extractor: use hints → (firstName, lastName)
    else no hints
        Extractor->>LLM: request name split
        LLM-->>Extractor: (firstName, lastName) or error
        alt LLM succeeded
            Extractor->>Extractor: return LLM result
        else LLM failed
            Extractor->>Heuristic: parse_common_formats(full_name)
            Heuristic-->>Extractor: (firstName, lastName) with fallbacks
        end
    end
    Extractor->>Client: ResumeExtractedProfile(firstName, lastName, ...)
    Client->>CRM: create_contact_from_resume(profile)
    CRM->>CRM: _populate_name_fields(payload, source_name)
    CRM->>CRM: payload[_configured_linkedin_field()] = linkedin_url
    CRM-->>Client: contact created or enriched error_detail/status_note

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

fix: show resume lookup identifiers and create prospects #125: Modifies the same CRM resume-inference flows and helper methods for resume identity summaries and contact payloads.
feat(shared): reuse resume extractor in bot and worker #113: Overlaps on resume extraction/name-splitting helpers and CRM resume handling used across bot workflows.
fix: unify resume extraction updates #116: Makes complementary changes to resume extraction and CRM linking (name-splitting and LinkedIn-field handling overlap).

Poem

🐰 Hop, I nibble names with care,

First and last found from thin air,
LLM hints, heuristics too,
LinkedIn field now chosen true,
Contacts sprout where resumes go — hooray!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: fixing resume contact inference (with parsed name/email display) and centralizing LinkedIn field handling across search, create, and update flows.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch michaelmwu/fix-ci-contact-name

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py (1)

1541-1545: Harden configured LinkedIn field normalization.

If crm_linkedin_field is blank/whitespace, downstream payload/search keys become invalid. Consider normalizing with .strip() and falling back to the default key.

💡 Proposed hardening

     `@staticmethod`
     def _configured_linkedin_field() -> str:
         """Return the configured field for LinkedIn profile values."""
-        return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))
+        configured = str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl")).strip()
+        return configured or "cLinkedInUrl"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py` around lines 1541 -
1545, The _configured_linkedin_field() currently returns
settings.crm_linkedin_field without normalization which can produce
blank/whitespace keys; update it to read getattr(settings, "crm_linkedin_field",
"cLinkedInUrl"), .strip() the result, and if the stripped value is empty use the
default "cLinkedInUrl" so downstream payload/search keys remain valid (make
changes in the _configured_linkedin_field function).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 3415-3419: The fallback name replacement can leave stale
firstName/lastName values in payload because _populate_name_fields calls
split_name which short-circuits when those keys exist; after setting
payload["name"] = self._fallback_contact_name_for_discord_user(user) (when
parsed_name is falsy or "Resume Candidate"), remove or overwrite any existing
payload["firstName"] and payload["lastName"] before calling
self._populate_name_fields so split_name recomputes names from the new
source_name, or alternatively call _populate_name_fields with a flag to force
recompute; update the logic around _fallback_contact_name_for_discord_user,
payload["name"], and the invocation of _populate_name_fields accordingly.

In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 991-997: The code accepts placeholder hint values because
_normalize_name_part(first_name_hint) and _normalize_name_part(last_name_hint)
are treated as valid; update the logic to detect and treat placeholders as
missing by introducing a check (or helper like _is_placeholder) against common
placeholders (e.g., "unknown", "n/a", "na", "none", "-", empty string) after
calling _normalize_name_part for first_name_hint and last_name_hint, clear those
values when they match placeholders, and only return (first_name, last_name) if
both are still non-placeholder; keep normalized_full_name = _normalize_name
unchanged for fallback parsing.
- Around line 1052-1059: The code currently falls back to returning full_name as
the first name (return split_first or full_name), which causes wrong outputs
when only lastName is present; update the return logic in the block using
_normalize_name_part and parsed (the variables split_first, split_last and
full_name) so you never substitute full_name for a missing split_first. Instead:
if both split_first and split_last are missing, return None; otherwise return
split_first (or a non-full-name fallback such as a
SINGLE_NAME_FALLBACK_FIRST_NAME constant or None) and split_last or
SINGLE_NAME_FALLBACK_LAST_NAME—do not use full_name as a fallback for first
name. Ensure changes are applied where split_first, split_last and full_name are
referenced (function/method that contains _normalize_name_part usage).

---

Nitpick comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 1541-1545: The _configured_linkedin_field() currently returns
settings.crm_linkedin_field without normalization which can produce
blank/whitespace keys; update it to read getattr(settings, "crm_linkedin_field",
"cLinkedInUrl"), .strip() the result, and if the stripped value is empty use the
default "cLinkedInUrl" so downstream payload/search keys remain valid (make
changes in the _configured_linkedin_field function).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 114b3ee and 7861434.

📒 Files selected for processing (4)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
packages/shared/src/five08/resume_extractor.py
tests/unit/test_crm.py
tests/unit/test_resume_extractor.py

Copilot

Pull request overview

This PR improves resume-driven contact creation and update flows by adding first/last name handling to resume extraction, centralizing which CRM field stores LinkedIn URLs, and enhancing Discord-facing error and inference messages for better operator visibility.

Changes:

Add first/last name extraction/splitting to ResumeProfileExtractor, and propagate those fields into CRM contact creation payloads.
Centralize LinkedIn field selection via _configured_linkedin_field() and use it across search/create/update paths.
Improve Discord failure/no-match messaging to include backend error/status and parsed candidate identity (name/email), with corresponding unit tests.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
`packages/shared/src/five08/resume_extractor.py`	Adds `first_name`/`last_name` fields and implements `split_name` with LLM/heuristic fallback.
`apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Populates CRM `firstName`/`lastName`, centralizes LinkedIn field usage, and improves user-facing failure/no-match messages.
`tests/unit/test_crm.py`	Extends unit coverage for configured LinkedIn field usage, first/last name population, and improved messaging.
`tests/unit/test_resume_extractor.py`	Adds unit tests for `split_name` behavior and fallback paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-03T10:01:16Z

+    @staticmethod
+    def _configured_linkedin_field() -> str:
+        """Return the configured field for LinkedIn profile values."""
+        return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))


_configured_linkedin_field() uses str(getattr(...)), which will turn None into the literal string 'None' (and will also accept an empty string). That can produce invalid Espo field names and break search/update payloads. Consider validating the setting (must be a non-empty str), falling back to "cLinkedInUrl" when invalid, and reusing this helper everywhere LinkedIn field naming is needed (e.g. _field_label currently reads settings.crm_linkedin_field directly).

Suggested change

return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))

value = getattr(settings, "crm_linkedin_field", None)

if isinstance(value, str):

value = value.strip()

if value:

return value

return "cLinkedInUrl"

Copilot · 2026-03-03T10:01:16Z

+    async def test_search_contacts_by_field_uses_configured_linkedin_field(
+        self, crm_cog
+    ):
+        """Search-by-field uses the configured LinkedIn field name."""


This test name/docstring says it verifies use of the configured LinkedIn field, but it actually hard-codes field="cLinkedIn" and doesn’t exercise _configured_linkedin_field(). Consider renaming/rewording the test to reflect what it asserts (select list includes the requested field and excludes the default) or patch _configured_linkedin_field() and call the code path that uses it.

Suggested change

async def test_search_contacts_by_field_uses_configured_linkedin_field(

self, crm_cog

):

"""Search-by-field uses the configured LinkedIn field name."""

async def test_search_contacts_by_field_includes_requested_field_and_excludes_default(

self, crm_cog

):

"""Search-by-field includes the requested LinkedIn field and excludes the legacy default."""

Copilot · 2026-03-03T10:01:17Z

+        return (
+            first_name or DEFAULT_FALLBACK_FIRST_NAME,
+            last_name or inferred_last or SINGLE_NAME_FALLBACK_LAST_NAME,
+        )


split_name returns SINGLE_NAME_FALLBACK_LAST_NAME when full_name is missing, even though DEFAULT_FALLBACK_LAST_NAME is defined. This makes the “no name at all” fallback inconsistent with _split_name_heuristically (which returns Resume/Candidate) and leaves DEFAULT_FALLBACK_LAST_NAME effectively unused. Consider returning (DEFAULT_FALLBACK_FIRST_NAME, DEFAULT_FALLBACK_LAST_NAME) when normalized_full_name is falsy and no usable hints were provided.

Copilot · 2026-03-03T10:01:17Z

+        if normalized_full_name:
+            inferred = None
+            if self.client is not None:
+                try:
+                    inferred = self._split_name_with_llm(normalized_full_name)
+                except Exception:
+                    inferred = None
+            if inferred is None:
+                inferred = self._split_name_heuristically(normalized_full_name)


split_name will make an additional LLM request whenever self.client is set and name hints are missing. Since extract() already calls the LLM, this can add extra latency/cost (and it can also be triggered for placeholder names like "Resume Candidate"). Consider skipping the second LLM call (use heuristics instead) or gating it to only run when the name looks like a real person name and/or when the upstream LLM output omitted firstName/lastName unexpectedly.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 1391-1393: The exception text assigned to error_detail (and the
similar usage at lines 1417–1418) must be bounded and sanitized before being
logged or sent to Discord: replace control characters and problematic
markdown/backticks, collapse excessive whitespace, truncate to a safe length
(e.g., 1900 characters to leave room for surrounding text like status_note), and
then use that sanitized_error variable in logger.exception and in any Discord
message payload; update the references to use sanitized_error (and keep
status_note unchanged) so long raw exceptions cannot exceed Discord limits or
break message formatting.

In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 1095-1143: The heuristic splitter _split_name_heuristically
currently treats "Doe, Jane" as ["Doe,", "Jane"] and returns first="Doe," which
is wrong; update _split_name_heuristically to detect comma-delimited "Last,
First" (or "Last, First Middle") cases early: if full_name contains a comma,
split on the first comma, trim both sides, treat the left side as last and the
right side as the remaining given name(s), normalize both with
_normalize_name_part, apply existing prefix/suffix handling
(NAME_PREFIXES/NAME_SUFFIXES) to the right/left parts as appropriate, and then
return (first_normalized_or_fallback, last_normalized_or_fallback) so that "Doe,
Jane" yields first="Jane" and last="Doe" while preserving existing fallback
logic (SINGLE_NAME_FALLBACK_LAST_NAME, DEFAULT_FALLBACK_FIRST_NAME,
DEFAULT_FALLBACK_LAST_NAME).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7861434 and cf56227.

📒 Files selected for processing (4)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
packages/shared/src/five08/resume_extractor.py
tests/unit/test_crm.py
tests/unit/test_resume_extractor.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/unit/test_crm.py
tests/unit/test_resume_extractor.py

michaelmwu temporarily deployed to test March 3, 2026 09:54 — with GitHub Actions Inactive

michaelmwu requested a review from Copilot March 3, 2026 09:55

Copilot started reviewing on behalf of michaelmwu March 3, 2026 09:55 View session

coderabbitai Bot reviewed Mar 3, 2026

View reviewed changes

Comment thread apps/discord_bot/src/five08/discord_bot/cogs/crm.py

Comment thread packages/shared/src/five08/resume_extractor.py

Comment thread packages/shared/src/five08/resume_extractor.py

Copilot AI reviewed Mar 3, 2026

View reviewed changes

michaelmwu had a problem deploying to test March 3, 2026 10:29 — with GitHub Actions Failure

coderabbitai Bot reviewed Mar 3, 2026

View reviewed changes

Comment thread apps/discord_bot/src/five08/discord_bot/cogs/crm.py Outdated

Comment thread packages/shared/src/five08/resume_extractor.py

michaelmwu had a problem deploying to test March 3, 2026 10:37 — with GitHub Actions Failure

michaelmwu had a problem deploying to test March 3, 2026 10:41 — with GitHub Actions Failure

michaelmwu temporarily deployed to test March 3, 2026 10:42 — with GitHub Actions Inactive

michaelmwu added 7 commits March 3, 2026 18:43

Fix resume contact creation name fields and feedback

d9e6d8e

Fix resume contact inference and linkedin field handling

6b9595d

Fix resume split-name fallback for contact creation

79b96ee

Address LinkedIn config and resume name extraction review findings

34f570c

Fix discord display name handling when discriminator invalid

ab231e2

Sanitize resume contact creation errors for Discord

54432a3

Use Discord username for cDiscordUsername in contact linking flows

cfbd971

michaelmwu force-pushed the michaelmwu/fix-ci-contact-name branch from 74e0c54 to cfbd971 Compare March 3, 2026 10:43

michaelmwu temporarily deployed to test March 3, 2026 10:43 — with GitHub Actions Inactive

Handle comma-delimited names in heuristic split

d7666b3

michaelmwu temporarily deployed to test March 3, 2026 10:46 — with GitHub Actions Inactive

michaelmwu merged commit ec84c15 into main Mar 3, 2026
5 checks passed

michaelmwu deleted the michaelmwu/fix-ci-contact-name branch March 3, 2026 10:47

This was referenced Mar 3, 2026

fix: stabilize resume contact inference parsing #133

Merged

feat: extend update-contact fields #174

Merged

Clarify empty LLM fallback reason #186

Merged

This was referenced Mar 18, 2026

refactor CRM resume processing flow #206

Merged

fix: support LinkedIn URL contact lookup #214

Merged

-        return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))
+        value = getattr(settings, "crm_linkedin_field", None)
+        if isinstance(value, str):
+            value = value.strip()
+            if value:
+                return value
+        return "cLinkedInUrl"

Conversation

michaelmwu commented Mar 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

How Has This Been Tested?

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

michaelmwu commented Mar 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 3, 2026 •

edited

Loading