Skip to content

Fix resume contact inference and LinkedIn field handling#130

Merged
michaelmwu merged 8 commits into
mainfrom
michaelmwu/fix-ci-contact-name
Mar 3, 2026
Merged

Fix resume contact inference and LinkedIn field handling#130
michaelmwu merged 8 commits into
mainfrom
michaelmwu/fix-ci-contact-name

Conversation

@michaelmwu
Copy link
Copy Markdown
Member

@michaelmwu michaelmwu commented Mar 3, 2026

Description

  • Resume-based contact creation now shows the backend error text + status directly in the Discord failure message.
  • LinkedIn field usage is now centralized via _configured_linkedin_field and used for search/create/update flows.
  • Resume inference failure for no-match now includes parsed name/email so users can verify the candidate identity before creating.
  • LinkedIn update embeds now read from the same configured field used in update payloads.

Related Issue

  • N/A

How Has This Been Tested?

  • uv run pytest tests/unit/test_crm.py -k "search_contacts_by_field_uses_configured_linkedin_field or build_resume_parsed_identity_summary_includes_name_and_email or upload_resume_no_matching_inferred_contact_shows_name_and_email or update_contact_uses_configured_linkedin_field or test_resume_create_contact_view_logs_create_failure or test_upload_resume_link_user_shows_confirm_then_creates_contact"
  • uv run pytest tests/unit/test_resume_extractor.py -k "split_name"

Summary by CodeRabbit

  • New Features

    • Automatic first/last name extraction for resumes and configurable LinkedIn field usage.
    • Concise parsed-identity summaries shown when resume inference is ambiguous.
    • Enhanced error messages include detailed status and error info.
  • Improvements

    • Dynamic field selection in contact lookups and more robust deduplication.
    • Name fields reliably populated across resume create/link flows; clearer prompts.
  • Tests

    • Expanded unit tests covering name-splitting, resume flows, and LinkedIn-field behavior.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 3, 2026

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 9 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between cf56227 and d7666b3.

📒 Files selected for processing (4)
  • apps/discord_bot/src/five08/discord_bot/cogs/crm.py
  • packages/shared/src/five08/resume_extractor.py
  • tests/unit/test_crm.py
  • tests/unit/test_resume_extractor.py
📝 Walkthrough

Walkthrough

Added configurable LinkedIn-field handling in the CRM cog and expanded resume extraction with robust name-splitting (LLM + heuristic + fallbacks). Name fields (first/last) are surfaced in extracted profiles and used across resume-to-CRM payloads; error messages for resume create/upload include richer details.

Changes

Cohort / File(s) Summary
Resume Name Extraction
packages/shared/src/five08/resume_extractor.py
Introduced first_name/last_name on ResumeExtractedProfile; added split_name with LLM-based _split_name_with_llm, heuristic _split_name_heuristically, normalization and placeholder handling; updated prompts/output schema and both extraction flows to populate first_name/last_name.
CRM Cog & Payloads
apps/discord_bot/src/five08/discord_bot/cogs/crm.py
Added module+class helpers for configurable LinkedIn field (_configured_linkedin_field), _populate_name_fields, and _build_resume_parsed_identity_summary. Replaced hard-coded cLinkedInUrl uses with configurable field, populated firstName/lastName in resume/contact payloads, and enriched resume create/upload error details (error_detail, status_note).
Tests — CRM
tests/unit/test_crm.py
Added/updated tests to assert configured LinkedIn field is used in searches and updates, verify firstName/lastName population from resume extraction (including single-name fallback), and check identity-summary and error messaging in resume flows.
Tests — Resume Extractor
tests/unit/test_resume_extractor.py
Added unit tests for split_name behavior: preferring LLM when available, falling back to heuristic on failure, single-token name handling, numeric-token filtering, hint/placeholder behavior, and integration of mocked LLM responses.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Resume Upload Client
    participant Extractor as ResumeProfileExtractor
    participant LLM as LLM Service
    participant Heuristic as Heuristic Parser
    participant CRM as CRM Contact Handler

    Client->>Extractor: extract(file_content, name_hints)
    Extractor->>Extractor: parse_resume(file_content)
    Extractor->>Extractor: split_name(full_name, hints)
    alt hints present
        Extractor->>Extractor: use hints → (firstName, lastName)
    else no hints
        Extractor->>LLM: request name split
        LLM-->>Extractor: (firstName, lastName) or error
        alt LLM succeeded
            Extractor->>Extractor: return LLM result
        else LLM failed
            Extractor->>Heuristic: parse_common_formats(full_name)
            Heuristic-->>Extractor: (firstName, lastName) with fallbacks
        end
    end
    Extractor->>Client: ResumeExtractedProfile(firstName, lastName, ...)
    Client->>CRM: create_contact_from_resume(profile)
    CRM->>CRM: _populate_name_fields(payload, source_name)
    CRM->>CRM: payload[_configured_linkedin_field()] = linkedin_url
    CRM-->>Client: contact created or enriched error_detail/status_note
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 Hop, I nibble names with care,

First and last found from thin air,
LLM hints, heuristics too,
LinkedIn field now chosen true,
Contacts sprout where resumes go — hooray!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: fixing resume contact inference (with parsed name/email display) and centralizing LinkedIn field handling across search, create, and update flows.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch michaelmwu/fix-ci-contact-name

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
apps/discord_bot/src/five08/discord_bot/cogs/crm.py (1)

1541-1545: Harden configured LinkedIn field normalization.

If crm_linkedin_field is blank/whitespace, downstream payload/search keys become invalid. Consider normalizing with .strip() and falling back to the default key.

💡 Proposed hardening
     `@staticmethod`
     def _configured_linkedin_field() -> str:
         """Return the configured field for LinkedIn profile values."""
-        return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))
+        configured = str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl")).strip()
+        return configured or "cLinkedInUrl"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py` around lines 1541 -
1545, The _configured_linkedin_field() currently returns
settings.crm_linkedin_field without normalization which can produce
blank/whitespace keys; update it to read getattr(settings, "crm_linkedin_field",
"cLinkedInUrl"), .strip() the result, and if the stripped value is empty use the
default "cLinkedInUrl" so downstream payload/search keys remain valid (make
changes in the _configured_linkedin_field function).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 3415-3419: The fallback name replacement can leave stale
firstName/lastName values in payload because _populate_name_fields calls
split_name which short-circuits when those keys exist; after setting
payload["name"] = self._fallback_contact_name_for_discord_user(user) (when
parsed_name is falsy or "Resume Candidate"), remove or overwrite any existing
payload["firstName"] and payload["lastName"] before calling
self._populate_name_fields so split_name recomputes names from the new
source_name, or alternatively call _populate_name_fields with a flag to force
recompute; update the logic around _fallback_contact_name_for_discord_user,
payload["name"], and the invocation of _populate_name_fields accordingly.

In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 991-997: The code accepts placeholder hint values because
_normalize_name_part(first_name_hint) and _normalize_name_part(last_name_hint)
are treated as valid; update the logic to detect and treat placeholders as
missing by introducing a check (or helper like _is_placeholder) against common
placeholders (e.g., "unknown", "n/a", "na", "none", "-", empty string) after
calling _normalize_name_part for first_name_hint and last_name_hint, clear those
values when they match placeholders, and only return (first_name, last_name) if
both are still non-placeholder; keep normalized_full_name = _normalize_name
unchanged for fallback parsing.
- Around line 1052-1059: The code currently falls back to returning full_name as
the first name (return split_first or full_name), which causes wrong outputs
when only lastName is present; update the return logic in the block using
_normalize_name_part and parsed (the variables split_first, split_last and
full_name) so you never substitute full_name for a missing split_first. Instead:
if both split_first and split_last are missing, return None; otherwise return
split_first (or a non-full-name fallback such as a
SINGLE_NAME_FALLBACK_FIRST_NAME constant or None) and split_last or
SINGLE_NAME_FALLBACK_LAST_NAME—do not use full_name as a fallback for first
name. Ensure changes are applied where split_first, split_last and full_name are
referenced (function/method that contains _normalize_name_part usage).

---

Nitpick comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 1541-1545: The _configured_linkedin_field() currently returns
settings.crm_linkedin_field without normalization which can produce
blank/whitespace keys; update it to read getattr(settings, "crm_linkedin_field",
"cLinkedInUrl"), .strip() the result, and if the stripped value is empty use the
default "cLinkedInUrl" so downstream payload/search keys remain valid (make
changes in the _configured_linkedin_field function).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 114b3ee and 7861434.

📒 Files selected for processing (4)
  • apps/discord_bot/src/five08/discord_bot/cogs/crm.py
  • packages/shared/src/five08/resume_extractor.py
  • tests/unit/test_crm.py
  • tests/unit/test_resume_extractor.py

Comment thread apps/discord_bot/src/five08/discord_bot/cogs/crm.py
Comment thread packages/shared/src/five08/resume_extractor.py
Comment thread packages/shared/src/five08/resume_extractor.py
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves resume-driven contact creation and update flows by adding first/last name handling to resume extraction, centralizing which CRM field stores LinkedIn URLs, and enhancing Discord-facing error and inference messages for better operator visibility.

Changes:

  • Add first/last name extraction/splitting to ResumeProfileExtractor, and propagate those fields into CRM contact creation payloads.
  • Centralize LinkedIn field selection via _configured_linkedin_field() and use it across search/create/update paths.
  • Improve Discord failure/no-match messaging to include backend error/status and parsed candidate identity (name/email), with corresponding unit tests.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
packages/shared/src/five08/resume_extractor.py Adds first_name/last_name fields and implements split_name with LLM/heuristic fallback.
apps/discord_bot/src/five08/discord_bot/cogs/crm.py Populates CRM firstName/lastName, centralizes LinkedIn field usage, and improves user-facing failure/no-match messages.
tests/unit/test_crm.py Extends unit coverage for configured LinkedIn field usage, first/last name population, and improved messaging.
tests/unit/test_resume_extractor.py Adds unit tests for split_name behavior and fallback paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@staticmethod
def _configured_linkedin_field() -> str:
"""Return the configured field for LinkedIn profile values."""
return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_configured_linkedin_field() uses str(getattr(...)), which will turn None into the literal string 'None' (and will also accept an empty string). That can produce invalid Espo field names and break search/update payloads. Consider validating the setting (must be a non-empty str), falling back to "cLinkedInUrl" when invalid, and reusing this helper everywhere LinkedIn field naming is needed (e.g. _field_label currently reads settings.crm_linkedin_field directly).

Suggested change
return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))
value = getattr(settings, "crm_linkedin_field", None)
if isinstance(value, str):
value = value.strip()
if value:
return value
return "cLinkedInUrl"

Copilot uses AI. Check for mistakes.
Comment thread tests/unit/test_crm.py Outdated
Comment on lines +1582 to +1585
async def test_search_contacts_by_field_uses_configured_linkedin_field(
self, crm_cog
):
"""Search-by-field uses the configured LinkedIn field name."""
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name/docstring says it verifies use of the configured LinkedIn field, but it actually hard-codes field="cLinkedIn" and doesn’t exercise _configured_linkedin_field(). Consider renaming/rewording the test to reflect what it asserts (select list includes the requested field and excludes the default) or patch _configured_linkedin_field() and call the code path that uses it.

Suggested change
async def test_search_contacts_by_field_uses_configured_linkedin_field(
self, crm_cog
):
"""Search-by-field uses the configured LinkedIn field name."""
async def test_search_contacts_by_field_includes_requested_field_and_excludes_default(
self, crm_cog
):
"""Search-by-field includes the requested LinkedIn field and excludes the legacy default."""

Copilot uses AI. Check for mistakes.
Comment on lines +1016 to +1019
return (
first_name or DEFAULT_FALLBACK_FIRST_NAME,
last_name or inferred_last or SINGLE_NAME_FALLBACK_LAST_NAME,
)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split_name returns SINGLE_NAME_FALLBACK_LAST_NAME when full_name is missing, even though DEFAULT_FALLBACK_LAST_NAME is defined. This makes the “no name at all” fallback inconsistent with _split_name_heuristically (which returns Resume/Candidate) and leaves DEFAULT_FALLBACK_LAST_NAME effectively unused. Consider returning (DEFAULT_FALLBACK_FIRST_NAME, DEFAULT_FALLBACK_LAST_NAME) when normalized_full_name is falsy and no usable hints were provided.

Copilot uses AI. Check for mistakes.
Comment on lines +1000 to +1008
if normalized_full_name:
inferred = None
if self.client is not None:
try:
inferred = self._split_name_with_llm(normalized_full_name)
except Exception:
inferred = None
if inferred is None:
inferred = self._split_name_heuristically(normalized_full_name)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split_name will make an additional LLM request whenever self.client is set and name hints are missing. Since extract() already calls the LLM, this can add extra latency/cost (and it can also be triggered for placeholder names like "Resume Candidate"). Consider skipping the second LLM call (use heuristics instead) or gating it to only run when the name looks like a real person name and/or when the upstream LLM output omitted firstName/lastName unexpectedly.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 1391-1393: The exception text assigned to error_detail (and the
similar usage at lines 1417–1418) must be bounded and sanitized before being
logged or sent to Discord: replace control characters and problematic
markdown/backticks, collapse excessive whitespace, truncate to a safe length
(e.g., 1900 characters to leave room for surrounding text like status_note), and
then use that sanitized_error variable in logger.exception and in any Discord
message payload; update the references to use sanitized_error (and keep
status_note unchanged) so long raw exceptions cannot exceed Discord limits or
break message formatting.

In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 1095-1143: The heuristic splitter _split_name_heuristically
currently treats "Doe, Jane" as ["Doe,", "Jane"] and returns first="Doe," which
is wrong; update _split_name_heuristically to detect comma-delimited "Last,
First" (or "Last, First Middle") cases early: if full_name contains a comma,
split on the first comma, trim both sides, treat the left side as last and the
right side as the remaining given name(s), normalize both with
_normalize_name_part, apply existing prefix/suffix handling
(NAME_PREFIXES/NAME_SUFFIXES) to the right/left parts as appropriate, and then
return (first_normalized_or_fallback, last_normalized_or_fallback) so that "Doe,
Jane" yields first="Jane" and last="Doe" while preserving existing fallback
logic (SINGLE_NAME_FALLBACK_LAST_NAME, DEFAULT_FALLBACK_FIRST_NAME,
DEFAULT_FALLBACK_LAST_NAME).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7861434 and cf56227.

📒 Files selected for processing (4)
  • apps/discord_bot/src/five08/discord_bot/cogs/crm.py
  • packages/shared/src/five08/resume_extractor.py
  • tests/unit/test_crm.py
  • tests/unit/test_resume_extractor.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/unit/test_crm.py
  • tests/unit/test_resume_extractor.py

Comment thread apps/discord_bot/src/five08/discord_bot/cogs/crm.py Outdated
Comment thread packages/shared/src/five08/resume_extractor.py
@michaelmwu michaelmwu force-pushed the michaelmwu/fix-ci-contact-name branch from 74e0c54 to cfbd971 Compare March 3, 2026 10:43
@michaelmwu michaelmwu merged commit ec84c15 into main Mar 3, 2026
5 checks passed
@michaelmwu michaelmwu deleted the michaelmwu/fix-ci-contact-name branch March 3, 2026 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants