Fix resume contact inference and LinkedIn field handling by michaelmwu · Pull Request #129 · 508-dev/508-workflows

michaelmwu · 2026-03-03T09:49:32Z

Description

Resume-based contact creation now shows the backend error text + status directly in the Discord failure message.
LinkedIn field usage is now centralized via _configured_linkedin_field and used for search/create/update flows.
Resume inference failure for no-match now includes parsed name/email so users can verify the candidate identity before creating.
LinkedIn update embeds now read from the same configured field used in update payloads.

Related Issue

N/A

How Has This Been Tested?

uv run pytest tests/unit/test_crm.py -k "search_contacts_by_field_uses_configured_linkedin_field or build_resume_parsed_identity_summary_includes_name_and_email or upload_resume_no_matching_inferred_contact_shows_name_and_email or update_contact_uses_configured_linkedin_field or test_resume_create_contact_view_logs_create_failure or test_upload_resume_link_user_shows_confirm_then_creates_contact"
uv run pytest tests/unit/test_resume_extractor.py -k "split_name"

Summary by CodeRabbit

Release Notes

New Features
- Configurable LinkedIn field handling for contact management
- Separate first and last name extraction from resumes with intelligent parsing capabilities
- Enhanced resume parsing with improved identity information and summary display
Improvements
- More robust error messaging during resume processing with additional contextual details
- Consistent and deduplicated contact field handling across search and update workflows

coderabbitai · 2026-03-03T09:49:51Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR centralizes LinkedIn field configuration, implements structured name parsing with LLM and heuristic fallbacks, and refactors the CRM payload construction to consistently populate first and last names from extracted resume data. The resume extractor now exports split_name() for name decomposition, and the CRM cog applies these extracted names across multiple payload-building paths.

Changes

Cohort / File(s)	Summary
Resume Extraction Name Parsing `packages/shared/src/five08/resume_extractor.py`	Added `split_name()` public method with LLM and heuristic name-splitting logic. Extended `ResumeExtractedProfile` with `first_name` and `last_name` fields. Updated extraction flows to populate both fields alongside `name` via new splitting helpers.
CRM LinkedIn Field Configurability `apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Introduced `_configured_linkedin_field()` staticmethod to centralize LinkedIn field name (default `cLinkedInUrl`). Replaced hardcoded LinkedIn field references across multiple call sites including payload builders, contact search, and update paths.
CRM Name Field Population `apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Added `_populate_name_fields()` helper to extract and set `firstName` and `lastName` from a source name. Integrated into resume-based payload builders (`_build_resume_create_contact_payload`, `_build_contact_payload_for_link_user`, `_infer_contact_from_resume`) and various resume workflows.
CRM Resume Processing Enhancements `apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Added `_build_resume_parsed_identity_summary()` to generate user-friendly summaries from resume data. Enhanced error messaging with detailed error and status information. Refactored `_search_contacts_by_field()` with dynamic field selection and deduplication logic.
Test Coverage `tests/unit/test_crm.py`, `tests/unit/test_resume_extractor.py`	Added tests validating configured LinkedIn field usage, first/last name population across payloads, resume identity summary formatting, and split_name behavior with LLM fallback and single-token name handling.

Sequence Diagram(s)

sequenceDiagram
    participant Resume as Resume File
    participant Extractor as Resume Extractor
    participant LLM as LLM Service
    participant Heuristic as Heuristic Parser
    participant CRM as CRM Cog

    Resume->>Extractor: extract(file_content)
    Extractor->>Extractor: _build_prompt() & call LLM
    Extractor->>LLM: Parse name, firstName, lastName
    LLM-->>Extractor: Parsed identity + names
    
    alt LLM Success
        Extractor->>Extractor: split_name(full_name, hints from LLM)
        Extractor->>LLM: _split_name_with_llm(full_name)
        LLM-->>Extractor: firstName, lastName
    else LLM Failure
        Extractor->>Heuristic: _split_name_heuristically(full_name)
        Heuristic-->>Extractor: firstName, lastName (with prefix/suffix handling)
    end
    
    Extractor-->>CRM: ResumeExtractedProfile(name, first_name, last_name, ...)
    CRM->>CRM: _populate_name_fields(payload, source_name)
    CRM->>CRM: _build_resume_parsed_identity_summary(file_content)
    CRM-->>CRM: Payload with firstName, lastName, LinkedIn field configured

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

fix: show resume lookup identifiers and create prospects #125 — Modifies resume-inference and contact-creation flows in CRMCog, overlapping on payload construction and user-facing resume lookup behavior.
feat(shared): reuse resume extractor in bot and worker #113 — Updates shared resume extractor integration and first/last name field handling across CRM payload paths; directly related to name extraction changes.
fix: confirm link_user contact creation #88 — Modifies resume-to-contact payload logic and contact linking methods with similar refactoring of field mapping patterns.

Poem

🐰 Names split swift as carrot sticks,
LinkedIn fields now dance like tricks,
First meets Last in PayloadLand—
Resume magic, paws at hand! 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main changes in the PR: fixing resume contact inference logic and making LinkedIn field handling configurable across multiple code paths.
Docstring Coverage	✅ Passed	Docstring coverage is 82.61% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch michaelmwu/fix-resume-contact

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR improves the Discord CRM resume workflow by (1) enriching resume-derived identity details (first/last name + parsed identity summary) and (2) centralizing LinkedIn custom-field selection so search/create/update flows consistently use the same CRM field.

Changes:

Add name splitting (LLM + heuristic fallback) and propagate firstName/lastName into resume extraction results and CRM create payloads.
Centralize LinkedIn field selection via _configured_linkedin_field() and use it across search/create/update + update embed rendering.
Improve Discord UX on failures: include backend error + status on contact-create failures, and include parsed name/email on “no matching contact” inference.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
`packages/shared/src/five08/resume_extractor.py`	Adds first/last name fields and new `split_name()` logic (LLM + heuristic) and updates the LLM prompt/parse pipeline.
`apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Uses configured LinkedIn field for search/create/update, populates `firstName`/`lastName` in create payloads, and enhances user-facing error/identity messages.
`tests/unit/test_crm.py`	Adds/updates unit coverage for configured LinkedIn field usage, name-field propagation, improved failure messages, and parsed identity summary behavior.
`tests/unit/test_resume_extractor.py`	Adds unit coverage for the new name-splitting behavior (LLM preference + heuristic fallback + single-token handling).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-03T09:56:22Z

+            inferred = None
+            if self.client is not None:
+                try:
+                    inferred = self._split_name_with_llm(normalized_full_name)
+                except Exception:
+                    inferred = None
+            if inferred is None:
+                inferred = self._split_name_heuristically(normalized_full_name)


split_name() will attempt an LLM call whenever self.client is configured. In extract(), you already performed an LLM completion immediately before calling split_name(), so this can result in a second OpenAI request if firstName/lastName aren’t returned. Consider defaulting to heuristic splitting in this situation (or adding a flag to disable the extra LLM call) to reduce latency/cost and avoid doubling failure modes.

Suggested change

inferred = None

if self.client is not None:

try:

inferred = self._split_name_with_llm(normalized_full_name)

except Exception:

inferred = None

if inferred is None:

inferred = self._split_name_heuristically(normalized_full_name)

inferred: tuple[str, str] | None = None

# Prefer heuristic splitting first to avoid unnecessary LLM calls.

inferred = self._split_name_heuristically(normalized_full_name)

# If the heuristic could not split the name, fall back to the LLM when available.

if inferred is None and self.client is not None:

try:

inferred = self._split_name_with_llm(normalized_full_name)

except Exception:

inferred = None

Copilot · 2026-03-03T09:56:22Z

+        return split_first or full_name, split_last or SINGLE_NAME_FALLBACK_LAST_NAME
+


When the name-split LLM returns only one of firstName/lastName, _split_name_with_llm() falls back to full_name as the missing part. This can produce a firstName containing spaces (e.g., "Ada Lovelace") or duplicate data across fields. Consider falling back to the heuristic splitter to fill missing parts (or at least using the first/last token) when the model returns incomplete output.

Suggested change

return split_first or full_name, split_last or SINGLE_NAME_FALLBACK_LAST_NAME

# If the model returned only one of first/last, use the heuristic splitter

# to fill in the missing part based on the full name, avoiding duplicate

# or multi-token first names.

if not split_first or not split_last:

heuristic_first, heuristic_last = self._split_name_heuristically(full_name)

split_first = split_first or heuristic_first

split_last = split_last or heuristic_last

return split_first, split_last

Copilot · 2026-03-03T09:56:22Z

+    @staticmethod
+    def _configured_linkedin_field() -> str:
+        """Return the configured field for LinkedIn profile values."""
+        return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))


_configured_linkedin_field() reads settings.crm_linkedin_field, but the bot Settings model doesn’t declare this field (so Pydantic Settings won’t load it from env/config and it will always fall back). Also, if the attribute were ever present but None, str(None) would yield the literal string "None". Consider adding an explicit crm_linkedin_field: str = "cLinkedInUrl" to the Settings schema (and validating/stripping it here, falling back when empty).

Suggested change

return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))

default_field = "cLinkedInUrl"

raw_value = getattr(settings, "crm_linkedin_field", None)

if not isinstance(raw_value, str):

return default_field

value = raw_value.strip()

return value or default_field

Copilot · 2026-03-03T09:56:23Z

+
+        return (
+            f"\nParsed contact details: name=`{parsed_name}`, email=`{primary_email}`"
+        )


The parsed name/email are interpolated directly into inline-code backticks. If the extracted name/email contains a backtick, it can break formatting and potentially hide/alter surrounding text. Consider sanitizing (e.g., replace/backslash-escape backticks and truncate to a safe length) before embedding user/LLM-derived values into Discord messages.

Copilot · 2026-03-03T09:56:23Z

            await interaction.followup.send(
-                "⚠️ Could not create a contact from this resume. "
+                f"⚠️ Could not create a contact from this resume: `{error_detail}`{status_note}. "
                "Please provide `search_term` or `link_user`.",
                ephemeral=True,


The user-facing failure message embeds error_detail = str(exc) verbatim. For EspoAPIError raised by the shared client, this may include boilerplate (e.g., "Wrong request, status code is …") and can duplicate the separately-appended status note, making the message noisy. Consider special-casing EspoAPIError to extract/display just the server reason (and escape/truncate any backticks/newlines) while keeping the full exception details only in logs/audit metadata.

Copilot · 2026-03-03T09:56:23Z

+        first_name, last_name = self.resume_extractor.split_name(
+            full_name=source_name,
+            first_name_hint=str(payload.get("firstName", "")).strip() or None,
+            last_name_hint=str(payload.get("lastName", "")).strip() or None,
+        )


_populate_name_fields() calls resume_extractor.split_name(), which will invoke an additional LLM call whenever the OpenAI client is configured. In the resume-upload path you already may have done an LLM extraction, so this can introduce an extra API request (latency/cost/failure surface) just to split the name. Consider using heuristic-only splitting here, or plumbing through extracted first_name/last_name hints from the resume profile to avoid triggering another model call.

michaelmwu added 2 commits March 3, 2026 17:14

Fix resume contact creation name fields and feedback

16afc3a

Fix resume contact inference and linkedin field handling

a3d3623

michaelmwu had a problem deploying to test March 3, 2026 09:49 — with GitHub Actions Failure

michaelmwu requested a review from Copilot March 3, 2026 09:50

Copilot started reviewing on behalf of michaelmwu March 3, 2026 09:50 View session

michaelmwu closed this Mar 3, 2026

Copilot AI reviewed Mar 3, 2026

View reviewed changes

michaelmwu deleted the michaelmwu/fix-resume-contact branch March 3, 2026 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix resume contact inference and LinkedIn field handling#129

Fix resume contact inference and LinkedIn field handling#129
michaelmwu wants to merge 2 commits into
mainfrom
michaelmwu/fix-resume-contact

michaelmwu commented Mar 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 3, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return split_first or full_name, split_last or SINGLE_NAME_FALLBACK_LAST_NAME

-        return split_first or full_name, split_last or SINGLE_NAME_FALLBACK_LAST_NAME
+        # If the model returned only one of first/last, use the heuristic splitter
+        # to fill in the missing part based on the full name, avoiding duplicate
+        # or multi-token first names.
+        if not split_first or not split_last:
+            heuristic_first, heuristic_last = self._split_name_heuristically(full_name)
+            split_first = split_first or heuristic_first
+            split_last = split_last or heuristic_last
+        return split_first, split_last

-        return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl"))
+        default_field = "cLinkedInUrl"
+        raw_value = getattr(settings, "crm_linkedin_field", None)
+        if not isinstance(raw_value, str):
+            return default_field
+        value = raw_value.strip()
+        return value or default_field

Conversation

michaelmwu commented Mar 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

How Has This Been Tested?

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

michaelmwu commented Mar 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 3, 2026 •

edited

Loading