Fix resume contact inference and LinkedIn field handling#130
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdded configurable LinkedIn-field handling in the CRM cog and expanded resume extraction with robust name-splitting (LLM + heuristic + fallbacks). Name fields (first/last) are surfaced in extracted profiles and used across resume-to-CRM payloads; error messages for resume create/upload include richer details. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Resume Upload Client
participant Extractor as ResumeProfileExtractor
participant LLM as LLM Service
participant Heuristic as Heuristic Parser
participant CRM as CRM Contact Handler
Client->>Extractor: extract(file_content, name_hints)
Extractor->>Extractor: parse_resume(file_content)
Extractor->>Extractor: split_name(full_name, hints)
alt hints present
Extractor->>Extractor: use hints → (firstName, lastName)
else no hints
Extractor->>LLM: request name split
LLM-->>Extractor: (firstName, lastName) or error
alt LLM succeeded
Extractor->>Extractor: return LLM result
else LLM failed
Extractor->>Heuristic: parse_common_formats(full_name)
Heuristic-->>Extractor: (firstName, lastName) with fallbacks
end
end
Extractor->>Client: ResumeExtractedProfile(firstName, lastName, ...)
Client->>CRM: create_contact_from_resume(profile)
CRM->>CRM: _populate_name_fields(payload, source_name)
CRM->>CRM: payload[_configured_linkedin_field()] = linkedin_url
CRM-->>Client: contact created or enriched error_detail/status_note
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
apps/discord_bot/src/five08/discord_bot/cogs/crm.py (1)
1541-1545: Harden configured LinkedIn field normalization.If
crm_linkedin_fieldis blank/whitespace, downstream payload/search keys become invalid. Consider normalizing with.strip()and falling back to the default key.💡 Proposed hardening
`@staticmethod` def _configured_linkedin_field() -> str: """Return the configured field for LinkedIn profile values.""" - return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl")) + configured = str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl")).strip() + return configured or "cLinkedInUrl"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py` around lines 1541 - 1545, The _configured_linkedin_field() currently returns settings.crm_linkedin_field without normalization which can produce blank/whitespace keys; update it to read getattr(settings, "crm_linkedin_field", "cLinkedInUrl"), .strip() the result, and if the stripped value is empty use the default "cLinkedInUrl" so downstream payload/search keys remain valid (make changes in the _configured_linkedin_field function).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 3415-3419: The fallback name replacement can leave stale
firstName/lastName values in payload because _populate_name_fields calls
split_name which short-circuits when those keys exist; after setting
payload["name"] = self._fallback_contact_name_for_discord_user(user) (when
parsed_name is falsy or "Resume Candidate"), remove or overwrite any existing
payload["firstName"] and payload["lastName"] before calling
self._populate_name_fields so split_name recomputes names from the new
source_name, or alternatively call _populate_name_fields with a flag to force
recompute; update the logic around _fallback_contact_name_for_discord_user,
payload["name"], and the invocation of _populate_name_fields accordingly.
In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 991-997: The code accepts placeholder hint values because
_normalize_name_part(first_name_hint) and _normalize_name_part(last_name_hint)
are treated as valid; update the logic to detect and treat placeholders as
missing by introducing a check (or helper like _is_placeholder) against common
placeholders (e.g., "unknown", "n/a", "na", "none", "-", empty string) after
calling _normalize_name_part for first_name_hint and last_name_hint, clear those
values when they match placeholders, and only return (first_name, last_name) if
both are still non-placeholder; keep normalized_full_name = _normalize_name
unchanged for fallback parsing.
- Around line 1052-1059: The code currently falls back to returning full_name as
the first name (return split_first or full_name), which causes wrong outputs
when only lastName is present; update the return logic in the block using
_normalize_name_part and parsed (the variables split_first, split_last and
full_name) so you never substitute full_name for a missing split_first. Instead:
if both split_first and split_last are missing, return None; otherwise return
split_first (or a non-full-name fallback such as a
SINGLE_NAME_FALLBACK_FIRST_NAME constant or None) and split_last or
SINGLE_NAME_FALLBACK_LAST_NAME—do not use full_name as a fallback for first
name. Ensure changes are applied where split_first, split_last and full_name are
referenced (function/method that contains _normalize_name_part usage).
---
Nitpick comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 1541-1545: The _configured_linkedin_field() currently returns
settings.crm_linkedin_field without normalization which can produce
blank/whitespace keys; update it to read getattr(settings, "crm_linkedin_field",
"cLinkedInUrl"), .strip() the result, and if the stripped value is empty use the
default "cLinkedInUrl" so downstream payload/search keys remain valid (make
changes in the _configured_linkedin_field function).
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
apps/discord_bot/src/five08/discord_bot/cogs/crm.pypackages/shared/src/five08/resume_extractor.pytests/unit/test_crm.pytests/unit/test_resume_extractor.py
There was a problem hiding this comment.
Pull request overview
This PR improves resume-driven contact creation and update flows by adding first/last name handling to resume extraction, centralizing which CRM field stores LinkedIn URLs, and enhancing Discord-facing error and inference messages for better operator visibility.
Changes:
- Add first/last name extraction/splitting to
ResumeProfileExtractor, and propagate those fields into CRM contact creation payloads. - Centralize LinkedIn field selection via
_configured_linkedin_field()and use it across search/create/update paths. - Improve Discord failure/no-match messaging to include backend error/status and parsed candidate identity (name/email), with corresponding unit tests.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
packages/shared/src/five08/resume_extractor.py |
Adds first_name/last_name fields and implements split_name with LLM/heuristic fallback. |
apps/discord_bot/src/five08/discord_bot/cogs/crm.py |
Populates CRM firstName/lastName, centralizes LinkedIn field usage, and improves user-facing failure/no-match messages. |
tests/unit/test_crm.py |
Extends unit coverage for configured LinkedIn field usage, first/last name population, and improved messaging. |
tests/unit/test_resume_extractor.py |
Adds unit tests for split_name behavior and fallback paths. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @staticmethod | ||
| def _configured_linkedin_field() -> str: | ||
| """Return the configured field for LinkedIn profile values.""" | ||
| return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl")) |
There was a problem hiding this comment.
_configured_linkedin_field() uses str(getattr(...)), which will turn None into the literal string 'None' (and will also accept an empty string). That can produce invalid Espo field names and break search/update payloads. Consider validating the setting (must be a non-empty str), falling back to "cLinkedInUrl" when invalid, and reusing this helper everywhere LinkedIn field naming is needed (e.g. _field_label currently reads settings.crm_linkedin_field directly).
| return str(getattr(settings, "crm_linkedin_field", "cLinkedInUrl")) | |
| value = getattr(settings, "crm_linkedin_field", None) | |
| if isinstance(value, str): | |
| value = value.strip() | |
| if value: | |
| return value | |
| return "cLinkedInUrl" |
| async def test_search_contacts_by_field_uses_configured_linkedin_field( | ||
| self, crm_cog | ||
| ): | ||
| """Search-by-field uses the configured LinkedIn field name.""" |
There was a problem hiding this comment.
This test name/docstring says it verifies use of the configured LinkedIn field, but it actually hard-codes field="cLinkedIn" and doesn’t exercise _configured_linkedin_field(). Consider renaming/rewording the test to reflect what it asserts (select list includes the requested field and excludes the default) or patch _configured_linkedin_field() and call the code path that uses it.
| async def test_search_contacts_by_field_uses_configured_linkedin_field( | |
| self, crm_cog | |
| ): | |
| """Search-by-field uses the configured LinkedIn field name.""" | |
| async def test_search_contacts_by_field_includes_requested_field_and_excludes_default( | |
| self, crm_cog | |
| ): | |
| """Search-by-field includes the requested LinkedIn field and excludes the legacy default.""" |
| return ( | ||
| first_name or DEFAULT_FALLBACK_FIRST_NAME, | ||
| last_name or inferred_last or SINGLE_NAME_FALLBACK_LAST_NAME, | ||
| ) |
There was a problem hiding this comment.
split_name returns SINGLE_NAME_FALLBACK_LAST_NAME when full_name is missing, even though DEFAULT_FALLBACK_LAST_NAME is defined. This makes the “no name at all” fallback inconsistent with _split_name_heuristically (which returns Resume/Candidate) and leaves DEFAULT_FALLBACK_LAST_NAME effectively unused. Consider returning (DEFAULT_FALLBACK_FIRST_NAME, DEFAULT_FALLBACK_LAST_NAME) when normalized_full_name is falsy and no usable hints were provided.
| if normalized_full_name: | ||
| inferred = None | ||
| if self.client is not None: | ||
| try: | ||
| inferred = self._split_name_with_llm(normalized_full_name) | ||
| except Exception: | ||
| inferred = None | ||
| if inferred is None: | ||
| inferred = self._split_name_heuristically(normalized_full_name) |
There was a problem hiding this comment.
split_name will make an additional LLM request whenever self.client is set and name hints are missing. Since extract() already calls the LLM, this can add extra latency/cost (and it can also be triggered for placeholder names like "Resume Candidate"). Consider skipping the second LLM call (use heuristics instead) or gating it to only run when the name looks like a real person name and/or when the upstream LLM output omitted firstName/lastName unexpectedly.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 1391-1393: The exception text assigned to error_detail (and the
similar usage at lines 1417–1418) must be bounded and sanitized before being
logged or sent to Discord: replace control characters and problematic
markdown/backticks, collapse excessive whitespace, truncate to a safe length
(e.g., 1900 characters to leave room for surrounding text like status_note), and
then use that sanitized_error variable in logger.exception and in any Discord
message payload; update the references to use sanitized_error (and keep
status_note unchanged) so long raw exceptions cannot exceed Discord limits or
break message formatting.
In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 1095-1143: The heuristic splitter _split_name_heuristically
currently treats "Doe, Jane" as ["Doe,", "Jane"] and returns first="Doe," which
is wrong; update _split_name_heuristically to detect comma-delimited "Last,
First" (or "Last, First Middle") cases early: if full_name contains a comma,
split on the first comma, trim both sides, treat the left side as last and the
right side as the remaining given name(s), normalize both with
_normalize_name_part, apply existing prefix/suffix handling
(NAME_PREFIXES/NAME_SUFFIXES) to the right/left parts as appropriate, and then
return (first_normalized_or_fallback, last_normalized_or_fallback) so that "Doe,
Jane" yields first="Jane" and last="Doe" while preserving existing fallback
logic (SINGLE_NAME_FALLBACK_LAST_NAME, DEFAULT_FALLBACK_FIRST_NAME,
DEFAULT_FALLBACK_LAST_NAME).
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
apps/discord_bot/src/five08/discord_bot/cogs/crm.pypackages/shared/src/five08/resume_extractor.pytests/unit/test_crm.pytests/unit/test_resume_extractor.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/unit/test_crm.py
- tests/unit/test_resume_extractor.py
74e0c54 to
cfbd971
Compare
Description
_configured_linkedin_fieldand used for search/create/update flows.Related Issue
How Has This Been Tested?
uv run pytest tests/unit/test_crm.py -k "search_contacts_by_field_uses_configured_linkedin_field or build_resume_parsed_identity_summary_includes_name_and_email or upload_resume_no_matching_inferred_contact_shows_name_and_email or update_contact_uses_configured_linkedin_field or test_resume_create_contact_view_logs_create_failure or test_upload_resume_link_user_shows_confirm_then_creates_contact"uv run pytest tests/unit/test_resume_extractor.py -k "split_name"Summary by CodeRabbit
New Features
Improvements
Tests