refactor CRM resume processing flow by michaelmwu · Pull Request #206 · 508-dev/508-workflows

michaelmwu · 2026-03-18T22:30:45Z

Description

Move resume extract/apply logic into shared code so the Discord bot can run synchronous resume flows directly while the worker reuses the same processor for background paths.
Share the resume processing models, reuse parsed resume hints, and DRY the bot-side resume validation and create-contact payload building.
Simplify CRM field handling to the concrete Espo fields and formats we actually use, including cLinkedIn, cOnboarder, cOnboardingState, Espo date/datetime parsing, and PDF/DOCX-only resume uploads.

Related Issue

None.

How Has This Been Tested?

uv run pytest tests/unit/test_resume_profile_processor.py tests/unit/test_crm.py tests/unit/test_worker_mailbox_resume_ingest.py
uv run ruff check apps/worker/src/five08/worker/models.py apps/worker/src/five08/worker/crm/resume_profile_processor.py packages/shared/src/five08/resume_processing_models.py packages/shared/src/five08/resume_profile_processor.py apps/discord_bot/src/five08/discord_bot/cogs/crm.py tests/unit/test_crm.py tests/unit/test_resume_profile_processor.py

Summary by CodeRabbit

New Features
- Resume upload validation: accept only PDF/DOCX up to 10 MB with clear user-facing errors and audit logging.
- Improved skills extraction that returns confidence and per-skill strength indicators.
Improvements
- Faster, direct resume processing for quicker profile previews and CRM updates.
- Standardized LinkedIn handling and clearer onboarding/status wording across UIs and messages.

coderabbitai · 2026-03-18T22:31:04Z

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 19 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa9a9b80-83fd-4a73-9205-5068db36ecf5

📥 Commits

Reviewing files that changed from the base of the PR and between a1cf65c and 86b4e1c.

📒 Files selected for processing (1)

packages/shared/src/five08/resume_skills_extractor.py

📝 Walkthrough

Walkthrough

Consolidates resume processing into shared packages: adds shared models, document/ and skills processors, and a ResumeProfileProcessor; simplifies worker to a thin wrapper; and refactors the Discord CRMCog to call processors directly with new validation, constants, and hints propagation.

Changes

Cohort / File(s)	Summary
Shared library — models & processors `packages/shared/src/five08/resume_processing_models.py`, `packages/shared/src/five08/resume_profile_processor.py`, `packages/shared/src/five08/resume_skills_extractor.py`, `packages/shared/src/five08/resume_document_processor.py`	Adds typed Pydantic models, a ResumeProcessorConfig and ResumeProfileProcessor (extract/apply flows, optional DB persistence), a SkillsExtractor (LLM + heuristic fallback), and a DocumentProcessor (validation, extraction, caching).
Worker integration `apps/worker/src/five08/worker/crm/resume_profile_processor.py`, `apps/worker/src/five08/worker/models.py`, `apps/worker/src/five08/worker/config.py`	Replaces large local processor with a thin subclass delegating to shared ResumeProfileProcessor; worker models aliased to shared models; removed configurable crm_linkedin_field; default allowed_file_types tightened to `pdf,docx`.
Discord bot CRM cog `apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Rewires from queued jobs to direct in-process calls: adds `_create_resume_profile_processor`, `_extract_resume_profile_direct`, `_apply_resume_profile_direct`, `_validate_resume_attachment`, resume constants (allowed extensions, max size, date formats, `LINKEDIN_FIELD`), and threads hints through inference/summary builders.
Intake & people sync `apps/worker/src/five08/worker/crm/intake_form_processor.py`, `apps/worker/src/five08/worker/crm/people_sync.py`	Introduces `LINKEDIN_FIELD = "cLinkedIn"` and replaces dynamic `settings.crm_linkedin_field` usage with the public constant.
Tests `tests/unit/test_crm.py`, `tests/unit/test_resume_profile_processor.py`, `tests/unit/test_skills_extractor.py`, `tests/unit/test_worker_config.py`	Updates and adds tests for field renames (onboarding, LinkedIn), direct extract/apply flows, resume validation (file types/size), hint propagation, ResumeProcessorConfig behavior, skills heuristic extraction, and worker defaults.

Sequence Diagram

sequenceDiagram
    participant Bot as Discord Bot (CRMCog)
    participant Proc as ResumeProfileProcessor (Shared)
    participant Extractor as ResumeProfileExtractor (Shared)
    participant Skills as SkillsExtractor (Shared)
    participant CRM as EspoClient (CRM)
    participant DB as Postgres

    rect rgba(100,200,150,0.5)
    Note over Bot,DB: Resume Extraction Flow (Direct, In‑Process)
    Bot->>Proc: extract_profile_proposal(contact_id, attachment_id, filename)
    Proc->>CRM: fetch contact details
    Proc->>Proc: download & validate attachment
    Proc->>Extractor: extract text & parsed fields
    Proc->>Skills: extract skills from text
    Proc->>Proc: normalize/merge with CRM data
    Proc-->>Bot: return ResumeExtractionResult
    end

    rect rgba(100,150,200,0.5)
    Note over Bot,DB: Resume Application Flow (Direct, In‑Process)
    Bot->>Proc: apply_profile_updates(contact_id, updates, link_discord?)
    Proc->>CRM: bulk update contact fields
    alt Bulk update succeeds
        Proc->>CRM: verify updated fields
    else Bulk update fails
        Proc->>CRM: apply per-field with error collection
    end
    Proc->>DB: persist processing run (idempotent, if configured)
    Proc-->>Bot: return ResumeApplyResult
    end

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

feat(shared): reuse resume extractor in bot and worker #113 — Adds/adopts the same shared resume extraction/processing code used by this change.
fix worker startup side effects and DRY crm normalization #140 — Introduces shared ResumeProfileProcessor and resume models overlapping this consolidation.
Fix resume contact inference and LinkedIn field handling #130 — Modifies CRM cog resume inference and LinkedIn field handling related to the new LINKEDIN_FIELD constant.

Poem

🐰 I hopped through code with nimble paws,

Shared processors stitched without a pause.
Bot and worker now share the same trail,
Resumes parsed clean — no more a long tail.
Carrots for tests, and cheers for the merge!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.55% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'refactor CRM resume processing flow' accurately summarizes the main change: restructuring resume processing to move logic into shared code and simplify field handling.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch michaelmwu/dry-crm-constants

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Refactors the CRM resume extraction/apply workflow into shared code so both the Discord bot and worker reuse the same processor, while also simplifying CRM field handling to the concrete Espo fields and formats currently in use.

Changes:

Introduces shared resume processing models and a shared ResumeProfileProcessor (plus skills/doc processing helpers).
Updates worker to use a compatibility shim that delegates to the shared processor.
Updates Discord bot CRM cog to run extract/apply directly (threaded), tightens resume upload validation (PDF/DOCX), and simplifies onboarding fields/date formatting.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/unit/test_resume_profile_processor.py	Adjusts tests around configurable LinkedIn field behavior in the worker processor.
tests/unit/test_crm.py	Updates tests for onboarding field names/date formatting and stricter resume upload type validation; adds coverage for reusing inferred hints.
packages/shared/src/five08/resume_skills_extractor.py	Adds shared skills extraction (LLM + heuristic fallback).
packages/shared/src/five08/resume_profile_processor.py	Adds shared end-to-end resume extract/apply processor and configuration.
packages/shared/src/five08/resume_processing_models.py	Adds shared Pydantic models for resume processing results and change tracking.
packages/shared/src/five08/resume_document_processor.py	Adds shared document validation/text extraction with caching.
apps/worker/src/five08/worker/models.py	Re-exports shared resume processing models to keep worker API/types consistent.
apps/worker/src/five08/worker/crm/resume_profile_processor.py	Replaces worker implementation with a thin shim over the shared processor.
apps/discord_bot/src/five08/discord_bot/cogs/crm.py	Switches from backend-job flow to direct shared processor calls; simplifies onboarding fields; adds centralized resume attachment validation; reuses parsed hints.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    allowed_file_extensions: set[str] = field(
+        default_factory=lambda: {"pdf", "docx", "txt"}
+    )


+                return ResumeApplyResult(
+                    contact_id=contact_id,
+                    updated_fields=sorted(updated_fields),
+                    updated_values={
+                        field: approved_updates[field]
+                        for field in sorted(updated_fields)
+                        if field in approved_updates
+                    },
+                    link_discord_applied=link_applied,
+                    success=True,
+                )
+
+            if updated_fields:
+                return ResumeApplyResult(
+                    contact_id=contact_id,
+                    updated_fields=sorted(updated_fields),
+                    updated_values={
+                        field: approved_updates[field]
+                        for field in sorted(updated_fields)
+                        if field in approved_updates
+                    },
+                    link_discord_applied=link_applied,
+                    success=False,
+                    error="; ".join(batch_errors)
+                    if batch_errors
+                    else "Some fields did not persist after update",
+                )
+


+RESUME_ALLOWED_EXTENSIONS = frozenset({".pdf", ".docx"})
+RESUME_ALLOWED_EXTENSIONS_LABEL = "PDF or DOCX"
+RESUME_MAX_FILE_SIZE_BYTES = 10 * 1024 * 1024
+RESUME_MAX_FILE_SIZE_MB = RESUME_MAX_FILE_SIZE_BYTES // (1024 * 1024)


+        detected: set[str] = set()
+        for token in token_matches:
+            canonical = self._normalize_skill_name(token)
+            if canonical in COMMON_SKILLS and canonical not in DISALLOWED_SKILLS:
+                detected.add(canonical)
+


coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

packages/shared/src/five08/resume_skills_extractor.py (1)

86-107: Consider adding a timeout for the OpenAI API call.

The LLM call lacks an explicit timeout. While worker-level timeouts may apply, adding a request-level timeout would provide more predictable failure modes.

💡 Optional: Add timeout parameter

             response = self.client.chat.completions.create(
                 model=self.model,
                 messages=[
                     {
                         "role": "system",
                         "content": (
                             "You extract professional skills from resumes for a CRM. "
                             "Focus on white-collar skills for product development orgs: "
                             "engineering, product, data, design, growth, and marketing. "
                             "Return JSON only, no prose. "
                             "Normalize skills to concise canonical names, lowercase. "
                             "Provide a strength from 1-5 when known, where 5 is strongest. "
                             "If uncertain, you may omit it or leave it blank. "
                             "Bias 3 for simple mentions, 4-5 for recent/current project usage, "
                             "and 1-2 for weak, outdated, or minimal exposure."
                         ),
                     },
                     {"role": "user", "content": prompt},
                 ],
                 temperature=0.1,
                 max_tokens=1200,
+                timeout=30.0,
             )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/resume_skills_extractor.py` around lines 86 - 107,
The OpenAI call in self.client.chat.completions.create lacks a request-level
timeout; update the call to include a timeout parameter (e.g.,
timeout=self.request_timeout or a literal like timeout=30) so the request fails
fast instead of relying solely on worker-level timeouts, and add a configurable
attribute (e.g., self.request_timeout) on the enclosing class or method
signature so callers can set it; ensure you pass the timeout alongside
temperature and max_tokens in the same call and handle any timeout exceptions
where response is consumed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Line 7993: The command help text still reads "Resume file to upload (PDF, DOC,
or DOCX)" but the validator now rejects DOC files; update the help/description
string for the command parameter named "file" (the argument whose value is
currently set to file="Resume file to upload (PDF, DOC, or DOCX)") to reflect
the actual allowed types (e.g., "Resume file to upload (PDF or DOCX)"). Ensure
you only change that description string so it matches the validator behavior.
- Around line 4737-4745: The audit write in the resume validation failure path
currently calls self._audit_command(...) which can raise and abort the Discord
command; change this to use the best-effort helper self._audit_command_safe(...)
(or wrap the call in a try/except that swallows/logs audit errors) when logging
the failure (preserving interaction, action, result=failure_result and metadata
with "filename" and "reason":"invalid_file_type"); do the same replacement for
the other nearby resume-validation audit call that uses _audit_command so audit
failures do not propagate.
- Around line 66-70: The bot duplicates resume validation rules; replace the
hard-coded RESUME_ALLOWED_EXTENSIONS, RESUME_ALLOWED_EXTENSIONS_LABEL,
RESUME_MAX_FILE_SIZE_BYTES and RESUME_MAX_FILE_SIZE_MB with values derived from
the shared processor config so rules never drift: import or call
ResumeProcessorConfig.from_settings(...) (or the shared config provider used
elsewhere) and set the bot's allowed extensions, label and max file size from
its properties (convert bytes->MB for RESUME_MAX_FILE_SIZE_MB and format the
label from the extensions list) so the bot-side checks use the exact same rules
as ResumeProcessorConfig; update all other duplicated blocks (the other
occurrences noted) to follow the same pattern.

In `@packages/shared/src/five08/resume_profile_processor.py`:
- Around line 680-694: The block that returns a ResumeApplyResult when
updated_fields is truthy is unreachable because an earlier return already
handles the same condition; remove this duplicate if updated_fields: return
ResumeApplyResult(...) block (which references updated_fields, approved_updates,
link_discord_applied, batch_errors) to eliminate dead code and keep only the
intended single return path for updated_fields in resume_profile_processor.py.

---

Nitpick comments:
In `@packages/shared/src/five08/resume_skills_extractor.py`:
- Around line 86-107: The OpenAI call in self.client.chat.completions.create
lacks a request-level timeout; update the call to include a timeout parameter
(e.g., timeout=self.request_timeout or a literal like timeout=30) so the request
fails fast instead of relying solely on worker-level timeouts, and add a
configurable attribute (e.g., self.request_timeout) on the enclosing class or
method signature so callers can set it; ensure you pass the timeout alongside
temperature and max_tokens in the same call and handle any timeout exceptions
where response is consumed.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c07a903f-2104-4ada-b52f-7bce7ebe217d

📥 Commits

Reviewing files that changed from the base of the PR and between 77f44c4 and a357680.

📒 Files selected for processing (9)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
apps/worker/src/five08/worker/crm/resume_profile_processor.py
apps/worker/src/five08/worker/models.py
packages/shared/src/five08/resume_document_processor.py
packages/shared/src/five08/resume_processing_models.py
packages/shared/src/five08/resume_profile_processor.py
packages/shared/src/five08/resume_skills_extractor.py
tests/unit/test_crm.py
tests/unit/test_resume_profile_processor.py

Copilot

Pull request overview

Refactors the resume extraction/apply flow so both the Discord bot (synchronous path) and the worker (background path) share the same resume processor/config/models, while also tightening CRM field expectations (e.g., cLinkedIn, cOnboarder, cOnboardingState) and restricting resume uploads to PDF/DOCX.

Changes:

Introduces shared resume processing primitives (ResumeProcessorConfig, ResumeProfileProcessor, typed result models) under packages/shared.
Updates the worker to use a compatibility shim that delegates to the shared processor.
Updates the Discord bot to run extract/apply directly (no backend job enqueue/polling), adds centralized resume attachment validation, and adjusts onboarding/LinkedIn field handling.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/unit/test_skills_extractor.py	Adds coverage for shared heuristic multi-word skill detection.
tests/unit/test_resume_profile_processor.py	Adds tests for shared config clamping + partial-apply warning behavior.
tests/unit/test_crm.py	Updates bot behavior tests for direct extract/apply, onboarding field changes, upload validation, and hint reuse.
packages/shared/src/five08/resume_skills_extractor.py	New shared skills extractor with LLM + heuristic fallback.
packages/shared/src/five08/resume_profile_processor.py	New shared end-to-end resume extract/apply processor and config.
packages/shared/src/five08/resume_processing_models.py	New shared Pydantic models for extraction/apply payloads/results.
packages/shared/src/five08/resume_document_processor.py	New shared document processor for PDF/DOCX validation + text extraction.
apps/worker/src/five08/worker/models.py	Switches worker models to re-export shared resume processing models.
apps/worker/src/five08/worker/crm/resume_profile_processor.py	Replaces worker implementation with shim subclassing shared processor.
apps/worker/src/five08/worker/config.py	Restricts allowed resume types to `pdf,docx`.
apps/discord_bot/src/five08/discord_bot/cogs/crm.py	Moves bot resume flows to direct shared processor calls, adds resume validation helper, and simplifies CRM field assumptions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            crm_update_payload = dict(approved_updates)
+            crm_update_payload["cResumeLastProcessed"] = datetime.now(
+                tz=timezone.utc
+            ).strftime("%Y-%m-%d %H:%M:%S")


coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 66-69: Replace the hard-coded LINKEDIN_FIELD constant with a value
read from the shared processor config (use the existing config key
crm_linkedin_field) and fall back to "cLinkedIn" if not present; specifically,
remove or override the top-level LINKEDIN_FIELD constant and instead initialize
the field name from the processor/shared config where EspoClient or CRM
operations reference LINKEDIN_FIELD (search for the symbol LINKEDIN_FIELD and
usages in crm-related functions/classes), then update all occurrences (including
the other noted locations) to use the runtime-configured field name so LinkedIn
search/update/label behavior respects crm_linkedin_field overrides.

In `@packages/shared/src/five08/resume_profile_processor.py`:
- Around line 737-746: The code returns attempted updates in
updated_fields/updated_values even when no writes succeeded; change the
ResumeApplyResult construction in the branch where no updates succeeded to set
updated_fields to an empty list and updated_values to an empty dict (instead of
using approved_updates), while keeping contact_id, link_discord_applied
(link_applied), success=False and error built from batch_errors; ensure
ResumeApplyResult(...) uses
sorted(approved_updates.keys())/dict(approved_updates) only in the successful
path where at least one write actually succeeded.
- Around line 629-637: The early return when approved_updates is empty prevents
the subsequent Discord-link enrichment from running, which rejects link-only
submissions; update the function containing the approved_updates check (the
block that returns a ResumeApplyResult with success=False and error="No valid
profile fields provided") to either remove or defer that return so that the
Discord enrichment logic that sets cDiscordUserID and cDiscordUsername still
executes when approved_updates is empty, and ensure the final ResumeApplyResult
reflects any discord-only updates before returning.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3089bfe2-a08e-403e-8179-bb417ebf7fd4

📥 Commits

Reviewing files that changed from the base of the PR and between a357680 and 246e2d2.

📒 Files selected for processing (8)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
apps/worker/src/five08/worker/config.py
packages/shared/src/five08/resume_processing_models.py
packages/shared/src/five08/resume_profile_processor.py
packages/shared/src/five08/resume_skills_extractor.py
tests/unit/test_crm.py
tests/unit/test_resume_profile_processor.py
tests/unit/test_skills_extractor.py

coderabbitai · 2026-03-18T23:44:36Z

+ESPO_DATE_FORMAT = "%Y-%m-%d"
+ESPO_DATETIME_FORMAT = "%Y-%m-%d %H:%M:%S"
+LINKEDIN_FIELD = "cLinkedIn"
 EspoClient = espo.EspoClient


⚠️ Potential issue | 🟠 Major

Avoid hard-coding the LinkedIn CRM field name.

Line 68 pins LINKEDIN_FIELD to "cLinkedIn", but the shared processor config supports crm_linkedin_field overrides. This can break LinkedIn search/update/label behavior in deployments using a different CRM field.

🔧 Proposed fix

-LINKEDIN_FIELD = "cLinkedIn" +LINKEDIN_FIELD = ( + str(getattr(settings, "crm_linkedin_field", "cLinkedIn")).strip() or "cLinkedIn" +)

Also applies to: 1695-1697, 5171-5172, 5354-5355, 7196-7197

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py` around lines 66 - 69, Replace the hard-coded LINKEDIN_FIELD constant with a value read from the shared processor config (use the existing config key crm_linkedin_field) and fall back to "cLinkedIn" if not present; specifically, remove or override the top-level LINKEDIN_FIELD constant and instead initialize the field name from the processor/shared config where EspoClient or CRM operations reference LINKEDIN_FIELD (search for the symbol LINKEDIN_FIELD and usages in crm-related functions/classes), then update all occurrences (including the other noted locations) to use the runtime-configured field name so LinkedIn search/update/label behavior respects crm_linkedin_field overrides.

coderabbitai · 2026-03-18T23:44:36Z

+            return ResumeApplyResult(
+                contact_id=contact_id,
+                updated_fields=sorted(approved_updates.keys()),
+                updated_values=dict(approved_updates),
+                link_discord_applied=link_applied,
+                success=False,
+                error="; ".join(batch_errors)
+                if batch_errors
+                else "No fields were updated",
+            )


⚠️ Potential issue | 🟠 Major

Do not report failed writes as updated fields.

This branch runs when no field update succeeded, but it currently returns updated_fields/updated_values populated from attempted inputs. That misleads callers into treating failed updates as applied.

Suggested fix

return ResumeApplyResult( contact_id=contact_id, - updated_fields=sorted(approved_updates.keys()), - updated_values=dict(approved_updates), + updated_fields=[], + updated_values={}, link_discord_applied=link_applied, success=False, error="; ".join(batch_errors) if batch_errors else "No fields were updated", )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/shared/src/five08/resume_profile_processor.py` around lines 737 - 746, The code returns attempted updates in updated_fields/updated_values even when no writes succeeded; change the ResumeApplyResult construction in the branch where no updates succeeded to set updated_fields to an empty list and updated_values to an empty dict (instead of using approved_updates), while keeping contact_id, link_discord_applied (link_applied), success=False and error built from batch_errors; ensure ResumeApplyResult(...) uses sorted(approved_updates.keys())/dict(approved_updates) only in the successful path where at least one write actually succeeded.

Copilot

Pull request overview

This PR refactors the resume extraction/apply workflow into shared modules so both the Discord bot and the worker can reuse the same processor, while also standardizing EspoCRM field usage (notably cLinkedIn, cOnboarder, cOnboardingState) and tightening resume upload validation to PDF/DOCX only.

Changes:

Introduces shared resume processing models + processor + skills extractor in packages/shared, and updates the worker to use a compatibility shim.
Updates the Discord bot to run resume extract/apply flows directly (sync via asyncio.to_thread) and reuses parsed resume hints in failure summaries.
Simplifies CRM field handling and updates tests to reflect new onboarding/linkedin fields and new resume validation rules.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/unit/test_worker_config.py	Updates worker defaults tests to reflect removal of configurable LinkedIn field mapping.
tests/unit/test_skills_extractor.py	Adds coverage for shared heuristic skill extraction detecting multi-word phrases.
tests/unit/test_resume_profile_processor.py	Adds shared config tests, removes configurable LinkedIn field tests, and expands apply-path behavior tests (aliases, link-only, partial success).
tests/unit/test_crm.py	Updates bot CRM tests for new onboarding fields, Espo datetime formatting, resume validation (PDF/DOCX-only), hint reuse, and direct extract/apply paths.
packages/shared/src/five08/resume_skills_extractor.py	Adds shared skills extractor with LLM + heuristic fallback.
packages/shared/src/five08/resume_profile_processor.py	Adds shared end-to-end resume extraction + CRM apply workflow and persistence of processing runs.
packages/shared/src/five08/resume_processing_models.py	Adds shared Pydantic models for extract/apply results and change/skip reporting.
packages/shared/src/five08/resume_document_processor.py	Adds shared document text extraction/validation + hashing/caching.
apps/worker/src/five08/worker/models.py	Re-exports shared resume processing models instead of duplicating them in worker models.
apps/worker/src/five08/worker/crm/resume_profile_processor.py	Replaces worker processor implementation with a thin shim over the shared processor/config.
apps/worker/src/five08/worker/crm/people_sync.py	Standardizes LinkedIn field handling to `cLinkedIn`.
apps/worker/src/five08/worker/crm/intake_form_processor.py	Standardizes intake payload LinkedIn field mapping to `cLinkedIn`.
apps/worker/src/five08/worker/config.py	Removes configurable LinkedIn field and restricts allowed resume types to `pdf,docx`.
apps/discord_bot/src/five08/discord_bot/cogs/crm.py	Switches resume flows to direct shared processing, standardizes onboarding/linkedin fields, adds resume attachment validation helper, and reuses parsed hints in summaries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        self.allowed_extensions = {ext.strip().lower() for ext in allowed_extensions}
+        self.max_file_size = max(1, int(max_file_size_mb)) * 1024 * 1024
+        self._content_cache: dict[str, str] = {}
+


+    def _parse_strength(self, value: Any) -> int | None:
+        raw: Any = value
+        if isinstance(value, dict):
+            raw = value.get("strength")
+        if raw is None:
+            return None
+        if isinstance(raw, str) and not raw.strip():
+            return None
+        try:
+            numeric = int(float(raw))
+        except Exception:
+            return None
+        if numeric < 1 or numeric > 5:
+            return None
+        return numeric
+
+    def _parse_skill_with_strength(self, value: str) -> tuple[str, int | None]:
+        raw = value.strip()
+        match = re.match(r"^(.*)\(\s*(\d*)\s*\)\s*$", raw)
+        if match is None:
+            return self._normalize_skill_name(raw), None
+
+        base = match.group(1).strip()
+        parsed_strength = self._parse_strength(match.group(2))
+        if not base:
+            return "", None
+        normalized_base = self._normalize_skill_name(base)
+        if not normalized_base:
+            return "", None
+        if parsed_strength is None:
+            return normalized_base, None
+        return normalized_base, parsed_strength
+


refactor crm resume processing

a357680

Copilot AI review requested due to automatic review settings March 18, 2026 22:30

michaelmwu temporarily deployed to test March 18, 2026 22:30 — with GitHub Actions Inactive

Copilot AI reviewed Mar 18, 2026

View reviewed changes

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

address PR review feedback

b386615

michaelmwu temporarily deployed to test March 18, 2026 22:44 — with GitHub Actions Inactive

fix resume validation audit handling

1d968de

Copilot AI review requested due to automatic review settings March 18, 2026 22:45

michaelmwu temporarily deployed to test March 18, 2026 22:45 — with GitHub Actions Inactive

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Comment thread packages/shared/src/five08/resume_profile_processor.py Outdated

Comment on lines +659 to +662

crm_update_payload = dict(approved_updates)

crm_update_payload["cResumeLastProcessed"] = datetime.now(

tz=timezone.utc

).strftime("%Y-%m-%d %H:%M:%S")

update fallback resume timestamp

246e2d2

michaelmwu temporarily deployed to test March 18, 2026 23:37 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

simplify linkedin field handling

a1cf65c

Copilot AI review requested due to automatic review settings March 18, 2026 23:51

michaelmwu temporarily deployed to test March 18, 2026 23:52 — with GitHub Actions Inactive

Copilot AI reviewed Mar 18, 2026

View reviewed changes

remove unused skill parsing helpers

86b4e1c

michaelmwu temporarily deployed to test March 19, 2026 00:02 — with GitHub Actions Inactive

michaelmwu merged commit 28bdde5 into main Mar 19, 2026
4 checks passed

michaelmwu deleted the michaelmwu/dry-crm-constants branch March 19, 2026 00:03

coderabbitai Bot mentioned this pull request Mar 24, 2026

Add member agreement Docuseal flow and role grant #215

Merged

Conversation

michaelmwu commented Mar 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

How Has This Been Tested?

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

michaelmwu commented Mar 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 18, 2026 •

edited

Loading