fix: improve resume location parsing and editing by michaelmwu · Pull Request #180 · 508-dev/508-workflows

michaelmwu · 2026-03-08T21:44:50Z

Description

Tighten CRM/resume location parsing so country values must be valid countries, city/state fields reject obvious garbage, and two-part locations like Nanzih, Kaohsiung City are repaired as city plus region instead of a bad country.
Add a resume-review Edit Location modal for City, State / Region, Country, and Timezone, and render grouped human-readable location summaries in both the preview and applied-updates embeds.
Expand unit coverage for normalization, resume extraction, resume confirmation UI, and non-US location parsing.

Related Issue

N/A

How Has This Been Tested?

uv run pytest tests/unit/test_crm_normalization.py tests/unit/test_resume_extractor.py tests/unit/test_crm.py -q

Summary by CodeRabbit

New Features
- Added a location editor (city, state, country, timezone) and a button to open it.
- Resume/profile previews now show Location as a single, consolidated line when related fields change.
- Improved normalization and parsing for ambiguous two-part locations.
Bug Fixes
- Better validation distinguishing cities, states, and countries; more robust timezone handling.
Tests
- Added tests covering location editing, grouping, parsing, and preview behavior.

coderabbitai · 2026-03-08T21:45:07Z

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 32 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f5786c91-e0e5-4c3f-97cd-8df86ee99500

📥 Commits

Reviewing files that changed from the base of the PR and between f4a312b and 3cd379e.

📒 Files selected for processing (4)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
packages/shared/src/five08/crm_normalization.py
tests/unit/test_crm.py
tests/unit/test_crm_normalization.py

📝 Walkthrough

Walkthrough

Adds end-to-end location editing: UI modal/button, normalization utilities, resume-extraction heuristics for city/state/country/timezone, and tests; integrates location rendering/collapsing into the CRM update confirmation view and preview generation.

Changes

Cohort / File(s)	Summary
CRM UI & View `apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Added `ResumeEditLocationModal` and `ResumeEditLocationButton`; extended `ResumeUpdateConfirmationView` with `_LOCATION_FIELDS`, `_is_location_field`, `_has_location_updates`, `_location_component`, `_format_location_summary`; updated applied/preview rendering and collapsing to show Location as a single logical field.
Normalization library `packages/shared/src/five08/crm_normalization.py`	Introduced extensive location normalization: US state maps, country canonical/alias maps, and helpers (`_normalize_location_text`, `_location_lookup_key`, `_is_plausible_location_phrase`, `_title_case_location`); refactored `normalize_city`, `normalize_state`, and `normalize_country` to use these helpers.
Resume extractor `packages/shared/src/five08/resume_extractor.py`	Updated `_parse_location_candidate` to treat two-part locations as city+state when second token isn't a country; added prompt constraints to prefer real country names, treat ambiguous two-part tokens as city/state, and exclude tech tokens from location fields.
Unit tests — CRM `tests/unit/test_crm.py`	Added tests for location UI (modal pre-population, submission/normalization), location-button visibility, and grouping/collapsing of location subfields in the confirmation view.
Unit tests — Normalization `tests/unit/test_crm_normalization.py`	Exposed/added `normalize_state` to public API and added tests covering state normalization, country/city disambiguation, and invalid-token rejection.
Unit tests — Resume extractor `tests/unit/test_resume_extractor.py`	Added/updated tests for two-part city/region parsing, repair of invalid country data, and updated expectations for normalized state values.

Sequence Diagram

sequenceDiagram
    autonumber
    actor User
    participant DiscordUI as "Discord UI\n(Button)"
    participant Modal as "Location Modal\n(Form)"
    participant Normalizer as "Normalization\n(crm_normalization)"
    participant CRM as "CRM View\n(ResumeUpdateConfirmationView)"
    participant DB as "Database"

    User->>DiscordUI: Click Edit Location
    DiscordUI->>Modal: Open (pre-populate values)
    User->>Modal: Submit city/state/country/timezone
    Modal->>Normalizer: normalize_city(state/city)
    Normalizer-->>Modal: normalized_city
    Modal->>Normalizer: normalize_state(input)
    Normalizer-->>Modal: normalized_state
    Modal->>Normalizer: normalize_country(input)
    Normalizer-->>Modal: normalized_country
    Modal->>Normalizer: normalize_timezone(input)
    Normalizer-->>Modal: normalized_timezone
    Modal->>CRM: Return proposed_updates (location fields)
    CRM->>CRM: _has_location_updates()
    CRM->>CRM: _format_location_summary()
    CRM->>DB: Persist proposed_updates
    CRM-->>User: Display Proposed Changes with consolidated "Location" line

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: extend update-contact fields #174: Overlaps CRM location parsing and UI plumbing; likely modifies the same _parse_location_input and update-contact location flows.
Improve resume CRM inference and Espo client integration #168: Adds similar location handling and normalize_state changes in resume extraction and normalization libraries.
Suggest cTimezone during profile processing from location data #159: Introduces timezone heuristics and related prompt/parse adjustments that intersect with location/timezone inference in this PR.

Poem

🐰 I hopped to a modal, tidy and bright,
City, State, Country — all snug in a line,
Normalized hops, timestamps kept light,
Preview sings "Location" — compact and fine,
Rabbits cheer for tidy data tonight! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.38% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary changes: improving resume location parsing and adding location editing UI.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch michaelmwu/location-parse-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR tightens resume/CRM location normalization (rejecting invalid countries and obvious garbage in city/state) and adds Discord UI affordances to review/edit location fields before applying updates to the CRM.

Changes:

Strengthen location normalization: canonical-country validation, US state abbreviation expansion, and plausibility checks for city/state text.
Improve resume location extraction to better repair two-part non-country locations (city + region).
Add a Discord “Edit Location” modal and group location fields into a single human-readable “Location” line in preview/applied-updates embeds, with expanded unit coverage.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`packages/shared/src/five08/crm_normalization.py`	Adds canonical country validation + aliases, US state abbreviation expansion, and plausibility checks for city/state.
`packages/shared/src/five08/resume_extractor.py`	Updates deterministic parsing to treat 2-part non-country locations as city+region and strengthens LLM prompt constraints.
`apps/discord_bot/src/five08/discord_bot/cogs/crm.py`	Introduces Edit Location modal/button; groups location fields into a single “Location” summary in preview + applied updates.
`tests/unit/test_crm_normalization.py`	Adds/updates normalization tests for country/state/city plausibility and US state expansion.
`tests/unit/test_resume_extractor.py`	Adds extraction test ensuring invalid LLM location fields are overridden by deterministic parsing; updates state expectations.
`tests/unit/test_crm.py`	Adds tests for grouped location rendering and the new Edit Location modal/button behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-08T21:51:24Z

packages/shared/src/five08/crm_normalization.py

+
+
+def _normalize_location_text(value: str) -> str:
+    return re.sub(r"\s+", " ", value).strip(" ,")


_normalize_location_text currently strips spaces/commas but not trailing periods. This makes normalize_country reject common resume punctuation like Taiwan. / Germany. (and then higher-level location parsing may fail for City, Region, Taiwan.). Consider stripping terminal punctuation (at least .) during normalization so canonical country matching still works.

Suggested change

return re.sub(r"\s+", " ", value).strip(" ,")

# Collapse internal whitespace and strip common trailing punctuation/spaces.

return re.sub(r"\s+", " ", value).strip(" ,.")

Copilot · 2026-03-08T21:51:25Z

packages/shared/src/five08/crm_normalization.py

+    letters_only = re.sub(r"[^A-Za-z]", "", normalized)
+    if len(letters_only) <= 2:


normalize_state uses letters_only = re.sub(r"[^A-Za-z]", "", normalized) to reject short tokens. This unintentionally rejects non-ASCII region names that otherwise pass _is_plausible_location_phrase (e.g., Chinese/Japanese/Korean region strings), because letters_only becomes empty and the function returns None. Consider counting Unicode letters instead (e.g., based on str.isalpha()), so the “short token” guard doesn’t exclude valid non-Latin locations.

Suggested change

letters_only = re.sub(r"[^A-Za-z]", "", normalized)

if len(letters_only) <= 2:

letter_count = sum(1 for ch in normalized if ch.isalpha())

if letter_count <= 2:

Copilot · 2026-03-08T21:51:25Z

packages/shared/src/five08/resume_extractor.py

    if len(parts) == 2:
        if _looks_like_state_region(region):
            return city, _normalize_state(region), None
        country = _normalize_country(region)
        if country:


In _parse_location_candidate, the _looks_like_state_region(region) branch runs before trying _normalize_country(region). Since _looks_like_state_region returns true for any 2-letter uppercase token, inputs like London, UK or San Francisco, US will be treated as a “state” and short-circuited, causing country parsing to fail (and may return (city, None, None) with the new stricter normalize_state). Consider attempting _normalize_country(region) first, or tightening _looks_like_state_region to only match known US state abbreviations (instead of [A-Z]{2}).

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (3)

tests/unit/test_crm_normalization.py (1)
55-61: Add the same garbage-string rejection check for normalize_state.

The PR tightens both city and state parsing, but this block only proves the long bullet-prefixed garbage value is rejected for normalize_city. If normalize_state starts accepting the same junk, this file will miss it.
Suggested assertion
     assert normalize_state("CA") == "California"
     assert normalize_state("Kaohsiung City") == "Kaohsiung City"
+    assert (
+        normalize_state("○ A Python Django Api Handles Account Creation And Management")
+        is None
+    )
     assert normalize_state("Js") is None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_crm_normalization.py` around lines 55 - 61, The test
currently verifies that normalize_city rejects the long bullet-prefixed garbage
string but does not assert the same for normalize_state; add an analogous
assertion that normalize_state("○ A Python Django Api Handles Account Creation
And Management") is None so the test covers state normalization rejecting the
same junk. Locate the assertions block containing normalize_city and
normalize_state in tests/unit/test_crm_normalization.py and add the new
assertion next to the existing normalize_city check to ensure normalize_state
behavior is validated.
tests/unit/test_resume_extractor.py (1)
759-826: Also assert timezone backfill in this repaired-location path.

This test proves the invalid country is replaced and the region is repaired, but it never checks the Taiwan timezone inference that similar extractor tests already cover. A regression could keep address_city/state/country correct while dropping result.timezone.
Suggested assertion
     assert result.address_city == "Nanzih"
     assert result.address_state == "Kaohsiung City"
     assert result.address_country == "Taiwan"
+    assert result.timezone == "UTC+08:00"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_resume_extractor.py` around lines 759 - 826, The test
test_extract_discards_invalid_country_and_repairs_current_location_region
currently verifies address_city/state/country but omits timezone; update this
test (which uses ResumeProfileExtractor and assigns result =
extractor.extract(...)) to also assert the inferred timezone (expecting
"Asia/Taipei") on result.timezone to ensure timezone backfill is present when
the invalid country is replaced and the region is repaired.
tests/unit/test_crm.py (1)
669-693: Cover non-canonical manual edits in the location modal submit test.

This case only feeds already-normalized city/state/country values, so it mainly proves passthrough plus timezone normalization. A regression that skips normalize_country/normalize_state in the new modal path would still pass.

Consider a companion submit case with mixed-case or abbreviated inputs, plus an invalid country value, and assert the payload is normalized or rejected as intended.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_crm.py` around lines 669 - 693, Add a companion async test
(e.g., test_edit_location_modal_submit_normalizes_or_rejects) that exercises
ResumeEditLocationModal.on_submit using mixed-case/abbreviated inputs (e.g.,
"nAnZiH", "ca", "uSa" or "US") and an invalid country value to cover
non-canonical manual edits; ensure the test sets
modal.city_input/_state_input/_country_input/_timezone_input values, calls await
modal.on_submit(mock_interaction), and then asserts that
ResumeUpdateConfirmationView.proposed_updates contains normalized keys via
normalize_country/normalize_state (or that the modal rejects/handles the invalid
country as the code path dictates) and that
mock_interaction.response.send_message was called (or not) according to the
rejection behavior so the normalization and validation paths are covered.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 1153-1154: The ResumeEditLocationButton is only added when
self._has_location_updates(proposed_updates) is true, making the location editor
unreachable if the parser returns no location; change the logic so the button is
added whenever location data is missing or incomplete in proposed_updates (i.e.,
when not self._has_location_updates(proposed_updates) or when specific location
fields like city/state/country/timezone are absent) so users can open the modal
to add a new location; update the condition around
add_item(ResumeEditLocationButton()) in the same block to check for
missing/incomplete location fields rather than only existing updates.
- Around line 954-964: The modal currently only uses
normalize_timezone(raw_timezone) and thus rejects abbreviation inputs like
"PST"/"EST"; update the timezone parsing to mirror _parse_location_input by
first trying normalize_timezone(raw_timezone) and if that returns falsy, check
_LOCATION_TIMEZONE_ABBREV_MAP for a mapped value (or call the same helper used
by _parse_location_input) so timezone accepts both offset-style and abbreviation
inputs; adjust the timezone variable assignment and the invalid_fields check
(referencing raw_timezone and timezone) accordingly.

In `@packages/shared/src/five08/crm_normalization.py`:
- Around line 90-317: The country lookup rejects legitimate names because
_location_lookup_key() only strips periods and the canonical/alias maps
(_CANONICAL_COUNTRY_NAMES, _COUNTRY_CANONICAL_MAP, _COUNTRY_ALIASES) lack many
variants (e.g., "Democratic Republic of the Congo", accented forms like "Côte
d’Ivoire"). Fix by (1) updating _CANONICAL_COUNTRY_NAMES to include missing
canonical names (e.g., "Democratic Republic of the Congo") and common official
variants, (2) enhancing _location_lookup_key() to perform robust Unicode
normalization (unicodedata.normalize to NFKD + strip combining marks), normalize
apostrophes/quotes (curly to straight), remove punctuation and extra whitespace
and then casefold, and (3) rebuild _COUNTRY_CANONICAL_MAP and _COUNTRY_ALIASES
keys using that normalized lookup key so alias entries (add accented and
punctuation variants) map correctly to the canonical values; update any
reference code that uses _COUNTRY_CANONICAL_MAP/_COUNTRY_ALIASES to call the
improved _location_lookup_key() before lookup.
- Around line 425-444: The fallback is too permissive and lacks non-US aliases;
update normalize_state() and the shared fallback logic by first checking a
curated map of non-US region abbreviations/names (e.g.,
CANADA_PROVINCE_ABBREVIATIONS, OTHER_REGION_NAMES) after
_US_STATE_ABBREVIATIONS/_US_STATE_NAMES and return canonical values if found,
then tighten _is_plausible_location_phrase (reduce max_words to ~3, max_length
to ~40) and add a short stopword list (e.g.,
"engineer","developer","javascript","senior","jr","sr","intern") to reject
job/tech phrases; also require letters_only length >= 3 before accepting and
calling _title_case_location(normalized). Apply the same stricter checks in
normalize_city()-related paths so 1–2 letter tokens (like "JS") are not
title-cased while known region abbreviations are still preserved via the curated
alias maps.

---

Nitpick comments:
In `@tests/unit/test_crm_normalization.py`:
- Around line 55-61: The test currently verifies that normalize_city rejects the
long bullet-prefixed garbage string but does not assert the same for
normalize_state; add an analogous assertion that normalize_state("○ A Python
Django Api Handles Account Creation And Management") is None so the test covers
state normalization rejecting the same junk. Locate the assertions block
containing normalize_city and normalize_state in
tests/unit/test_crm_normalization.py and add the new assertion next to the
existing normalize_city check to ensure normalize_state behavior is validated.

In `@tests/unit/test_crm.py`:
- Around line 669-693: Add a companion async test (e.g.,
test_edit_location_modal_submit_normalizes_or_rejects) that exercises
ResumeEditLocationModal.on_submit using mixed-case/abbreviated inputs (e.g.,
"nAnZiH", "ca", "uSa" or "US") and an invalid country value to cover
non-canonical manual edits; ensure the test sets
modal.city_input/_state_input/_country_input/_timezone_input values, calls await
modal.on_submit(mock_interaction), and then asserts that
ResumeUpdateConfirmationView.proposed_updates contains normalized keys via
normalize_country/normalize_state (or that the modal rejects/handles the invalid
country as the code path dictates) and that
mock_interaction.response.send_message was called (or not) according to the
rejection behavior so the normalization and validation paths are covered.

In `@tests/unit/test_resume_extractor.py`:
- Around line 759-826: The test
test_extract_discards_invalid_country_and_repairs_current_location_region
currently verifies address_city/state/country but omits timezone; update this
test (which uses ResumeProfileExtractor and assigns result =
extractor.extract(...)) to also assert the inferred timezone (expecting
"Asia/Taipei") on result.timezone to ensure timezone backfill is present when
the invalid country is replaced and the region is repaired.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2a50a859-4f68-4423-b9a3-c05af581bbe8

📥 Commits

Reviewing files that changed from the base of the PR and between b48c508 and 9b90746.

📒 Files selected for processing (6)

apps/discord_bot/src/five08/discord_bot/cogs/crm.py
packages/shared/src/five08/crm_normalization.py
packages/shared/src/five08/resume_extractor.py
tests/unit/test_crm.py
tests/unit/test_crm_normalization.py
tests/unit/test_resume_extractor.py

apps/discord_bot/src/five08/discord_bot/cogs/crm.py

packages/shared/src/five08/crm_normalization.py

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-08T22:02:44Z

apps/discord_bot/src/five08/discord_bot/cogs/crm.py

+        raw_city = (self.city_input.value or "").strip()
+        raw_state = (self.state_input.value or "").strip()
+        raw_country = (self.country_input.value or "").strip()
+        raw_timezone = (self.timezone_input.value or "").strip()
+


ResumeEditLocationModal.on_submit treats user-entered strings like "None"/"null" as real values (e.g., normalize_city("None") returns "None"), so submitting the modal can write the literal string "None" into addressCity/addressState rather than clearing the field. Consider pre-normalizing inputs with the same logic as _location_component (treat case-insensitive "none"/"null" as empty) before calling normalize_city/normalize_state, or make the normalization helpers reject these sentinel values.

Suggested change

raw_city = (self.city_input.value or "").strip()

raw_state = (self.state_input.value or "").strip()

raw_country = (self.country_input.value or "").strip()

raw_timezone = (self.timezone_input.value or "").strip()

def _clean_location_input(value: str | None) -> str:

"""Normalize raw user location input before CRM normalization.

Treats case-insensitive 'none'/'null' as empty so those values clear

existing fields instead of being stored literally.

"""

cleaned = (value or "").strip()

if cleaned.lower() in {"none", "null"}:

return ""

return cleaned

raw_city = _clean_location_input(self.city_input.value)

raw_state = _clean_location_input(self.state_input.value)

raw_country = _clean_location_input(self.country_input.value)

raw_timezone = _clean_location_input(self.timezone_input.value)

Copilot · 2026-03-08T22:02:45Z

apps/discord_bot/src/five08/discord_bot/cogs/crm.py

+    city_input: discord.ui.TextInput = discord.ui.TextInput(
+        label="City",
+        required=False,
+        max_length=100,
+    )
+    state_input: discord.ui.TextInput = discord.ui.TextInput(
+        label="State / Region",
+        required=False,
+        max_length=100,
+    )
+    country_input: discord.ui.TextInput = discord.ui.TextInput(
+        label="Country",
+        required=False,
+        max_length=100,
+    )
+    timezone_input: discord.ui.TextInput = discord.ui.TextInput(
+        label="Timezone",
+        required=False,
+        max_length=100,
+    )


The location modal allows up to 100 characters per field, but normalize_city/normalize_state currently reject values over 40 chars (via _is_plausible_location_phrase(..., max_length=40)). This mismatch will cause the UI to accept input that the backend will always mark invalid. Consider aligning max_length on the TextInputs with the normalization constraints (or loosening the normalization max_length if 100 is intentional).

Copilot · 2026-03-08T22:02:45Z

apps/discord_bot/src/five08/discord_bot/cogs/crm.py

+    @classmethod
+    def _has_location_updates(cls, values: dict[str, Any]) -> bool:
+        return any(values.get(field) for field in cls._LOCATION_FIELDS)
+


ResumeUpdateConfirmationView._has_location_updates is introduced but appears unused in the module. If it’s not needed, consider removing it to keep the view API minimal; otherwise, wire it into the call sites that need it (e.g., deciding whether to render grouped location output).

Suggested change

@classmethod

def _has_location_updates(cls, values: dict[str, Any]) -> bool:

return any(values.get(field) for field in cls._LOCATION_FIELDS)

Copilot · 2026-03-08T22:02:45Z

packages/shared/src/five08/crm_normalization.py

+def _normalize_location_text(value: str) -> str:
+    return re.sub(r"\s+", " ", value).strip(" ,.")
+
+
+def _location_lookup_key(value: str) -> str:
+    normalized = _normalize_location_text(value)
+    normalized = normalized.replace("’", "'").replace("‘", "'")
+    normalized = unicodedata.normalize("NFKD", normalized)
+    normalized = "".join(
+        ch for ch in normalized if not unicodedata.combining(ch)
+    ).casefold()
+    collapsed = "".join(ch if ch.isalnum() else " " for ch in normalized)
+    return re.sub(r"\s+", " ", collapsed).strip()
+
+
+_COUNTRY_CANONICAL_MAP: dict[str, str] = {
+    _location_lookup_key(value): value for value in _CANONICAL_COUNTRY_NAMES
+}
+_COUNTRY_ALIASES: dict[str, str] = {
+    _location_lookup_key(key): value for key, value in _RAW_COUNTRY_ALIASES.items()
+}
+
+
+def _is_plausible_location_phrase(
+    value: str,
+    *,
+    max_words: int,
+    max_length: int,
+) -> bool:
+    if not value or len(value) > max_length:
+        return False
+    tokens = [
+        token.casefold()
+        for token in re.findall(r"[^\W\d_][^\W\d_'-]*", value, flags=re.UNICODE)
+    ]
+    if not tokens or len(tokens) > max_words:
+        return False
+    if any(token in _LOCATION_STOPWORDS for token in tokens):
+        return False
+    for ch in value:
+        if ch.isalpha() or ch in {" ", "-", "'", ".", "(", ")"}:
+            continue
+        return False


normalize_city/normalize_state currently reject strings containing typographic apostrophes (e.g., “’” / “‘”) because _is_plausible_location_phrase only allows ASCII ', while _location_lookup_key already normalizes these for countries. This can cause valid city/region names with curly quotes to be treated as invalid. Consider normalizing curly quotes in _normalize_location_text (or allowing them in _is_plausible_location_phrase) so city/state validation matches the country normalization behavior.

fix resume location parsing and editing

9b90746

Copilot AI review requested due to automatic review settings March 8, 2026 21:44

michaelmwu temporarily deployed to test March 8, 2026 21:44 — with GitHub Actions Inactive

Copilot started reviewing on behalf of michaelmwu March 8, 2026 21:45 View session

Copilot AI reviewed Mar 8, 2026

View reviewed changes

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

fix location normalization edge cases

f4a312b

michaelmwu temporarily deployed to test March 8, 2026 21:55 — with GitHub Actions Inactive

fix location editor edge cases

3cd379e

Copilot AI review requested due to automatic review settings March 8, 2026 21:58

michaelmwu temporarily deployed to test March 8, 2026 21:58 — with GitHub Actions Inactive

Copilot started reviewing on behalf of michaelmwu March 8, 2026 21:58 View session

michaelmwu merged commit 9fc4f98 into main Mar 8, 2026
6 checks passed

michaelmwu deleted the michaelmwu/location-parse-fix branch March 8, 2026 21:59

Copilot AI reviewed Mar 8, 2026

View reviewed changes

This was referenced Mar 8, 2026

fix: refresh resume extract previews #181

Merged

fix: tighten role inference and allow cRoles editing #187

Merged



		def _normalize_location_text(value: str) -> str:
		return re.sub(r"\s+", " ", value).strip(" ,")

	return re.sub(r"\s+", " ", value).strip(" ,")
	# Collapse internal whitespace and strip common trailing punctuation/spaces.
	return re.sub(r"\s+", " ", value).strip(" ,.")

		letters_only = re.sub(r"[^A-Za-z]", "", normalized)
		if len(letters_only) <= 2:

-        raw_city = (self.city_input.value or "").strip()
-        raw_state = (self.state_input.value or "").strip()
-        raw_country = (self.country_input.value or "").strip()
-        raw_timezone = (self.timezone_input.value or "").strip()
+        def _clean_location_input(value: str | None) -> str:
+            """Normalize raw user location input before CRM normalization.
+            Treats case-insensitive 'none'/'null' as empty so those values clear
+            existing fields instead of being stored literally.
+            """
+            cleaned = (value or "").strip()
+            if cleaned.lower() in {"none", "null"}:
+                return ""
+            return cleaned
+        raw_city = _clean_location_input(self.city_input.value)
+        raw_state = _clean_location_input(self.state_input.value)
+        raw_country = _clean_location_input(self.country_input.value)
+        raw_timezone = _clean_location_input(self.timezone_input.value)

	@classmethod
	def _has_location_updates(cls, values: dict[str, Any]) -> bool:
	return any(values.get(field) for field in cls._LOCATION_FIELDS)

Conversation

michaelmwu commented Mar 8, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

How Has This Been Tested?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

michaelmwu commented Mar 8, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 8, 2026 •

edited

Loading