fix: improve URL highlighting precision#302
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a52198a1b3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
domain/models.ts
Outdated
| export const URL_HIGHLIGHT_PATTERN = | ||
| "\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?])"; | ||
| export const IPV4_HIGHLIGHT_PATTERN = | ||
| `(?<![\\d.])(?<![Vv])(?<!ver\\s)(?<!version\\s)(?<!Version\\s)(?:${STRICT_IPV4_OCTET_PATTERN}\\.){3}${STRICT_IPV4_OCTET_PATTERN}(?![\\d.])`; |
There was a problem hiding this comment.
Restore boundary checks around IPv4 regex
The new IPV4_HIGHLIGHT_PATTERN dropped the old \b...\b word-boundary protection and now only excludes adjacent digits/dots, so it will match IP-like substrings inside alphanumeric tokens (for example, abc192.168.0.1def now highlights 192.168.0.1). This is a precision regression introduced by this change and contradicts the goal of reducing false positives in logs/output text.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 8462eee. I tightened the IPv4 pattern to reject adjacent word characters and dots on both sides, so it no longer matches inside alphanumeric tokens or partially inside longer dotted strings while still preserving the version-string exclusions.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8462eee71a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
domain/models.ts
Outdated
| export const URL_HIGHLIGHT_PATTERN = | ||
| "\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?])"; | ||
| export const IPV4_HIGHLIGHT_PATTERN = | ||
| `(?<![\\w.])(?<![Vv])(?<!ver\\s)(?<!version\\s)(?:${STRICT_IPV4_OCTET_PATTERN}\\.){3}${STRICT_IPV4_OCTET_PATTERN}(?![\\w.])`; |
There was a problem hiding this comment.
Restrict version-prefix guard to standalone words
The new IPv4 regex suppresses valid addresses whenever they are preceded by any text ending in ver , not just version markers. Because of (?<!ver\s), strings like server 10.0.0.1 no longer match (the 4 chars before 10 are ver ), so common log lines lose IP highlighting after this change. This is a functional regression from the previous pattern and affects normal terminal output, not just edge cases.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 5839206. The version-prefix exclusion now only applies when ver or version appears as a standalone word before the dotted sequence, so normal text like server 10.0.0.1 highlights again while ver 1.2.3.4 and version 1.2.3.4 remain suppressed.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5839206bb9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
domain/models.ts
Outdated
| const STRICT_IPV4_OCTET_PATTERN = '(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)'; | ||
|
|
||
| export const URL_HIGHLIGHT_PATTERN = | ||
| "\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?])"; |
There was a problem hiding this comment.
Exclude closing delimiters from URL highlight regex
The new URL pattern still consumes trailing ) and ], so common text like (https://example.com) or [https://example.com] gets highlighted with the closing delimiter included, which is a regression in matching precision. Because this rule is now part of the default global highlights, users will frequently see over-highlighting in markdown/log contexts unless the character class or trailing exclusion also handles bracket/parenthesis delimiters.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in c521477. The URL pattern now backtracks past trailing ), ], }, and sentence punctuation, so surrounding delimiters like (https://example.com) and [https://example.com] no longer get highlighted while plain URLs and query strings still match.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c521477325
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
domain/models.ts
Outdated
| const STRICT_IPV4_OCTET_PATTERN = '(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)'; | ||
|
|
||
| export const URL_HIGHLIGHT_PATTERN = | ||
| "\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?\\])}])"; |
There was a problem hiding this comment.
Keep closing bracket in IPv6 URL highlights
The new URL regex always strips a trailing ] via the negative lookbehind set, so valid bracketed IPv6 URLs are highlighted incompletely (for example, http://[2001:db8::1] is matched as http://[2001:db8::1). In terminal output that includes IPv6 endpoints, this reduces highlight accuracy and can break downstream link handling because the rendered span is no longer the real URL.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in b11182a. I split the URL matcher so bracketed IPv6 hosts are handled by a dedicated branch that preserves the host-closing ], while the general URL branch still trims wrapper delimiters like trailing ) and ] in markdown/log contexts.
|
@codex review |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Validation
Partially addresses #294