Skip to content

fix: improve URL highlighting precision#302

Merged
binaricat merged 5 commits intomainfrom
feat/url-highlight-ip-precision
Mar 9, 2026
Merged

fix: improve URL highlighting precision#302
binaricat merged 5 commits intomainfrom
feat/url-highlight-ip-precision

Conversation

@binaricat
Copy link
Owner

Summary

  • expand the default address highlight rule to cover URLs as well as IP and MAC values
  • tighten IPv4 matching to avoid common false positives such as version strings and invalid dotted numbers
  • normalize stored terminal settings so existing users automatically receive the updated default highlight patterns

Validation

  • npm run lint
  • npm run build
  • manual regex spot-checks for URL matching and version-string false positives

Partially addresses #294

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a52198a1b3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

domain/models.ts Outdated
export const URL_HIGHLIGHT_PATTERN =
"\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?])";
export const IPV4_HIGHLIGHT_PATTERN =
`(?<![\\d.])(?<![Vv])(?<!ver\\s)(?<!version\\s)(?<!Version\\s)(?:${STRICT_IPV4_OCTET_PATTERN}\\.){3}${STRICT_IPV4_OCTET_PATTERN}(?![\\d.])`;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore boundary checks around IPv4 regex

The new IPV4_HIGHLIGHT_PATTERN dropped the old \b...\b word-boundary protection and now only excludes adjacent digits/dots, so it will match IP-like substrings inside alphanumeric tokens (for example, abc192.168.0.1def now highlights 192.168.0.1). This is a precision regression introduced by this change and contradicts the goal of reducing false positives in logs/output text.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 8462eee. I tightened the IPv4 pattern to reject adjacent word characters and dots on both sides, so it no longer matches inside alphanumeric tokens or partially inside longer dotted strings while still preserving the version-string exclusions.

@binaricat
Copy link
Owner Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8462eee71a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

domain/models.ts Outdated
export const URL_HIGHLIGHT_PATTERN =
"\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?])";
export const IPV4_HIGHLIGHT_PATTERN =
`(?<![\\w.])(?<![Vv])(?<!ver\\s)(?<!version\\s)(?:${STRICT_IPV4_OCTET_PATTERN}\\.){3}${STRICT_IPV4_OCTET_PATTERN}(?![\\w.])`;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict version-prefix guard to standalone words

The new IPv4 regex suppresses valid addresses whenever they are preceded by any text ending in ver , not just version markers. Because of (?<!ver\s), strings like server 10.0.0.1 no longer match (the 4 chars before 10 are ver ), so common log lines lose IP highlighting after this change. This is a functional regression from the previous pattern and affects normal terminal output, not just edge cases.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 5839206. The version-prefix exclusion now only applies when ver or version appears as a standalone word before the dotted sequence, so normal text like server 10.0.0.1 highlights again while ver 1.2.3.4 and version 1.2.3.4 remain suppressed.

@binaricat
Copy link
Owner Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5839206bb9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

domain/models.ts Outdated
const STRICT_IPV4_OCTET_PATTERN = '(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)';

export const URL_HIGHLIGHT_PATTERN =
"\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?])";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude closing delimiters from URL highlight regex

The new URL pattern still consumes trailing ) and ], so common text like (https://example.com) or [https://example.com] gets highlighted with the closing delimiter included, which is a regression in matching precision. Because this rule is now part of the default global highlights, users will frequently see over-highlighting in markdown/log contexts unless the character class or trailing exclusion also handles bracket/parenthesis delimiters.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in c521477. The URL pattern now backtracks past trailing ), ], }, and sentence punctuation, so surrounding delimiters like (https://example.com) and [https://example.com] no longer get highlighted while plain URLs and query strings still match.

@binaricat
Copy link
Owner Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c521477325

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

domain/models.ts Outdated
const STRICT_IPV4_OCTET_PATTERN = '(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)';

export const URL_HIGHLIGHT_PATTERN =
"\\b(?:https?:\\/\\/|www\\.)[^\\s<>\"'`]+(?<![.,;:!?\\])}])";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep closing bracket in IPv6 URL highlights

The new URL regex always strips a trailing ] via the negative lookbehind set, so valid bracketed IPv6 URLs are highlighted incompletely (for example, http://[2001:db8::1] is matched as http://[2001:db8::1). In terminal output that includes IPv6 endpoints, this reduces highlight accuracy and can break downstream link handling because the rendered span is no longer the real URL.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in b11182a. I split the URL matcher so bracketed IPv6 hosts are handled by a dedicated branch that preserves the host-closing ], while the general URL branch still trims wrapper delimiters like trailing ) and ] in markdown/log contexts.

@binaricat
Copy link
Owner Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@binaricat binaricat merged commit 1286975 into main Mar 9, 2026
@binaricat binaricat mentioned this pull request Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant