✨Allow plugins to register inline terminator characters#391
✨Allow plugins to register inline terminator characters#391
Conversation
Agent-Logs-Url: https://github.com/executablebooks/markdown-it-py/sessions/e9a49254-6b3a-4ecc-9b57-84f6df4e6ccd Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
Agent-Logs-Url: https://github.com/executablebooks/markdown-it-py/sessions/e9a49254-6b3a-4ecc-9b57-84f6df4e6ccd Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
| for name, rule2 in _rules2: | ||
| self.ruler2.push(name, rule2) | ||
| # Characters that stop the text rule, allowing other inline rules to fire. | ||
| self._terminator_chars: set[str] = set(_DEFAULT_TERMINATORS) |
There was a problem hiding this comment.
Can we remove this performance degradation in the non-extended path, by making terminator_char's additive to the defaults. Then we don't need to make this set every time
There was a problem hiding this comment.
Done in 6100aeb. The default regex is now compiled once at module level as _default_terminator_re and assigned directly in __init__ — no set() copy, no per-instance rebuild. __init__ initialises _extra_terminator_chars as an empty set, and add_terminator_char only rebuilds the regex when a character genuinely outside _DEFAULT_TERMINATORS is added.
Agent-Logs-Url: https://github.com/executablebooks/markdown-it-py/sessions/913c6f4f-7b8b-496d-9626-f0e5b7c8bc0a Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
|
|
||
| # Pre-compiled regex for the default terminator set. Shared across all ParserInline | ||
| # instances that have not had extra chars added, so __init__ pays no allocation cost. | ||
| _default_terminator_re: re.Pattern[str] = re.compile( |
There was a problem hiding this comment.
Have this as a cached function, as before. So that we can lazy compute it once
There was a problem hiding this comment.
Done in d0d8d46. _default_terminator_re is now a @functools.cache-decorated function — the regex is compiled at most once (on the first ParserInline instantiation) and the cached result is shared across all unextended instances.
Agent-Logs-Url: https://github.com/executablebooks/markdown-it-py/sessions/04e84ce1-b921-4475-9f26-c0b296e29e33 Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #391 +/- ##
=======================================
Coverage 95.80% 95.81%
=======================================
Files 64 64
Lines 3457 3463 +6
=======================================
+ Hits 3312 3318 +6
Misses 145 145
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@copilot fix pre-commit |
Agent-Logs-Url: https://github.com/executablebooks/markdown-it-py/sessions/1c71001c-d7bd-4b35-8682-9c0afb71b1a9 Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
The inline
textrule used a hardcoded, unexpandable set of terminator characters, forcing plugins that need to trigger on non-terminator characters (e.g.wfor GFMwww.autolinks) to resort to core-rule post-processing workarounds.Changes
parser_inline.py: Moves the terminator set ontoParserInlineas_terminator_chars(aset[str]seeded from_DEFAULT_TERMINATORS) with a pre-compiledterminator_re: re.Pattern[str]attribute. Exposesadd_terminator_char(ch)to extend the set; the regex is rebuilt eagerly only when a genuinely new character is added, keeping zero per-call overhead in the hot path.rules_inline/text.py: Drops the module-level_TerminatorCharsset and@functools.cache-decorated factory. Thetextrule now readsstate.md.inline.terminator_redirectly.docs/contributing.md: Updates the "Why is my inline rule not executed?" FAQ to document the new API.Usage
Fully backward-compatible — the default terminator set is unchanged.