-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flexible inline #629
Flexible inline #629
Conversation
Create a new inline Pattern2 class that uses a more effecient pattern. Derive all inline classes from Pattern2, but leave the legacy Pattern, SimpleTextPattern, SimpleTagPattern, SubstituteTagPattern, and DoubleTagPattern for backwards compatibility.
Use the more appropriate `InlineProcessor` name for new classes opposed to the legacy `Pattern2`.
Allow additional analysis by passing in data buffer into handleMatch. This allows us to give a simple regex for a start token, and then we can tokenize the content if we like inside handleMatch. If we don't find what we are looking for, we can reject the match as if we found nothing. You must return your element and the bounds of the handled characters.
Initial handling of recursive round brackets in link's href.
Clean up some variable naming, and perfrom comon actions on returned text before returning it. Also remove code that turns off spaces in URL to allow images to behave like links with spaces in its URL.
For consistency, rename Pattern classes to InlineProcessor.
Handle titles in a more sane way
Remove unused variable and update docstring describing functionality in better detail.
Figured I'd get some discussion going on this and start moving this forward. |
Seems pypy is failing for the recently discussed issue Future warnings with elementtree:
While this is a separate issue, I might take care of it here as well. |
Yeah, we keep fixing these, and then every time we do more work on ElementTree related code, we introduce them. So I made sure a test fails to stop us from reintroducing the warnings. I consider these warnings a reason to not accept a PR, so yeah, they should be addressed (which I see you did). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to take a closer look at what some of the code is actually doing, but here are a few general organization issues I spotted. Overall it looks good. Thanks for doing the work on this.
Oh and while it may seem obvious, I should note that the docs need to be updated for these changes. Specially, the Extension API docs. |
Makes sense. It's a pretty big change. I figured it'd take a little bit to work through the review. |
Yeah, I didn't make changes to the docs yet only because I wanted to make sure we were all in agreement on the new processors behavior. Figured I'd wait until we were good with the code. |
Merge legacy test misc/image.txt with new (renamed) inline/images.py.
Describe the new InlineProcessor and its usage in the documents.
Went ahead and addressed the test changes suggested, and provided some API documentation. I'll address spelling issues if Travis notifies of any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the documentation. I noted a couple minor issues. Otherwise looks good
Fix unnecessary plural usage of start, and rephrase processor return sentences.
@facelessuser, if you rebase this against master (resolving merge conflicts), I think we can accept this. |
Cool, I'll pull the latest changes into this pull request and resolve any conflicts. I've pulled in most of the changes already, so imagine it should go fairly smooth. |
Branch is synced up. |
…a#3630) The gist-rst pattern crashes Python 3.11, because its pattern tries to set the multiline flag - it starts with `(?m)` - but Markdown's `Pattern` class `__init__`, which is called by our `GistPattern.__init__`, 'wraps' the pattern with additional capture groups before compiling it. This causes the multiline flag not to be at the start of the 'wrapped' pattern. From Python 3.6 to Python 3.10 flags not being at the start of the pattern was deprecated and triggered a warning when trying to compile it; in Python 3.11 it's an error. Markdown 3.0.0 added a preferred `InlineProcessor` class: Python-Markdown/markdown#629 which does not do this wrapping. It requires the subclass to implement `handleMatch`, which we already do. Our `handleMatch` uses named capture groups, so the change in the number of capture groups in the compiled expression shouldn't matter. So simply switching to inheriting from `InlineProcessor` instead of `Pattern` should solve the problem. We already required Markdown 3.0.0, so this does not mean we require a newer Markdown than before. Signed-off-by: Adam Williamson <awilliam@redhat.com>
…a#3630) The gist-rst pattern crashes Python 3.11, because its pattern tries to set the multiline flag - it starts with `(?m)` - but Markdown's `Pattern` class `__init__`, which is called by our `GistPattern.__init__`, 'wraps' the pattern with additional capture groups before compiling it. This causes the multiline flag not to be at the start of the 'wrapped' pattern. From Python 3.6 to Python 3.10 flags not being at the start of the pattern was deprecated and triggered a warning when trying to compile it; in Python 3.11 it's an error. Markdown 3.0.0 added a preferred `InlineProcessor` class: Python-Markdown/markdown#629 which does not do this wrapping. It requires the subclass to implement `handleMatch`, which we already do. Our `handleMatch` uses named capture groups, so the change in the number of capture groups in the compiled expression shouldn't matter. So simply switching to inheriting from `InlineProcessor` instead of `Pattern` (and slightly adjusting the signature and return values to match what's expected of `InlineProcessor` subclasses) should solve the problem. We already required Markdown 3.0.0, so this does not mean we require a newer Markdown than before. Signed-off-by: Adam Williamson <awilliam@redhat.com>
Add new InlineProcessor class that handles inline processing much better and allows for more flexibility. This adds new InlineProcessors that no longer utilize unnecessary pretext and posttext captures. New class can accept the buffer that is being worked on and manually process the text without regex and return new replacement bounds. This helps us to handle links in a better way and handle nested brackets and logic that is too much for regular expression. The refactor also allows image links to have links/paths with spaces like links. Ref #551, #613, #590, #161.