Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexible inline #629

Merged
merged 29 commits into from Jan 18, 2018
Merged

Flexible inline #629

merged 29 commits into from Jan 18, 2018

Conversation

facelessuser
Copy link
Collaborator

Add new InlineProcessor class that handles inline processing much better and allows for more flexibility. This adds new InlineProcessors that no longer utilize unnecessary pretext and posttext captures. New class can accept the buffer that is being worked on and manually process the text without regex and return new replacement bounds. This helps us to handle links in a better way and handle nested brackets and logic that is too much for regular expression. The refactor also allows image links to have links/paths with spaces like links. Ref #551, #613, #590, #161.

Create a new inline Pattern2 class that uses a more effecient pattern.  Derive all inline classes from Pattern2, but leave the legacy Pattern, SimpleTextPattern, SimpleTagPattern, SubstituteTagPattern, and DoubleTagPattern for backwards compatibility.
Use the more appropriate `InlineProcessor` name for new classes opposed to the legacy `Pattern2`.
Allow additional analysis by passing in data buffer into handleMatch.  This allows us to give a simple regex for a start token, and then we can tokenize the content if we like inside handleMatch.  If we don't find what we are looking for, we can reject the match as if we found nothing. You must return your element and the bounds of the handled characters.
Initial handling of recursive round brackets in link's href.
Clean up some variable naming, and perfrom comon actions on returned text before returning it.  Also remove code that turns off spaces in URL to allow images to behave like links with spaces in its URL.
For consistency, rename Pattern classes to InlineProcessor.
Handle titles in a more sane way
Remove unused variable and update docstring describing functionality in better detail.
@facelessuser
Copy link
Collaborator Author

Figured I'd get some discussion going on this and start moving this forward.

@facelessuser
Copy link
Collaborator Author

Seems pypy is failing for the recently discussed issue Future warnings with elementtree:

FutureWarning: The behavior of this method will change in future versions.  Use specific 'len(elem)' or 'elem is not None' test instead.

While this is a separate issue, I might take care of it here as well.

@waylan
Copy link
Member

waylan commented Jan 13, 2018

Seems pypy is failing for the recently discussed issue Future warnings with elementtree:

Yeah, we keep fixing these, and then every time we do more work on ElementTree related code, we introduce them. So I made sure a test fails to stop us from reintroducing the warnings. I consider these warnings a reason to not accept a PR, so yeah, they should be addressed (which I see you did).

Copy link
Member

@waylan waylan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to take a closer look at what some of the code is actually doing, but here are a few general organization issues I spotted. Overall it looks good. Thanks for doing the work on this.

tests/misc/image.html Outdated Show resolved Hide resolved
tests/misc/image.txt Outdated Show resolved Hide resolved
tests/test_syntax/inline/test_advanced_images.py Outdated Show resolved Hide resolved
tests/test_syntax/inline/test_advanced_links.py Outdated Show resolved Hide resolved
@waylan
Copy link
Member

waylan commented Jan 13, 2018

Oh and while it may seem obvious, I should note that the docs need to be updated for these changes. Specially, the Extension API docs.

@facelessuser
Copy link
Collaborator Author

facelessuser commented Jan 13, 2018

Makes sense. It's a pretty big change. I figured it'd take a little bit to work through the review.

@facelessuser
Copy link
Collaborator Author

Yeah, I didn't make changes to the docs yet only because I wanted to make sure we were all in agreement on the new processors behavior. Figured I'd wait until we were good with the code.

Merge legacy test misc/image.txt with new (renamed) inline/images.py.
Describe the new InlineProcessor and its usage in the documents.
@facelessuser
Copy link
Collaborator Author

facelessuser commented Jan 13, 2018

Went ahead and addressed the test changes suggested, and provided some API documentation. I'll address spelling issues if Travis notifies of any.

Copy link
Member

@waylan waylan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the documentation. I noted a couple minor issues. Otherwise looks good

docs/extensions/api.md Outdated Show resolved Hide resolved
docs/extensions/api.md Outdated Show resolved Hide resolved
Fix unnecessary plural usage of start, and rephrase processor return sentences.
@waylan
Copy link
Member

waylan commented Jan 17, 2018

@facelessuser, if you rebase this against master (resolving merge conflicts), I think we can accept this.

@facelessuser
Copy link
Collaborator Author

Cool, I'll pull the latest changes into this pull request and resolve any conflicts. I've pulled in most of the changes already, so imagine it should go fairly smooth.

@facelessuser
Copy link
Collaborator Author

Branch is synced up.

@waylan waylan merged commit d18c3d0 into Python-Markdown:master Jan 18, 2018
AdamWill added a commit to AdamWill/nikola that referenced this pull request Jun 21, 2022
…a#3630)

The gist-rst pattern crashes Python 3.11, because its pattern
tries to set the multiline flag - it starts with `(?m)` - but
Markdown's `Pattern` class `__init__`, which is called by our
`GistPattern.__init__`, 'wraps' the pattern with additional
capture groups before compiling it. This causes the multiline
flag not to be at the start of the 'wrapped' pattern. From
Python 3.6 to Python 3.10 flags not being at the start of the
pattern was deprecated and triggered a warning when trying to
compile it; in Python 3.11 it's an error.

Markdown 3.0.0 added a preferred `InlineProcessor` class:
Python-Markdown/markdown#629
which does not do this wrapping. It requires the subclass to
implement `handleMatch`, which we already do. Our `handleMatch`
uses named capture groups, so the change in the number of
capture groups in the compiled expression shouldn't matter. So
simply switching to inheriting from `InlineProcessor` instead
of `Pattern` should solve the problem.

We already required Markdown 3.0.0, so this does not mean we
require a newer Markdown than before.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
AdamWill added a commit to AdamWill/nikola that referenced this pull request Jun 21, 2022
…a#3630)

The gist-rst pattern crashes Python 3.11, because its pattern
tries to set the multiline flag - it starts with `(?m)` - but
Markdown's `Pattern` class `__init__`, which is called by our
`GistPattern.__init__`, 'wraps' the pattern with additional
capture groups before compiling it. This causes the multiline
flag not to be at the start of the 'wrapped' pattern. From
Python 3.6 to Python 3.10 flags not being at the start of the
pattern was deprecated and triggered a warning when trying to
compile it; in Python 3.11 it's an error.

Markdown 3.0.0 added a preferred `InlineProcessor` class:
Python-Markdown/markdown#629
which does not do this wrapping. It requires the subclass to
implement `handleMatch`, which we already do. Our `handleMatch`
uses named capture groups, so the change in the number of
capture groups in the compiled expression shouldn't matter. So
simply switching to inheriting from `InlineProcessor` instead
of `Pattern` (and slightly adjusting the signature and return
values to match what's expected of `InlineProcessor` subclasses)
should solve the problem.

We already required Markdown 3.0.0, so this does not mean we
require a newer Markdown than before.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants