-
Notifications
You must be signed in to change notification settings - Fork 862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SmartyPants extension as part of Python-Markdown #12
Comments
I'm not completely opposed to this, but what's the benefit of:
over:
The latter works today and requires less typing. Of course, I realize in some more complex situations (like using smartypants with the codehilite extension) things may not work as well. Therefore, I could see a smartypants extension which implemented all the various features as inline patterns inserted into the parser. However, this is not high enough on my priority list to devote the time to. Of course, patches and/or merge requests are always welcome. |
That doesn't work so nicely on the command line, or with other extensions (as you noted). |
Regarding the command line, markdown outputs to stdout, so any decent Another issue with making smartypants an extension is how to make it I'm not saying its not worth doing, just that it will be more work \X/ /-\ `/ |_ /-\ || |
Based on the reasons stated previously, I will not be implementing this. If and when someone else writes the extension, they can do a merge request and I'll reconsider. Until then, I'm closing this. |
Here is such an extension: https://bitbucket.org/jeunice/mdx_smartypants Anyone can use the code directly. I would be happy to put it in whatever format or license would make it most palatable to the Python-Markdown team to include. Smartypants seems like such a natural fit with Markdown. That it's not part of the core is just odd. |
Aside from the named html entities addition, how is this better that Either way, the smartypants lib is still needed. And I'm "Meh" to the named html entities addition. In other words, does this improve the markdown lib enough to justify an additional maintenance load? As I stated previously, I might be interested if smartypants we re-implemented as an extension using the extension API (inlinepatterns etc). If all the extension is doing is providing a wrapper, then why hide the fact in an extension? Oh and it is not part of the "core" because smartypants.pl (the original) is not part of markdown.pl (the original markdown implementation). Of course, it can be an extension (which is not the "core") which even ships with the "core" if there is real value in doing so. I just don't see that value yet. In any event, I've added the extension to the wiki |
It’s better IMO because it yields a simple, highly-useful function right It’s certainly possible to import a couple of separate namedentities and I note that several extensions already bundled with markdown—Meta and HTML Publishers and web designers use enhanced-typography glyphs all the time. Trying to reinvent a mature, well-accepted wheel like smartypants or do
Smartypants out-of-the-box improves markdown along those axes. If it makes jse On Mon, May 7, 2012 at 3:37 PM, Waylan Limberg <
|
I'd prefer smartypants functions to work right out-of-the-box with Python-Markdown, but in lieu of that, I've submitted the package to PyPI (http://pypi.python.org/pypi/mdx_smartypants/) so that users can auto-install it with |
I see some benefits in extension approach:
|
Ah, and another benefit will be ability to use it in Python-Markdown’s own docs. I’m willing to implement and maintain this extension as part of Python-Markdown if @waylan agrees. |
@mitya57 to be clear, the benefits you mention do not apply to the existing implementation @jonathaneunice linked to above. However, an extension that reimplemented smartypants using Python-Markdown's extension API (probably a few different inline patterns) would be acceptable to me. I realize that is what you were referring to, but I just wanted to make it clear that I was questioning the value of @jonathaneunice's implementation, not smartypants in general. So, yeah, I'm all for a smartypants extension if it is done right. I just don't have the time to devote to it. |
WIP version available in my smarty-extension branch (suggestions for a better name?). |
@waylan can you please review my implementation (linked above)? Please don't merge it yet, as there is no documentation and branch is not clean, but let me know if you do/don't like the code :) Things that need to be done:
|
Took a quick glance and generally looks good. A few concerns though. I find is strange that you are monkeypatching the Pattern class rather than using a subclass. In fact, for the dashes and ellipses a single subclass would easily work. And IMO the code would be a little easier to read. Remember that any extensions included with the standard library should be a model of how extensions work. I'd rather see: emDashesPattern = SubstituteTextPattern(r'(?<!-)---(?!-)', mdash) In fact, the SubstituteTextPattern (or whatever better name you come up with) could probably eventually go right in inlinpatterns.py for use by any extension. Regarding the many quote patterns, I wonder if a subclass could handle them as well - rather than a factory function. I'm undecided about that. But I'd like to see what is possible. Curious why you chose to use tables in your tests. I would think horizontal rules would be a more interesting test. Especially some of the more interesting patterns people use (try 3 dashes, space, 2 dashes, space, 3 dashes ...). And that is all supported int he standard library. |
Oh, one more thing, according to the documentation, smartypants is supposed to use HTML entities in the output. Shouldn't we be doing the same? |
Done now.
These are now also
Done.
Please see my previous comment. Actually, I don't see any advantages of not using unicode now (the original smartypants was written back in 2004, probably there were some advantages at that point). |
Updated the branch with configuration, docs and license headers. This is not yet ready for merging because it seems that -<a href="http://example.com">Link</a>\u2019s test</p>
+�\x03\u2019s test</p> The only way to solve this I can come up with is cloning each regexp into two: one with |
@mitya57 I have a few observations: It appears that you forgot to commit the documentation. I see the links to the file, but the smarty.txt file is missing. Also it appears that either your master branch or smarty-extension branch is not up-to-date. If it was, then Github's compare feature would be helpful in seeing a snapshot of all your changes. Not a big deal, but it would be helpful. I reviewed the entities issue. The problem is that the serializer is escaping any html entities. In fact, ElementTree apparently has no support for them. We handle them by storing them in the htmlStash. If we use entities here, that is what we would need to do. The pattern class would probably need to be a subclass of HtmlPattern and should mark the stashed entities as safe so they still work in "safe_mode" ( I think we should stick with entities. Some people use weird encodings in their servers/browsers/file systems, and have little or no understanding of the issues, let alone how to change the settings. This is especially true with English users who only ever work with text using ASCII characters. As ASCII converts to unicode, they don't ever notice a problem. If we start adding in random unicode chars, weird things might happen. I didn't check, maybe all the smartypants chars are ASCII anyway, but HTMLentities serve this purpose just fine. Might as well stick with them. Finally, regarding your failing test; I was going to suggest running your pattern earlier, but it needs to run after "escape" at least, and that uses STX and ETX. Can you give me a simple example of the problem outside of the markdown codebase. I'm not sure I understand how the problem is related to |
@waylan Thanks for the review!
Will commit tomorrow (it's on a different machine).
Master updated.
Thanks for the suggestions, switched to entities and it works!
This seems to not happen outside markdown codebase. It seems that return value of |
@mitya57 I just made some inline comments on commit be77195. Regarding And it just occured to me what your problem most likely is with STX and ETX. Your Pattern class needs to be a subclass of HtmlPattern. Then, before passing any text into the htmlStash, you need to pass the text to Also, as STX and ETX act as start and stop deliminators, make sure your regex isn't matching only part of an existing placeholder. If you break an existing placeholder up, it will never be able to be swapped back out later. As your regexes appear to only make reference to the STX and ETX chars as non-matches, I don't think this is a problem, but I haven't taken the time to actually go over the regular expressions you are using - so I thought it might be worth mentioning. Hope this helps. |
@waylan Thanks for the comments, fixed 3 of 4 and commented on the 4th.
I was not passing output of Most SmartyPants regexps match a quote, a character before it and/or a character after it. It breaks when one of those characters is STX/ETX. |
Sorry that it took too long, I was busy with some university work. Now I've finally managed to fix the STX/ETX issue by using lookbehind experssions. The remaining problem is that quotes before/after inline markup are not recognized properly, i.e. a single quote after a link or emphasized text becomes an opening one instead of a closing one. Any thoughts about that? If @waylan is OK with having that "bug", I'll make a clean branch and propose a formal pull request. |
@mitya57 I suppose the reason for the quote-after-inline-markup bug is a result of the fact that inline patterns provide no information about where the text comes from (tag, text, tail, etc). With that info, you could have different behavior if the text was from the elm.text, or elm.tail etc. This shortcoming of the API has always annoyed me. I'm open to suggestions that won't break existing extensions. |
Actually, the original SmartyPants library suffers from the same issue, so we don't regress here. I've now created pull request #231 for your review. |
I'd rather not have to type "—" when I blog using Pelican, which uses this library. It looks like there might've been a fix above years ago, but has it never been merged? |
@reagle a "smarty" extension was added to Python-Markdown in version 2.4 over four years ago. You just need to tell Markdown to use the extension. |
Okay, thank you. |
This is a feature request. It'd be nice if there was a built-in (batteries included) extension to implement SmartyPants quoting by turning on a simple extension.
I notice that someone is already using SmartyPants with Markdown for Python, though not as an extension:
http://byrneswoder.com/blog/one-secret-to-generating-clean-html-from-text/
The text was updated successfully, but these errors were encountered: