Hyperlinks with double underscores don't render correctly #625
Comments
This is one of the many reasons I'm researching peg-based grammars for parsing inline markup. The problem we're facing here is that's a limit to how much context a regular expression can see...and it messes up when the markup looks perfectly valid from a pattern-matching perspective. Most of the lightweight markup languages out there have this problem...until they make the move to a grammer-based parser. I'm really looking forward to this improvement in Asciidoctor because it's going to make inline-formatting much more predictable...and we can get access to it in the AST. Fortunately, AsciiDoc provides many different ways to control substitution to work around issues like this one. I'll present the solutions in the order that I recommend using them (as not all solutions are good practice). Solution A :: :link-with-underscores: http://www.migrations.fr/la_guerre__de__sept__ans.htm
This URL has repeating underscores {link-with-underscores} but AsciiDoc won't process them. This works because quotes are substituted before attributes, so the URL remains "hidden" while the text in the line is being formatted (strong, emphasis, monospace, etc). Another way to solve formatting glitches is to explicitly specify the formatting you want to have applied to a span of text using the inline pass macro. If you want to display a URL, and have it be completely preserved, you can put it inside a pass macro and enable only macros (which is what substitutes links). This URL has repeating underscores pass:macros[http://www.migrations.fr/la_guerre__de__sept__ans.htm] but AsciiDoc won't process them. This works because the pass macro removes the content from the line of text while substitutions are performed, applies the explicit substitutions to that text while it's on the sidelines, then restores it to the original location. Solution C and D :: The final two solution I'll mention are related, but I don't recommend using them. It's possible to escape individual characters or a range of characters inside the URL. You can isolate the part of the URL causing problems using the double dollar escape: This URL has repeating underscores http://www.migrations.fr/$$la_guerre__de__sept__ans$$.htm but AsciiDoc won't process them. Like the pass macro, it pulls the text out during substitution, but it doesn't offer a way to apply substitutions to that text. You tend to use double dollar when you want to prevent the processor from detecting a URL, like: This URL won't be recognized by the processor $$http://www.migrations.fr/la_guerre__de__sept__ans.htm$$ It's also possible to escape the underscores: This URL won't be recognized by the processor http://www.migrations.fr/la\_guerre__de__sept__ans.htm However, escaping is not consistent between AsciiDoc and Asciidoctor (mostly because Ruby 1.8.7, which we still support, doesn't have look behind capabilities in the regex engine). I think you'll be the most happy with Solution A. It's best practice to pull all your links into attributes anyway, and by doing so you get the bonus that they aren't mangled. |
Thank you so much for your thorough reply. I'm learning a lot just by I'm intrigued by your last statement about pulling links into attributes. -- Chuck On Wed, Sep 11, 2013 at 3:46 AM, Dan Allen notifications@github.com wrote:
|
On Wed, Sep 11, 2013 at 12:59 PM, Chuck Durfee notifications@github.comwrote:
That's what we like to hear. We learn from each other, as questions almost
We like to put them under the document title. That's the convention we're https://gist.github.com/mojavelinux/6519908/raw/1cf543e3078d212de2f7362e45d3e006cfdcca2e/gs-messaging-jms.adoc Notice how attributes can build on other attributes. It's also possible to define the links for a section underneath the section In Asciidoctor 1.5.0 (or later) I'm planning to allow document attributes ...of course, you can always put the links in an include file, then just = Document Title
Author Name
include::asciidoc-settings[]
include::link-definitions[]
content
There are lots of options, but as with all things, pick the way that works -Dan Dan Allen | http://google.com/profiles/dan.j.allen |
I went with option A and it worked great! Thanks again! |
Excellent! |
Thank you very much @mojavelinux , Your solution double dollar escape also worked for http://www.google.com/~example@gmail.com For asciidoc, we need to escape it like this:
|
Great to hear! Note that these techniques are now documented in the user manual. See http://asciidoctor.org/docs/user-manual/#complex-urls. |
Thanks, @neontapir and @mojavelinux !! Ran into a this very issue today with a link that includes underscores and Solution A works great. |
I found while including a hyperlink with double underscores in an AsciiDoc file that they are not rendered correctly. To reproduce, I created a document named link.asc whose sole line is:
When I render it with Asciidoctor using
...the line in question becomes:
Notice that the href tag is incorrect and the double underscores have been interpreted as italics. I expect the line to render as:
AsciiDoc does not render this line correctly either, by the way. I originally posted this question in the Nabble discussion forum.
The text was updated successfully, but these errors were encountered: