Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandoc not converting mediawiki links with empty query parameter value properly #4068

Closed
outofcontrol opened this issue Nov 14, 2017 · 4 comments

Comments

@outofcontrol
Copy link

Version: pandoc 2.0.1.1 - 2.0.2 (possibly more)

Bug: Repeatable: Pandoc gets confused when trying to convert a URL that has empty parameters in it.

This bug is revealed when converting from Mediawiki to Github Flavoured Markdown (gfm). To demonstrate this bug, use the online Pandoc converter. Using this sample:

[https://domain.com/script.php?a=1&b=2&c=&d=4 open productname bugs]

The result should be:

[open productname bugs](https://domain.com/script.php?a=1&b=2&c=&d=4)

Instead, Pandoc gets confused when a parameter is empty (...c=&d...) and results in this:

\[<https://domain.com/script.php?a=1&b=2&c>=\&d=4 open productname bugs\]
@mb21
Copy link
Collaborator

mb21 commented Nov 14, 2017

There's indeed a bug there, but the problem is the &c= which isn't followed by a value. Minimal example:

$ echo '[http://domain.com?a= open productname bugs]' | pandoc -f mediawiki
<p>[<a href="http://domain.com?a" class="uri">http://domain.com?a</a>= open productname bugs]</p>

@mb21 mb21 changed the title Pandoc not converting mediawiki links with empty parameters properly Pandoc not converting mediawiki links with empty query parameter value properly Nov 14, 2017
@mb21
Copy link
Collaborator

mb21 commented Nov 14, 2017

This should probably be fixed in uri.

> runParser uri () "" "http://domain.com?a=&b=1"
Right ("http://domain.com?a","http://domain.com?a")

@outofcontrol
Copy link
Author

In the same vain as the above issue, this is another link which break pandoc: conversion from Mediawiki to other formats:

[http://domain.com?a=. open productname bugs]

Perhaps others.

@jgm
Copy link
Owner

jgm commented Nov 15, 2017

The uri parser has a heuristic designed for excluding punctuation that follows bare URIs in text.
This should definitely be refined to give better results for the examples above.

@jgm jgm closed this as completed in 508aab0 Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants