Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandoc drops “unknown” HTML elements when converting to markdown #1756

Closed
dullroar opened this issue Nov 13, 2014 · 2 comments
Closed

Pandoc drops “unknown” HTML elements when converting to markdown #1756

dullroar opened this issue Nov 13, 2014 · 2 comments

Comments

@dullroar
Copy link

Consider the following simple HTML:

<!DOCTYPE html>
<body>
<p>Test
  <object height="355" width="425">
    <param name="movie" value="http://www.youtube.com/v/DKk9rv2hUfA&amp;rel=1">
    <param name="wmode" value="transparent">
    <embed height="355" src="http://www.youtube.com/v/DKk9rv2hUfA&amp;rel=1" type="application/x-shockwave-flash" width="425">
  </object>
</p>
</body>

I want to convert that to markdown, and for the elements that don't have markdown equivalents (object, etc.) to just pass them through as HTML unchanged. However, when I run it through pandoc (v1.13.1 on Windows) with the following command line:

pandoc --from=html --to=markdown --output=C:\Temp\test.md C:\Temp\test.html

...the only output I get in test.md is:

Test

Note: I have already seen this question and answer, but when I try --parse-raw it simply passes through all the HTML as HTML, which is not what I want. In the StackExchange thread I started on this I was told that indeed --parse-raw is what I want and should work, but that the embed element seems to be triggering a bug that then causes everything to be brought over as raw HTML, and was given the suggestion to post a bug report, so here I am.

@jgm
Copy link
Owner

jgm commented Nov 17, 2014

The issue is that pandoc is treating the "embed" element as block-level, so it interrupts the paragraph. This can be fixed easily.

@jgm jgm closed this as completed in 4aadcd5 Nov 17, 2014
@dullroar
Copy link
Author

Excellent! Thanks for the quick response, especially given your "day job" teaching plus the CommonMark work, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants