Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandoc Self-Contained HTML files with embedded CSS links #2121

Closed
kmader opened this issue Apr 27, 2015 · 11 comments
Closed

Pandoc Self-Contained HTML files with embedded CSS links #2121

kmader opened this issue Apr 27, 2015 · 11 comments

Comments

@kmader
Copy link

kmader commented Apr 27, 2015

I have an RMarkdown report exporting to HTML (via Pandoc) and a few lines adding relevant css files to the report, specifically

<link rel="stylesheet" href="leaflet.css" />

causes pandoc to produce the error message

pandoc: Could not fetch html-demo/#default#VML

Since it is trying to parse this line in the css file to make it self-contained

.lvml {
    behavior: url(#default#VML);
    display: inline-block;
    position: absolute;
    }

which does not refer to a physical file.

Pandoc should ignore such tags in css files (initial issue posted in rmarkdown: rstudio/rmarkdown#418)

@nkalvi
Copy link

nkalvi commented Apr 27, 2015

@kmader Would you please post a complete example along with your Pandoc's version and the complete command?

@jgm
Copy link
Owner

jgm commented Apr 27, 2015

@nkalvi - although this is usually important to have, I think in this
case it is already clear enough what the problem is and how to solve it.

+++ nkalvi [Apr 27 15 06:40 ]:

@kmader Would you please post a complete example along with your Pandoc's version and the complete command?

@nkalvi
Copy link

nkalvi commented Apr 27, 2015

Thanks @jgm - I realized it after a few minutes.

@kmader
Copy link
Author

kmader commented Apr 27, 2015

So in the case I am using (which is probably not completely proper usage of the pandoc tools) actually has the
<link rel="stylesheet" href="leaflet.css" type="text/css" /> portion inside the md file rather than being specified as a pandoc -c argument.
This is the case because it is not a style-sheet for the entire document rather a style sheet being included for a component or widget within the document.

@mpickering
Copy link
Collaborator

Surely you mean to use the --self-contained command?

@kmader
Copy link
Author

kmader commented Apr 27, 2015

Thanks @nkalvi @mpickering , here is a better working example

  • pandocTest.md
# Hi
<link rel="stylesheet" href="pandocTest.css" />
# Bye
  • pandocTest.css
.lvml {
    behavior: url(#default#VML);
    display: inline-block;
    position: absolute;
    }

Executed command

pandoc pandocTest.md --self-contained -o pandocTest.html

Error message: pandoc: Could not find data file ./

@jgm
Copy link
Owner

jgm commented Apr 27, 2015 via email

@nkalvi
Copy link

nkalvi commented Apr 27, 2015

Sorry for not noticing it involved self-contained; Matt, thanks for pointing it out.

@jgm @mpickering would this be an acceptable fix?
I feel checking for URI reference instead of just # may be more future-proof.

In Text.Pandoc.SelfContained

import Network.URI (isURI, escapeURIString, URI(..), parseURI, isURIReference)

cssURLs :: MediaBag -> Maybe String -> FilePath -> ByteString
        -> IO ByteString
cssURLs media sourceURL d orig =
  case B.breakSubstring "url(" orig of
       (x,y) | B.null y  -> return orig
             | otherwise -> do
                  let (u,v) = B.breakSubstring ")" $ B.drop 4 y
                  let url = toString
                          $ case B.take 1 u of
                                 "\"" -> B.takeWhile (/='"') $ B.drop 1 u
                                 "'"  -> B.takeWhile (/='\'') $ B.drop 1 u
                                 _    -> u
                  let url' = if isURI url
                                then url
                                else d </> url
                  rest <- cssURLs media sourceURL d v
                  if not (isURIReference url') -- Ignore behavior: url(#default#VML) etc.
                    then
                      return $ x `B.append` "url(" `B.append` (fromString url) `B.append` rest
                    else do
                      (raw, mime) <- getRaw media sourceURL "" url'
                      let enc = fromString $ makeDataURI mime raw
                      return $ x `B.append` "url(" `B.append` enc `B.append` rest

@nkalvi
Copy link

nkalvi commented Apr 27, 2015

@kmader You may already be aware of the --include-in-header option. I'm not sure whether it would be a feasible temporary workaround in your case (to include portions that result in error).

`-H` *FILE*, `--include-in-header=`*FILE*

:   Include contents of *FILE*, verbatim, at the end of the header.
    This can be used, for example, to include special
    CSS or javascript in HTML documents.  This option can be used
    repeatedly to include multiple files in the header.  They will be
    included in the order specified.  Implies `--standalone`.

@jgm
Copy link
Owner

jgm commented May 2, 2015

@nkalvi, isURIReference doesn't test for what we need to test for. Haddocks say:
"Test if string contains a valid URI reference (an absolute or relative URI with optional fragment identifier)."

@jgm jgm closed this as completed in 9b2f645 May 2, 2015
jgm added a commit that referenced this issue May 2, 2015
This gives better results when we have, e.g. multiple paragraphs.
Note that tags aren't allowed in these fields.

Closes #2121.
@jgm
Copy link
Owner

jgm commented May 2, 2015

(Sorry - that last commit should not have referenced this issue.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants