Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File transclusion #4

Closed
michal-h21 opened this issue Nov 20, 2016 · 8 comments
Closed

File transclusion #4

michal-h21 opened this issue Nov 20, 2016 · 8 comments
Labels
feature request lua Related to the Lua interface and implementation

Comments

@michal-h21
Copy link

Recently released iA Writer has interesting extension to Markdown, they call it content blocks. This enables easy inclusion of images, tables, source code and child documents. For example:

/Savings Account.csv 'Recent Transactions'
/Example.swift
/Lorem Ipsum.txt

would create the table from file Savings Account.csv, include Example.swift as source code and include contents of Lorem Ipsum.txt processed with Markdown.

More details can be found in this post, there is also in depth description.

I guess that it wouldn't be easy to support for example the syntax highlighting and CSV inclusion in all TeX engines, but I like the idea. What do you think?

@Witiko
Copy link
Owner

Witiko commented Nov 21, 2016

I am quite fond of the idea to replace the Markdown image tag with something more seamless. It is a simple matter to add experimental support for converting

/Savings Account.csv 'Recent Transactions'
/Example.swift
/Lorem Ipsum.txt
http://example.com/minard.jpg (Napoleon’s disastrous Russian campaign of 1812)
/Flowchart.png "Enginering Flowchart"

into

\markdownRendererContentBlockLocal{Savings Account.csv}{Recent Transactions}
\markdownRendererContentBlockLocal{Example.swift}{}
\markdownRendererContentBlockLocal{Lorem Ipsum.txt}{}
\markdownRendererContentBlockOnline{http://example.com/minard.jpg}{Napoleon’s disastrous Russian campaign of 1812}
\markdownRendererContentBlockLocal{Flowchart.png}{Enginering Flowchart}

but the real question is how much pre-processing should be done by lunamark. Should we produce tables from known formats? If so, we will need to add support for a table extension first. Should we download content referenced by URLs or should this be left to TeX formats?

@michal-h21
Copy link
Author

I think that lunamark should do a basic detection of content types, as there is limited number of supported image types, tables can be contained only in csv files, and supported languages are defined lagnauges.json.

So, the result could be something like:

\markdownRendererContentBlockCsv{Savings Account.csv}{Recent Transactions}
\markdownRendererContentBlockSyntaxHighlight{Example.swift}{swift}{}
\markdownRendererContentBlock{Lorem Ipsum.txt}{}
\markdownRendererContentBlockImage{Flowchart.png}{Enginering Flowchart}

Regarding tables, I think that csv parsers exists for all formats and engines, so it shouldn't be necessary to convert it to Markdown tables. I am not sure how these parsers conform to csv standard though.

Online referenced can be only images. I am not sure whether it's possible to download content from the internet from engines other than LuaTeX, so it it probably should be downloaded by lunamark. Maybe the url's can be hashed and images can be saved in the temp directory, which is created anyway. Image hashes can be checked against temp filenames, in order to assure that the image is downloaded only once.

@Witiko
Copy link
Owner

Witiko commented Nov 23, 2016

I think that lunamark should do a basic detection of content types, as there is limited number of
supported image types, tables can be contained only in csv files, and supported languages are
defined langauges.json.

It might me a good idea to load all the Languages.json files found by kpathsea and combine them in a cascading manner. This way, the user may create their own Languages.json file in the current folder or in ~/texmf without completely overriding the Languages.json file provided as a part of the TeX distribution.

Regarding tables, I think that csv parsers exists for all formats and engines, so it shouldn't be
necessary to convert it to Markdown tables. I am not sure how these parsers conform to csv
standard though.

True, but it might still make more sense to parse the CSV and produce table renderer macros specific to the markdown package. This way, tables will be typeset consistently regardless of whether they are input as a content block or as a markdown table. Passing the CSV file directly to TeX should be enough for the initial experiments, though.

Online referenced can be only images. I am not sure whether it's possible to download content from
the internet from engines other than LuaTeX, so it it probably should be downloaded by lunamark.
Maybe the url's can be hashed and images can be saved in the temp directory, which is created
anyway. Image hashes can be checked against temp filenames, in order to assure that the image is
downloaded only once.

We will need to produce the cache filenames by hashing the URLs, since that is the only information we have at disposal while offline.

@Witiko
Copy link
Owner

Witiko commented Mar 18, 2017

I appologize for the wait; I did not have the time to work on the feature until now. The feature-file-transclusion branch should contain all the necessary code.

With the contentBlocks option, all types of content blocks are transformed into \markdownRendererContentBlock{suffix}{raw URL}{URL}{title}, which is rendered as a table by the default LaTeX and ConTeXt renderer prototypes if suffix is csv and transcluded otherwise.

With the contentBlockPolymorphism option, we may also get:

  • \markdownRendererInputFencedCode{raw URL}{infostring} if there exists a suffix: infostring entry in Languages.json, and
  • \markdownRendererImage{URL}{raw URL}{URL}{title} if suffix is either png, jpg, jpeg, gif, tif, or tiff.

An example LaTeX document:

\documentclass{article}
\begin{filecontents*}{scientists.md}
Foo bar
\end{filecontents*}
\begin{filecontents*}{scientists.csv}
name,surname,age
Albert,Einstein,133
Marie,Curie,145
Thomas,Edison,165
\end{filecontents*}
\usepackage[contentBlocks,contentBlockPolymorphism]{markdown}
\begin{document}
\begin{markdown}
/scientists.md
/scientists.csv
\end{markdown}
\end{document}

scrot

(UPDATE: Note that the current version renders the table as a float instead.)

I considered doing URL fetching, but decided against it in the end, as it seems difficult to provide an implementation that would be both simple and robust. The users may implement URL fetching themselves:

\documentclass{article}
\usepackage[contentBlocks,contentBlockPolymorphism]{markdown}
\usepackage{graphicx}
\begingroup
\catcode`\@=11
\catcode`\%=12
\catcode`\^^A=14
\global\def\markdownRendererImage#1#2#3#4{^^A
  \immediate\write18{^^A
    if printf '%s' "#3" | grep -q ^http; then
      OUTPUT="$(printf '%s' "#3" | md5sum | cut -d' ' -f1).^^A
              $(printf '%s' "#3" | sed 's/.*[.]//')";
      if ! [ -e "$OUTPUT" ]; then
        wget -O- '#3' > "$OUTPUT";
      fi;
      printf '%s%%' "$OUTPUT" > \jobname.fetched;
    else
      printf '%s%%' "#3" > \jobname.fetched;
    fi}^^A
  {\everyeof={\noexpand}^^A
   \edef\filename{\@@input"\jobname.fetched" }^^A
   \includegraphics[width=\textwidth]{\filename}}}
\endgroup
\begin{document}
\begin{markdown}
  https://cloud.githubusercontent.com/assets/603082/24072692/17f79436-0beb-11e7-8da9-6bfbd6e1ec7d.png
  ![Martin Scharrer's example image](example-image)
\end{markdown}
\end{document}

screenshot_20170318_171446

If some package provided a \fetch#1 command that would expand to #1 for local files and unknown protocols, but would fetch files and return local filenames for known protocols (while also taking care of caching), then that would be pretty great and I would be willing to add such a package to markdown.tex as an external dependency to provide automatic URL fetching.

I will merge and release by the next weekend. In the meantime, I will be grateful for any feedback; especially regarding the compatibility with the iA Writer. I don't own any Apple devices and the Android OS version of the editor does not seem to implement file transclusion, so my opportunities for testing are limited.

@michal-h21
Copy link
Author

michal-h21 commented Mar 20, 2017

That's great, thanks!

I think that the URL fetching isn't so important, it wouldn't be easy to implement it in a portable way.

I've tried Markdown, CSV, image and code inclusion and it works perfectly. The only issue I found was that Lua isn't included in Languages.json. When I added it, I've got Listings package error, because it supports several Lua versions and you must choose which one. I fixed that with:

\lstset{
 defaultdialect=[5.2]Lua
}

in the document preamble.

@Witiko
Copy link
Owner

Witiko commented Mar 20, 2017

Note that all Languages.json files found by kpathsea are loaded. As a result, you may maintain your local Languages.json file without altering the Languages.json file that comes along with the distribution. I suppose I should document this in the interface documentation, since it is a useful feature that is now buried in the implementation.

Currently the Languages.json file is identical to the one from iainc/Markdown-Content-Blocks. I guess it would be convenient to distribute various mappings that are directly compatible with packages such as listings, minted, or with the ConTeXt highlighter. Then again, that's a lot of different Languages.json files to maintain, so perhaps it would be best to persuade the package maintainers to distribute their own mappings. I haven't really thought this through / reviewed the situation yet; if the language names remain relatively constant across the various highlighters, then this might easily be an overkill.

Last thing on my mind: the contentBlockPolymorphism option is most likely going away before the release. It just seems superfluous, as we can transform the input to:

  • \markdownRendererContentBlock{suffix}{raw URL}{URL}{title},
  • \markdownRendererContentBlockOnlineImage{suffix}{raw URL}{URL}{title}, and
  • \markdownRendererContentBlockCode{suffix}{language}{raw URL}{URL}{title}

and let the user decide whether they want to delegate the contentBlockOnlineImage renderer to the image renderer and the contentBlockCode renderer to the inputFencedCode renderer (the default behavior), or whether they want to strip away the extra information and delegate both the contentBlockOnlineImage and contentBlockCode renderers to the contentBlock renderer (this might be useful if you are completely overriding the semantics of content blocks for your document).

@Witiko
Copy link
Owner

Witiko commented Mar 24, 2017

Also the Languages.json file is currently CC BY-SA-licensed, so I can't include it in the release. The release version will therefore either ship without a Languages.json file, requiring the user to supply one themselves, or it might ship with a number of different Language.json files as suggested above.

@michal-h21
Copy link
Author

It also seems that many languages are missing from the Language.json - I've extracted list of 90 languages supported by Listings package, only 25 of them were found there. Even major ones like PHP or C seems missing. I guess that it is because the name and the extension are equal.

Anyway, it seems like a good idea to use different JSON files for each particular syntax highlighter, as they may use different names for one language.

@Witiko Witiko closed this as completed in 22cd86b Mar 27, 2017
@Witiko Witiko added the lua Related to the Lua interface and implementation label Apr 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request lua Related to the Lua interface and implementation
Projects
None yet
Development

No branches or pull requests

2 participants