Syntax for specifying image size #261

Closed
jgm opened this Issue Jun 10, 2011 · 110 comments

Projects

None yet
@jgm
Owner
jgm commented Jun 10, 2011

Googlecode Issue #29

There is currently no way to specify the size of an image.
For some ideas, see http://code.google.com/p/pandoc/wiki/ImageSizes.

@jgm
Owner
jgm commented Jun 10, 2011

Comment 2 by schm...@gmx.de, Nov 10, 2010
You can enter the dimensions of an image in the header of the JPG or PNG files. LaTeX and the DocBook processors will use that size. In HTML output you can set the size for every image in a separate CSS file.

@candlerb

Duplicate of #61 ?

@seanfarley

I'd like to say for the record that all of the advice for image manipulation (see above) doesn't help at all when dealing with vector image formats (svg, TikZ, pdf, etc.). This issue is quite the wrench.

@pckujawa

@jgm mentions that

You can enter the dimensions of an image in the header of the JPG or PNG files.

I'm not sure what that means. Where does one specify the header of one of these image files? I assume this isn't in reference to the bytes of the image file itself.

@jgm
Owner
jgm commented Apr 22, 2013

+++ pckujawa [Apr 22 13 00:53 ]:

[1]@jgm mentions that

 You can enter the dimensions of an image in the header of the JPG or
 PNG files.

I'm not sure what that means. Where does one specify the header of one
of these image files? I assume this isn't in reference to the bytes of
the image file itself.

The DPI is stored in metadata in the header of the image files.
You can view and manipulate it with most image manipulation programs
(e.g. Gimp, Photoshop). I like to use ImageMagick command line tools,
myself.

convert -units PixelsPerInch my.jpg -density 300 my.new.jpg
@pckujawa

Ah, so you were referring to the bytes in the image file itself. That
method doesn't let me specify a width that is larger than, for example, the
\maxwidth of the latex page, but I guess that is by design (otherwise
things overflow badly).

For anyone else using latex, I did find that \includegraphics does not work
as expected because the default template has the following remapping:

$if(graphics)$
\usepackage{graphicx}
% We will generate all images so they have a width \maxwidth. This means
% that they will get their normal width if they fit onto the page, but
% are scaled down if they would overflow the margins.
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth
\else\Gin@nat@width\fi}
\makeatother
\let\Oldincludegraphics\includegraphics
\renewcommand{\includegraphics}[1]{\Oldincludegraphics[width=\maxwidth]{#1}}
$endif$

So if you want to use latex and \includegraphics[width=...], you need to
comment out the \renewcommand. Just curious, is there no way to pass
arguments into the [] part of a macro in latex? If there is, why not allow
users to pass in their own [width=...] without having to edit the template?
I'm a newbie to latex macros, so maybe I just don't understand their
complexity.

Thanks for pandoc!
Pat

On Mon, Apr 22, 2013 at 9:43 AM, John MacFarlane - notifications@github.com
github.pck.374cd93cd6.notifications#github.com@ob.0sg.net wrote:

+++ pckujawa [Apr 22 13 00:53 ]:

[1]@jgm mentions that

You can enter the dimensions of an image in the header of the JPG or
PNG files.

I'm not sure what that means. Where does one specify the header of one
of these image files? I assume this isn't in reference to the bytes of
the image file itself.

The DPI is stored in metadata in the header of the image files.
You can view and manipulate it with most image manipulation programs
(e.g. Gimp, Photoshop). I like to use ImageMagick command line tools,
myself.

convert -units PixelsPerInch my.jpg -density 300 my.new.jpg


Reply to this email directly or view it on GitHubhttps://github.com/jgm/pandoc/issues/261#issuecomment-16795811
.

@adityam
adityam commented Apr 24, 2013

From what I understand, there is no consensus on markdown syntax to specify the width of an image. For that reason, width specifications are not supported.

@adityam
adityam commented Apr 24, 2013

Similar to HTML+CSS, ConTeXt allows you to specify the dimensions of individual figures. You can create a figures.tex file:

\defineexternalfigure[filename.ext][width=..., height=...]
\defineexternalfigure[filename.ext][width=..., height=...]

Then generate a ConTeXt file using

pandoc -t context -s output.tex

and compile it using

context --environment=figures.tex output.tex
@pckujawa

@adityam Thanks, good to know.

@nichtich
Contributor

Just a note: if image size should be specified in Pandoc Markdown, one must be able to specify at least two kinds of sizes: first the size in cm or inch for PDF, DocBook, Word etc. and second the size in pixels for HTML, EPUB and related output formats. For bitmap image files both can also be specified in the image files but vector image files usually contain no pixel dimensions.

@mb21
Collaborator
mb21 commented Jul 22, 2013

A TeX hack to get floating images in ConTeXt is to redifine \externalfigure:

\let\externalfigureOrig\externalfigure
\def\externalfigure[#1]{\placefigure[right]{}{\externalfigureOrig[#1]}}
@adityam
adityam commented Jul 22, 2013

This issue is about changing figure size, and not its floating behavior. By default,

 $pandoc -t context 
 ![Caption](cow.pdf)

gives

\placefigure[here,nonumber]{Caption}{\externalfigure[cow.pdf]}

Do you want to change the floating behavior (from here which floats only when necessary) to right? If so, a better way might be to modify pandoc to generate

\placefigure{Caption}{\externalfigure[cow.pdf]}

and set the default floating behaviour using \setupfloat[figure][location={....}].

@jgm
Owner
jgm commented Oct 21, 2013

I just discovered that in HTML you can do

<img style="width: 3cm;"...>

This resolves the problem about how to convert between pixels and other measurements; we'll just let the browser handle that.

@timtylin
Contributor

Note that in almost all browsers, 1 in (and equivalently 2.54 cm) is simply shorthand for 96 pixels.

@jgm
Owner
jgm commented Oct 21, 2013

Here is what multimarkdown does:

[image]: http://path.to/image "Image title" width=400px height=400px
[link]:  http://path.to/link.html "Some Link" class=external
     style="border: solid black 1px;"

I like the basic idea. I'd like to allow something similar in inline images and links:

![image](http://path.to/image "Image title" width=4cm)
[link](http://path.to/link.html "some link" class=external)

Some questions and potential difficulties:

  1. What units to allow? cm, em, and in will work for both LaTeX and HTML (using styles). px is more problematic.
  2. There is some argument for reusing the existing attribute syntax (used now in headers and code blocks) for consistency. That would suggest something like
[image]: http://path.to/image "Image title" {width=400px height=400px}
[link]:  http://path.to/link.html "some link" {.external style="border: solid black 1 px;"}

and in inline form,

![image](http://path.to/image "title" {width=400px height=400px})

or should it be

![image](http://path.to/image "title"){width=400px height=400px}
  1. Another advantage of the curly-bracket form is that it avoids some parsing difficulties, especially when no title is present. Currently spaces are allowed in the URL part of a markdown link, so
[link](http://path.to/link.html class=external)

would be parsed as a link to the URL http://path.to/link.html%20class=external, and the multimarkdown format would break this existing behavior. Requiring curly brackets for attributes gives the parser a clear signal.

@jgm
Owner
jgm commented Oct 21, 2013

Is this hard coded in the browsers, or is it sensitive to the system
DPI setting?

On linux, at least, you can change the system DPI:
https://wiki.archlinux.org/index.php/Xorg#Setting_DPI_manually

+++ Tim Lin [Oct 21 13 13:05 ]:

Note that in almost all browsers, 1 in (and equivalently 2.54 cm) is
simply shorthand for [1]96 pixels.


Reply to this email directly or [2]view it on GitHub.
[xJAuenYDiIoVt3LF3y6848lkNPgY-QGjRmh7vcZK4inXnxdKsKiEjkMtMuE82YEe.gif]

References

  1. http://www.quirksmode.org/blog/archives/2012/11/the_css_physica.html
  2. #261 (comment)
@mb21
Collaborator
mb21 commented Oct 21, 2013

I'm all for

![image](http://path.to/image "title"){width=400px height=400px class=myClass}

So we have each type of bracket besides the others [...](...){...} instead of [...](...{...}...). Also we're free to add additional attributes.

@ambs
ambs commented Oct 21, 2013

👍 for [...](...){...}

@timtylin
Contributor

@jgm: As far as I can tell, this convention is part of the official CSS specification and does not obey variations in OS or device. I think there's currently no reliable method to have perfect physical sizing control short of probing for the actual physical device, and I'm not sure ultimately how useful it is anyways.

IMO one of the best practices is to use widths relative to the container. In LaTeX I've always used something like width=0.5\textwidth and in HTML that corresponds to width=50% (defaults to be relative to container width). I don't have any idea whether the other formats support this kind of thing, but it would make the most sense to me.

@ambs
ambs commented Oct 21, 2013

I kind of agree with @eVITAERC. I do that always, too, for LaTeX. But although LaTeX has a fixed page width, that is not true for HTML. But we can always support both (relative or absolute).

@nichtich
Contributor

@mb21 and @ambs: So how would you expect to specify the size of an image for both, LaTeX/PDF and HTML/EPUB format? The original image will unlikely have 96 DPI. This would not work, would it?:

![image](http://path.to/image "title"){width=400px height=400px width=5cm height=5cm}

Its usual to create PDF and HTML from the same Markdown source. If the proposed syntax cannot support both of them at the same time, one must manually change the image file as it is needed now. A possible solution with limited usability is to support a "dpi" parameter. For instance 5cm / 203DPI = 400px:

![image](http://path.to/image "title"){width=400px height=400px dpi=203}
![image](http://path.to/image "title"){width=5cm height=5cm dpi=203}
@timtylin
Contributor

@ambs: HTML at least has the container width that you can almost always count on. Having a width relative to the container width is also ideologically closer to the current almost de-facto system of rendering on a column/grid system for CSS (think of percentages like laying out a picture across 100 columns).

On the other hand, absolute sizes like pixels will test your resolve to live when you want to make it work for high-DPI handheld devices and 4K desktop displays.

@dashed
dashed commented Oct 21, 2013

I like @nichtich's suggestion to have separate dimension specification for HTML and LaTeX (pdf output). You do not want to fixate pixel to an actual physical measurement. This is in my opinion the best way to go.

Other than the default case:

![image](path/to/image "title")

One may optionally include dimensions for either HTML or LaTeX:

  1. ![image](/path/to/image "title"){width=400px height=400px}
  2. ![image](path/to/image "title"){rwidth=5cm rheight=5cm}
  3. ![image](path/to/image "title"){width=400px height=400px rwidth=5cm rheight=5cm}

rheight and rwidth may refer to 'real' height and width for physical outputs.

If either dimensions for HTMl and LaTeX are excluded, then they're assumed to be default and handled by the browser or LaTeX pdf compiler accordingly.

Other cases may include that, if you only include rheight or rwidth, the aspect ratio is kept. I think this is the default behaviour in LaTeX without having to specify keepaspectratio.

How do we separate attributes for either LaTeX and HTML?

For instance,
![image](path/to/image "title"){width=400px, height=400px, rwidth=5cm, rheight=5cm, style="border:5px solid red;", class="imagestyle", angle=180}

Does anyone know if LaTeX throws an error if you just dump the attributes into:
\includegraphics[dump_attributes_here]{path/to/image}

@jgm
Owner
jgm commented Oct 21, 2013

I don't see the need for the complexity of separate measures for
different formats, if specifications like "50%" or "5cm" will work
for both. We can just disallow px measurements, or discourage them.

@uvtc
uvtc commented Oct 22, 2013

![...](...){...} would mesh nicely with:

  • the fenced code block attribute syntax: ~~~{#my-code .haskell} ...
  • header attribute syntax: ## Section {#some-id}
  • the sometimes-proposed span syntax like [...]{...}
  • the sometimes-proposed fenced div syntax, possibly like ^^^{#some-id .some-class}...
@uvtc
uvtc commented Oct 22, 2013

Added my comment about the div syntax (mentioned in prev comment here) to the end of issue #168.

@dashed
dashed commented Oct 22, 2013

@jgm If that's the case, would you at least allow the style attribute for HTML? That way, style="height=40px;width=25px;" becomes an alternative for those who want to specify the height and width using px.

@mb21
Collaborator
mb21 commented Oct 22, 2013

Personally, I usually want to use a width for TeX, and assign a class for HTML (so I can set width, floating and responsive design properties with CSS). I don't want to use percentages of the parent because scaling images in browsers by anything other than multiples of two usually produces aliasing artifacts.

@timtylin
Contributor

@mb21 This may be getting a bit off-topic, but I think there may be some potential to introduce style-sheet-like behavior in LaTeX by automatically custom-defining a figure macro for each image (named by identifier). Nothing concrete in my head yet, just throwing it out there.

LaTeX has always been a little bit unsatisfying to me because it's supposed to provide stylesheets to the TeX-style control character paradigm, but many of the current "de-facto" packages don't really work like pure stylesheets.

@adityam
adityam commented Oct 29, 2013

@eVITAERC You mean something like the ConTeXt solution that I proposed earlier in this thread?

@timtylin
Contributor

@adityam Similar, except for something compatible with existing LaTeX-based workflows. The main idea is to decouple specific layout-based information from syntactic information.

@mb21
Collaborator
mb21 commented Oct 29, 2013

@eVITAERC yes, I think that would be great. I'm no TeX guru, so is there a way to assign the same identifier to multiple images or figures (like an HTML class) in LaTeX and/or ConTeXt? Using unique identifiers (as suggested by @adityam) works as well, but is potentially quite a bit more cumbersome.

@adityam
adityam commented Oct 29, 2013

@mb21 Yes, in ConTeXt you can assign the same identifier to multiple images. See context-wiki for details.

In practise, this works in the same way as CSS classes.

@mb21
Collaborator
mb21 commented Oct 29, 2013

Oh, that's great. So ![image](file.jpg){class=myClass} could be translated to <img href="file.jpg" class="myClass" /> in HTML and \externalfigure[file.jpg][myClass] in ConTeXt. Output formats that don't support the notion of a class or non-unique identifier would just leave it out.

@mb21
Collaborator
mb21 commented Nov 14, 2013

Having thought about it a bit more, I propose the following syntax.

Headers

as already specified in the README
# My header {#id .class key=value key=value}

Images

as discussed here and in issues #170 and #813
![caption](image.png){#id .class}

which can be extended to
![caption](image.png){#id .class width=10cm width=600px} or ![caption](image.png){#id .class width=50%}

The HTML writer would ignore all cm-values and the TeX writers ignore the px-values. Both honor the %-values, so when using % you can use only one width key. Analogous for height.

@aaren
aaren commented Dec 9, 2013

@mb21

  1. why would you not comma separate the attributes inside {...}?

  2. What happens when you don't want to specify a class? e.g.

    ![caption](image.png){#id width=50%}
    

    What happens here? Are you suggesting that class should always be the second un-named argument?

@mb21
Collaborator
mb21 commented Dec 9, 2013

@aaren

  1. It's space separated since we use that syntax already for headers.
  2. No, class is simply the argument starting with a dot, again as already used in the headers. If you don't specify a class the resulting image won't have any and you cannot style it with CSS or ConTeXt.
@mb21
Collaborator
mb21 commented Mar 11, 2014

While there doesn’t appear to be consensus on how to explicitly specify image widths and heights, at least in HTML and ConTeXt the issue could be mostly avoided by using: ![caption](image.png){#id .class}

So now we’d have to change the definitions from...

  • Link [Inline] Target to Link Attr [Inline] Target and
  • Image [Inline] Target to Image Attr [Inline] Target.

This would solve #170. And while we are making breaking changes to Pandoc’s data format, we should probably add Attr also to Table (to solve #813) and maybe even to BlockQuote and Quoted (you never know).

When we see how people use these optional id and class attributes, we might develop a better understanding which additional key-value pairs like width and height are then still needed. These could then be added relatively easy without changing Pandoc’s data format yet again.

@seanfarley

When will this issue finally move forward? When will we get support for SVG (or any vector graphics)?

@jgm
Owner
jgm commented Mar 19, 2014

@seanfarley: This issue is about image size, not SVG. SVG already works fine in output formats that support it (such as HTML).

@seanfarley

Image size is very much related. If the graphic is generated in, say, TikZ then there is no way to control how large it appears in HTML. Usually, HTML, HTML slides, LaTeX, and beamer all have different coordinates, so having an image that can scale to each one is the only real way include graphics.

@bilderbuchi bilderbuchi added a commit to openframeworks/ofBook that referenced this issue Mar 20, 2014
@bilderbuchi bilderbuchi Scale all images to 70% in pdf output.
This saves about 20 pages (or 5%). Most figures improve with it, with a 
couple of exceptions, where the base material is of low resolution already.
There is no mechanism yet to prescribe image width in markdown or 
pandoc, cf. jgm/pandoc#261.
8636348
@bilderbuchi

So, is there already a clear way forward on this?

@mb21
Collaborator
mb21 commented Jul 9, 2014

While we're at it, we might consider the implications of the new responsive images HTML5 standard as well. As of July 2014 it's only in beta builds of browsers yet. But the Picturefill polyfill is already available. The EPUB3 standard is likely to adapt this as well.

This article explains it quite well. But basically there are three ways to specify responsive images:

  1. Provide an alternative image for high-resolution (e.g. 'retina') displays:

    <img srcset="small.jpg 1x, large.jpg 2x"
       src="small.jpg"
       alt="A rad wolf" />
    
  2. Tell the browser the width (w) in pixel of different image files, as well as the sizes the image will be rendered at under different conditions (e.g. min-width of the screen). The browser will then pick the file it deems best suited:

    <img srcset="large.jpg  1024w,
          medium.jpg 640w,
          small.jpg  320w"
       sizes="(min-width: 36em) 33.3vw,
          100vw"
       src="small.jpg"
       alt="A rad wolf" />
    
  3. Finally, if you want to offer different image file formats (like SVG or WebP), or do art direction (like using cropped images on smaller screens), then you'll have to use the picture element:

    <picture>
       <source media="(min-width: 36em)"
          srcset="large.jpg  1024w,
             medium.jpg 640w,
             small.jpg  320w"
          sizes="33.3vw" />
       <source srcset="cropped-large.jpg 2x,
             cropped-small.jpg 1x" />
       <img src="small.jpg" alt="A rad wolf" />
    </picture>
    

    Note that the img within the picture is not only a fall-back but is always needed: The source elements just provide alternative sources for the img element.

My proposal (see above posts) was to include arbitrary attributes in pandoc images:

![caption](image.png){#id .class key1="value 1" key2="value 2"}

This would support 1. and 2. but would fail at natively specifying all the information required to generate the picture element. But as it will be only needed in HTML and EPUB anyways, I don't see the harm in using markdown with inline HTML, which is stripped in e.g. LaTeX and actually already works as expected:

paragraph one

<picture>
    <source srcset="large.jpg" media="(min-width: 1024px)">
    <source srcset="cropped.jpg">
    ![](large.jpg)
</picture>

Thoughts?

@xapple
xapple commented Jul 23, 2014

In my opinion, this is the single most important issue that pandoc is facing at the moment. Not being able to specify image size in markdown syntax is a real show stopper for many users. Why has this issue been waiting for so many years ? Has development stopped ?

@mb21
Collaborator
mb21 commented Jul 26, 2014

Okay, I guess @jgm is reluctant to move forward on this until there is a solution for all use-cases. So I’ll give it another shot.

As proposed above, the syntax for inline images is:

![caption](image.png){#id .class key1="val 1" key2="val 2"}

and for reference images:

![caption]

[caption]: image.png {#id .class key1="val 1" key2="val 2"}
  • For HTML/EPUB all attributes are passed through “as is”. This provides elegant support for all future responsive image properties (e.g. srcset, sizes) as well as a consistent way to specify the plain old HTML attributes width and height. So if you write ![](image.png){width=300 height=200}, you’ll get <img src="image.png" width="300" height="200">. (This is also useful to instruct the browser to reserve a 300x200 pixel box until the image is loaded to prevent a jarring reflow when the image arrives, but this could also be done with CSS.)
  • The other writers ignore attributes that are not supported by their output format.
  • The width and height attributes need special treatment. When used without a unit, the number’s unit is in pixels (to keep the syntax consistent with passing through HTML attributes). However, one of the following units can be used: px, cm, inch and %.
    • Pixels are converted to cm/inch for output in page-based formats like LaTeX and vice versa for output in HTML. Pandoc could introduce a --dpi option, but the default would be 96dpi.
    • If % is used, the result would be for example <img href="file.jpg" style="width: 50%;" /> or \includegraphics[width=0.5\textwidth]{file.jpg} (LaTeX) and finally \externalfigure[file.jpg][width=0.5\textwidth] (ConTeXt).
  • In formats like HTML and ConTeXt that have a notion of a class, ![](file.jpg){.myClass} will result in <img href="file.jpg" class="myClass" /> and \externalfigure[file.jpg][myClass] respectively. (Note that the complete ConTeXt syntax is \externalfigure[file.jpg][width=8cm, height=1cm][myClass] and the class has always lower precedence, whereas CSS has always higher precedence than the HTML attributes.)
  • When no attributes are specified, the behaviour is obviously the same as now (DPI image metadata, for example, can still be used).

This covers all use-cases I can think of. Thoughts anyone? @jgm, do you still have concerns? if so, what are they?

@dubiousjim

I've implemented something like this (the Reader side only) on my local copy of Pandoc. I'll submit it for consideration in a few days. I implemented a lighter-weight syntax than described above. In my view, having ![...](...){...} or ![...](...{...}) for inline images is just too bulky. I'd prefer for the syntax to allow the extra metadata only in reference keys, that is you'd have to say:

text ... ![caption][key] more text...

[key]: url "title" width=... height=... #id .class1 .class2

Also, for the time being, I'm not doing any reinterpretation of the attribute values. If you say width=blah, then I pass blah on to the Writer. If you want px or cm or %, then just say so.

Experimentally, I am checking for the presence of href=url and similar attributes on the image, and if they're found, then they (and any #id and .class info) get put on an <a> element that wraps the image. Maybe something similar could be done someday for <picture> elements, but I haven't tried to do that now.

@bilderbuchi

fwiw, I think @mb21 proposal sounds like a solid solution.

@denten
denten commented Jul 29, 2014

Can I suggest scaling large images to page width by default? Right now compiling into .docx or .odt pushes large images outside of the printable margin. Whatever the solution to specifying width, the default behavior, without any specification, should be more sane.

@ambs
ambs commented Aug 18, 2014

While we wait for pandoc to support image scaling natively, does anybody has a filter I can use to produce LaTeX and HTML5 outputs with scaled images?

@mofosyne
mofosyne commented Sep 9, 2014

For your information, a similar discussion on generic attribute syntax is in http://talk.commonmark.org/t/guide-for-syntax-extensions/

Here is my suggestion so far

Embedded Media

!mediaType[description](url){#myId .myClass key=val key2="val 2"}
  • assumed to be image if mediaType left blank

  • syntactic sugar ( content of () handled by mediaType handler/extension):

    ![Kittens With Mittens](file.mp4 "video title" 80x10 )
    

    is equivalent to typing:

    !video[Kittens With Mittens](file.mp4){title="video title" width=80 height=10}
    
@MetaMemoryT

I really like the proposal @mb21 made: #261 (comment)

Would it get merged @jgm if someone implemented it?

Can we get a go-ahead to implement this?

If LaTeX and HTML support meta-data about images, why can't pandoc?

@jgm
Owner
jgm commented Oct 29, 2014

I think it's a good proposal. It would require very intrusive changes, however -- it's not a little thing. The Image inline type in pandoc-types would need to be changed to include attributes. Any code that case matches on inline elements, including ALL of the writers, would need to be changed. There would also need to be changes in Builder, Walk, etc.

+++ MetaMemoryT [Oct 29 14 09:15 ]:

I really like the proposal @mb21 made: #261 (comment)

Would it get merged @jgm if someone implemented it?
Can we get a go-ahead to implement this?


Reply to this email directly or view it on GitHub:
#261 (comment)

@nichtich
Contributor

tables and blockquotes might also get attributes when the AST model is changed anyway. For backwards compatibility a pre-filter and post-filter can be applied to map to/from the old model, so existing filters don't fail.

@MetaMemoryT

I think the next step would be to estimate the total time to implement this, perhaps this could be crowd-funded. It looks like the only obstacle that @jgm has mentioned is the tame to implement it and I think documenting the behavior and writing unit tests is important for such large changes, and also writing pre-filter and post-filter compatibility functions.

Here is a very rough estimate of the work required (Please provide your opinion if you disagree):
h = hour

1h to update pandoc-types library
12h - 2h for each Writer that needs to change behavior, (and write unit tests for the new behavior) (HTML, EPUB, LaTeX, ConTeXt, Markdown, Docx)
6h to document how each writer behaves when metadata is added for images
1h to update all other writers to just punt on the new AST
1h to update AST api (pandoc-types / Text / Pandoc / Walk.hs)
2h to write backward compatible helper functions for AST api
2h to implement Mardown parser of the attributes


25h total I got so far, Again, please mention anything I might have missed.

@mpickering
Collaborator

I think that estimate is a bit high. I could do this quite quickly if
people provided suitable test for the relevant formats.

The biggest worry is that these sorts of changes break backwards compatibility.

On Thu, Oct 30, 2014 at 2:07 PM, MetaMemoryT notifications@github.com
wrote:

I think the next step would be to estimate the total time to implement
this, perhaps this could be crowd-funded. It looks like the only obstacle
that @jgm https://github.com/jgm has mentioned is the tame to implement
it and I think documenting the behavior and writing unit tests is important
for such large changes, and also writing pre-filter and post-filter
compatibility functions.

Here is a very rough estimate of the work required (Please provide your
opinion if you disagree):
h = hour

1h to update pandoc-types library
12h - 2h for each Writer that needs to change behavior, (and write unit
tests for the new behavior) (HTML, EPUB, LaTeX, ConTeXt, Markdown, Docx)
6h to document how each writer behaves when metadata is added for images
1h to update all other writers to just punt on the new AST
1h to update AST api (pandoc-types / Text / Pandoc / Walk.hs)
2h to write backward compatible helper functions for AST api

23h total I got so far, Again, please mention anything I might have missed.


Reply to this email directly or view it on GitHub
#261 (comment).

@mb21
Collaborator
mb21 commented Oct 30, 2014

I think it's great that this is moving forward, maybe for 1.14.0. I can help adding utility functions to Text.Pandoc.ImageSize and upgrading the writers, but not sure I'm up to digging through all the internals.

An open question is whether only images should get attributes. While we are making breaking changes to Pandoc’s data format, we should consider adding Attr also to Link (to solve #170 in the future), Table (for #813) and maybe even to BlockQuote and Quoted (you never know, or would that go too far?). I'm not suggesting to change the Markdown reader for all these immediately (although there's been discussion about that on CommonMark), just thinking ahead.

@mpickering what kind of tests were you thinking about? You can obviously always construct cases that parsed differently before the change. It's debatable whether newlines should be allowed in attribute blocks (kramdown, markdown extra don't), but I guess the most important thing is to remain consistent within pandoc (to header attributes), so just reusing attributes :: MarkdownParser Attr should sidestep such concerns for now. Finally, feel free to look at my take on a generic attributes spec–this may or may not be helpful.

@timtylin
Contributor

I've actually done most of the modifications necessary for Scholdoc (see also scholdoc-types), including modifications to the tests to make them pass. I've removed a lot of the writers for Scholdoc, but if you go back a few commits you can find all of the current Pandoc writers that I've modified to build correctly with the new Image data constructor that includes an Attr.

@timtylin
Contributor

I don't currently have time to extract a clean diff for just the change to Image, but perhaps it can be a helpful starting point.

Like what a lot of people said already, the biggest problem is that this would currently break all the existing filters that target images. Some additional magic can happen in the filter handling code for ones that you give to Pandoc directly, but anyone doing a * -> native/JSON -> * would have their existing code broken.

@mb21
Collaborator
mb21 commented Nov 7, 2014

@timtylin Thanks, I've looked at the diff between the two repos, but except for the changes to have the changed type definition parse, there doesn't appear to be implemented much regarding image dimensions?

@timtylin
Contributor
timtylin commented Nov 7, 2014

@mb21 Ah yes, I was more referring to the modifications needed to the pandoc-types, as well as the modifications needed to the test cases.

As for the actual implementation: Scholdoc itself is not really a full Pandoc fork, but it's just meant to facilitate everyday academic writing using markdown. As such, it only does a very limited set of conversions: markdown_scholarly -> HTML/LaTeX/Docx. Furthermore, only the width attribute, specifically relative width in terms of %, is really "officially supported" for all output formats. The assumption is that most people really only need control on one dimension due to fixed aspected ratios, and the relative width to container seems to be the most natural.

Having said that, you'll find implementations for the correctly behavior of width % mostly in the Figure functions of the LaTeX output writer, where I parse the % and set it to be a relative fraction of \hsize. Alas, I don't have a complete implementation for Docx yet that does the same behavior, but I have a undergrad student working on it part time. If the user specifies the width in terms of other units (px, inch, etc), we just use the trivial implementation in all writers.

@aaren
aaren commented Nov 7, 2014

@timtylin have you documented the changes that you have made in scholdoc anywhere? It would be good to get an idea of the extensions that you've made.

@timtylin
Contributor
timtylin commented Nov 7, 2014

@aaren I have written up a draft of the extended Markdown syntax that Scholdoc understands, that I'm calling "ScholarlyMarkdown" for now. It's kinda in soft-launch; I've somehow extorted my labmates to use it for everyday paper writing, but because I think it needs some more work (especially on the Docx output front) I haven't publicized it too much.

@storrgie
storrgie commented Dec 5, 2014

@timtylin Can pandoc parse this specification?

@timtylin
Contributor
timtylin commented Dec 5, 2014

@storrgie If you're only interested in HTML & LaTeX/PDF output then you can try my very specialized fork of of Pandoc, but otherwise no.

@storrgie
storrgie commented Dec 5, 2014

@timtylin I was looking through it, I should have read a bit more before asking the question. I'm under duress to finish up a document right now but I plan to look at scholdoc as soon as I'm finished because it seems to be what I've been looking for.

@mpickering
Collaborator

@mb21 If you feel comfortable going ahead with this implementation then go for it! If you need any help then ping me.

I think the only thing to be resolved is what the type of the new Image constructor should be. Any thoughts? Maybe Image Attr Inline Target would be suitable?

@mb21
Collaborator
mb21 commented Dec 10, 2014

@mpickering Okay, I've started and it is going better than I expected :) Will post a pull request soon.

@mb21
Collaborator
mb21 commented Dec 11, 2014

@aaren mentioned that the LaTeX writer should render an id on a figure as a \label{}.

If anyone knows of any other output formats that support the notion of an id or class on images, please let me know. I know already of ConTeXt's identifiers that act like classes if I'm not mistaken (although one image cannot have multiple classes), as well as the classes in DocBook (the role attribute), MediaWiki, Textile and RST (which also has an id-like name parameter).

@Jmuccigr
Contributor

Just bumping up against this problem today and wondering how far an implementation is?

Thanks.

@jgm
Owner
jgm commented May 12, 2015

+++ John Muccigrosso [May 11 15 10:33 ]:

Just bumping up against this problem today and wondering how far an
implementation is?

I have an implementation on a branch in this repository, but
it's not planned for the next release (1.14). We'll
probably have this in the release after; it's a big breaking
change.

@jgm
Owner
jgm commented May 12, 2015

Yes that's it.

+++ Alberto Leal [May 11 15 18:47 ]:

@jgm Is it this branch? https://github.com/jgm/pandoc/commits/new-image-attributes

@frumbert
frumbert commented Jun 2, 2015

Also looking for this change; I'm using the windows binary. It's pretty similar to the format I already use with my custom parser (e.g. ![image caption](image.jpg {width: "500px"}) - this format just pushes the attrib map outside rather than inside the link grouping. Also in my implementation if the output format is HTML5, any custom "unknown" attributes are prefixed with "data-", so that it maintains as much of the input document as possible - such that {width: "500px", id: foo, cake: "lie"} becomes width="500px" id="foo" data-cake="lie".

I also had it [attrib map] implemented on standard hyperlinks, is that part of this change/branch?

@RobTrew
RobTrew commented Jun 15, 2015

pandoc.org/installing.html suggests [for Windows | OS X] that

There is a package installer at pandoc’s download page

I can't (2015-06-15) see installers at https://github.com/jgm/pandoc/releases/tag/1.14.0.3 ...

Should I perhaps be looking elsewhere in space, or in time ?

@nkalvi
nkalvi commented Jun 15, 2015

https://github.com/jgm/pandoc/releases

A bit inconvenient - hopefully fixed soon 😄

@jasoncmcg

At first I thought this was a problem with Pandoc becaused I resized images and all it did was make the images look worst in the PDF (created from a markdown doc). However, as I was reading through here, I saw a mention of using imagemagick to change the DPI. I used Irfanview, went to the image menu and selected Information. I made the DPI extremely high (by multiplying by 10), clicked the 'change' button and noted the 'Print Size' down below. Pandoc will use the print size as dictated by the image DPI. Worked great.

@KurtPfeifle

Good it worked for you!

But be aware...

At first I thought this was a problem with Pandoc becaused I resized images and all it did was make the images look worst in the PDF (created from a markdown doc). However, as I was reading through here, I saw a mention of using imagemagick to change the DPI. I used Irfanview, went to the image menu and selected Information. I made the DPI extremely high (by multiplying by 10), clicked the 'change' button and noted the 'Print Size' down below. Pandoc will use the print size as dictated by the image DPI.

No. It's not _Pandoc_ which makes use and applies the DPI settings to the output.

Most likely, you produced a PDF? In that case it was the LaTeX PDF generating engine which read the DPI metadata from the image and applied it when producing the PDF. Pandoc does not know anything about that!

Also, you results may change if you change in between the different --latex-engine=pdflatex|lualatex|xelatex options.

Lastly, it will not work for other output formats (as far as I'm aware, none of these changes its behavior depending on the DPI setting of the image). You _are_ aware that this setting doesn't change the actual pixel data, but is only a setting in a metadata field of the image, which can be read and reported to you by a tool like exiftool, right?

Worked great.

Don't rely on it. Don't assume it will stay the same over releases (of LaTeX, xelatex, pdflatex, lualatex).

@jgm
Owner
jgm commented Jun 26, 2015

+++ Kurt Pfeifle [Jun 25 15 17:21 ]:

Good it worked for you!

But be aware...

At first I thought this was a problem with Pandoc becaused I resized
images and all it did was make the images look worst in the PDF
(created from a markdown doc). However, as I was reading through
here, I saw a mention of using imagemagick to change the DPI. I used
Irfanview, went to the image menu and selected Information. I made
the DPI extremely high (by multiplying by 10), clicked the 'change'
button and noted the 'Print Size' down below. Pandoc will use the
print size as dictated by the image DPI.

No. It's not Pandoc which makes use and applies the DPI settings to the
output.

Most likely, you produced a PDF? In that case it was the LaTeX PDF
generating engine which read the DPI metadata from the image and
applied it when producing the PDF. Pandoc does not know anything about
that!

Actually, pandoc does check the DPI information embedded
in an image, and uses this to figure out how to size images
in some output formats.

@jasoncmcg

Yes, you are absolutely correct, Kurt.
I am using MiKTeX (default config) with Pandoc on Windows 8.
Here is the line I use to make the conversion:

pandoc -o "example.pdf" -S -s --toc "example.md" 

Also, the same adjustment works great when creating a docx file.

However, and this may be where you are leading, when I created an html file, the icons were indeed blown back up very large. So, I guess I could understand if you want to force icon sizes for generated web-based documentation.

Note: I mostly using Pandoc to create polished docs for documentation out of markdown.

@KurtPfeifle

On Fri, Jun 26, 2015 at 8:17 PM, John MacFarlane notifications@github.com
wrote:

+++ Kurt Pfeifle [Jun 25 15 17:21 ]:

Good it worked for you!

But be aware...

At first I thought this was a problem with Pandoc becaused I resized
images and all it did was make the images look worst in the PDF
(created from a markdown doc). However, as I was reading through
here, I saw a mention of using imagemagick to change the DPI. I used
Irfanview, went to the image menu and selected Information. I made
the DPI extremely high (by multiplying by 10), clicked the 'change'
button and noted the 'Print Size' down below. Pandoc will use the
print size as dictated by the image DPI.

No. It's not Pandoc which makes use and applies the DPI settings to the
output.

Most likely, you produced a PDF? In that case it was the LaTeX PDF
generating engine which read the DPI metadata from the image and
applied it when producing the PDF. Pandoc does not know anything about
that!

Actually, pandoc does check the DPI information embedded
in an image, and uses this to figure out how to size images
in some output formats.

Ah! Thanks for the info!

Is it documented somewhere, which list of output formats this applies to?!

@Jason: sorry for my potentially misleading statements then.

@jgm
Owner
jgm commented Jun 26, 2015

+++ Kurt Pfeifle [Jun 26 15 12:19 ]:

Ah! Thanks for the info!
Is it documented somewhere, which list of output formats this applies
to?!

ODT, Docx, RTF (since these formats require specification of
a size in non-pixel dimensions).

@gantech
gantech commented Aug 14, 2015

Sorry for the noob comment. Still in progress right?

@linquize

ping

@jgm
Owner
jgm commented Aug 15, 2015

Still in progress, there's a PR that adds this.
I will probably merge it before long, but since
it is a breaking change with pandoc-types, I may
do another release in the 1.15 series first.

+++ gantech [Aug 14 15 05:57 ]:

Sorry for the noob comment. Still in progress right?


Reply to this email directly or [1]view it on GitHub.

References

  1. #261 (comment)
@01AutoMonkey

I basically can't use markdown right now because there isn't "support for defining image size", so looking forward to this feature!

@flurin
flurin commented Sep 8, 2015

This would be really interesting as it would be quite trivial to write filters that could do things with these attributes as well (including other files come to mind)

@crsh
crsh commented Oct 21, 2015

👍

@legrostdg

Is there still something preventing #2351 from being merged (except that it now has conflicts with the current master)?

@jgm
Owner
jgm commented Oct 21, 2015

+++ legrostdg [Oct 21 15 07:23 ]:

Is there still something preventing [1]#2351 from being merged (except
that it now has conflicts with the current master)?

Mainly that it requires changes to pandoc-types and
everything that depends on this...it's a big API change.
I think we should be able to do it soon, though.

@tompollard tompollard referenced this issue in tompollard/phd_thesis_markdown Oct 22, 2015
Closed

ability to scale images in markdown #15

@nichtich
Contributor

Where is the best place to get to know and anticipate the coming changes? I bet the change can also affect custom pandoc filters so authors and developers should be pointed to a good description what to take care of.

On 21. Oktober 2015 16:57:17 MESZ, John MacFarlane notifications@github.com wrote:

+++ legrostdg [Oct 21 15 07:23 ]:

Is there still something preventing [1]#2351 from being merged
(except
that it now has conflicts with the current master)?

Mainly that it requires changes to pandoc-types and
everything that depends on this...it's a big API change.
I think we should be able to do it soon, though.


Reply to this email directly or view it on GitHub:
#261 (comment)

Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

@jgm jgm added a commit that closed this issue Nov 20, 2015
@jgm Merge branch 'new-image-attributes' of https://github.com/mb21/pandoc
…into mb21-new-image-attributes

* Bumped version to 1.16.
* Added Attr field to Link and Image.
* Added `common_link_attributes` extension.
* Updated readers for link attributes.
* Updated writers for link attributes.
* Updated tests
* Updated stack.yaml to build against unreleased versions of
  pandoc-types and texmath.
* Fixed various compiler warnings.

Closes #261.

TODO:

* Relative (percentage) image widths in docx writer.
* ODT/OpenDocument writer (untested, same issue about percentage widths).
* Update pandoc-citeproc.
244cd56
@jgm jgm closed this in 244cd56 Nov 20, 2015
@dashed
dashed commented Nov 20, 2015

👍 Slick. Thanks @jgm!

@legrostdg

\o/

@signalwerk

Awesome! Thx!

@ambs
ambs commented Nov 20, 2015

WEEEEEEEEE!!!

@katrinleinweber

Merci! ☀️

@ivoanjo
ivoanjo commented Nov 20, 2015

This is huge, thanks! 👍

@frumbert

Cool, but this won't be available in a full release until 1.16, right? I can't create a windows version of this branch using either stack or cabal.

@jgm
Owner
jgm commented Nov 23, 2015

+++ TIm St.Clair [Nov 22 15 17:41 ]:

Cool, but this won't be available in a full release until 1.16, right?
I can't create a windows version of this branch using either stack or
cabal.

What happens on Windows when you try with stack?

@frumbert

Windows 2003 R2 Sp2 32-Bit, on VirtualBox; git, node, cygwin, miktex, haskell platform, cabal, etc all installed and updated ok. (ghc version 7.8.3)

$ stack install
Downloading lts-3.13 build plan ...FailedConnectionException2 "raw.githubusercon
tent.com" 443 True user error (cryptonite: random: cannot get any source of entr
opy on this system)

, and, when compiling via cabal, I also get cryptonite can't be installed (cryptonite: random: cannot get any source of entropy on this system).

@jgm
Owner
jgm commented Nov 23, 2015

https://hackage.haskell.org/package/cryptonite
says it supports Windows >= 8.

Is Windows 2003 < Windows 8? If so, that's your problem.

+++ TIm St.Clair [Nov 22 15 21:19 ]:

Windows 2003 R2 Sp2 32-Bit, on VirtualBox; git, node, cygwin, miktex,
haskell platform, cabal, etc all installed and updated ok. (ghc version
7.8.3)

$ stack install
Downloading lts-3.13 build plan ...FailedConnectionException2
"raw.githubusercon
tent.com" 443 True user error (cryptonite: random: cannot get any
source of entr
opy on this system)

when compiling via cabal, I also get cryptonite can't be installed
(cryptonite: random: cannot get any source of entropy on this system).


Reply to this email directly or [1]view it on GitHub.

References

  1. #261 (comment)
@frumbert

slaps forehead
compiling happily on my win10 box...

i really have to trash that old thing and stop using it.

@Wind4Greg

I really want this feature but can't get Pandoc to build with stack on Windows 10. Followed the instructions from https://github.com/jgm/pandoc/wiki/Installing-the-development-version-of-pandoc, downloaded the latest stack from http://docs.haskellstack.org/en/stable/install_and_upgrade.html#windows. But I get a weird error seemingly unrelated to Pandoc:

    C:\Users\Greg\AppData\Local\Temp\stack5940\HTTP-4000.2.21\Network\HTTP\Proxy.hs:85:22:
        Not in scope: ‘liftM’
        Perhaps you meant ‘liftM2’ (imported from Control.Monad)

Which has something to do with a Haskell HTTP library. Is there any other way to get this feature? Or is there something about Haskell on windows that I'm overlooking.

@jgm
Owner
jgm commented Dec 8, 2015

Someone has reported the same issue on pandoc-discuss.
I haven't had a chance to build yet on a Windows machine or
investigate further.

+++ Greg Bernstein [Dec 08 15 11:31 ]:

I really want this feature but can't get Pandoc to build with stack on
Windows 10. Followed the instructions from
[1]https://github.com/jgm/pandoc/wiki/Installing-the-development-versio
n-of-pandoc, downloaded the latest stack from
[2]http://docs.haskellstack.org/en/stable/install_and_upgrade.html#wind
ows. But I get a weird error seemingly unrelated to Pandoc:
C:\Users\Greg\AppData\Local\Temp\stack5940\HTTP-4000.2.21\Network\HTTP\Proxy
.hs:85:22:
Not in scope: ‘liftM’
Perhaps you meant ‘liftM2’ (imported from Control.Monad)

Which has something to do with a Haskell HTTP library. Is there any
other way to get this feature? Or is there something about Haskell on
windows that I'm overlooking.


Reply to this email directly or [3]view it on GitHub.

References

  1. https://github.com/jgm/pandoc/wiki/Installing-the-development-version-of-pandoc
  2. http://docs.haskellstack.org/en/stable/install_and_upgrade.html#windows
  3. #261 (comment)
@Wind4Greg

Thanks. It's a windows/Haskell issue, using a VM was able to build it on Ubuntu and see first hand this really nice feature. I'll watch pandoc-discuss for the windows fix.
Cheers.

@mindv0rtex

Is there an estimate of when this feature will reach the release?

@mb21
Collaborator
mb21 commented Mar 31, 2016

Image dimensions have been in 1.16 for a while already...

@mindv0rtex

Oops, my bad!

@ssinari ssinari referenced this issue in Rapporter/pander Apr 14, 2016
Open

Figure and table referencing #258

@mofosyne

How do you use it btw? I can't seem to find instructions on using it in the manual.

@KurtPfeifle

@mofosyne:

Search for link_attributes.

See here: http://pandoc.org/README.html#images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment