Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add support for capturing svg, png, and jpeg and _repr_mimebundle_ #1138

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

oscargus
Copy link
Contributor

@oscargus oscargus commented May 13, 2023

Closes #1136

However, I have a doubt that this is a bit "too hacky" as it now checks for both str and bytes to check that a valid representation is obtained. I guess that there is a reason for the explicit str-check? So adding bytes may break things. I was thinking about adding some support to tell if bytes is OK for a given repr (it is required for png), but didn't really find a good way. It is of course possible to drop support for png and this would go away. (It would also be easy to add support for jpeg.)

Also, these are now directly embedded in the source, leading to that they will not show up as thumbnails. I guess that the same thing happens for objects that only have a _html_repr_ and are not scraped explicitly? Not a big deal for my use case, but would of course be nice even there.

A minor thing, which I sort of realize the reason for, is that these show up at the top. This of course makes sense when it comes to plots, but as I'd like to illustrate the output object in a graphical way, not so much. Easy to solve on my side though. (And I think it only affects the first "cell".)

Should these new representations be on or off by default? I guess having them on may lead to some surprising outputs...

Finally, tests and docs are missing, but I wanted to get some feedback on the bytes-issue before going into that. Edit: I think this is probably a better approach to the previous bytes-issue, but still not sure if embedded is the way to go.

(And, yes, not that many lines were needed, at least not with this approach.)

Copy link
Contributor

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and reasonably simple addition from first glance. Just need test and doc if we're happy to go ahead with adding support for this. Thanks!

sphinx_gallery/gen_rst.py Outdated Show resolved Hide resolved
sphinx_gallery/gen_rst.py Outdated Show resolved Hide resolved
@oscargus
Copy link
Contributor Author

Thanks! I'll try to add tests and documentation.

@oscargus oscargus changed the title Add support for capturing svg and png ENH: Add support for capturing svg and png May 15, 2023
@oscargus oscargus force-pushed the repr_svg_png branch 2 times, most recently from c827db0 to e2a1592 Compare May 15, 2023 10:51
@oscargus
Copy link
Contributor Author

This still needs tests for PNG and JPG output, but I cannot really figure out a good way to generate it without adding dependencies... Will try to generate a minimal binary file and play around a bit.

@larsoner
Copy link
Contributor

You could make custom classes with the _repr_* methods that wrap to a little Pillow call to dump some reasonable bytes.

@oscargus
Copy link
Contributor Author

oscargus commented May 17, 2023

I'm now thinking that it may make sense to save the figures to file (one of my use cases is to capture GraphViz graphs which can be quite big, so it is nice to be able to "Open image in new tab" which requires a separate file).

I know that there are thoughts of too many configuration parameters already, but would it make sense to add another one here?
One can possibly merge it with #1134? So that the same parameter determines if all images are embedded (no thumbnails) or saved as separate files (thumbnails)?

It may be that this can be added on later, depending on how hard it is...

@larsoner
Copy link
Contributor

larsoner commented May 17, 2023

I would start out by writing the files to disk just like current scrapers, then adding them as .. img-sg:: or whatever. Let's see if that's good enough. Then if we want to embed these as base64-encoded images or whatever we can do that separately.

@oscargus oscargus changed the title ENH: Add support for capturing svg and png ENH: Add support for capturing svg, png, and jpeg May 18, 2023
@oscargus oscargus force-pushed the repr_svg_png branch 7 times, most recently from 2e10238 to 19d569e Compare May 19, 2023 11:40
@oscargus
Copy link
Contributor Author

OK, so now it writes files. Also, I added support for _repr_mimebundle_ in a way transparent for the user. Some objects, like GraphViz graphs, only support this. But asking for _repr_svg_ will automatically look for an SVG in the MIME bundle (if the method exists etc).

My main problem now is how to test all this. The tests for embedding were quite straightforward as one can simply check the resulting HTML. However, I will need to look more into how to possibly test these ones.

@oscargus oscargus changed the title ENH: Add support for capturing svg, png, and jpeg ENH: Add support for capturing svg, png, and jpeg and _repr_mimebundle_ May 19, 2023
@oscargus
Copy link
Contributor Author

oscargus commented May 19, 2023

The content of _repr_mimebundle_ takes precedence if it contains the correct type. This makes sense to avoid generating the same content twice. Also:

https://github.com/scikit-learn/scikit-learn/blob/86541f2b3bc8a96264e265cb810cf79858544340/sklearn/base.py#L636-L638

It is hard to say how many more packages can be supported by this. A quick search got ~1200 hits for _repr_html_ compared to ~200 for _repr_svg_ and ~150 for _repr_mimebundle_.

However, as I personally rely on SymPy and GraphViz quite a bit, it is a real game changer for me. :-)

@oscargus
Copy link
Contributor Author

Another issue, I realize from the tests, is that SVGs are not supported by LaTeX out of the box. Of course it is a bit expected, but I wonder if one should add support for converting SVGs to PDFs? Or mention, e.g., https://pypi.org/project/sphinxcontrib-svg2pdfconverter/ in the documentation?

@oscargus oscargus force-pushed the repr_svg_png branch 7 times, most recently from 6c1901e to 243069d Compare May 19, 2023 14:26
@oscargus
Copy link
Contributor Author

oscargus commented May 19, 2023

Managed to get thumbnails etc working...

(Edit: well, not me, but calling the right function and voilà.)

@larsoner
Copy link
Contributor

Error on CircleCI:

! LaTeX Error: Unknown graphics extension: .svg.

One way to avoid this would be to wrap the SVG output in .. only:: html. But maybe we should use the sphinxcontrib- package you recommend instead? It would be okay to add this to the test requirements file (assuming the module is well maintained)

@larsoner
Copy link
Contributor

... but I should add that the code otherwise looks good, and this output shows things nicely:

https://output.circle-artifacts.com/output/job/776959c5-ed38-4544-be3a-1e6a9930c298/artifacts/0/tiny_html/auto_examples/plot_mime_bundle.html

So once we can make CIs happy we should be good

@oscargus oscargus force-pushed the repr_svg_png branch 3 times, most recently from 246a7a1 to 1d282e5 Compare May 23, 2023 10:59
@oscargus
Copy link
Contributor Author

oscargus commented May 23, 2023

I've tried to make it work, but to no avail. My theory is that the svg2pdf converter is not working since the sg image is an imgsgnode and not an img at that stage. Hence, I tried with using .. image:: rather than .. sg-image::. Then, the thumbnails still works, but no image as the rst-code is kept as is. So while the build nowt passes, it is not correct.

@oscargus oscargus force-pushed the repr_svg_png branch 4 times, most recently from b04e172 to 989934d Compare May 23, 2023 11:47
@oscargus
Copy link
Contributor Author

This is the current output and source to the test errors:

https://output.circle-artifacts.com/output/job/ee02670a-3e05-444c-9029-0b0c7b60c191/artifacts/0/tiny_html/auto_examples/plot_mime_bundle.html

As seen, the .. image:: is interpreted as a continuation of the code block. I have (currently) no idea how/where to get around this, so any ideas are appreciated.

@larsoner
Copy link
Contributor

I've tried to make it work, but to no avail. My theory is that the svg2pdf converter is not working since the sg image is an imgsgnode and not an img at that stage. Hence, I tried with using .. image:: rather than .. sg-image::. Then, the thumbnails still works, but no image as the rst-code is kept as is. So while the build nowt passes, it is not correct.

For now then let's just detect if it's SVG and wrap it in a .. only:: html block. If it's an issue with img-sg or whatever then it's affecting more than just this PR, so we can/should tackle it separately.

@@ -353,7 +353,7 @@ def save_figures(block, block_vars, gallery_conf):
return all_rst


def figure_rst(figure_list, sources_dir, fig_titles='', srcsetpaths=None):
def figure_rst(figure_list, sources_dir, fig_titles='', srcsetpaths=None, sg_image=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove sg_image for now

@larsoner
Copy link
Contributor

As seen, the .. image:: is interpreted as a continuation of the code block. I have (currently) no idea how/where to get around this, so any ideas are appreciated.

Locally you could check this, but let's use CircleCI instead. The RST files are actually saved to disk and you can look:

You can see the problematic line:

 .. image:: /auto_examples/images/sphx_glr_plot_mime_bundle_001.svg

In particular note the leading space

@oscargus oscargus force-pushed the repr_svg_png branch 3 times, most recently from 9fe1bd4 to d044ca4 Compare June 3, 2023 11:11
@oscargus
Copy link
Contributor Author

oscargus commented Jun 3, 2023

I got it fully working using .. sg-image:: for SVG. There is a special template for SVGs without srcset so .. image-sg:: will directly use the parent method here:

if node['srcset'] is None:
self.visit_image(node)
return

Had to install pipwin and from there cairocffi to get the Windows tests passing. But now it seems to work properly:
https://output.circle-artifacts.com/output/job/9efaf816-edc7-46d7-b119-4c6b4f319928/artifacts/0/tiny_html/auto_examples/index.html

Not sure if this is an acceptable solution for now or if it may break something else?

Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments otherwise LGTM!

@@ -353,7 +353,8 @@ def save_figures(block, block_vars, gallery_conf):
return all_rst


def figure_rst(figure_list, sources_dir, fig_titles='', srcsetpaths=None):
def figure_rst(figure_list, sources_dir, fig_titles='', srcsetpaths=None,
svg_image=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well add these because it's easy

Suggested change
svg_image=False):
*, svg_image=False):

excluded_mime_types = {MIME_MAPPING[repr] for repr in
MIME_MAPPING_REPR.difference(capture_repr_set)}
mimebundle = {}
if included_mime_types and hasattr(___, '_repr_mimebundle_'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hasattr is unnecessary, the try below would catch the AttributeError below anyway. Might as well remove it

Suggested change
if included_mime_types and hasattr(___, '_repr_mimebundle_'):
if included_mime_types:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are correct. I think the idea was to try to catch a slightly more narrow class of exceptions, where AttributeError wasn't one of them, but never managed to figure out what a suitable subset might be.

Comment on lines +502 to +507
# Used for SVGs
SVG_IMAGE = """
.. image:: /%s
:alt: %s
:class: sphx-glr-single-img
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY / this appears to be a duplicate of SINGLE_IMAGE below, it would be better just to use that directly (and fix the leading space in it), or below it write

SVG_IMAGE = SINGLE_IMAGE  # alias

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for that initial space. Since it said it was kept for backwards compatibility I wasn't sure if the space should still be there to not break anything. But maybe the space is always an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think the space is a bug and can safely be removed

Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments otherwise LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support more _repr_*_
3 participants