Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export a png #23

Closed
harrylojames opened this issue Oct 20, 2023 · 11 comments
Closed

Export a png #23

harrylojames opened this issue Oct 20, 2023 · 11 comments
Labels
enhancement New feature or request

Comments

@harrylojames
Copy link
Contributor

It would be great to be able to be able to export a png and have it include the title, subtitle and figcaption as is the case when using the three dots to export a png here.

op = Obsplot(renderer="jsdom")

op(
    {
        "marks": [
            Plot.tickX([0, 5, 10, 15])
        ], 
        "title" : "Title test",
    },
    path="test.png"
)

Split out from this #21

@juba
Copy link
Owner

juba commented Oct 20, 2023

PNG export would indeed be really useful, but it is much harder to implement than SVG export, because you need a complete HTML renderer (such as a browser) to convert to a bitmap image file.

Maybe I'll try to see how other libraries implement this.

@harrylojames
Copy link
Contributor Author

Observable use this. Happy to try adding a function to pyobsplot-js if you think that would make sense

@juba
Copy link
Owner

juba commented Dec 5, 2023

Oh I didn't know about this, it seems really promising ! Where did you see that Observable uses it ?

If you want to try to integrate it in pyobsplot, of course don't hesitate. Maybe the difficulty will be to make it work in both the jsdom and widget renderers, but it is definitely worth a try.

@juba juba added the enhancement New feature or request label Dec 5, 2023
@harrylojames
Copy link
Contributor Author

Great - I'll give that a go. It was mentioned in a discussion here.

@harrylojames
Copy link
Contributor Author

This was unfortunately the case for html-to-image "but it is much harder to implement than SVG export, because you need a complete HTML renderer (such as a browser) to convert to a bitmap image file.". In Observable notebooks the figure has already been rendered in the browser so export to png via html-to-image is possible.

I've had a look at what other visualisation libraries do and they all appear to take a similar approach as described here. A simple approach that worked but proved limited was using the python package html2image. The issue I had was getting the correct width and height. If the html just contains an SVG tag then there's no issue we can grab the width and height attributes. However, if there is a title/legend/caption etc. then the height will be too small.

Bokeh has an implementation to dynamically get the width and height which we could try and borrow when the figure is a HTML object (export_to_png and dynamically estimate width & height )?

Not ideal, there would also need to be an additional docs section similar to this.

Let me know what you think

@wirhabenzeit
Copy link
Contributor

For including observable plots in a scientific paper I also came across the need of exporting into png, pdf or svg (a single svg including title, caption, legend, etc). For my needs I got the best results using typst, see a few examples in my repo. For me this approach worked much better than anything I tried using headless browser exports. The resulting figures have tight configurable margins without cutting off title or caption. Supported formats are pdf, png, svg and typ.tar.gz (a compressed directory of the .typ file with the individual svgs, mainly for debug purposes).

The entire code for the export logic is in obsplot.py, the additional external dependencies are typst and bs4, both available from pip. The process is basically exporting to html, and parsing the output. Since observable exports discrete legends to multiple separate svg's this step requires a little extra logic. The export is quite fast, thanks to the speed of typst.

Of course such an reverse-engineering type approach might not be ideal for everyone, in particular my script so far probably does not cover all observable features (title, subtitle, caption, discrete legends, continuous legends are supported, but maybe something else is missing?). So far themes are also not supported, although that would be an easy addition. If there is interest here despite these drawbacks, I'd be happy to open a pull request.

@juba
Copy link
Owner

juba commented Apr 9, 2024

That is very interesting, I didn't know that typst could do this type of thing and that it could be installed directly via pip.

I'll take a look at your code, many thanks for taking the time to describe the way you did this and to share your code.

@wirhabenzeit
Copy link
Contributor

I added this functionality to obsplot.py like the code below, to a large part copying the ObsplotJsdomCreator. In order to get the raw output from ObsplotJsdom.plot I had to redefine this method to def plot(self, raw=False) -> Union[str, SVG, HTML], where raw=True just returns a string. There is probably a better way than this, without copying the ObsplotJsdomCreator code.

Another issue I could not solve is making this work with Plot.auto. Basically for the parsing to work it is important to always pass the figure: True option to observable in order to guarantee that the resulting plot is wrapped in a <figure>. I tried to do so with

if "figure" not in spec:
            spec["figure"] = True

but this fails for Plot.auto.

class ObsplotTypstCreator(ObsplotCreator):
    """
    Typst renderer Creator class.
    """

    def __init__(
        self,
        theme: str = DEFAULT_THEME,
        default: Optional[dict] = None,
        debug: bool = False,  # noqa: FBT001, FBT002
    ) -> None:
        super().__init__(theme, default, debug)
        self._proc = None
        self.start_server()

    def __call__(self, *args, dpi: int = 100, margin: int = 4, font_size: int = 12, font: str = "SF Pro Display", **kwargs) -> None:
        """
        Method called when an instance is called.
        """
        if self._proc is not None and self._proc.poll() is not None:
            msg = "Server has ended, please recreate your plot generator object."
            raise RuntimeError(msg)
        path = None
        if "path" in kwargs:
            path = kwargs["path"]
            del kwargs["path"]
        spec = self.get_spec(*args, **kwargs)
        if "figure" not in spec:
            spec["figure"] = True
        res = ObsplotJsdom(
            spec,
            port=self._port,
            theme=self._theme,
            default=self._default,
            debug=self._debug,
        ).plot(raw=True)
        if path is None:
            with tempfile.NamedTemporaryFile(suffix=".png") as f:
                self.render_typst(res, f.name, margin, font_size, font, dpi)
                return display(Image(filename=f.name))
        else:
            self.render_typst(res, path, margin, font_size, font, dpi)

    def start_server(self):
        """
        Start http node plot generator server.
        """
        if self._proc is not None:
            if self._proc.poll() is None:
                # If proc already running, do nothing
                return
        # Check for node executable
        npx = shutil.which("npx")
        if not npx:
            msg = "npx executable has not been found."
            raise RuntimeError(msg)
        # Run node script with JSON spec as input
        try:
            p = Popen(
                ["npx", f"pyobsplot@{MIN_NPM_VERSION}"],  # noqa: S607
                stdin=None,
                stdout=PIPE,
                stderr=PIPE,
                encoding="Utf8",
                # Use shell=True if we are on Windows. Otherwise PATH
                # is not parsed and npx is not found.
                shell=os.name == "nt",  # noqa: S603
                start_new_session=True,
            )
        except SubprocessError:
            err = p.stderr.read()  # type: ignore
            msg = f"Can't start server: {err}"
            raise RuntimeError(msg) from SubprocessError
        # read back OS selected port from stdout
        try:
            port = p.stdout.readline()  # type: ignore
            self._port = int(port.strip())
        except ValueError:
            err = p.stderr.read()  # type: ignore
            msg = f"Server not started: {err}"
            raise ValueError(msg) from ValueError
        # store Popen process
        self._proc = p

    def close(self):
        """
        Stop http node plot generator server.
        """
        if self._proc is not None:
            os.killpg(os.getpgid(self._proc.pid), signal.SIGTERM)

    @staticmethod
    def shift_svg(svg):
        soup = BeautifulSoup(str(svg), "xml")
        svg = soup.svg
        if "viewBox" in svg.attrs:
            x, y, width, height = map(int, svg.attrs["viewBox"].split())
            if x != 0 or y != 0:
                g = soup.new_tag("g", transform=f"translate({-x}, {-y})")
                g.extend(svg.contents)
                svg.clear()
                svg.append(g)
                svg.attrs["viewBox"] = f"0 0 {width} {height}"
        return str(svg)

    def render_typst(self, html: str, path: str, margin: int, font_size: int, font: str, dpi: int) -> None:
        """
        Render HTML to svg/pdf/png using jsdom+typst.

        Args:
            html (str): HTML content.
            path (str): path to output file.
        """
        path_obj = Path(path)
        ext = "".join(path_obj.suffixes)
        stem = str(path_obj.name).removesuffix("".join(path_obj.suffixes))

        with tempfile.TemporaryDirectory() as tmpdirname:
            soup = BeautifulSoup(html, "xml")
            figure = soup.find("figure", recursive=False)
            swatches = []
            plots = []
            for i, swatch in enumerate(figure.find_all("div", recursive=False)):
                new_swatch = []
                for j, svg in enumerate(swatch.find_all("svg", recursive=True)):
                    with open(f"{tmpdirname}/{stem}_{i}_{j}.svg", "w") as f:
                        f.write(ObsplotTypstCreator.shift_svg(str(svg)))
                    new_swatch.append(
                        {"file": f"{stem}_{i}_{j}.svg", "width": svg.attrs["width"], "height": svg.attrs["height"], "text": svg.next_sibling}
                    )
                swatches.append(new_swatch)
            for i, svg in enumerate(figure.find_all("svg", recursive=False)):
                with open(f"{tmpdirname}/{stem}_{i}.svg", "w") as f:
                    f.write(ObsplotTypstCreator.shift_svg(str(svg)))
                plots.append({"file": f"{stem}_{i}.svg", "width": svg.attrs["width"], "height": svg.attrs["height"]})
            max_width = max(int(svg["width"]) for svg in plots)
            typeset = (
                f'#set text(\nfont: "{font}",\nsize: {font_size}pt\n)\n'
                + f"#set page(\nwidth: {max_width+2*margin}pt,\nheight: auto,\nmargin: (x: {margin}pt, y: {margin}pt),\n)\n"
            )
            if title := figure.find("h2"):
                typeset += f"= {title.text}"
            if subtitle := figure.find("h3"):
                typeset += f"\n{subtitle.text}"
            typeset += "\n\n"
            for swatch in swatches:
                typeset += "#{\nset align(horizon)\nstack(\n  dir: ltr,\n  spacing: 10pt,\n"
                for el in swatch:
                    typeset += f'  image("{el["file"]}", width: {el["width"]}pt),\n'
                    typeset += f'  "{el["text"]}",\n'
                typeset += ")}\n\n"
            typeset += "#v(-10pt)\n".join([f'#image("{plot["file"]}", width: {plot["width"]}pt)\n' for plot in plots])

            if caption := figure.find("figcaption"):
                typeset += f"\n{caption.text}"

            with open(f"{tmpdirname}/{stem}.typ", "w") as f:
                f.write(typeset)

            typst.compile(f"{tmpdirname}/{stem}.typ", output=path, ppi=dpi, format=ext[1:])

@harrylojames
Copy link
Contributor Author

Very excited about the prospect of this addition and would love to see a pull request! Are there any blockers to making it happen? If we aren't sure yet that it supports all permutations would a way forward in the short term be to note the functionality is experimental?

Going to try the code above and I'll feedback if something doesn't work.

@juba
Copy link
Owner

juba commented May 5, 2024

Sorry I've been busy and didn't work a lot on pyobsplot recently, but I will definitely take a look at this. I just understood that py-typst doesn't even need to install typst separately, so it could be very handy.

@juba
Copy link
Owner

juba commented May 24, 2024

Experimental PNG support added to development version.

@juba juba closed this as completed May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants