In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from IPython.display import display

from penai.hierarchy_generation.inference import HierarchyInferencer
from penai.hierarchy_generation.utils import (
    InteractiveHTMLHierarchyVisualizer,
    InteractiveSVGHierarchyVisualizer,
)
from penai.llm.llm_model import RegisteredLLM
from penai.registries.projects import SavedPenpotProject
from penai.utils.ipython import IFrameFromSrc
from penai.utils.vis import (
    DesignElementVisualizer,
    ShapeHighlighter,
)

# Hierarchy Generation

In this notebook we will demonstrate how to automatically infer a hierarchy of shapes with semantic shape descriptions for a Penpot project with vision language models (VLMs).

First, we will load an example project and select a frame / board from a page for hierarchy generation. The current approach works on frame, respectively board level to reduce the number of shapes in a single prompt but also as boards within Penpot are typically supposed to act as logical separations of sub-designs within a single page and therefore can serve as point of reference for the LLM.

Note, that the hierarchy generation works for some files and designs better than others. If a design inherently has a clear and hierarchical structure, our inference algorithm will do a pretty good job transferring this visual information into a formal structure. In cases with little inherent structure, e.g. a grid of icons, the generated hierarchies might be flat or of little information.

In [None]:
project = SavedPenpotProject.MATERIAL_DESIGN_3.load(pull=True)
cover_page = project.get_main_file().get_page_by_name("Cover")

Next, we perform two important steps: removal of invisible elements and bounding box derivation. The first one is important as invisible shapes such as pure group elements that don't correspond to any visible elements can't be visually recognized by the VLM. The bounding box derivation is necessary to construct "snippets" of rendered elements that will be provided each separately for guiding the hierarchy generation.

In [None]:
cover_page.svg.remove_elements_with_no_visible_content()
cover_page.svg.retrieve_and_set_view_boxes_for_shape_elements()

Finally we will retrieve the "Cover" board which is the only frame in this document and covers the whole page.

In [5]:
cover_frame = cover_page.svg.get_shape_by_name("Cover")

To now perform the hierarchy generation, we will instantiate a `HierarchyInferencer` object with a LLM of our choice and pass the prepared shape to its `infer_shape_hierarchy()`-method:

In [6]:
# The ShapeHighlighter is used to create annotated visualizations of single shaped
shape_visualizer = ShapeHighlighter()

# The DesignElementVisualizer is used to visualize a design element (e.g. a primitive, group, etc.) within
# its context in the design document.
# It will use the ShapeHighlighter to derive visualizations for the different shapes that make up the design element.
design_element_visualizer = DesignElementVisualizer(shape_visualizer=shape_visualizer)

hierarchy_inference = HierarchyInferencer(
    shape_visualizer=design_element_visualizer,
    model=RegisteredLLM.GPT4O
)

If the cell above finishes without errors, it indicates that the hierarchy has been derived successfully. The underlying code performs a validation of the AI response to ensure that the response format is correct (i.e. syntactically correct JSON) but also that the generated hierarchy is valid, i.e. all shapes are covered and no duplicate shapes are present.

We can finally use the `InteractiveSVGHierarchyVisualizer` utility-class to visualize the generated hierarchy interactively within this notebook:

In [None]:
# We will use the infer_shape_hierarchy_impl() method as it provides all the meta-data
# for the prompt, including the used visualizations and the prompt itself.
output = hierarchy_inference.infer_shape_hierarchy_impl(cover_frame)

## Optional: Display Prompt

Uncomment the following line to display the prompt that has been used to generate hierarchy.

In [None]:
html = output.conversation.display_html()

In [8]:
hierarchy = output.hierarchy
hierarchy_svg_visualizer = InteractiveSVGHierarchyVisualizer(hierarchy, cover_frame)
# display(SVG(data=hierarchy_svg_visualizer.svg.to_string()))

In [None]:
hierarchy_html_visualizer = InteractiveHTMLHierarchyVisualizer(
    hierarchy, svg=hierarchy_svg_visualizer.svg
)
display(IFrameFromSrc(hierarchy_html_visualizer.html_content, width=1200, height=900))