In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from IPython.display import display

from penai.hierarchy_generation.inference import HierarchyInferencer
from penai.hierarchy_generation.vis import (
    InteractiveHTMLHierarchyVisualizer,
    InteractiveSVGHierarchyVisualizer,
)
from penai.llm.llm_model import RegisteredLLM
from penai.registries.projects import SavedPenpotProject
from penai.utils.ipython import IFrameFromSrc

# Hierarchy Generation

In this notebook we will demonstrate how to automatically infer a hierarchy of shapes with semantic shape descriptions for a Penpot project with vision language models (VLMs).

First, we will load an example project and select a frame / board from a page for hierarchy generation. The current approach works on frame, respectively board level to reduce the number of shapes in a single prompt but also as boards within Penpot are typically supposed to act as logical separations of sub-designs within a single page and therefore can serve as point of reference for the LLM.

Note, that the hierarchy generation works for some files and designs better than others. If a design inherently has a clear and hierarchical structure, our inference algorithm will do a pretty good job transferring this visual information into a formal structure. In cases with little inherent structure, e.g. a grid of icons, the generated hierarchies might be flat or of little information.

In [4]:
project = SavedPenpotProject.MATERIAL_DESIGN_3.load(pull=True)
cover_page = project.get_main_file().get_page_by_name("Cover")

Scanning remote paths in penpot/data/raw/designs/Material Design 3: 100%|██████████| 36/36 [00:00<00:00, 532.68it/s]
force pulling (bytes): 0it [00:00, ?it/s]


Next, we perform two important steps: removal of invisible elements and bounding box derivation. The first one is important as invisible shapes such as pure group elements that don't correspond to any visible elements can't be visually recognized by the VLM. The bounding box derivation is necessary to construct "snippets" of rendered elements that will be provided each separately for guiding the hierarchy generation.

In [5]:
cover_page.svg.remove_elements_with_no_visible_content()
cover_page.svg.retrieve_and_set_view_boxes_for_shape_elements()

Setting view boxes: 100%|██████████| 163/163 [00:03<00:00, 52.87it/s] 


Finally we will retrieve the "Cover" board which is the only frame in this document and covers the whole page.

In [6]:
cover_frame = cover_page.svg.get_shape_by_name("Cover")

To now perform the hierarchy generation, we will instantiate a `HierarchyInferencer` object with a LLM of our choice and pass the prepared shape to its `infer_shape_hierarchy()`-method:

In [7]:
hierarchy_inference = HierarchyInferencer()
hierarchy = hierarchy_inference.infer_shape_hierarchy(cover_frame)

70it [00:14,  4.94it/s]
Scanning remote paths in penpot/data/cache/llm_responses_cache.local.sqlite: : 0it [00:00, ?it/s]
No files found in remote storage under path: data/cache/llm_responses_cache.local.sqlite
pulling (bytes): 0it [00:00, ?it/s]


```json
{
  "id": "1",
  "description": "Main container rectangle",
  "children": [
    {
      "id": "2",
      "description": "Footer text displaying version information",
      "children": []
    },
    {
      "id": "3",
      "description": "Left sidebar circle container",
      "children": [
        {
          "id": "42",
          "description": "Conversion statistics rectangle",
          "children": [
            {
              "id": "46",
              "description": "Conversion label text",
              "children": []
            },
            {
              "id": "45",
              "description": "Conversion value text",
              "children": []
            },
            {
              "id": "44",
              "description": "Conversion target text",
              "children": []
            },
            {
              "id": "43",
              "description": "Conversion bar chart path",
              "children": []
            }
          ]
        },
      

If the cell above finishes without errors, it indicates that the hierarchy has been derived successfully. The underlying code performs a validation of the AI response to ensure that the response format is correct (i.e. syntactically correct JSON) but also that the generated hierarchy is valid, i.e. all shapes are covered and no duplicate shapes are present.

We can finally use the `InteractiveSVGHierarchyVisualizer` utility-class to visualize the generated hierarchy interactively within this notebook:

In [None]:
hierarchy_svg_visualizer = InteractiveSVGHierarchyVisualizer(hierarchy, cover_frame)
# display(SVG(data=hierarchy_svg_visualizer.svg.to_string()))

In [53]:
hierarchy_html_visualizer = InteractiveHTMLHierarchyVisualizer(
    hierarchy, svg=hierarchy_svg_visualizer.svg
)
display(IFrameFromSrc(hierarchy_html_visualizer.html_content, width=1200, height=900))