# SVG File Parsing for Bézier Curve Extraction

## The goal of this section is to create a parser for our SVG files to more easily get the waypoints for each Bézier Curve in our SVG file so that we can use it to create a trajectory for our SketchBot.

## An Overview of SVG Files

SVG files describe vector graphics using paths. The paths are defined through the <path> element in SVG, and the 'd' attribute within the <path> element contains a string representing a series of commands and parameters that define the path.

The path data consists of commands such as:

M (Move To): Move the current point to a new position.
L (Line To): Draw a straight line from the current point to a new position.
C (Cubic Bezier Curve To): Draw a cubic Bézier curve.
Q (Quadratic Bezier Curve To): Draw a quadratic Bézier curve.
Z (Close Path): Close the current path by connecting the current point to the starting point.

Each of the commands above, depending on the SVG generator can have multiple aliases in the path detailed below:

MoveTo: M , m.
LineTo: L , l , H , h , V , v.
Cubic Bézier Curve: C , c , S , s.
Quadratic Bézier Curve: Q , q , T , t.
Elliptical Arc Curve: A , a.
ClosePath: Z , z.

## Our Parser

This notebook takes an SVG file as input and parses through the path element to get a representation of all the commands in a way that will allow us to extract the control points and generate Bezier Curves more easily than the format the raw file provides. I created this as I was not able to find any suitable Python library that parses through the SVG file we generated from the contours in the best way to extract all commands.

In [None]:
!pip install svg.path

In [4]:
#imports
import xml.etree.ElementTree as ET
import numpy as np
import matplotlib.pyplot as plt
from svg.path import parse_path

In [5]:

def parse_svg(svg_file):
    '''
    Input: SVG_file containing line drawing of object
    Output: all the path data in a list of strings containing the commands for each (preparsing)
    '''
    tree = ET.parse(svg_file)
    root = tree.getroot()

    # Define the SVG namespace
    svg_namespace = {'svg': 'http://www.w3.org/2000/svg'}

    # Find all path elements in the SVG
    path_elements = root.findall('.//svg:path', namespaces=svg_namespace)
    print(path_elements)

    all_path_data = []
    # Iterate through each path element and extract the 'd' attribute (path data)
    for path_element in path_elements:
        path_data = path_element.get('d')
        all_path_data.append(path_data)

    return all_path_data

In [None]:
def generate_reparsed_path(path_data):
    '''
    USE THIS FUNCTION IF WORKING WITH SVG FILE FROM CONTOURS
    Input: 
    path_data (List): path data from SVG file (processed with parse_svg) (normally just one instance, but in case we have a list)
    
    Output: 
    reparsed_path_data (List): A list of dictionaries with following entires (normally just one dict)
        path_string (String): a string of the reparsed path with commands properly spaced, suitable for replotting with matplotlib
        MoveTo Commands (List): a list containing all the individual MoveTo or LineTo commands
        cubic commands  (List): a list containing all individual cubic bezier curve commands
        quadratic commands (List): a list containing all individual quadratic curve commands
    '''
    #list of all possible path commands 
    path_commands = ['M', 'm', 'L' , 'l' , 'H' , 'h' , 'V' , 'v', 'C', 'c', 'S', 's', 'Q', 'q', 'T', 't', 'A', 'a', 'Z', 'z']
    reparsed_path_data = {}

    re_parsed_path = ''
    move_to_commands = []
    cubic_commands = []
    quadratic_commands = []

    all_points = []

    for data in path_data:
        print(data)
        #due to the nature of the way SVG is formatted, we must reformat to be able to split better in python
        for command in path_commands:
            data = data.replace(command, ' '+command+' ') #to later parse and extract points from split on single character

        data = data.split()

        
        for i in range(len(data)): #iterate through the path and extract each command and their corresponding waypoints
            if 'M' in data[i] or 'L' in data[i]: #single point if MoveTo or LineTo command
                move_to_commands.append(data[i] + ' ' + data[i+1])
                re_parsed_path += data[i] + ' ' + data[i+1] + ' '
            
            elif 'C' in data[i]: #extract 3 points since cubic bezier curve
                cubic_commands.append(data[i] + ' ' + data[i+1] + ', ' + data[i+2] + ', ' + data[i+3])
                re_parsed_path += data[i] +' ' + data[i+1] + ',' + data[i+2] + ',' + data[i+3] + ' '
            
            elif 'Q' in data[i]: #extract 2 points since quadratic bezier curve
                ##print('Q FOUND') for debugging
                quadratic_commands.append(data[i] + ' ' + data[i+1] + ', ' +  data[i+2])
                re_parsed_path += data[i] + ' ' + data[i+1] + ', ' +  data[i+2] + ' '
            #elif 
    reparsed_path_data = {'path string': re_parsed_path, 'MoveTo commands': move_to_commands, 'cubic commands':cubic_commands, 'quadratic commands': quadratic_commands}
    
    return reparsed_path_data



In [6]:
def generate_reparsed_path_SAT(path_data):
    '''
    USE THIS FUNCTION IF WORKING WITH SAT FILE FROM TS
    Input: 
    path_data (List): path data from SVG file (processed with parse_svg) (normally just one instance, but in case we have a list)
    
    Output: 
    reparsed_path_data (List): A list of dictionaries with following entires (normally just one dict)
        path_string (String): a string of the reparsed path with commands properly spaced, suitable for replotting with matplotlib
        MoveTo Commands (List): a list containing all the individual MoveTo or LineTo commands
        cubic commands  (List): a list containing all individual cubic bezier curve commands
        quadratic commands (List): a list containing all individual quadratic curve commands
    '''
    #list of all possible path commands 
    path_commands = ['M', 'm', 'L' , 'l' , 'H' , 'h' , 'V' , 'v', 'C', 'c', 'S', 's', 'Q', 'q', 'T', 't', 'A', 'a', 'Z', 'z']
    reparsed_path_data = {}

    re_parsed_path = ''
    move_to_commands = []
    cubic_commands = []
    quadratic_commands = []

    all_moveto_points = []

    for data in path_data:

        #due to the nature of the way SVG is formatted, we must reformat to be able to split better in python
        for command in path_commands:
            data = data.replace(command, ' '+command+' ') #to later parse and extract points from split on single character

        data = data.split()
        #print(data)
        
        for i in range(len(data)): #iterate through the path and extract each command and their corresponding waypoints
            if 'M' in data[i] or 'L' in data[i]: #single point if MoveTo or LineTo command
                move_to_commands.append(data[i] + ' ' + data[i+1] + ' ' + data[i+2])  ##changing because different output of svg formatting for SAT file
                re_parsed_path += data[i] + ' ' + data[i+1] + ' ' + data[i+2] + ' '
                all_moveto_points.append([data[i+1], data[i+2]])
            
            elif 'C' in data[i]: #extract 3 points since cubic bezier curve
                temp = ''
                for x in range(7):
                    temp += data[i+x] + ' '
                cubic_commands.append(temp)
                re_parsed_path += temp
            
            elif 'Q' in data[i]: #extract 2 points since quadratic bezier curve
                ##print('Q FOUND') for debugging
                temp = ''
                for x in range(5):
                    temp += data[i+x] + ' '
                quadratic_commands.append(temp)
                re_parsed_path += temp
            #elif 
    reparsed_path_data = {'path string': re_parsed_path, 'MoveTo commands': move_to_commands, 'cubic commands':cubic_commands, 'quadratic commands': quadratic_commands}
    
    return reparsed_path_data, all_moveto_points


### Example with generated bunny SVG file 

In [None]:
#first we will load in the SVG file and extract the raw data from the path elements using parse_svg
path_data = parse_svg('out.svg')

#next we will generate the dictionaries containing all the commands and their associated waypoints
reparsed_data_original = generate_reparsed_path(path_data)
print('Path String:', reparsed_data_original['path string'])
print('MoveTo Commands', reparsed_data_original['MoveTo commands'])
print('Cubic Commands', reparsed_data_original['cubic commands'])
print('Quadratic Commands', reparsed_data_original['quadratic commands'])

[<Element '{http://www.w3.org/2000/svg}path' at 0x7f775e139a90>]
M377.0,462.3259077834862C372.4508362174215,461.5636073017164 369.00167639662396,459.780403309666 368.88151738159695,458.12867965644034C368.51124680613566,453.0388853747129 370.6172712188231,450.6830585972762 379.27175148277325,446.5060262950045C384.8767190179249,443.8008227780429 385.53202006121245,443.16589581475694 385.81996628246884,440.1614031934495C386.1768174094542,436.4379420094322 384.0503699835436,434.0858022378292 380.2785133808091,434.03180337187047C376.007841861576,433.9706633437067 367.0715037962185,429.1168070288138 359.12348218104216,422.54124862452863C352.34014186282104,416.9292545210606 350.7955272180784,415.06943029652734 349.0554333035657,410.4186125613211C346.60015520983853,403.8562938221238 346.3804322079433,394.61373112628195 348.5126138431632,387.5851228856037C349.33264473692043,384.8819404726857 349.8902738060185,382.5766698884931 349.75178955227,382.4622993651757C349.6133052985215,382.347928841858

### Now, with the reparsed data we can do stuff like this:

In the following cell, we will regenerate the curves using matplotlib

In [None]:
def plot_svg_path(svg_path_data, num_points=2000):
    path = parse_path(svg_path_data)
    ##print(path) for debugging
    points = [path.point(i / num_points) for i in range(num_points + 1)]
    
    x, y = zip(*[(point.real, point.imag) for point in points])
    
    plt.plot(x, y, color='blue')
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.title('SVG Path')
    plt.show()

#here we can directly import the Path String we generated above!
#svg_path_data = reparsed_data['path string']
#plot_svg_path(svg_path_data)


## Further Processing: Medial Axis Transform (MAT) / Symmetry Axis Transform (SAT)

### Aiming to simplify multiple curves in SVG path to one for Trajectory

The notion of skeleton was introduced by H. Blum as a result of the Medial Axis Transform (MAT) or Symmetry Axis Transform (SAT). The MAT determines the closest boundary point(s) for each point in an object. An inner point belongs to the skeleton if it has at least two closest boundary points.

The SAT is similar to the MAT (and indeed uses it as input) but is often more practical as it removes insignificant MAT branches based on the local image scale. The culling severity of branches is controlled by a parameter called s.

<img src="Screenshot 2023-12-04 at 12.32.58.png" width="" align="" />

<img src="Screenshot 2023-12-04 at 12.29.44.png" width="" align="" />

### Original SVG contour image

### The grey silhouette is the original SVG shape and the blue curve is after SAT

In the rendering above to the right, the gray silhouette is the original SVG shape and the blue curves represent the SAT. On the left, we also see the original SVG output generated in the Contour Pipeline notebook.

Potentially helpful: https://stackoverflow.com/questions/29921826/how-do-i-calculate-the-medial-axis-for-a-2d-vector-shape

We can access the standalone generated SVG of the MAT drawing here

Let us now process the curves of this modified SVG:

In [8]:
path_data = parse_svg('extracted_sat.svg')
reparsed_data = generate_reparsed_path_SAT(path_data)
print('Path String:', reparsed_data[0]['path string'])
print('MoveTo Commands', reparsed_data[0]['MoveTo commands'])
print('Cubic Commands', reparsed_data[0]['cubic commands'])
print('Quadratic Commands', reparsed_data[0]['quadratic commands'])

print("Move to points")

#svg_path_data = reparsed_data['path string']
#plot_svg_path(svg_path_data)

08542572810643 353.8447804685775', 'M 344.1022613813328 353.75543827148414', 'M 344.2068688876273 353.17651287438906', 'M 344.23828413362185 352.9929957330003', 'M 344.3730314523948 352.1491413627606', 'M 344.3851741709509 352.0709281294025', 'M 344.4673936977529 351.53591256739884', 'L 344.54610718359834 351.01792998590463', 'M 344.54610718359834 351.01792998590463', 'M 344.6235305391345 350.5169111942921', 'M 344.635788459114 350.4386515181362', 'M 344.7705128122261 349.6014152797995', 'M 344.7766009654611 349.5646752206124', 'M 344.9063331911448 348.8075110528813', 'M 344.9296515529016 348.67693688450936', 'M 345.0456170865428 348.05429591494806', 'M 345.08535101381955 347.85145257282034', 'M 345.19087361915774 347.3391211153556', 'M 345.2461422413329 347.08597212719275', 'M 345.3448034061988 346.6593807592111', 'M 345.41418668333387 346.3781377164831', 'M 345.51035217819407 346.01259179082734', 'M 345.59133759125484 345.72532584341695', 'M 345.69075459929553 345.3965151902213', 'M 

## Comparing  Pre-MAT and Post-MAT Skeletonization

<img src="/pre_skeletonization_plot.png" width="" align="" />

<img src="/post_skeletonization_plot.png" width="" align="" />

## Comparing  Pre-MAT and Post-MAT Skeletonization

### An improved plot of our pre & post-processed bunny from SVG path to MatPlot using svgpath2mpl

<img src="/pre_processed_bunny.png" width="" align="" />

<img src="/post_processed_bunny.png" width="" align="" />

In [None]:
### CODE TO GENERATE THE ABOVE PLOTS 
#better matplot plotting from SVG Path 
#taken from https://github.com/nvictus/svgpath2mpl
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
from IPython.display import HTML, SVG
import numpy as np
from svgpath2mpl import parse_path

svg_path_data_pre = reparsed_data_original['path string']
path_post = parse_path(svg_path_data)
path_pre = parse_path(svg_path_data_pre)

fig, ax = plt.subplots(nrows=1, ncols=1)
patch = mpl.patches.PathPatch(
    path_post, 
    edgecolor='blue', 
    linewidth=1)
patch.set_transform(ax.transData);
ax.add_patch(patch);
ax.set_aspect(1);
ax.set_xlim(300, 550);
ax.set_ylim(480, 300);

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=9651abc2-eeb7-42d8-826f-09f5a16b2834' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>