# SVG → PDF 1:1 Vector Conversion (Batch)

This notebook converts every `.svg` in a given folder into a **vector** `.pdf` with the **same dimensions** and file base name (e.g., `a.svg → a.pdf`) saved **in the same folder**.

Two backends are supported:
- **CairoSVG** (pure Python). Recommended for a simple, fully in-notebook workflow.
- **Inkscape** CLI. Useful if you already have Inkscape installed; robust with complex SVGs.

The notebook auto-selects a backend based on availability. You can override it.


In [1]:
from pathlib import Path
import sys, shutil, subprocess, traceback
import xml.etree.ElementTree as ET

# import cairosvg
# Check for CairoSVG
try:
    import cairosvg  # type: ignore
    HAVE_CAIROSVG = True
except Exception:
    HAVE_CAIROSVG = False

# Check for Inkscape binary
INKSCAPE = shutil.which("inkscape")

print(f"CairoSVG available: {HAVE_CAIROSVG}")
print(f"Inkscape on PATH: {INKSCAPE is not None} ({INKSCAPE if INKSCAPE else ''})")


CairoSVG available: True
Inkscape on PATH: False ()


## Choose backend

- Leave `BACKEND = "auto"` to auto-select: `cairosvg` if installed, otherwise `inkscape` if found.
- Or set explicitly to `"cairosvg"` or `"inkscape"`.


In [2]:
# Options: 'auto', 'cairosvg', or 'inkscape'
BACKEND = "auto"

def pick_backend():
    if BACKEND == "cairosvg":
        if not HAVE_CAIROSVG:
            raise RuntimeError("CairoSVG requested but not installed. Install with: pip install cairosvg")
        return "cairosvg"
    if BACKEND == "inkscape":
        if INKSCAPE is None:
            raise RuntimeError("Inkscape requested but not found on PATH. Install Inkscape or add to PATH.")
        return "inkscape"
    # auto
    if HAVE_CAIROSVG:
        return "cairosvg"
    if INKSCAPE is not None:
        return "inkscape"
    raise RuntimeError("No converter available. Install CairoSVG (pip install cairosvg) or Inkscape.")
    
ACTIVE_BACKEND = pick_backend()
print("Using backend:", ACTIVE_BACKEND)


Using backend: cairosvg


## Set the folder containing `.svg` files

Edit the `INPUT_DIR` line below. Use absolute paths or `~` for your home directory.


In [3]:
# EDIT THIS: path to the folder that contains .svg files
INPUT_DIR = Path('/Users/zhijiang/Desktop/University/Bio Lab/TSARnew/newFeatureDemo/demoImgs').expanduser()

# Safety checks
if not INPUT_DIR.exists():
    raise FileNotFoundError(f"INPUT_DIR does not exist: {INPUT_DIR}")
if not INPUT_DIR.is_dir():
    raise NotADirectoryError(f"INPUT_DIR is not a directory: {INPUT_DIR}")

print("Input directory:", INPUT_DIR)


Input directory: /Users/zhijiang/Desktop/University/Bio Lab/TSARnew/newFeatureDemo/demoImgs


## Utilities

We read basic SVG attributes to warn about missing `width`/`height` or `viewBox`, which can affect page extents.


In [4]:
def svg_size_info(svg_path: Path):
    try:
        # Handle namespaces simply by ignoring them
        it = ET.iterparse(svg_path, events=("start",))
        for _, el in it:
            # first element is <svg> root
            root = el
            break
        width = root.attrib.get("width")
        height = root.attrib.get("height")
        viewBox = root.attrib.get("viewBox")
        return width, height, viewBox
    except Exception:
        return None, None, None


## Converters

Two implementations that aim for **1:1 vector** output. No intentional rasterization or compression is applied.


In [5]:
def convert_with_cairosvg(svg_path: Path, pdf_path: Path):
    import cairosvg  # local import to ensure helpful error if missing
    # scale=1.0 keeps original size. Background None keeps transparency if present.
    cairosvg.svg2pdf(url=str(svg_path), write_to=str(pdf_path), scale=1.0, background_color=None)

def inkscape_version_string():
    if INKSCAPE is None: 
        return ""
    try:
        out = subprocess.run([INKSCAPE, "--version"], capture_output=True, text=True, check=False)
        return out.stdout.strip()
    except Exception:
        return ""

def convert_with_inkscape(svg_path: Path, pdf_path: Path, text_to_path: bool = False, area="page"):
    assert INKSCAPE is not None, "Inkscape binary not found"
    # area: 'page' preserves the SVG page size, 'drawing' fits to drawing bounds
    area_flag = "--export-area-page" if area == "page" else "--export-area-drawing"
    ttpath = ["--export-text-to-path"] if text_to_path else []
    
    ver = inkscape_version_string()
    if ver.startswith("Inkscape 0."):
        # Legacy CLI
        # -A = export to PDF, -z = without GUI (deprecated but used in 0.x)
        cmd = [INKSCAPE, "-z", "-A", str(pdf_path), str(svg_path)]
    else:
        # Inkscape 1.x
        cmd = [INKSCAPE, str(svg_path),
               "--export-type=pdf",
               f"--export-filename={str(pdf_path)}",
               area_flag, *ttpath,
               "--export-overwrite"]
    subprocess.run(cmd, check=True)


## Run conversion

- Scans only the **top level** of `INPUT_DIR` for `*.svg`.
- Writes `name.pdf` next to `name.svg`. Overwrites existing PDFs.
- Prints warnings if an SVG lacks `width`/`height` and `viewBox`.


In [6]:
from typing import List

def convert_folder(input_dir: Path, backend: str, use_text_to_path: bool = False, area: str = "page") -> List[Path]:
    svgs = sorted(input_dir.glob("*.svg"))
    if not svgs:
        print("No .svg files found in:", input_dir)
        return []
    out_pdfs = []
    for i, svg in enumerate(svgs, 1):
        pdf = svg.with_suffix(".pdf")
        w, h, vb = svg_size_info(svg)
        if (w is None and h is None and vb is None) or (not w and not h and not vb):
            print(f"[warn] {svg.name}: missing width/height/viewBox. Check for potential cropping or scaling.")
        try:
            if backend == "cairosvg":
                convert_with_cairosvg(svg, pdf)
            elif backend == "inkscape":
                convert_with_inkscape(svg, pdf, text_to_path=use_text_to_path, area=area)
            else:
                raise ValueError(f"Unknown backend: {backend}")
            print(f"[{i}/{len(svgs)}] {svg.name} → {pdf.name}")
            out_pdfs.append(pdf)
        except Exception as e:
            print(f"[error] {svg.name}: {e}")
            traceback.print_exc()
    return out_pdfs

# Execute
generated = convert_folder(INPUT_DIR, ACTIVE_BACKEND, use_text_to_path=False, area="page")
print(f"""
Done. Generated {len(generated)} PDF(s).
""".strip())


[1/4] beta CuCl2.svg → beta CuCl2.pdf
[2/4] beta PF74.svg → beta PF74.pdf
[3/4] gam CuCl2.svg → gam CuCl2.pdf
[4/4] gam PF74.svg → gam PF74.pdf
Done. Generated 4 PDF(s).


## Notes on 1:1 vector fidelity

- **Vector kept**: Both CairoSVG and Inkscape preserve vector shapes and paths. If your SVG embeds raster images, those remain raster in the PDF.
- **No compression**: Neither converter applies rasterization-based downsampling. PDF content streams may be compressed losslessly as part of the PDF spec.
- **Page extents**: To avoid cropping, ensure the SVG has either `width`+`height` **or** a proper `viewBox`. This notebook warns if missing.
- **Fonts**: If portability matters, consider Inkscape with `--export-text-to-path` to convert text to outlines. Set `use_text_to_path=True` above.
- **Exact page vs fit-to-drawing**: This notebook uses Inkscape's `--export-area-page` by default to preserve the original page size. Use `area="drawing"` to fit the PDF to the artwork bounds.
- **Install tips**: 
  - CairoSVG: `pip install cairosvg`
  - Inkscape: https://inkscape.org (ensure `inkscape` is on your PATH)
