# Imports

We use *io* for byte IO streaming, *cairosvg* for SVG rasterization, *pdf2image* for PDF rasterization, and *PIL* for image processing:

In [None]:
import io

from cairosvg import svg2png
from pdf2image import convert_from_path
from PIL import Image

# Synthetic data generation

First we load an example document. We use a low DPI (dots per inch), since the added detail is not expected to increase detection accuracy:

In [None]:
pdf_file = '../data/documents/ak10900_selostus.pdf'

pdf_page = convert_from_path(pdf_file, dpi=50)

We can visualize one of the pages by simply indexing into the list:

In [None]:
print('Image size:', pdf_page[0].size, '(px)')
pdf_page[0]

And here's one of the pages that originally may have contained a signature:

In [None]:
print('Image size:', pdf_page[36].size, '(px)')
pdf_page[36]

Next we'll load the signature of a former US head of state:

In [None]:
signature = Image.open(io.BytesIO(svg2png(url='https://upload.wikimedia.org/wikipedia/commons/4/46/George_HW_Bush_Signature.svg')))
# Reduce the width to be more in line with document
signature = signature.resize([signature.width//4, signature.height//4])
print('Image size:', signature.size, '(px)')
signature

Finally we simply paste the signature onto the target document page at the correct location:

In [None]:
page = pdf_page[36].copy()
page.paste(signature, [130,360], signature)
page