# Identifying Similar Graphic Design Templates

# The Aim

Several companies are springing up that offer ready-made graphic design templates,
in categories such as business cards, invitations and presentations. These kind of templates
all similar in a sense, but there is lots of variation between them that makes some
likable to one user, and unlikable to another. For example all birthday invitations
have similar words on them _"You're invited to my birthday ... it's on YYYY-MM-DD"_
but there's many different layouts, colours and graphics that make it personalised.

With the more templates being created, it is easy to get overwhelmed, close the browser and
give up on trying to find one you like. Can this experience be improved? Is it possible
to offer up suggestions of what a user may like based on what they click on?
Rather than a user clicking through Page 1 up to Page 100, could they go down a rabbit-hole
of suggested templates, slowly refining it down until they find the one they want?

The aim of this little project is to see if we can offer up similar graphic design templates
based on an input template. There has been work done in the past on finding similar songs
based on an input song ([link](https://github.com/spotify/annoy)) and on finding similar
photographs ([link](https://www.kaggle.com/abhikjha/fastai-hooks-and-image-similarity-search)).
Can we extend that same approach to design templates?

# The Method

The basic principle that both methods listed above use is to generate a list of numeric values that can describe
each object. Finding "similar" templates is then about selecting out the templates that have the lowest overall difference
between its numeric features and the original object's. Another way to think of this is plotting the object into multi-dimensional space, and selecting out the other objects that are "closest" to it.

For our project we narrowed down on a specific type of graphic design templates: **invitations**.

There are different ways to construct a "Numeric Feature Generator" <sup>TM</sup> for invitations.

The first is to think about different metrics that would define a graphic design. For example, they could be broken into three categories: the text used, the graphics added and the overall layout. The text might be large/small, sans-serif/script. The graphics might be vector-based/hand-drawn, floral/industrial. The layout might be minimal/crowded. With a set of metrics you could label each of your graphics with that information. It would be hard if you don't have the individual elements available. If you could pull it off, the benefit is that these numbers are very explainable to a person off the street.

I didn't have access to this kind of info so had to make do with something else. I tried to let a machine learn the key features of a design and see if it was _good enough_ for finding similar graphics. This approach involves setting a machine to learn the images with the task of trying to classify invitations based on their type: weddings, birthdays, bbqs and graduations. As part of that process, the very last stage before the machine will output it's prediction, it generates a 500-length list of numeric features that it weighs up to make it's final decision. Ignoring it's final prediction, pulling this 500-length vector for each of the invitations would do well as a list of numeric features.

# The Data

I needed to get a set of images of invitations. A well-worded google search and a web scrape meant I was able to
get around 150 images per category (bbq, wedding, birthday, graduation). There's lots of noise in google search so doing something like `"birthday invitations site:canva.com"` did the job. The benefit of this is that we get the image labelling for free; all images downloaded in each search can be matched to their category.

# Model

- We would want to leverage a pre-trained image model
- Train it further on our data.
- Since we want to simplify the processing, we want to have a vector output based on a range of metrics.
This way we can share the nodes across our outputs. To improve results, if needed,
we can separate them out into separate models.
- Because the cost to label the data is high, we would want to
use semi-supervised learning.

## Outputs

Colours:

- Colour palettes
- Total number of colours present
- Hue
- Saturation
- Contrast
- Monochrome indicator

Layout

- White space between elements

Text

- serifness (sans serif -> serif -> script)
- kerning
- monospaced

Graphics

- floralness
- busyness
- realness (from 0 being a definite vector graphic up to
1 being something that looks like a hand-drawn painted/drawn object)

Some of these outputs are deterministic and wouldn't need to be fed through the model.
We can split the above into those that are aggregations on the image (colour, hue, saturation),
and those that are about the semantics of the image. With more fine-tuned data,
more and more of this wishlist of features can be found for "free". I.e. if we had
info on the individual elements of a design, we can have a smaller model that runs just
for text, and one for graphics. Without that, we are having to deal with the jpeg output
of the image.

# Where to from here

- Does well on large layout effect (flower border, large numbers, photo frame). Does not do well with _style_ (fonts, humourous/serious)
- Attempt with more data. In a production setting you would imagine a company would run it on _all_ of their designs. There's only so much I can recommend if I only have 400 to choose from.