# Similar.AI Labelling Guide

For the dataset provided, we ask that Similar.ai only label individuals identified by a red bounded box. For these individuals, we already have some upper body labels and regions to localise when training our model. This will enchance the metric learning process for mapping Similar.ai labels to existing StreetStyle labels. So far we have provided 23,908 images for labelling. For example, in the image below, although there are two people present, we ask that you only give labels for the person on the right:

In [1]:
from IPython.display import Image
Image(filename='0b42f3eec6ab6882ba261425646642b7_581641884225517952_10084109.jpg',width=250,height=200) 

<IPython.core.display.Image object>

For some images, the bounding boxes provided by StreetStyle are not valid, and have no relation to the individuals in the image. These can be ignored entirely. For example:

In [2]:
Image(filename='0c88bbd00d1d126878f23a3be9b74116_895234284077234413_361472861.jpg',width=200,height=200) 

<IPython.core.display.Image object>

In cases, where there is an overlap (like below) and more than one person is present inside a bounding box, it is perfectly fine to ignore that image to avoid confusion. 

In [3]:
Image(filename='0ad305537fa885c4d0148f19b977caf3_898762546448839374_476155139.jpg',width=200,height=200) 

<IPython.core.display.Image object>

In general, for images where there is any confusion in who the bounding box is referring to, bad quality images, or images where there are no relevant Similar.ai labels, please skip and ignore that image. 

**Labels required** 

We have labels for most of the upper body clothing types. What we need is item type labelling for the lower body as well as color and market, as well as some misc item labels. In order to reduce how many labels you need to apply to each image, we'll just use color labels and infer the item types present. So if you assign a label ```dress_blue``` to an image, we'll assume there is a ```dress```, and that it is ```blue``` - you don't need to also include the label ```dress```. 

So for example, lets look at the labels for this picture:

In [6]:
Image(filename='8c5663f4fd701e5d5c6b08e4821af07f_761585291335355200_1146068377.jpg',width=250,height=200) 

<IPython.core.display.Image object>

First, we only want to look at the person with the bounding box, we're going to ignore everybody else. We have upper body labels for this image (so, for example, color=black, item_type=sweater) so we need the lower body labels. For this image, we could have ```[market_woman, jeans_blue, glove_black, bag_black]```.

Any label ommited, eg ```belt```, we assume to mean ```belt_none``` so you don't have to worry about labelling things that aren't there.

Taking labels from your notion document:

We lack labels the item types: ```skirt```, ```hosiery```, ```trousers```, ```lingerie```, ```lounge/sleepwear```, ```swimwear```, ```onesies```, ```belts```, ```hair accessories```, ```jewellery```, ```shoes```, ```bags```.

We have no labels for market.

We have the same colour labels as you do, so just use the colours you have listed with the addition of a ```more_than_one_colour``` for multicoloured items.

For everything else listed in the notion document that we haven't listed above, we have labels for, so you don't have to waste your time with those (unless you have a lot of time and want to help improve our upper body labels).

Some items may not require a colour label, e.g. jewellery. If you want to omit the colour and just include a (e.g.) ```jewellery_watch``` label thats fine, and you can do that for any other item type you don't want colours for (e.g. if you don't care about glove colour, feel free to just label ```glove```).


The streestyle dataset additionally includes a ```clothing_pattern``` label (solid, graphics, striped, floral, plaid and spotted). We don't need to work with this right now, but if you want to include it (for the item types we don't have labels for, as listed above) that can also work.