# Part - 3 :  Label the Style Dataset

*Yilin Lyu*

*yl3832@columbia.edu*

This jupyter notebook describes the automatic labeling process of labelling the image in the **Style-Grid view.csv**

### This work was done by:

1) Create two **disctionaries** for color and body tags. 

2) Use **Resonance_code to retrieve information** from the dictionaries and label the Style Name. 

3) **Save** the tags to csv file.

There are **4252** new labels created in the **style** dataset. 

In [2]:
import pandas as pd
import os

In [3]:
style = pd.read_csv("Style-Grid view.csv")
style.head(2)

Unnamed: 0,Style Name,IMAGE,Resonance_code,TAGS: Material,TAGS: Color,TAGS: Body,id
0,🛡 The Tea Length Tunic - Petal Poet in Stret...,,TK-6011 SGT19 PANSYP,"Chief Silk, Weave Woven Stretch, GSM 130-160, ...","Contrast High, Bright Medium, Genre Floral, Fi...",,recCPenP3CfKvzeJD
1,🛡 The Sleeveless Classic Blouse - Cora's Cafe...,,TK-3018 SGT19 JPWALL,"Chief Silk, Weave Woven Stretch, GSM 130-160, ...","Contrast Medium, Bright Medium, Genre Floral, ...",,rec9rh22rgAsvYerk


In [4]:
color = pd.read_csv("Colors-Grid_view_labeled.csv")
color.head(2)

Unnamed: 0.1,Unnamed: 0,Color_Name_,Color_Code,IMAGE,TAG: Color,id
0,0,Cut Paper,CUTPYS,SCzyoVwFSVqHCDkZXVYt_Cut%20Paper.png (https://...,"Primary Purple, Secondary Red, Field Large, Co...",rec0016dbT64i8wpT
1,1,White w/Thin Navy Vertical Stripe,WHITUI,fcsx6bXR7CHzNK8gtpPI_Screen%20Shot%202018-08-2...,"Primary white, Secondary black, Bright Medium...",rec01n5wINvMeGI9S


In [5]:
body = pd.read_csv("Body-Grid_view_labeled.csv")
body.head(2)

Unnamed: 0.1,Unnamed: 0,Body Name,IMAGE,Product Tags: Body,Core Body Number,TAG: Body,id
0,0,ZIP HOODIE,EH2KTK9Qeer0hHk66aon_Screen%20Shot%202018-01-0...,,CM-3003,"Use Day, Use Weekend, Use Work, Fit Relaxed, D...",recS4t29J73BYrznq
1,1,Zip BaseBall Jacket,uZNESfIgQLu2lhCVJy2X_Screen%20Shot%202017-12-2...,,CM-4001,"Use Day, Use Work, Body Top 3, Length Medium, ...",recLupRsEop46s4ME


## How to tag the style dataset?

Notice that we could add tags to the **style** dataset according to **Resonance_code**

A **Resonance_code** includes three parts: *body_tag_code* + *material_tag_code* + *color_tag_code*.

For example: a cloth with Resonance_code: CM-3003 SGT19 WHITUI means:
      
      ZIP HOODIE (CM-3003) with hite w/Thin Navy Vertical Stripe (WHITUI)

Therefore, we can now label the style using our previous results. 

For those that do not have a code or a image, we cannot label it right now

## 1) Create two disctionaries for color and body tags

In [6]:
'''
create two dictionaries to store the information
color_dict includes the information of color 
body_dict includes the information of body 
'''
Color_Code_list = list(color["Color_Code"])
Body_Name_list = list(body['Core Body Number'])
color_dict = {}
for color_code in Color_Code_list:
    color_tags = color[color["Color_Code"] ==  color_code]['TAG: Color']
    if len(color_tags) > 0:
        color_dict[color_code] = color_tags.iloc[0]
    else: color_dict[color_code] = ''
body_dict = {}
for body_code in Body_Name_list:
    body_tags = body[body["Core Body Number"] ==  body_code]['TAG: Body']
    if len(body_tags) > 0:
        body_dict[body_code] = body_tags.iloc[0]
    else: body_dict[body_code] = ''

## 2) Use Resonance_code to retrieve information from the dictionaries and label the Style Name

In [7]:
Resonance_Code_list = list(style['Resonance_code'])

In [8]:
count = 0
for Resonance_code in Resonance_Code_list:
    if len(Resonance_code) > 5:
        body_code = Resonance_code[0:7]
        color_code = Resonance_code[-6:]
        if body_code in body_dict:
            body_tag = body_dict[body_code]
        else: body_tag = ''
        if color_code in color_dict:   
            color_tags = color_dict[color_code]
        else: color_tags = ''
        style["TAGS: Body"].iloc[style[style['Resonance_code'] == Resonance_code].index[0]] = body_tag
        style["TAGS: Color"].iloc[style[style['Resonance_code'] == Resonance_code].index[0]] = color_tags
        count = count + 1
print "there are " + str(count) + " new labels created"

there are 4252 new labels created


In [9]:
style.head(2)

Unnamed: 0,Style Name,IMAGE,Resonance_code,TAGS: Material,TAGS: Color,TAGS: Body,id
0,🛡 The Tea Length Tunic - Petal Poet in Stret...,,TK-6011 SGT19 PANSYP,"Chief Silk, Weave Woven Stretch, GSM 130-160, ...","Contrast High, Bright Medium, Genre Floral, Fi...","Use Day, Use Weekend, Use Work, Fit Relaxed, D...",recCPenP3CfKvzeJD
1,🛡 The Sleeveless Classic Blouse - Cora's Cafe...,,TK-3018 SGT19 JPWALL,"Chief Silk, Weave Woven Stretch, GSM 130-160, ...","Contrast Medium, Bright Medium, Genre Floral, ...","Use Day, Use Work, Detail Soft, Use Weekend, F...",rec9rh22rgAsvYerk


## 3) Save the tags to csv file

In [10]:
style.to_csv("Style-Grid_view_labeled.csv")

### Label results

The whole dataset is labeled if there is a Resonance_code of each sample and the labeled dataset is saved as **Style-Grid_view_labeled.csv**.

There are **4252** new labels created in the **style** dataset. 

## Thank you!