In [1]:
import pandas as pd

from model_benchmark_utils import *

In [2]:
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 10)

## Answer quality functions

In [3]:
def uniform(answer, description):
    return 1.0

In [4]:
def preferred_length(answer, prompt):
    split_answer = answer.split(' ')
    word_count_factor = 1.0 if len(split_answer) in [5, 6] else 0.1
    return word_count_factor

In [5]:
def most_common_words(answer, prompt):
    answer_word_list = answer.upper().split(" ")
    prompt_word_list = prompt.upper().split(" ")
    number_of_common_words = len(list(set(prompt_word_list) & set(answer_word_list)))

    return number_of_common_words / float(len(answer_word_list))


## model-1 performance evaluation

This was the first model trained - no training data processing was involved except from wrapping description and names in meta markers. Model was training for 4600 steps on single GPU. Average loss at that point was 0.19. Since next iteration achieved better results with average loss of 0.35 it could mean that at this point model-1 was already overfitted, or otherwise broken. Tensorboard graph - and most probably model itself - is messed up from multiple training attempts with different training parameters. Because of that it a was throw away model to figure out whole process of training and generating answers for submission.

Submission was generated with a heuristics that favoured answers having most words in common with the description.

Model scored 0.14 in the challenge test which is slightly above tested maximum potential (see below)

In [8]:
model_1_answers, model_1_score = evaluate_all_test_sets_of_model(1, most_common_words)

Best answers:

In [9]:
model_1_answers.sort_values(by='score', ascending=False).head(10)

Unnamed: 0,name,description,answers,score
34,WOVEN BANANA LEAF BASKET,Woven banana leaf basket with contrast handles.,WOVEN BANANA LEAF BASKET WITH CONTRAST HANDLES...,1.0
87,CHRISTMAS PARACHUTE DECORATION,Christmas parachute decoration.,"CHRISTMAS MOTHER TREE DECORATION, CHRISTMAS CH...",1.0
43,GREEN IRON WATERING CAN,Green iron watering can with a brass sprinkler...,GREEN IRON WATERING CAN WITH BRASS SYNTHETIC H...,1.0
95,METALLIC-EFFECT SKINNY JEANS,Metallic-effect skinny jeans featuring a five-...,"SOFT DESIGN LIGHTWEIGHT 9.5 ML / 3.38 OZ, THE ...",1.0
54,LARGE ROUND BAMBOO BASKET,Large round bamboo basket with woven rattan sh...,"LARGE ROUND BAMBOO BASKET, LARGE ROUND BAMBOO ...",1.0
49,BLUE HOLE 100 ML,ZARA BLUE HOLE EDT 100 ML (3.4 FL. OZ). <br/><...,"BLUE HOLE 100 ML / 3.38 OZ, BLUE HOLE 8.34ML /...",1.0
18,STAR PRINT FLAT SHEET,Flat sheet in 200 thread count cotton percale ...,"STAR-SHAPED SHEET WITH HEMSTITCHING, ELEPHANT ...",1.0
17,WOODEN STOOL,Wooden stool.,"WOODEN STOOL, TWO-TONE WOODEN STOOL, WARM WOOD...",1.0
76,ZARA ROSE EDT 90 ML,ZARA ROSE EDT 90 ML (3.04 FL. OZ). <br/><br/>F...,ZARA ROSE EDT 90 ML / 3.04 OZ + ZARA ROSE EDT ...,1.0
99,WOODEN DOOR KNOBS,Wooden door knobs (set of 2).,"WOODEN DOOR KNOBS (SET OF 2), WOODEN DOOR KNOB...",1.0


Worst answers:

In [10]:
model_1_answers.sort_values(by='score').head(10)

Unnamed: 0,name,description,answers,score
0,SMILE T-SHIRT,Long sleeve T-shirt with a round neckline and ...,"PLUSH T-SHIRT WITH AN EMBOSSED TEXT, BLACK STR...",0.0
31,CHECKED PINAFORE DRESS,Pinafore dress with a round neckline and strap...,"ESSENCE PINAFORE DRESS WITH FEATURING FRONT, L...",0.0
30,RUSTIC TROUSERS,High-waist trousers with an elastic waistband.,"HIGH-WAIST TROUSERS, HIGH-WAIST TEXTURED TROUS...",0.0
29,SHIRRED WAIST TROUSERS,High-waist trousers with an elastic waist. Fea...,"TROUSERS WITH CONTRAST WAIST, HOUNDSTOOTH WOOL...",0.0
28,ROUND FLOOR CUSHION,Round padded floor cushion with a star print. ...,"STAR PRINT ROUND PADDED FLOOR CUSHION, ROUND P...",0.0
27,FLOWERS HAIRBAND,Headband with bow appliqué and a floral print.,"BELT BAG WITH FLORAL PRINT, FOAM BAND WITH FLO...",0.0
26,RELAXED FIT HI-RISE FLARED JEANS,Faded high-waist jeans with a five-pocket desi...,"Z1975 HIGH-WAIST JEANS WITH FEATURING WAIST, F...",0.0
25,COMBINED HERRINGBONE BRACES,Adjustable elastic braces. Metal brace clips w...,"ADJUSTABLE LEATHER SCARF, ADJUSTABLE METAL BRA...",0.0
24,FLOWING PRINTED BERMUDA SHORTS,High-waist Bermuda shorts with an elastic wais...,"LEOPARD PRINT BERMUDA SHORTS, ELASTIC WAISTBAN...",0.0
23,TECHNICAL OVERSHIRT,Relaxed fit overshirt made of technical fabric...,"RESEARCH TECHNICAL MOUSEIRT, RUBBERISED OVERSH...",0.0


Tested maximum potential (answers in perfect order):

In [11]:
model_potential(model_1_answers)

0.126

Tested score of unsorted answers:

In [12]:
model_1_score

0.08087961104174585

## model-2 performance evaluation

Another attempt to train was focused on utilizing both GPUs and training time optimization. Enabling multi_gpu flag caused more memory pressure on the first card while the second one's memory allocation was only 60%. In order to decrease memory allocation on the first card I enabled use_memory_saving_gradients option which also forced disabling accumulate_gradients option. Learning rate was increased from default 0.0001 to 0.0002

Model was training on 2 GPUs with batch size of 10. After 1400 steps average loss was 0.35

In [13]:
model_2_answers, model_2_score = evaluate_all_test_sets_of_model(2, most_common_words)

Best answers:

In [14]:
model_2_answers.sort_values(by='score', ascending=False).head(10)

Unnamed: 0,name,description,answers,score
93,GOLD LEAF TEALIGHT HOLDER,Glass tealight holder with gold leaf detail.,"TEALIGHT HOLDER WITH LEAF DETAIL, GREEN TEALIG...",1.0
62,TOP WITH RHINESTONE STRAPS,Top with a straight neckline. Thin metal strap...,"GEM BUTTON TOP TRF, LACE TOP WITH STRAIGHT NEC...",1.0
42,BEAR BACKPACK,Playful backpack with decorative teddy bear an...,"BEAR BACKPACK, ANIMAL PRINT BACKPACK, BEAR BAC...",1.0
96,LEAF PRINT SWIMMING TRUNKS,Swimming trunks with a leaf print design and a...,"LINEN BUSTIER SWIMMING TRUNKS, INFUSED SPARKLY...",1.0
29,ROUND WOODEN JEWELLERY BOX,Round wooden jewellery box with a mirror insid...,"ROUND WOODEN JEWELLERY BOX, LARGE MIRRORED WOR...",1.0
52,MOUSE FLEECE BLANKET,Fleece blanket featuring a polka dot backgroun...,"MOUSE FLEECE BLANKET, ANIMAL AND MOUSE © DISNE...",1.0
49,STRAW BEACH VISOR,Straw beach visor.,"BEACH VISOR, STRAW BEACH VISOR, STRAW SNEAKERS...",1.0
4,GREENISH GLASS SOAP DISH,Greenish glass soap dish.,"GREENISH GLASS DISPENSER, GREENISH GLASS DISH,...",1.0
58,BOX PLEAT MINI SKIRT,High­waist mini skirt with a short-style linin...,"BOX PLEAT MINI SKIRT, BOX PLEAT SKIRT, BOX PLE...",1.0
92,TUBEROSE 10ML,ZARA TUBEROSE EDT 10ML (0.34 FL. OZ). <br/><br...,"TUBEROSE 10ML, TUBEROSE EDT 10 ML / 0.34 OZ, T...",1.0


Worst answers:

In [15]:
model_2_answers.sort_values(by='score').head(10)

Unnamed: 0,name,description,answers,score
0,SMILE T-SHIRT,Long sleeve T-shirt with a round neckline and ...,"PRINTED T-SHIRT, FLORAL PRINT T-SHIRT, FOOTBAL...",0.0
62,JACQUARD BLAZER,Blazer with a round collar and long sleeves. F...,"FAUX LEATHER BLAZER, TAFFETA CHECK SUIT BLAZER...",0.0
63,TRAVELER SUIT TROUSERS,"Trousers made of textured bi-stretch, easy-to-...","EASY-TO-IRON TECHNICAL TROUSERS, SLIM FIT TROU...",0.0
64,FAMILY T-SHIRT,Short sleeve T-shirt with a round neck and slo...,"ANIMAL PRINT STRIPE T-SHIRT, LONGLINE SLOGAN T...",0.0
65,SPARKLY DRESS,Dress featuring a straight neckline and ruffle...,"RUFFLED SHIRT DRESS, SHINY TEXTURED DRESS WITH...",0.0
66,POPLIN TOP WITH CUTWORK EMBROIDERY TRF,Crop top with a wide straight neckline and sho...,"LACE ORGANZA TOP, CUT OUT MINI CROP TOP, PRINT...",0.0
68,STRIPE PRINT TOP,"Top with an adjustable buttoned high neck, lon...","FAUX LEATHER TOP, PLEATED TOP, ORGANZA PRINT T...",0.0
69,FUR CUSHION COVER,Faux fur cushion cover. Cushion filling not in...,"FAUX FUR QUILDREN'S CUSHION COVER, FAUX FUR CU...",0.0
70,PRINTED MIDI DRESS TRF,Dress with a round neck and long sleeves. Feat...,"PLEATED DRESS, SATEEN DRESS TRF, CHECK DRESS, ...",0.0
60,SUEDE FLAT MULES WITH JUTE SOLE,Sand-coloured leather flat mules. Split suede ...,"SQUARE-TOE LEATHER SHOES/SUEDE HEEL SHOT, LEAT...",0.0


Tested maximum potential:

In [16]:
model_potential(model_2_answers)

0.424

Tested score of unsorted answers:

In [17]:
model_2_score

0.23222571384786073

## model-2.1 performance evaluation

Model 2.1 is even more finetuned model 2. After another 1200 steps average loss was at 0.247 and tested score is significantly improved

In [18]:
model_2_1_answers, model_2_1_score = evaluate_all_test_sets_of_model('2.1', most_common_words)

Best answers:

In [19]:
model_2_answers.sort_values(by='score', ascending=False).head(10)

Unnamed: 0,name,description,answers,score
93,GOLD LEAF TEALIGHT HOLDER,Glass tealight holder with gold leaf detail.,"TEALIGHT HOLDER WITH LEAF DETAIL, GREEN TEALIG...",1.0
62,TOP WITH RHINESTONE STRAPS,Top with a straight neckline. Thin metal strap...,"GEM BUTTON TOP TRF, LACE TOP WITH STRAIGHT NEC...",1.0
42,BEAR BACKPACK,Playful backpack with decorative teddy bear an...,"BEAR BACKPACK, ANIMAL PRINT BACKPACK, BEAR BAC...",1.0
96,LEAF PRINT SWIMMING TRUNKS,Swimming trunks with a leaf print design and a...,"LINEN BUSTIER SWIMMING TRUNKS, INFUSED SPARKLY...",1.0
29,ROUND WOODEN JEWELLERY BOX,Round wooden jewellery box with a mirror insid...,"ROUND WOODEN JEWELLERY BOX, LARGE MIRRORED WOR...",1.0
52,MOUSE FLEECE BLANKET,Fleece blanket featuring a polka dot backgroun...,"MOUSE FLEECE BLANKET, ANIMAL AND MOUSE © DISNE...",1.0
49,STRAW BEACH VISOR,Straw beach visor.,"BEACH VISOR, STRAW BEACH VISOR, STRAW SNEAKERS...",1.0
4,GREENISH GLASS SOAP DISH,Greenish glass soap dish.,"GREENISH GLASS DISPENSER, GREENISH GLASS DISH,...",1.0
58,BOX PLEAT MINI SKIRT,High­waist mini skirt with a short-style linin...,"BOX PLEAT MINI SKIRT, BOX PLEAT SKIRT, BOX PLE...",1.0
92,TUBEROSE 10ML,ZARA TUBEROSE EDT 10ML (0.34 FL. OZ). <br/><br...,"TUBEROSE 10ML, TUBEROSE EDT 10 ML / 0.34 OZ, T...",1.0


Worst answers

In [20]:
model_2_answers.sort_values(by='score').head(10)

Unnamed: 0,name,description,answers,score
0,SMILE T-SHIRT,Long sleeve T-shirt with a round neckline and ...,"PRINTED T-SHIRT, FLORAL PRINT T-SHIRT, FOOTBAL...",0.0
62,JACQUARD BLAZER,Blazer with a round collar and long sleeves. F...,"FAUX LEATHER BLAZER, TAFFETA CHECK SUIT BLAZER...",0.0
63,TRAVELER SUIT TROUSERS,"Trousers made of textured bi-stretch, easy-to-...","EASY-TO-IRON TECHNICAL TROUSERS, SLIM FIT TROU...",0.0
64,FAMILY T-SHIRT,Short sleeve T-shirt with a round neck and slo...,"ANIMAL PRINT STRIPE T-SHIRT, LONGLINE SLOGAN T...",0.0
65,SPARKLY DRESS,Dress featuring a straight neckline and ruffle...,"RUFFLED SHIRT DRESS, SHINY TEXTURED DRESS WITH...",0.0
66,POPLIN TOP WITH CUTWORK EMBROIDERY TRF,Crop top with a wide straight neckline and sho...,"LACE ORGANZA TOP, CUT OUT MINI CROP TOP, PRINT...",0.0
68,STRIPE PRINT TOP,"Top with an adjustable buttoned high neck, lon...","FAUX LEATHER TOP, PLEATED TOP, ORGANZA PRINT T...",0.0
69,FUR CUSHION COVER,Faux fur cushion cover. Cushion filling not in...,"FAUX FUR QUILDREN'S CUSHION COVER, FAUX FUR CU...",0.0
70,PRINTED MIDI DRESS TRF,Dress with a round neck and long sleeves. Feat...,"PLEATED DRESS, SATEEN DRESS TRF, CHECK DRESS, ...",0.0
60,SUEDE FLAT MULES WITH JUTE SOLE,Sand-coloured leather flat mules. Split suede ...,"SQUARE-TOE LEATHER SHOES/SUEDE HEEL SHOT, LEAT...",0.0


Tested maximum potential:

In [21]:
model_potential(model_2_1_answers)

0.58

Tested score of unsorted answers:

In [22]:
model_2_1_score

0.29293636154047076

## model-2.1 performance evaluation

In [24]:
model_2_2_answers, model_2_2_score = evaluate_all_test_sets_of_model('2.2', uniform)

In [25]:
model_1_answers.sort_values(by='score', ascending=False).head(50)

Unnamed: 0,name,description,answers,score
34,WOVEN BANANA LEAF BASKET,Woven banana leaf basket with contrast handles.,WOVEN BANANA LEAF BASKET WITH CONTRAST HANDLES...,1.0
87,CHRISTMAS PARACHUTE DECORATION,Christmas parachute decoration.,"CHRISTMAS MOTHER TREE DECORATION, CHRISTMAS CH...",1.0
43,GREEN IRON WATERING CAN,Green iron watering can with a brass sprinkler...,GREEN IRON WATERING CAN WITH BRASS SYNTHETIC H...,1.0
95,METALLIC-EFFECT SKINNY JEANS,Metallic-effect skinny jeans featuring a five-...,"SOFT DESIGN LIGHTWEIGHT 9.5 ML / 3.38 OZ, THE ...",1.0
54,LARGE ROUND BAMBOO BASKET,Large round bamboo basket with woven rattan sh...,"LARGE ROUND BAMBOO BASKET, LARGE ROUND BAMBOO ...",1.0
49,BLUE HOLE 100 ML,ZARA BLUE HOLE EDT 100 ML (3.4 FL. OZ). <br/><...,"BLUE HOLE 100 ML / 3.38 OZ, BLUE HOLE 8.34ML /...",1.0
18,STAR PRINT FLAT SHEET,Flat sheet in 200 thread count cotton percale ...,"STAR-SHAPED SHEET WITH HEMSTITCHING, ELEPHANT ...",1.0
17,WOODEN STOOL,Wooden stool.,"WOODEN STOOL, TWO-TONE WOODEN STOOL, WARM WOOD...",1.0
76,ZARA ROSE EDT 90 ML,ZARA ROSE EDT 90 ML (3.04 FL. OZ). <br/><br/>F...,ZARA ROSE EDT 90 ML / 3.04 OZ + ZARA ROSE EDT ...,1.0
99,WOODEN DOOR KNOBS,Wooden door knobs (set of 2).,"WOODEN DOOR KNOBS (SET OF 2), WOODEN DOOR KNOB...",1.0


In [26]:
model_potential(model_2_2_answers)

0.81125

In [27]:
model_2_2_score

0.5653414335647854