# Item Recommender
---
The goal of this document is to demonstrate how to recommend items for Style Lend renters. We calculate the similarities between all items using the description text of items stored in stylelend.com database. Then we make recommendations based on renters' size records, brand preferences, and cloth type preferences, extracted from renting history.

### Outline
---

1. [**Introduction**](#intro)
2. [**Tokenized Item Descriptions**](#splitting)

    - [Leave-k-out](#leave-k-out)
    - [Leave-one-out](#leave-one-out)
    - [M-fold](#m-fold)    
    - [Validation Sets](#validation)    
    - [Summary: Holdout recommendations and best practices](#holdout summary)    
<br>
3. [**Brand Co-occurence Network**](#metrics)

    - [Error Metrics](#error)
    - [Classification Metrics](#class)
    - [Rank Metrics](#rank)     
<br>

4. [**Recommender**](#metrics)

    - [Error Metrics](#error)
    - [Classification Metrics](#class)
    - [Rank Metrics](#rank) 

In [2]:
from sqlalchemy import create_engine
from sqlalchemy_utils import database_exists, create_database
import psycopg2
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import sys
sys.path.append('../script')
import recommender as rd
from sklearn.feature_extraction.text import CountVectorizer


In [5]:
# example on using recommendations
df = rd.load_item_data()
cr = rd.customized_recommender()

select_item = df.index[6]
cr.filter_recommendation(select_item)
cr.recommend_items()

Unnamed: 0_level_0,description,item_type,size,brand,size_number,tokens
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
6798,,,,,,
2980,"Ladylike but with a modern attitude, Helmut La...",bags,One-Size,Helmut Lang,,modern bunk pod pocket room main look access p...
1462,Very dark blue leather crossbody bag from Long...,accessories,One-Size,Longchamp,,dark bag
6235,Chanel black quilted caviar leather Grand Shop...,bags,One-Size,Chanel,,black grand shop tot bag front logo half moon ...
5821,Perfect for summer!,accessories,One-Size,Chanel,,perfect sum


In [6]:
items

[2598,
 1462,
 6235,
 5821,
 2305,
 7079,
 2306,
 2631,
 5823,
 7081,
 2638,
 1459,
 5825,
 5826,
 5828,
 5668,
 2478,
 5816,
 8040,
 5815,
 6365,
 1263,
 3279,
 2263,
 2759,
 3572,
 545,
 3187,
 3213,
 551,
 2534,
 6315,
 931,
 7266,
 5682,
 1568,
 1278,
 851,
 854,
 3184,
 253,
 3610,
 5879,
 3032,
 946,
 724,
 2957,
 1280,
 1275,
 2452,
 2011,
 5484,
 2125,
 930,
 5892,
 2956,
 547,
 4434,
 6267,
 1279,
 1281,
 1277,
 6586,
 3099,
 929,
 543,
 6588,
 5902,
 6830,
 2277,
 4451,
 5708,
 5709,
 2954,
 1525,
 6265,
 2632,
 3256,
 836,
 3069,
 5655,
 5715,
 5714,
 2289,
 6449,
 2476,
 1705,
 7800,
 3020,
 6844,
 6604,
 5890,
 5893,
 6847,
 5734,
 3265,
 3573,
 3574,
 7811,
 6850,
 1290,
 1469,
 6852,
 1623,
 2907,
 1300,
 6853,
 3319,
 6854,
 6856,
 6859,
 6857,
 476,
 2222,
 6864,
 6865,
 4029,
 3491,
 6866,
 944,
 3503,
 3508,
 6869,
 6870,
 3228,
 2578,
 3517,
 3526,
 3528,
 2630,
 6871,
 3504,
 5479,
 6411,
 3518,
 2717,
 6413,
 2668,
 4614,
 2716,
 2009,
 6877,
 6876,
 3014,
 834,
 

# EXAMPLE Codes

In [3]:
# Example on using tokenizer
df = rd.load_item_data()
df = rd.description_tokenizer(df)
df.to_csv("../source_data/item_df_n_tokens.csv")

In [3]:
# Example on using brand_similarity to save brand_similarity
bs = rd.brand_similarity()
brand_matrix = bs.brand_cooccur_matrix(thres=5)
brand_matrix.to_csv("../source_data/brand_matrix.csv")

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  my_data['brand_code'] = le.transform(my_data['brand'])


In [6]:
# Exmaple on brand network analysis
import networkx as nx
import warnings
warnings.filterwarnings('ignore')

G = nx.from_numpy_matrix()
pos=nx.circular_layout(G) 

nx.draw(G, pos=pos, node_size=13)

labels={}

for i in range(n):
    labels[i] = le.classes_[i]
    
nx.draw_networkx_labels(G,pos,labels,font_size=10,color='r')

NameError: name 'A' is not defined