## Recommendation App
- [Importing libraries](#import)
- [Fetching data frames and models](#fetching)
### [Normalization](#norm)
### [Recommendation walkthrough](#walkthrough)
- [finding cluster function](#findingCluster)
- [combining features](#combination)

<a id="import"></a>
### Imports

In [104]:
import pandas as pd
import pickle
import numpy as np
from sklearn.metrics import pairwise_distances

from pymongo import MongoClient

In [105]:
client = MongoClient('ec2-34-198-179-91.compute-1.amazonaws.com', 27017)
db = client.fletcher
dress_col = db.rtr_dresses
cur_dress = dress_col.find()

<a id="fetching"></a>
### Fetching the related data frames and models

In [3]:
df_general = pd.read_csv('../data/dress_features.csv', index_col=0)

In [78]:
df_body = pd.read_csv('../data/dress_features_bt.csv', index_col=0)

<a id="norm"></a>
### Normalization
- Normalizing df_body is not needed. We need to simply list the dresses in order of the score.
- For df_general, we need to normalize it per column. Make all columns go from 0 to 1.

In [16]:
def normalize(df):
    mapping = {}
    df_norm = df.copy()
    for col in df.columns:
        max = df[col].max()
        min = df[col].min()
        col_range = max - min
        df_norm[col] = (df[col] - min) / col_range
        mapping[col] = {'min' : min, 'col_range' : col_range}
    return df_norm, mapping

In [17]:
df_norm, mapping = normalize(df_general)

In [19]:
df_norm.head(10)

Unnamed: 0_level_0,back,bra,color,material,sequins_polar,sequins_unpolar,wedding,pockets
url,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
https://www.renttherunway.com/shop/designers/alexis/red_leona_dress,0.490403,0.367929,0.681637,0.397041,0.265116,0.015512,0.297983,9.6e-05
https://www.renttherunway.com/shop/designers/allison_parris/cobalt_marilyn_gown,0.580244,0.270334,0.574516,0.361258,0.381998,0.012793,0.596208,0.464545
https://www.renttherunway.com/shop/designers/badgley_mischka/award_winner_gown,0.611154,0.350834,0.59636,0.32952,0.310286,0.281476,0.302956,0.000126
https://www.renttherunway.com/shop/designers/badgley_mischka/curves_for_days_gown,0.525334,0.389743,0.539367,0.338634,0.459688,0.00089,0.35125,7.6e-05
https://www.renttherunway.com/shop/designers/badgley_mischka/evergreen_sequin_dress,0.567095,0.49744,0.644953,0.284895,0.350572,0.575796,0.097662,0.000775
https://www.renttherunway.com/shop/designers/badgley_mischka/fifth_avenue_showstopper_dress,0.494343,0.439424,0.560989,0.302429,0.319496,0.471619,0.296763,0.003644
https://www.renttherunway.com/shop/designers/badgley_mischka/forbidden_territory_gown,0.559083,0.483927,0.325848,0.297377,0.413565,0.030063,0.238648,0.000727
https://www.renttherunway.com/shop/designers/badgley_mischka/garden_of_sequins_dress,0.465413,0.404489,0.571204,0.421545,0.238415,0.346876,0.423181,1.1e-05
https://www.renttherunway.com/shop/designers/badgley_mischka/glitz_gown,0.489033,0.428355,0.580295,0.264312,0.31674,0.301871,0.23751,0.001017
https://www.renttherunway.com/shop/designers/badgley_mischka/ivy_gown,0.419384,0.496229,0.648871,0.36353,0.296689,0.562428,0.345795,0.00024


In [20]:
mapping

{'back': {'col_range': 0.89583333333333337, 'min': -0.39583333333333337},
 'bra': {'col_range': 0.42010582010582009, 'min': 0.0},
 'color': {'col_range': 0.80833333333333346, 'min': -0.083333333333333329},
 'material': {'col_range': 0.99041666666666661, 'min': -0.07166666666666667},
 'pockets': {'col_range': 0.027574314277116586, 'min': 0.0},
 'sequins_polar': {'col_range': 0.7414783950617283,
  'min': -0.15131172839506174},
 'sequins_unpolar': {'col_range': 0.013589843511914119, 'min': 0.0},
 'wedding': {'col_range': 0.010429103797054551, 'min': 0.00012572456065779416}}

<a id='walkthrough'></a>
## Recommendation Walkthrough
1. User inputs body information.
2. Assign user to a cluster.
3. Ranks dress from 1 to 10.
4. Get user's preferences.
5. Take the columns that users care about, and assign 1. 

In [27]:
cluster_model = pickle.load(open('../tools/clustering_model.sav', 'rb'))
cluster_mapping = pickle.load(open('../data/cluster_mapping.pkl', 'rb'))

In [83]:
df_body = df_body.replace(np.nan, 0)

<a id="findingCluster"></a>
#### Finding cluster

In [51]:
num_cols = ['age', 'usually_wears', 'pregnant', 'weight', 'upper_bust', 'under_bust', 'height_in']
def cluster(age=None, usually_wears=None, pregnant=None, weight=None, upper_bust=None, under_bust=None, height=None, body_type=None):
    body_info = np.array([age, usually_wears, pregnant, weight, upper_bust, under_bust, height])
    norm_vals = []
    for col, val in zip(num_cols, body_info):
        mapping = cluster_mapping[col]
        if not val:
            normed = 0.5
        else:
            normed = (val - mapping['min'] ) / mapping['col_range']
        if 'weight' in mapping:
            normed = normed * mapping['weight']
        norm_vals.append(normed)
    
    body_type_array = np.zeros(6) 
    if body_type >= 0:
        body_type_array[body_type] = 1
    input = np.append(norm_vals, body_type_array)
    return cluster_model.predict(input)[0]

<a id="combination"></a>
#### Combining general and body specific recommender

In [79]:
def get_dress_weights(cluster):  
    return df_body[df_body.index==cluster].transpose().sort_values(by=cluster,ascending=False)
    

In [117]:
def get_general_df(df, columns):
    cur_df = df.iloc[:, columns]
    cur_df['dist'] = pairwise_distances(cur_df, [1] * len(columns))
    return cur_df 

In [111]:
def get_recommendations(cluster, columns, n):
    df_cur = get_dress_weights(cluster)
    df_gen = get_general_df(df_norm, columns)
    main_df = df_cur.join(df_gen)
    main_df['total'] = main_df['dist'] * main_df[cluster]
    
    rec = []
    for url in main_df.sort_values('total', ascending=False).head(n).index:
        dress = dress_col.find_one({'url': url})
        rec.append((dress['dress_name'], dress['designer_name'], dress['img_link']))
    return rec

In [121]:
df_norm.to_csv('../data/dress_features_norm.csv')

In [127]:
pickle.dump(df_norm.columns, open('../data/pref_columns.pkl', 'wb'))