# BBL #1

## [Learning Embeddings for Product Size Recommendations](https://bit.ly/2Z88rff)


## Size recommendations

### State of the Art


* 8: Zalando, Romain Guigourès, Yuen King Ho, Evgenii Koriagin, Abdul-Saboor Sheikh, Urs Bergmann, and Reza Shirvany. 2018. [A Hierarchical Bayesian Model for Size Recommendation in Fashion](https://rguigoures.github.io/pdf/hierarchical-bayesian-model_final.pdf)

* 13 Rishabh Misra, Mengting Wan, and Julian McAuley. 2018. [Decomposing Fit Semantics for Product Size Recommendation in Metric Spaces](https://cseweb.ucsd.edu/~jmcauley/pdfs/recsys18e.pdf)
  * learns a latent space for customers and products by applying ordinal regression.
  * A fitness score is computed for each purchase and size ordering is enforced based on customer’s feedback on the purchased size (i.e. too small, too big or a good fit). In order to handle class imbalances, metric learning techniques are applied to transform data into a space where purchases of the same class are closer and purchases of different classes are separated by a margin

* 18 Vivek Sembium, Rajeev Rastogi, Atul Saroop, and Srujana Merugu. 2017. [Recommending Product Sizes to Customers](https://cseweb.ucsd.edu/classes/fa17/cse291-b/reading/p243-sembium.pdf)
  * Ordinal regression problem
  * customer and product true sizes are learned by taking their differences and feeding them into a linear model
  * propose a solution for determining if an article of a certain size would be fit, large, or small for a certain customer, using the purchase history.
  * This is achieved by iteratively deducing the true sizes for customers and products, fitting a linear function based on the difference in sizes, and performing ordinal regression on the output of the function to get the loss.
  * Extra features are simply included by addition to the linear function.

* 19 Vivek Sembium, Rajeev Rastogi, Lavanya Tekumalla, and Atul Saroop. 2018. [Bayesian Models for Product Size Recommendations](http://delivery.acm.org/10.1145/3190000/3186149/p679-sembium.pdf)
  * extends the work of [18] with a Bayesian logit and probit regression model with ordinal categories
  * The posterior distribution over customer and product true sizes is based on mean-field variational inference with Polya-Gammma augmentation.
  
The Bayesian approach allows the use of priors for handling data sparsity and the computation of confidence intervals for dealing with noisy data. Both and generate ordinal categorical variables based on explicit customer feedback on returned items (e.g. too small, too big or no return).

---

* 4 [Size Recommendation System for Fashion E-Commerce](https://kddfashion2017.mybluemix.net/final_submissions/ML4Fashion_paper_8.pdf) 
  * learns latent product features using Word2Vec and feeds them into a Gradient Boosting classifier along with additional product features (e.g physical measurements, colour, etc.).
  * However, additional product features are often difficult to obtain

* 20 Shreya Singh, G. Mohammed Abdulla, Sumit Borar, and Sagar Arora. 2018. [Footwear Size Recommendation System](https://arxiv.org/pdf/1806.11423.pdf)
  * extends 4 to the specific case of footwear size recommendations and also proposes a probabilistic graphical approach that exploits brand similarities.


## Size Recommendations


### Zalando

[A Hierarchical Bayesian Model for Size Recommendation in Fashion](https://rguigoures.github.io/pdf/hierarchical-bayesian-model_final.pdf)

* Represent articles as a combination of brand, usage, size, and fit
* A neural network is then trained to learn a latent vector describing each article defined as the combination of features mentioned before
* Customer vector representation is obtained by aggregating over purchased articles
* Finally, a gradient boosted classifier predicts the fit of an article to a customer: it learns the joint probability of a customer purchasing a given product size and the resulting return status being either too small, too big or no return

## Size recommendations

### Asos Approach

**_Neural Collaborative Filtering approach_**

Collaborative filtering uses customer-product interactions and is based on the assumption that customers buying similar products have similar tastes. This principle naturally translates into the size and fit domain as "customers with similar body shapes tend to buy clothes in similar sizes".

* learn a latent space at a product size level instead of at a product level i.e. we have a different vector for every possible size of a product
* asymmetric framework so that users are not represented explicitly, but as the aggregate of the product vectors with which they have interacted
* train different models for each product category (tops, bottoms or shoes), so all trained embeddings belong to the same category and the learned latent space represents the same body part



## Size recommendations

### Asos Approach

__*Transfer from Brand to Product*__

Model of product-size combinations instead of just products => product-size interaction matrix ~10 times sparser than for product recommendations

* Assume that each brand has consistent sizes
* Learn latent representations _V<sub>b<sub>s</sub></sub>_ for every combination of brand *b* and size _s_
* Transfer this to a product level: V<sub>p<sub>s</sub></sub> = V<sub>b<sub>s</sub></sub>, ∀p<sub>s</sub> ∈ b<sub>s</sub>

Impacts
* Improves generalisation
* Boosts performance
* Leads to faster convergence



## Multiple Personas detection

_Goal: Reduce noise in recommendations_

### State of the Art

* Explicitly use different user profiles (Netflix)
* Use of empirically determined thresholds on the range of purchased sizes
* Statistical models
* Filter out users where the mean and std deviation of the purchased sizes exceeds a category level threshold
* Hierarchical clustering method where clusters are iteratively merged as long as the std deviation does not exceed an empirical threshold
* Persona distribution learned from a dirichlet distribution
* Gaussian kernel density estimation approach refined to a gaussian mixture model



## Multiple Personas detection

### Asos approach

* Gaussian Mixture Model
* Assumes that one user is centred around __one__ core size
* Each persona is represented by a Gaussian distribution, whose mean µ corresponds to the persona’s core size

## Multiple Personas detection

### Asos approach

