# Intro to Machine Learning: Recommender - Naive Engine with Clustering

We use a clustering algorithm such as $k$-Means for better recommendation and reducing the search space.

## Outline

* [Loading Data](#Loading-Data)
* [Creating a Pivot Table](##Creating-a-Pivot-Table)
* [Clustering](#Clustering)
* [Implementing a Recommender](#Implementing-a-Recommender)

## Loading Data

In [None]:
import pandas as pd

In [None]:
wine_ratings = pd.read_csv('data/wine-reviews.csv')

In [None]:
wine_ratings.head()

---

## Creating a Pivot Table

In [None]:
wine_ratings_pivoted = wine_ratings.pivot(index='username', columns='wine', values='rating')
wine_ratings_pivoted

In [None]:
wine_ratings_pivoted = wine_ratings_pivoted.fillna(0)
wine_ratings_pivoted

---

## Clustering

In [None]:
# import
from sklearn import cluster

# instantiate
k = 3
kmeans = cluster.KMeans(n_clusters=k, random_state=1)

# fit
kmeans.fit(wine_ratings_pivoted)

In [None]:
kmeans.cluster_centers_

In [None]:
kmeans.labels_

In [None]:
# predict
kmeans.predict([wine_ratings_pivoted.loc['teus', :]])

In [None]:
my_taste = [5.0, 0.0, 1.0, 0.0, 4.0, 1.0, 2.0, 4.0, 0.0, 5.0, 5.0]

We can map my taste into one of the clusters like this.

In [None]:
kmeans.predict([my_taste])

We label each user by adding a new column "label" to the data frame.

In [None]:
wine_ratings_pivoted['label'] = kmeans.labels_

In [None]:
wine_ratings_pivoted

Show the list of people who have the same taste as mine.

In [None]:
my_taste = [5.0, 0.0, 1.0, 0.0, 4.0, 1.0, 2.0, 4.0, 0.0, 5.0, 5.0]
pred = kmeans.predict([my_taste])
pred

In [None]:
wine_ratings_pivoted[wine_ratings_pivoted.label == pred[0]]

---

## Implementing a Recommender

### Find out which wine I have never tasted from Pepe's list.

In [None]:
wine_ratings_pivoted.loc['pepe', list(map(lambda x: True if x == 0 else False, my_taste))]

We then should recommend "Rosseau Chambertin 2001" because Pepe has tasted it and gave it a good rating.

### Find out which wine I have never tasted from Yasset's list.

In [None]:
wine_ratings_pivoted.loc['yasset', list(map(lambda x: True if x == 0 else False, my_taste))]

Here we don't have any wine to recommend since Yasset has never tried any wine in the list above.