# Intro to Machine Learning: Recommender - User-Based Collaborative Filtering

## Outline

* [Loading Data](#Loading-Data)
* [Creating a Pivot Table](#Creating-a-Pivot-Table)
* [Measuring Similarity](#Measuring-Similarity)
* [Implementing a Recommender](#Implementing-a-Recommender)

## Loading Data

In [None]:
import pandas as pd

In [None]:
wine_ratings = pd.read_csv('data/wine-reviews.csv')

In [None]:
wine_ratings.head()

---

## Creating a Pivot Table

In [None]:
wine_ratings_pivoted = wine_ratings.pivot(index='wine', columns='username', values='rating').fillna(0)

In [None]:
wine_ratings_pivoted

---

## Measuring Similarity

In [None]:
from sklearn.metrics.pairwise import cosine_similarity

In [None]:
cosine_similarity([wine_ratings_pivoted.carlos], [wine_ratings_pivoted.jadianes])

In [None]:
wine_ratings_pivoted.columns

Find a simliarity vector for Mari to see the similarity between her and the others.

In [None]:
similarity_vector = {}
for each in wine_ratings_pivoted.columns:
    similarity = cosine_similarity([wine_ratings_pivoted[each]], [wine_ratings_pivoted['mari']])[0][0]
    similarity_vector[each] = similarity

In [None]:
similarity_vector

Sort people based on the similarity.

In [None]:
def get_key(item):
    return item.key

sorted(similarity_vector, key=similarity_vector.get, reverse=True)

---

## Implementing a Recommender

Should we recommend "JL Chave Hermitage 2001" to Mari?

In [None]:
wine_ratings_pivoted.loc['JL Chave Hermitage 2001', :]

We pick Carlos and John since they have rated this wine.

In [None]:
ratings = wine_ratings_pivoted.loc['JL Chave Hermitage 2001', ['carlos', 'john']]
ratings

Calculate the weighted mean of data

$\{x_1, x_2, \dots , x_n\},$

where x represents a set of mean values with non-negative weights

$\bar{x} = \frac{ \sum\limits_{i=1}^n w_i x_i}{\sum\limits_{i=1}^n w_i}$

In [None]:
numerator = (ratings['carlos'] * similarity_vector['carlos']) + (ratings['john'] * similarity_vector['john'])
denominator = similarity_vector['carlos'] + similarity_vector['john']

In [None]:
weighted_mean = numerator / denominator

In [None]:
weighted_mean