# Introdução
Esse é um notebook para exemplificar como fazer o uso do framework no ambiente do Colab. O exemplo mostrado aqui será o mesmo que construímos em aula remota, usaremos a base de dados **movielens** aplicando alguns pré-processamentos como **normalização** e **processamento textual**, por fim usaremos o modelo **ItemKNN** fornecido pelo **Lenskit** aplicando métodos como **predict_for_user**, **predict** e **recommend**

# Preparando o ambiente
Nessa seção iremos clonar o repositório, instalar as dependências e garantir que o ambiente colab reconheça o projeto

In [None]:
!git clone https://github.com/lucasnatali98/hybrid_recommender_framework.git

In [None]:
!pip install -r /content/hybrid_recommender_framework/requirements.txt

In [44]:
import sys
sys.path.insert(0,'/content/hybrid_recommender_framework')

In [46]:
%cd /content/hybrid_recommender_framework

/content/hybrid_recommender_framework


# Importando módulos necessários para o exemplo.

In [47]:
from hybrid_recommender_framework.src.preprocessing.normalize import NormalizeProcessing
from hybrid_recommender_framework.src.data.movielens import MovieLens
from hybrid_recommender_framework.src.preprocessing.text import TextProcessing
from hybrid_recommender_framework.src.recommenders.item_knn import LenskitItemKNN
import numpy as np

# Carregando a base de dados e preparando para uso.

In [48]:
movielens = MovieLens({
    'proportion': "ml-latest-small"
})

In [49]:
ratings = movielens.ratings
ratings.drop(columns=['timestamp'], inplace=True)
movies = movielens.items

In [50]:
ratings.head()

Unnamed: 0,user,item,rating
0,1,1,4.0
1,1,3,4.0
2,1,6,4.0
3,1,47,5.0
4,1,50,5.0


In [51]:
movies.head()

Unnamed: 0,movieId,title,genres
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
1,2,Jumanji (1995),Adventure|Children|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama|Romance
4,5,Father of the Bride Part II (1995),Comedy


# Instanciando os pré-processamentos

In [52]:
normalize_processing = NormalizeProcessing({
    'norm': 'l2',
    'column_to_apply': "rating",
    'axis': 0
})


In [53]:
text_processing = TextProcessing({
    'column_to_apply': 'genres',
    'remove_stop_words': True,
    'tokenize_words': True
})

In [54]:
itr = {
    "|": " "
}
movies_processed = text_processing.pre_processing(movies, items_to_replace = itr)

In [55]:
movies_processed

Unnamed: 0,movieId,title,genres
0,1,Toy Story (1995),"[Adventure, Animation, Children, Comedy, Fantasy]"
1,2,Jumanji (1995),"[Adventure, Children, Fantasy]"
2,3,Grumpier Old Men (1995),"[Comedy, Romance]"
3,4,Waiting to Exhale (1995),"[Comedy, Drama, Romance]"
4,5,Father of the Bride Part II (1995),[Comedy]
...,...,...,...
9737,193581,Black Butler: Book of the Atlantic (2017),"[Action, Animation, Comedy, Fantasy]"
9738,193583,No Game No Life: Zero (2017),"[Animation, Comedy, Fantasy]"
9739,193585,Flint (2017),[Drama]
9740,193587,Bungo Stray Dogs: Dead Apple (2018),"[Action, Animation]"


In [56]:
item_knn = LenskitItemKNN({
    'maxNumberNeighbors': 10,
})

In [57]:
items = ratings['item'].values
users = ratings['user'].values

unique_users = np.unique(users)
user = unique_users[0]

# Treinamento do modelo - ItemKNN

In [58]:
item_knn.fit(ratings)

## Predições para um usuário 

In [59]:
predict_to_user = item_knn.predict_for_user(user, items)
predict_to_user = predict_to_user[predict_to_user.notna()]
predict_to_user

item
1         4.490579
3         4.332590
6         4.871042
47        4.513436
50        4.582301
            ...   
166534    4.125020
168248    5.216757
168250    4.045101
168252    5.117748
170875    3.225782
Length: 96710, dtype: float64

# Predições para vários usuários

In [60]:
predict = item_knn.predict(ratings[['user', 'item']])
predict

0         4.490579
1         4.332590
2         4.871042
3         4.513436
4         4.582301
            ...   
100831    3.365822
100832    4.834298
100833    4.321853
100834    4.920219
100835    2.958027
Name: prediction, Length: 100836, dtype: float64

# Recomendando 10 itens para cada usuário

In [61]:
recommend_to_users = item_knn.recommend(unique_users, 10)
recommend_to_users

Unnamed: 0,user,item,score,algorithm_name
0,1,97866,6.054254,LenskitItemKNN
1,1,142115,5.917685,LenskitItemKNN
2,1,117192,5.868294,LenskitItemKNN
3,1,147382,5.847610,LenskitItemKNN
4,1,55167,5.824215,LenskitItemKNN
...,...,...,...,...
6095,610,3673,5.195032,LenskitItemKNN
6096,610,2511,5.187238,LenskitItemKNN
6097,610,389,5.168704,LenskitItemKNN
6098,610,4298,5.167129,LenskitItemKNN
