# Tutorial Implementando un recomendador híbrido

MAN 3160 - Sistemas Recomendadores


## Importar Librerías

In [1]:
import numpy as np
import json
import requests
import heapq
import math
import matplotlib.pyplot as plt
from sklearn.metrics import pairwise_distances
from sklearn.decomposition import PCA
from io import BytesIO
import pickle
import pandas as pd
import time
import scipy.sparse as sparse
import implicit

Descargamos datos que vienen previamente calculados: 
- transacciones/interacciones de cada usuario 
- transaciones para evaluar el modelo 
- embeddings de descripciones calculados con BERT  
- embeddings de descripciones calculados con BERT-large
- datos de libros con información de titulo, descripcion, año de publicacion, entre otros. 

In [2]:
import urllib.request
urllib.request.urlretrieve("https://www.dropbox.com/s/57tel5zqopkssrh/books.csv?dl=1", "books.csv")
urllib.request.urlretrieve("https://www.dropbox.com/s/zpnnoy1i8ljf9fg/goodreads_bert_embeddings.npy?dl=1", "goodreads_bert_embeddings.npy")
urllib.request.urlretrieve("https://www.dropbox.com/s/a8hcc9w30y7r3jl/goodreads_bert_large_embeddings.npy?dl=1", "goodreads_bert_large_embeddings.npy")
urllib.request.urlretrieve("https://www.dropbox.com/s/dqeqpsr0vdvmcy0/goodreads_past_interactions.json?dl=1", "goodreads_past_interactions.json")
urllib.request.urlretrieve("https://www.dropbox.com/s/rjtzhmb2zbpp30q/goodreads_test_interactions.json?dl=1", "goodreads_test_interactions.json")

('goodreads_test_interactions.json',
 <http.client.HTTPMessage at 0x230847cc400>)

# Cargar datos adicionales

In [3]:
df_books = pd.read_csv('books.csv', sep=',')
df_books.head()

Unnamed: 0,book_id,goodreads_book_id,best_book_id,work_id,books_count,isbn,isbn13,authors,original_publication_year,original_title,...,work_ratings_count,work_text_reviews_count,ratings_1,ratings_2,ratings_3,ratings_4,ratings_5,image_url,small_image_url,book_desc
0,1,2767052,2767052,2792775,272,439023483,9780439000000.0,Suzanne Collins,2008.0,The Hunger Games,...,4942365,155254,66715,127936,560092,1481305,2706317,https://images.gr-assets.com/books/1447303603m...,https://images.gr-assets.com/books/1447303603s...,Winning will make you famous. Losing means cer...
1,2,3,3,4640799,491,439554934,9780440000000.0,"J.K. Rowling, Mary GrandPré",1997.0,Harry Potter and the Philosopher's Stone,...,4800065,75867,75504,101676,455024,1156318,3011543,https://images.gr-assets.com/books/1474154022m...,https://images.gr-assets.com/books/1474154022s...,Harry Potter's life is miserable. His parents ...
2,3,41865,41865,3212258,226,316015849,9780316000000.0,Stephenie Meyer,2005.0,Twilight,...,3916824,95009,456191,436802,793319,875073,1355439,https://images.gr-assets.com/books/1361039443m...,https://images.gr-assets.com/books/1361039443s...,About three things I was absolutely positive.F...
3,4,2657,2657,3275794,487,61120081,9780061000000.0,Harper Lee,1960.0,To Kill a Mockingbird,...,3340896,72586,60427,117415,446835,1001952,1714267,https://images.gr-assets.com/books/1361975680m...,https://images.gr-assets.com/books/1361975680s...,The unforgettable novel of a childhood in a sl...
4,5,4671,4671,245494,1356,743273567,9780743000000.0,F. Scott Fitzgerald,1925.0,The Great Gatsby,...,2773745,51992,86236,197621,606158,936012,947718,https://images.gr-assets.com/books/1490528560m...,https://images.gr-assets.com/books/1490528560s...,Alternate Cover Edition ISBN: 0743273567 (ISBN...


In [4]:
# diccionario con id del usuario y id de libros con los que ha interactuado en el pasado 
with open('goodreads_past_interactions.json') as f:
    user_interactions = json.load(f)

# diccionario con id del usuario y id de libros para testear el modelo     
with open('goodreads_test_interactions.json') as f:
    user_interactions_test = json.load(f)

In [6]:
user_interactions['1']

[258,
 268,
 3638,
 1796,
 867,
 2738,
 4691,
 916,
 11,
 3889,
 136,
 6665,
 35,
 60,
 148,
 10,
 4,
 57,
 1521,
 70,
 103,
 36,
 119,
 13,
 66,
 2002,
 43,
 287,
 1041,
 67,
 46,
 22,
 115,
 31,
 16,
 256,
 273,
 378,
 329,
 98,
 216,
 1176,
 140,
 1310,
 414,
 85,
 219,
 177,
 102,
 95,
 225,
 76,
 100,
 171,
 485,
 325,
 498,
 323,
 72,
 496,
 1030,
 1055,
 2770,
 1187,
 2535,
 3294,
 4893,
 2133,
 262,
 437,
 421,
 901,
 212]

In [7]:
# dict index 2 book id and vice-versa for recommendation 
idx2bookid = {i: id_ for i, id_ in enumerate(df_books.book_id)}
bookid2idx = {id_:i for i, id_ in enumerate(df_books.book_id)}

# Cargar características pre-entrenadas: BERT y BERT-large

En esta sección se trabajará con modelos pre-entrenados de modelos de lenguage BERT y BERT-large que convierten texto a embeddings. 

Bidirectional Encoder Representations from Transformers (BERT) es una técnica de NLP (Natural Language Processing) desarrollada por Google y publicada en 2018 por Jacob Devlin. 

Actualmente Google utiliza BERT para entender las consultas de los usuarios en su buscador. 

Tiene dos versiones: 
- **BERT:** 12 capas, 12 cabezales de atencion y 110 millones de parámetros. Genera vectores de 768 dimensiones 
- **BERT-large:** 24 capas, 16 cabezales de atencion y 340 millones de parámetros.  

![BERT y BERT-large](http://jalammar.github.io/images/bert-base-bert-large.png)

![BERT y BERT-large arquitectura](http://jalammar.github.io/images/bert-base-bert-large-encoders.png)

En este caso los textos que utilizaremos son los títulos de los libros con su descripción y compararemos los resultados de recomendación con BERT y BERT-large. Para efectos de este trabajo los vectores de características ya fueron entrenados y guardados en archivos numpy. A continuación son cargados en memoria.

Para mayores detalles sobre el modelo de lenguaje BERT se recomienda revisar el siguiente artículo:
- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf)

In [8]:
bert_featmat = np.load('goodreads_bert_embeddings.npy', allow_pickle=True)
bert_large_featmat = np.load('goodreads_bert_large_embeddings.npy', allow_pickle=True)

In [9]:
bert_featmat.shape

(4287, 768)

In [10]:
bert_large_featmat.shape

(4287, 1024)

In [11]:
bert_featmat[5]

array([ 4.84995842e-01,  3.47123146e-02,  1.82838421e-02,  4.71778587e-02,
       -9.59259123e-02,  4.44397539e-01, -1.40853271e-01, -1.73830017e-02,
       -3.44723985e-02,  5.03976084e-02, -1.25274509e-01,  2.33933926e-01,
        1.72393486e-01, -4.44924459e-02,  6.76755458e-02, -9.40277800e-02,
        7.49101043e-02,  4.60708678e-01, -2.49723822e-01, -1.68240890e-01,
       -9.21016037e-02, -6.15491830e-02,  9.05511603e-02,  3.28459591e-02,
       -4.04576123e-01, -1.64132327e-01,  1.70409709e-01,  2.54194383e-02,
        1.03816532e-01, -3.57919652e-03,  7.22740144e-02, -2.62293249e-01,
        4.91619343e-04, -2.49186531e-02,  1.27423316e-01,  5.59403673e-02,
        1.01800971e-01,  4.02531624e-02, -2.25695863e-01, -4.02649213e-03,
       -3.45016539e-01,  1.14103138e-01, -1.20682888e-01,  1.04268208e-01,
       -3.60891335e-02,  1.34674117e-01,  5.88834062e-02,  2.11443812e-01,
       -9.84561443e-02, -1.35545731e-01,  1.21868595e-01, -1.37343898e-01,
       -7.76119754e-02, -

## Reducción de dimensionalidad (PCA)

In [12]:
# Project into a 20 PCA feature space
pca20_bert_featmat = PCA(n_components=20).fit_transform(bert_featmat)
pca20_bert_large_featmat = PCA(n_components=20).fit_transform(bert_large_featmat)

In [13]:
pca20_bert_featmat.shape

(4287, 20)

# Similar document retrieval 

En esta sección utilizaremos los vectores cargados para hacer un sistema de recuperación o búsqueda de información, para diferentes métricas de distancia.

Buscamos libros similares de acuerdo a la representación vectorial (BERT) de su título y descripción. 


In [14]:
# format results 
pd.options.display.max_colwidth = 50
pd.set_option('display.max_colwidth', -1)

  pd.set_option('display.max_colwidth', -1)


In [15]:
# Find similar images by image id
def find_similar_books(embedding, query_id=None, metric='euclidean', topk=10):
    
    n = embedding.shape[0]
    
    if query_id is None:
        query_i = np.random.randint(n)
        query_id = idx2bookid[query_i]
    
    else:
        query_i = bookid2idx[query_id]
        
    
    distances = pairwise_distances(embedding[query_i].reshape(1,-1), embedding, metric=metric)
    heap = []
    for i in range(n):            
        if len(heap) < topk:
            heapq.heappush(heap, (-distances[0][i], i))
        else:
            heapq.heappushpop(heap, (-distances[0][i], i))

    heap.sort(reverse=True)
    rec_ids = [idx2bookid[i] for _,i in heap]
    
    return rec_ids

In [16]:
# libros similares al libro de id 41865 (Twilight) utilizando distancia euclideana. se puede cambiar a "cosine" 
similar_books = find_similar_books(bert_featmat, query_id = 3, metric = 'euclidean', topk=10 )
similar_books

[3, 2908, 3115, 2303, 7334, 7235, 5721, 3510, 9552, 9696]

In [17]:
df_books[df_books.book_id.isin(similar_books)][['book_id', 'original_title', 'book_desc', 'authors']]

Unnamed: 0,book_id,original_title,book_desc,authors
2,3,Twilight,"About three things I was absolutely positive.First, Edward was a vampire.Second, there was a part of him—and I didn't know how dominant that part might be—that thirsted for my blood.And third, I was unconditionally and irrevocably in love with him.In the first book of the Twilight Saga, internationally bestselling author Stephenie Meyer introduces Bella Swan and Edward Cullen, a pair of star-crossed lovers whose forbidden relationship ripens against the backdrop of small-town suspicion and a mysterious coven of vampires. This is a love story with bite.",Stephenie Meyer
1423,2303,Bloody Bones,"In Laurell K. Hamilton's ""New York Times"" bestselling novels, Anita Blake, vampire hunter and animator, takes a bite out of crime-of the supernatural kind. But even someone who deals with death on a daily basis can be unnerved by its power... When Branson, Missouri, is hit with a death wave-four unsolved murders-it doesn't take an expert to realize that all is not well. But luckily for the locals, Anita is an expert-in just the kinds of preternatural goings-on that have everyone spooked. And she's got an ""in"" with just the kind of creature who can make sense of the slayings: a sexy master vampire known as Jean Claude.",Laurell K. Hamilton
1708,2908,"Severed Heads, Broken Hearts","Robyn Schneider's The Beginning of Everything is a witty and heart-wrenching teen novel that will appeal to fans of books by John Green and Ned Vizzini, novels such as The Perks of Being a Wallflower, and classics like The Great Gatsby and The Catcher in the Rye.Varsity tennis captain Ezra Faulkner was supposed to be homecoming king, but that was before—before his girlfriend cheated on him, before a car accident shattered his leg, and before he fell in love with unpredictable new girl Cassidy Thorpe.As Kirkus Reviews said in a starred review, ""Schneider takes familiar stereotypes and infuses them with plenty of depth. Here are teens who could easily trade barbs and double entendres with the characters that fill John Green's novels.""Funny, smart, and including everything from flash mobs to blanket forts to a poodle who just might be the reincarnation of Jay Gatsby, The Beginning of Everything is a refreshing contemporary twist on the classic coming-of-age novel—a heart-wrenching story about how difficult it is to play the part that people expect, and how new beginnings can stem from abrupt and tragic endings.",Robyn Schneider
1808,3115,A Hunger Like No Other,"In New York Times and USA TODAY bestselling author Kresley Cole’s sizzling series, a fierce werewolf and a bewitching vampire become unlikely soul mates whose passion will test the boundaries of life and death.After enduring years of torture from the vampire horde, Lachlain MacRieve, leader of the Lykae Clan, is enraged to find the predestined mate he’s waited millennia for is a vampire. Or partly one. Emmaline Troy is a small, ethereal half Valkyrie/half vampire, who somehow begins to soothe the fury burning within him.Sheltered Emmaline finally sets out to uncover the truth about her deceased parents—until a powerful Lykae claims her as his mate and forces her back to his ancestral Scottish castle. There, her fear of the Lykae—and their notorious dark desires—ebbs as he begins a slow, wicked seduction to sate her own dark cravings.Yet when an ancient evil from her past resurfaces, will their desire deepen into a love that can bring a proud warrior to his knees and turn a gentle beauty into the fighter she was born to be?",Kresley Cole
1999,3510,"Cerulean Sins (Anita Blake, Vampire Hunter, #11)","Cerulean Sins, the eleventh entry in the hugely-popular Anita Blake series, finds everyone’s favorite vampire hunter keeping house and kicking butt.Anita Blake is trying to get her life back to “normal” after a break-up with her werewolf lover. She has settled into a pattern of domesticity, which means that the new man in her life, the leopard shapeshifter Micah, has no problem sharing her with Jean-Claude, Master Vampire of the City. Things are as peaceful as they ever get for someone who raises the dead, when Jean-Claude receives an unexpected and unwelcome visitor: Musette, the very beautiful, very twisted representative of the European Council of Vampires. Anita soon finds herself caught up in a dangerous game of vampire power politics.To add to her troubles, she is asked to consult on a series of brutal killings, which seem to be the work of something un-human. The investigation leads her to Cerulean Sins, a vampire-run establishment that deals in erotic videos, videos that cater to very specific tastes. Anita knows one creature of the night who has such interests — Jean-Claude’s visitor. But if Anita brings Musette down, the repercussions could cost her everything she holds dear.Once a sworn enemy of all monsters, Anita is now the human consort of both Master Vampire Jean Claude and leopard shapeshifter Micah. When a centuries-old vampire hits St. Louis, Anita finds herself needing all the dark forces her passion can muster to save the ones she loves.Anita Blake returns to find hell hath no fury like a vampire scorned.",Laurell K. Hamilton
2903,5721,No Humans Involved,"In her acclaimed Women of the Otherworld series, bestselling author Kelley Armstrong creates a present day in which humans unwittingly coexist with werewolves, witches, and other supernatural beings. Now, in this spellbinding new novel, a beautiful necromancer who can see ghosts must come to terms with her power—and with an evil she never thought possible.\r\r\r\r\r\r\nIt’s the most anticipated reality television event of the season: three spiritualists gathered together in one house to raise the ghost of Marilyn Monroe. For celebrity medium Jaime Vegas, it is to be her swan song—one last publicity blast for a celebrity on the wrong side of forty. But unlike her colleagues, who are more show than substance, Jaime is the real thing. Reluctant to upstage her fellow spiritualists, Jaime tries to suppress her talents, as she has done her entire life. But there is something lurking in the maze of gardens behind the house: a spirit without a voice. And it won’t let go until somehow Jaime hears its terrible story. For the first time in her life, Jaime Vegas understands what humans mean when they say they are haunted. Distraught, Jaime looks to fellow supernatural Jeremy Danvers for help.As the touches and whispers from the garden grow more frantic, Jaime and Jeremy embark on an investigation into a Los Angeles underworld of black magic and ritual sacrifice. When events culminate in a psychic showdown, Jaime must use the darkest power she has to defeat a shocking enemy—one whose malicious force comes from the last realm she expected. . . .In a world whose surface resembles our own, Kelley Armstrong delivers a stunning alternate reality, one where beings of the imagination live, love, and fight a never-ending battle between good and evil.From the Hardcover edition.",Kelley Armstrong
3423,7235,Afterburn,"#1 New York Times bestselling author Sylvia Day, America’s premier author of provocative fiction, delivers the debut novel from Cosmo Red-Hot Reads from Harlequin.The realization that Jax still affected me so strongly was a jagged pill to swallow. He’d only been part of my life for five short weeks two years ago. But now he was back. Walking into a deal I’d worked hard to close. And God, he was magnificent. His eyes were a brown so dark they were nearly black. Thickly lashed, they were relentless in their intensity. Had I really thought they were soft and warm? There was nothing soft about Jackson Rutledge. He was a hard and jaded man, cut from a ruthless cloth.In that moment I understood how badly I wanted to unravel the mystery of Jax. Bad enough that I didn’t mind how much it was going to cost me...",Sylvia Day
3459,7334,Graffiti Moon,"Lucy is in love with Shadow, a mysterious graffiti artist.Ed thought he was in love with Lucy, until she broke his nose.Dylan loves Daisy, but throwing eggs at her probably wasn't the best way to show it.Jazz and Leo are slowly encircling each other.An intense and exhilarating 24 hours in the lives of four teenagers on the verge: of adulthood, of HSC, of finding out just who they are, and who they want to be.A lyrical new YA novel from the award-winning author of \r\r\r\r\r\r\nChasing Charlie Duskin\r\r\r\r\r\r\n and the Gracie Faltrain series.",Cath Crowley
4152,9552,The Last Werewolf,"Here is a powerful, definitive new version of the werewolf legend—mesmerising and incredibly sexy. In Jake, Glen Duncan has given us a werewolf for the twenty-first century—a man whose deeds can only be described as monstrous but who is in some magical way deeply human.Meet Jake. A bit on the elderly side (he turns 201 in March), but otherwise in the pink of health. The nonstop sex and exercise he's still getting probably contribute to that, as does his diet: unusual amounts of flesh and blood (at least some from friends and relatives). Jake, of course, is a werewolf, and with the death of his colleague he has now become the only one of his kind. This depresses Jake to the point that he's been contemplating suicide. Yet there are powerful forces who for very different reasons want - and have the power - to keep Jake alive. Here is a powerful new version of the werewolf legend - mesmerizing and undeniably sexy, and with moments of violence so elegantly wrought they dazzle rather than repel. But perhaps its most remarkable achievement is to make the reader feel sympathy for a man who can only be described as a monster - and in doing so, remind us what it means to be human. One of the most original, audacious, and terrifying novels in years.",Glen Duncan
4198,9696,Perfect Shadow,"Discover the origins of Durzo Blint in this original novella set in the world of Brent Weeks' New York Times bestselling Night Angel trilogy.""I got a bit of prophecy,"" the old assassin said. ""Not enough to be useful, you know. Just glimpses. My wife dead, things like that to keep me up late at night. I had this vision that I was going to be killed by forty men, all at once. But now that you're here, I see they're all you. Durzo Blint.""Durzo Blint? Gaelan had never even heard the name.***Gaelan Starfire is a farmer, happy to be a husband and a father; a careful, quiet, simple man. He's also an immortal, peerless in the arts of war. Over the centuries, he's worn many faces to hide his gift, but he is a man ill-fit for obscurity, and all too often he's become a hero, his very names passing into legend: Acaelus Thorne, Yric the Black, Hrothan Steelbender, Tal Drakkan, Rebus Nimble.But when Gaelan must take a job hunting down the world's finest assassins for the beautiful courtesan-and-crimelord Gwinvere Kirena, what he finds may destroy everything he's ever believed in.Word count: ~17,000",Brent Weeks


# Recomendación basada en contenido

In [19]:
def recommend(embedding, user_id=None, topk=10, metric='cosine'):
    
    #print("user_id = ", user_id)
    
    user_id = str(user_id)
    
    #Calculate distance metrics
    trx = user_interactions[user_id]
    n = embedding.shape[0]
    distances = 1e9
    
    # recorremos transacciones pasadas del usuario 
    for t in trx:
        query_i = bookid2idx[t]
        
        # recomendamos items más cercanos a items con los que interactuó el usuario
        distances = np.minimum(distances, pairwise_distances(
                embedding[query_i].reshape(1,-1), embedding, metric=metric).reshape(-1))

    #Rank items de menor a mayor distancia (nos quedamos con los topk)
    trx_set = set(trx)
    heap = []
    for i in range(n):
        if idx2bookid[i] in trx_set:
            continue
        if len(heap) < topk:
            heapq.heappush(heap, (-distances[i], i))
        else:
            heapq.heappushpop(heap, (-distances[i], i))
    heap.sort(reverse=True)
    
    # utilizamos un heap para extraer los items ordenados de menor a mayor distancia 
    recommended_ids = [idx2bookid[i] for _,i in heap]
    
    # retornar los que el usuario no haya consumido
    filtered_recommended_ids = []
    
    return recommended_ids

In [20]:
# recomendación para el usuario id = 50101 , utilizando bert con reduccion de dimensionalidad a 20 
user_id = '50101'
rec = recommend(pca20_bert_featmat, user_id=user_id, topk=15)
rec 

[4509,
 5002,
 4376,
 964,
 2292,
 7937,
 5126,
 9473,
 5244,
 390,
 6219,
 7602,
 7913,
 6865,
 2796]

# Recomendación híbrida - Cascada

In [22]:
user_items = {}
itemset = set()

for uid in user_interactions.keys():
    for iid in user_interactions[uid]:
        if int(uid) not in user_items:
            user_items[int(uid)] = []

        user_items[int(uid)].append(iid)
        itemset.add(iid)

itemset = np.sort(list(itemset))

sparse_matrix = np.zeros((len(user_items), len(itemset)))

for i, items in enumerate(user_items.values()):
    sparse_matrix[i] = np.isin(itemset, items, assume_unique=True).astype(int)

user_item_matrix = sparse.csr_matrix(sparse_matrix)

user_ids = {key: i for i, key in enumerate(user_items.keys())}
items_ids = {key: i for i, key in enumerate(itemset)}

In [23]:
items_ids

{1: 0,
 2: 1,
 3: 2,
 4: 3,
 5: 4,
 6: 5,
 7: 6,
 8: 7,
 10: 8,
 11: 9,
 12: 10,
 13: 11,
 14: 12,
 16: 13,
 17: 14,
 18: 15,
 19: 16,
 20: 17,
 22: 18,
 23: 19,
 24: 20,
 25: 21,
 26: 22,
 27: 23,
 28: 24,
 29: 25,
 30: 26,
 31: 27,
 34: 28,
 35: 29,
 36: 30,
 37: 31,
 38: 32,
 39: 33,
 40: 34,
 41: 35,
 43: 36,
 44: 37,
 46: 38,
 48: 39,
 49: 40,
 50: 41,
 51: 42,
 52: 43,
 53: 44,
 56: 45,
 57: 46,
 58: 47,
 59: 48,
 60: 49,
 61: 50,
 64: 51,
 65: 52,
 66: 53,
 67: 54,
 69: 55,
 70: 56,
 71: 57,
 72: 58,
 73: 59,
 74: 60,
 76: 61,
 77: 62,
 79: 63,
 80: 64,
 82: 65,
 83: 66,
 85: 67,
 87: 68,
 88: 69,
 89: 70,
 90: 71,
 91: 72,
 92: 73,
 93: 74,
 94: 75,
 95: 76,
 96: 77,
 97: 78,
 98: 79,
 99: 80,
 100: 81,
 102: 82,
 103: 83,
 104: 84,
 105: 85,
 107: 86,
 108: 87,
 110: 88,
 112: 89,
 113: 90,
 114: 91,
 115: 92,
 116: 93,
 117: 94,
 118: 95,
 119: 96,
 120: 97,
 122: 98,
 123: 99,
 124: 100,
 125: 101,
 126: 102,
 127: 103,
 130: 104,
 134: 105,
 135: 106,
 136: 107,
 138: 108,


In [24]:
bookid2idx

{1: 0,
 2: 1,
 3: 2,
 4: 3,
 5: 4,
 6: 5,
 7: 6,
 8: 7,
 10: 8,
 11: 9,
 12: 10,
 13: 11,
 14: 12,
 16: 13,
 17: 14,
 18: 15,
 19: 16,
 20: 17,
 22: 18,
 23: 19,
 24: 20,
 25: 21,
 26: 22,
 27: 23,
 28: 24,
 29: 25,
 30: 26,
 31: 27,
 34: 28,
 35: 29,
 36: 30,
 37: 31,
 38: 32,
 39: 33,
 40: 34,
 41: 35,
 43: 36,
 44: 37,
 46: 38,
 48: 39,
 49: 40,
 50: 41,
 51: 42,
 52: 43,
 53: 44,
 56: 45,
 57: 46,
 58: 47,
 59: 48,
 60: 49,
 61: 50,
 64: 51,
 65: 52,
 66: 53,
 67: 54,
 69: 55,
 70: 56,
 71: 57,
 72: 58,
 73: 59,
 74: 60,
 76: 61,
 77: 62,
 79: 63,
 80: 64,
 82: 65,
 83: 66,
 85: 67,
 87: 68,
 88: 69,
 89: 70,
 90: 71,
 91: 72,
 92: 73,
 93: 74,
 94: 75,
 95: 76,
 96: 77,
 97: 78,
 98: 79,
 99: 80,
 100: 81,
 102: 82,
 103: 83,
 104: 84,
 105: 85,
 107: 86,
 108: 87,
 110: 88,
 112: 89,
 113: 90,
 114: 91,
 115: 92,
 116: 93,
 117: 94,
 118: 95,
 119: 96,
 120: 97,
 122: 98,
 123: 99,
 124: 100,
 125: 101,
 126: 102,
 127: 103,
 130: 104,
 134: 105,
 135: 106,
 136: 107,
 138: 108,


In [25]:
model_als = implicit.als.AlternatingLeastSquares()
model_als.fit(user_item_matrix)

100%|██████████████████████████████████████████████████████████████████████████████████| 15/15 [00:39<00:00,  2.62s/it]


## Recomendación final

In [26]:
user_id = '965'
rec = recommend(pca20_bert_featmat, user_id=user_id, topk=30)
rec

[4509,
 5126,
 9473,
 5244,
 390,
 6219,
 7602,
 7913,
 6865,
 2796,
 1003,
 7701,
 9941,
 1595,
 4606,
 10,
 941,
 1663,
 6250,
 3349,
 9785,
 6265,
 2847,
 7032,
 1075,
 2166,
 9983,
 5161,
 5256,
 2056]

In [27]:
recommendations = model_als.recommend(userid=user_ids[965], user_items=user_item_matrix[user_ids[965]], N=10, items=[bookid2idx[r] for r in rec])

In [28]:
[idx2bookid[r] for r in recommendations[0]]

[2056, 1075, 1003, 2166, 941, 2847, 5161, 4509, 5256, 3349]