# Système de recommandation de livres basé sur le contenu

# Cahier des charges

Projet : Système de recommandation de livres

Objectif : 
- Prédire les livres qu'un utilisateur pourrait aimer en fonction de ses préférences (recommandation     collaborative) et des similarités avec d'autres livres (recommandation basée sur le contenu)

Techniques Utilisées :
- Filtrage collaboratif (basé-mémoire/basé-modèle)
- Recommandation basée sur le contenu (TF-IDF, embeddings : Word2Vec, GloVe, etc)
- Modèles : KNN, SVD, NMF, etc

Étapes :

Dataset
- Utiliser : https://www.kaggle.com/datasets/arashnic/book-recommendation-dataset?select=Users.csv
- Tester d’autres datasets de recommandation de livres

Prétraitement
- Nettoyage des données (valeurs manquantes, doublons)
- Feature engineering (genres, auteurs, notes moyennes)
- Vectorisation des textes (titres, descriptions) avec TF-IDF, Word2Vec, GloVe, etc

In [58]:
import pandas as pd
import numpy as np
from surprise import Dataset, Reader, SVD

In [59]:
path = "./datasets/"

books = pd.read_csv(path+"Books.csv")
users= pd.read_csv(path+"Users.csv")
ratings = pd.read_csv(path+"Ratings.csv")

In [60]:
books.head()

Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...


In [61]:
def show_image(url, width=100):
    """ Affiche les images du dataset"""
    return f'<img src="{url}" width="{width}">'

def url_to_img(df, func):
    return df.style.format({'Image-URL-S':func, 'Image-URL-M':func, 'Image-URL-L':func}, escape=False)

In [62]:
mask=np.where(books.isna())
books_nan=books.iloc[mask[0]]

books_nan

Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
118033,0751352497,A+ Quiz Masters:01 Earth,,1999,Dorling Kindersley,http://images.amazon.com/images/P/0751352497.0...,http://images.amazon.com/images/P/0751352497.0...,http://images.amazon.com/images/P/0751352497.0...
128890,193169656X,Tyrant Moon,Elaine Corvidae,2002,,http://images.amazon.com/images/P/193169656X.0...,http://images.amazon.com/images/P/193169656X.0...,http://images.amazon.com/images/P/193169656X.0...
129037,1931696993,Finders Keepers,Linnea Sinclair,2001,,http://images.amazon.com/images/P/1931696993.0...,http://images.amazon.com/images/P/1931696993.0...,http://images.amazon.com/images/P/1931696993.0...
187689,9627982032,The Credit Suisse Guide to Managing Your Perso...,,1995,Edinburgh Financial Publishing,http://images.amazon.com/images/P/9627982032.0...,http://images.amazon.com/images/P/9627982032.0...,http://images.amazon.com/images/P/9627982032.0...


lignes a verif :

- 118033
- 128890
- 129037
- 187689
- 209540
- 220731
- 221678

In [63]:
books_nan.style.format({'Image-URL-S':show_image, 'Image-URL-M':show_image, 'Image-URL-L':show_image})

Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
118033,0751352497,A+ Quiz Masters:01 Earth,,1999,Dorling Kindersley,,,
128890,193169656X,Tyrant Moon,Elaine Corvidae,2002,,,,
129037,1931696993,Finders Keepers,Linnea Sinclair,2001,,,,
187689,9627982032,The Credit Suisse Guide to Managing Your Personal Wealth,,1995,Edinburgh Financial Publishing,,,


In [64]:
books.duplicated().sum()

0