Skip to content

SarahMestiri/RecommenderSystems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 

Repository files navigation

Recommender Systems

Introduction

This repository contains resources about Recommender Systems as well as my RecSys projects trying out different methods.

Do you know that recommendation engine knows you better than you best friend?

Amazon, Netflix, Zalando, and many other companies are providing each customer a personalized user experience through recommendations. These recommendations fit the customer's interests. That means, if you are a software engineer, then Amazon will suggest programming books for you, and if you are a new mom, then it will suggest baby toys, diaper,etc. The magic of recommender systems is more than that: they not only provide you with what you need, but also surprise you with things that you didn't know you will like!

How is this unique "You" leveraged by recommender systems?

Let's discover that by learning more about one of the most famous methods for recommender systems: Collaborative Filtering (CF)!

Note: If you want to get an idea about the different types of recommender systems. check this article 5 Types of Recommenders.

Collaborative Filtering

The basic idea behind collaborative filtering is that people who agreed in certain evalution patters in the past, are likely to agree again in the future.

Memory-based Collaborative Filtering

Memory-based collaborative Filtering is one of the earliest methods developed for collaborative filtering, it's also called, Neighborhood-based collaborative filtering. We can distinguich 2 primary types in neighborhood methods:

  1. User-based collaborative filtering:
    Similar users are considered to have similar ratings on the same item.
  2. Item-based collaborative filtering:
    Similars items are considered to have similar ratings by the same user.

Neighborhood-based methods can be viewed as generalizations of k-nearest neighbor classifiers (KNN).

Although both types (user-based & item-based) have similar results, Item-based CF is often preferred because it uses the user's own ratings. Besides, it's more stable with changes => adding new user occurs more than adding new item => more similarity calculations and update of the ratings matrix.

Model-based Collaborative Filtering

Model-based Collaborative Filtering are more advanced methods that are considered more effective than memory-based collaborative filtering. They offer space-efficiency and they able to deliver accurate results even with a sparse matrix.

Some of the most accurate methods used in recommender systems are based on model-based techniques in general, and one of the most effective methods among them is latent factor models.

Latent Factor Models

Latent Factor Models leverage dimensionality reduction methods in order to estimate the ratings matrix. It uses the fact of high correlation between rows and columns of the ratings matrix which results in the approximation by a low-rank matrix and it's possible to obtain the fully specified low-rank matrix because of the redundancies in the data, even with a smaller subset of the original ratings matrix.

A good explanation of Matrix Factorization used by Latent Factor Models in order to reduce the dimension of the matrix is here by Nicolas Hug, Surprise Library developer.

Building a recommender system using collaborative filtering (with Resources)

To build a recommender system, you have a large set of libraries that you can use, and each one of them has its advantages and drawbacks. You can find a good Quora answer about the topic here:

Start with a small dataset then scale it to see what's the difference between results. Also, because if you choose a big dataset it's possible to get a MemoryError like it happened to me when I used the whole dataset of book-crossing. To scale the dataset, I would suggest to use Spark as it has a strong performance.

Available Datasets

Reference

Note

Originally I gave this as a talk in Data Natives 2017. Slides are here

About

This repository contains resources about Recommender Systems as well as my RecSys projects trying out different methods.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published