Imagine that you are the founder of a new ambitious startup "Social Network: Friends", which proposes to organize the search for new friends by comparing tastes in the field of cinema. Each user registering on the portal is asked to complete a very simple questionnaire. It consists of N films, for each of which one of two ratings is required: 1 if the user can say that he likes the specified film, and 0 otherwise (if he does not like it or the user has not watched it).

After filling out the questionnaire, the user receives a list of the most suitable for him in terms of cinematic compatibility.

### Assignment 10.2

Your task is to write the friendadviser class, within which to implement the following methods:


     1. The fit(self, R) method, which takes an M×N matrix R as input, where M is the number of registered users of your social network, and N is the number of films in the questionnaire. Matrix element rij is a mark placed by user i in the questionnaire against movie j.
     
     2. _sim(u1, u2) is a function that calculates the similarity (PMI) of users u1 and u2 by their rating vectors. We recommend using a "truncated" version of PMI, which is called score in the lecture.
     
     3. U_idx(u0, alpha) is a function to find a set of registered users (namely, their indexes) whose tastes are at least equal to the value of alpha with the new user u0.
     
     4. find_friends(u0, how_many) is a function that finds new friends for user u0 in the number specified by the how_many argument. At the output, we expect to get an array with the indices of such friends. For the sake of convenience, return the indexes in descending order of similarity of interests.

#### Solution

In [1]:
import numpy as np

In [2]:
class friendadviser(object): 
    def fit(self, R):
        self.R = R
        self.n_users = R.shape[0]
        self.n_items = R.shape[1]
        return self
    

    def _sim(self, u1, u2):
        nx, ny = u1.sum(), u2.sum()
        n_xy = (u1 * u2).sum()
        return n_xy/((nx + 1.e-6) * (ny + 1.e-6)) 
    
  
    def U_idx(self, u0, alpha):
        sim = np.array([ self._sim(u0, self.R[i, :]) for i in range(self.n_users) ])
        idx = np.argsort(sim)[::-1]
        ind = np.where(sim[idx] >= alpha)[0]
        return idx[ind]
    
    
    def find_friends(self, u0, how_many):
        idx = self.U_idx(u0, alpha=0.)
        return idx[:how_many]

An example:

In [3]:
u1 = [1, 0, 0, 1]
u2 = [1, 1, 0, 0]
u3 = [0, 1, 1, 1]
u4 = [0, 0, 0, 0]
u5 = [1, 0, 0, 1]

u0 = np.array([1, 1, 0, 0])

X = np.array([u1, u2, u3, u4, u5])

fa = friendadviser().fit(X)

In [4]:
fa.U_idx(u0, 0.2)

array([1, 4, 0])

In [5]:
fa.find_friends(u0, 2)

array([1, 4])