-
Notifications
You must be signed in to change notification settings - Fork 2
Recommender Service API
Collaborative filtering techniques (e.g. (Goldberg et al. 1992), (Resnick et al. 1994), (Shardanand & Maes,1995), (Hill et al. 1995), etc.) predict user’s affinity for items on the basis of the ratings that other users have made to these items in the past. Therefore, the steps taken to make recommendation in such systems consist in finding people with similar tastes to the user (or items with similar rating patterns as the one that the user has rated) by means of its past ratings; and by means of their ratings extrapolate the user future ratings. User information in a collaborative system consists of a vector of items and their associated ratings; finding similar users translates into finding similar vectors. The main advantages of collaborative techniques is that they are completely domain independent
- Cross-genre niches identification. Collaborative filtering has proven to be very effective at thinking out-of the box
- Domain independence. Domain knowledge is not needed (e.g. the same algorithm that rates movies can be used to recommend whatever)
- The quality of its results improves over time and implicit user feedback sufficient
Quality dependent on large historical data set, causing:
- Cold-start problems.
- New User. When a new user arrives at the system, there is no sufficient rating information to sketch user’s preferences; and there might be also a lack of information about the user itself. Both situations must be tacked by the recommender system.
- New Item. Every time a new Research Object is created the recommender system must recommend this new item and make to any of the users of the system that might be interested on it. Unlike the case of the new user problem, the possibility of not having enough information about the Research Object is less probable, since we assume that the information about the Research Object is accessible following the Linked Data principles [1] . Nevertheless, we have a problem regarding the estimation of the Research Object reputation given by the user community.
- Gray sheep problem. This problem is related with the new user problem. Some users are mere observers in a social scenario; they don’t rate items nor provide any means to extract their taste form their social interactions. Therefore, the system hasn’t got enough information about them, such in the case of new users.
- The sparsity problem. The sparsity problem typically occurs in systems with large number of items in which there are plenty of items rated only by few users, and many users which rated only few. The set of items rated but just few users would unlikely be recommended, no matter how high its reputation might be. The recommender system should minimize as much as possible this specific situation.
Content-based recommender systems (e.g. (Belkin and Croft, 1992), (Lang, 1995), (Schafer et al., 1999), etc. ) make use of information retrieval and filtering techniques. A content-based recommender tries to infer users future items of interest on the basis of the features of the objects that the users rated in the past. These object features are items of interest such as keywords that define the object, a summary of its content, etc.. Content-based techniques have similar advantages to collaborative filtering approaches (without the ability of detecting cross-genre niches), and they do not exhibit the new item problem. Nonetheless, they still rely in a large historical data set.
Content-based recommenders recommend items based upon:
- A description of the content of the item (i.e. ROs or resources)
- A profile of the user’s interest
- The user’s profile is embodied by a set of keywords that have been previously proposed by the user (and its assigned tags)
- The RO (or resource) description, being of great importance:
- The title and description (see the Content-Req)
- The tags that have been applied to the item by the user community (see the Reputation-Req)
The advantages of content-based recommendation algorithms are:
- No new item problem!
- Solely ratings provided by the active user to build her own profile, no need for data on other users
- The new user handling problem, as the system stills don’t have a well-formed user’s profile. Nevertheless, this technique doesn’t rely on statistical information, just needs that the user provides a small set of keywords that represent