-
Notifications
You must be signed in to change notification settings - Fork 2
Recommender Service API
Collaborative filtering techniques (e.g. (Goldberg et al. 1992), (Resnick et al. 1994), (Shardanand & Maes,1995), (Hill et al. 1995), etc.) predict user’s affinity for items on the basis of the ratings that other users have made to these items in the past. Therefore, the steps taken to make recommendation in such systems consist in finding people with similar tastes to the user (or items with similar rating patterns as the one that the user has rated) by means of its past ratings; and by means of their ratings extrapolate the user future ratings. User information in a collaborative system consists of a vector of items and their associated ratings; finding similar users translates into finding similar vectors. The main advantages of collaborative techniques is that they are completely domain independent
- Cross-genre niches identification. Collaborative filtering has proven to be very effective at thinking out-of the box
- Domain independence. Domain knowledge is not needed (e.g. the same algorithm that rates movies can be used to recommend whatever)
- The quality of its results improves over time and implicit user feedback sufficient
Quality dependent on large historical data set, causing:
- Cold-start problems.
- New User. When a new user arrives at the system, there is no sufficient rating information to sketch user’s preferences; and there might be also a lack of information about the user itself. Both situations must be tacked by the recommender system.
- New Item. Every time a new Research Object is created the recommender system must recommend this new item and make to any of the users of the system that might be interested on it. Unlike the case of the new user problem, the possibility of not having enough information about the Research Object is less probable, since we assume that the information about the Research Object is accessible following the Linked Data principles [1] . Nevertheless, we have a problem regarding the estimation of the Research Object reputation given by the user community.
- Gray sheep problem. This problem is related with the new user problem. Some users are mere observers in a social scenario; they don’t rate items nor provide any means to extract their taste form their social interactions. Therefore, the system hasn’t got enough information about them, such in the case of new users.
- The sparsity problem. The sparsity problem typically occurs in systems with large number of items in which there are plenty of items rated only by few users, and many users which rated only few. The set of items rated but just few users would unlikely be recommended, no matter how high its reputation might be. The recommender system should minimize as much as possible this specific situation.
Content-based recommender systems (e.g. (Belkin and Croft, 1992), (Lang, 1995), (Schafer et al., 1999), etc. ) make use of information retrieval and filtering techniques. A content-based recommender tries to infer users future items of interest on the basis of the features of the objects that the users rated in the past. These object features are items of interest such as keywords that define the object, a summary of its content, etc.. Content-based techniques have similar advantages to collaborative filtering approaches (without the ability of detecting cross-genre niches), and they do not exhibit the new item problem. Nonetheless, they still rely in a large historical data set.
Content-based recommenders recommend items based upon:
- A description of the content of the item (i.e. ROs or resources)
- A profile of the user’s interest
- The user’s profile is embodied by a set of keywords that have been previously proposed by the user (and its assigned tags)
- The RO (or resource) description, being of great importance:
- The title and description (see the Content-Req)
- The tags that have been applied to the item by the user community (see the Reputation-Req)
The advantages of content-based recommendation algorithms are:
- No new item problem!
- Solely ratings provided by the active user to build her own profile, no need for data on other users
- The new user handling problem, as the system stills don’t have a well-formed user’s profile. Nevertheless, this technique doesn’t rely on statistical information, just needs that the user provides a small set of keywords that represent
The inference of new recommendations is made by means of the constrained spreading activation mechanism. Constrained activation techniques have been well studied in the Information Retrieval field. Initially defined by (Quillian, 1968) and (Collins and Loftus, 1975). Upon activation of a number of specific nodes, their activation is spread iteratively to adjacent nodes until some termination criterion is met. In the concrete case of the recommendation inference engine the activation equals to item recommendation with a given strength.
Following the approach presented in (Crestani, 1997) we adopt a constraint approach that introduces:
- Distance constraints.
- Path constraints
- Fan-out constraints
- Resource aggregation handling
- Research Object evolution handling
- No cold start problems
- Can include features that are not present in the items (e.g. Research Object, resource, etc.)
- Static behaviour. The propagation of new recommendation is constrained by the relations defined among concepts in the used ontologies. This kind of is not easily nor usually changed.
- Knowledge engineering required. The use of the constrained spreading activation mechanism assumes the pre-existence of a formally and explicitly defined model of the domain.
The recommendations inference engine addresses the requirements (Evolution-Req) (Repurposeable-Req) (Model-Req) (Cold-Req)
Recommender systems are inherently vertical and configured to provide recommendations in a single and specific domain. We need of means for tailoring specific recommendations in terms of each research community that in the future wishes to make use of the recommender system
We address this tailoring activity when we combine the recommendations obtained with different recommendation algorithms. The implementations of State of the Art hybrid recommendation systems (see (Burke, 2002) for a survey of such techniques) combination decision is usually:
- Hardcoded
- Implicit
- An explicit declarative way of expressing such policies.
- A combiner that detects when these policies are applicable and enacts them.
The Recommender Service provides a set of recommendations of scientific resources such as myExperiment files and workflows and research papers. The recipients of such recommendations must be myExperiment users, since the data used to create them is based on user's myExperiment profile and uploaded data. The interface is a REST API that basically can be used as follows:
HTTP: GET PATH/recommender/recommendations/recommendationSet/user/{userId}/{itemType}{?max}
Where:
-
PATH:the path where the Recommender Service is deployed -
userIDthe myExperiment id of the user.
C: GET /recommender HTTP/1.1 C: Host: service.example.org C: Accept: application/xml
S: HTTP/1.1 200 OK S: Content-Type: application/xml S: S: <recommender> S: &lt;filteredrecommendationsset&gt; S: /recommender/recommendations/recommendationSet/user/&#123;userId&#125;/&#123;itemType&#125;&#123;?max&#125; S: &lt;/filteredrecommendationsset&gt; S: &lt;recommendationsset&gt; S: /recommender/recommendations/recommendationSet/user/&#123;userId&#125;&#123;?max&#125; S: &lt;/recommendationsset&gt; S: &lt;recommendationcontext&gt; S: /recommender/contexts/recommendationContext&#123;?user,resource, keyword&#125; S: &lt;/recommendationcontext&gt; S: &lt;contextualizedrecommendationsset&gt; S: /recommender/recommendations/contextualizedRecommendationsSet&#123;?user,type, max&#125; S: &lt;/contextualizedrecommendationsset&gt; S: </recommender>
The client parses the service document, extracts the URI template for the recommender service and assembles URI for the desired recommendations set:
C: GET /recommender/recommendations/recommendationsSet/user/2 HTTP/1.1 C: Host: service.example.org C: Accept: application/xml
S: HTTP/1.1 200 OK S: Content-Type: application/xml S: S: <?xml version="1.0"?> S: <recommendationsset> S: &lt;recommendation&gt; S: &amp;lt;explanation&amp;gt; S: The workflow entitled Get names of proteins similar to RNA binding proteins (Simple example SADI workflow) S: (URI:http://www.myexperiment.org/workflow.xml?id=2127) is recommended to you since you used the following tags: sadi , S: taverna , spreadsheet; and they partially describe its content S: &amp;lt;/explanation&amp;gt; S: &amp;lt;itemtype&amp;gt; S: item_type_workflow S: &amp;lt;/itemtype&amp;gt; S: &amp;lt;resource&amp;gt; S: http://www.myexperiment.org/workflows/2127 S: &amp;lt;/resource&amp;gt; S: &amp;lt;strength&amp;gt; S: 5.0 S: &amp;lt;/strength&amp;gt; S: &amp;lt;title&amp;gt; S: Get names of proteins similar to RNA binding proteins (Simple example SADI workflow) S: &amp;lt;/title&amp;gt; S: &amp;lt;usedtechnique&amp;gt; S: technique_keyword_content_based S: &amp;lt;/usedtechnique&amp;gt; S: &lt;/recommendation&gt; S: S: ... S: S: &lt;/recommendation&gt; S: </recommendationsset>
The client can also assemble the URI for creating the desired recommendation context for a later use of the contextualized recommender:
C: PUT /recommender/contexts/recommendationContext?user=http://www.myexperiment.org/user.xml?id=2 C: &resource=http://www.myexperiment.org/workflow.xml?id=16 C: &resource=http://www.myexperiment.org/workflow.xml?id=1583 HTTP/1.1 C: Host: service.example.org C: Accept: application/xml
S: HTTP/1.1 200 OK S: Content-Type: application/xml
After creating the recommendation context the client can request a recommendations obtained using the provided context.
C: GET /recommender/recommendations/contextualizedRecommendationsSet?user=http://www.myexperiment.org/user.xml?id=2 HTTP/1.1 C: Host: service.example.org C: Accept: application/xml
S: HTTP/1.1 200 OK S: Content-Type: application/xml S: S: <?xml version="1.0"?> S: <recommendationsset> S: &lt;recommendation&gt; S: &amp;lt;explanation&amp;gt; S: The workflow entitled Pathways and Gene annotations for Arabidopsis affy data(URI:http://www.myexperiment.org S: /workflow.xml?id=726) is recommended to you since you selected the resources S: (&amp;lt;a href=&amp;quot;http://www.myexperiment.org/workflow.xml?id=16,&amp;quot; target=&amp;quot;_blank&amp;quot;&amp;gt;http://www.myexperiment.org/workflow.xml?id=1583&amp;lt;/a&amp;gt;) with similar components S: S: &amp;lt;/explanation&amp;gt; S: &amp;lt;itemtype&amp;gt;item_type_workflow&amp;lt;/itemtype&amp;gt; S: &amp;lt;resource&amp;gt;http://www.myexperiment.org/workflows/726&amp;lt;/resource&amp;gt; S: &amp;lt;strength&amp;gt;5.0&amp;lt;/strength&amp;gt; S: &amp;lt;title&amp;gt;Pathways and Gene annotations for Arabidopsis affy data&amp;lt;/title&amp;gt; S: &amp;lt;usedtechnique&amp;gt;technique_group_content_based&amp;lt;/usedtechnique&amp;gt; S: &lt;/recommendation&gt; S: ... S: &lt;recommendation&gt; S: &amp;lt;explanation&amp;gt; S: The workflow entitled KEGG pathways common to both QTL and microarray based investigations S: (URI:http://www.myexperiment.org/workflow.xml?id=13) is recommended to you since you selected the resources S: (&amp;lt;a href=&amp;quot;http://www.myexperiment.org/workflow.xml?id=16,&amp;quot; target=&amp;quot;_blank&amp;quot;&amp;gt;http://www.myexperiment.org/workflow.xml?id=1583&amp;lt;/a&amp;gt;) with S: similar components S: &amp;lt;/explanation&amp;gt;&amp;lt;itemtype&amp;gt;item_type_workflow&amp;lt;/itemtype&amp;gt; S: &amp;lt;resource&amp;gt;http://www.myexperiment.org/workflows/13&amp;lt;/resource&amp;gt; S: &amp;lt;strength&amp;gt;1.498285&amp;lt;/strength&amp;gt; S: &amp;lt;title&amp;gt;KEGG pathways common to both QTL and microarray based investigations&amp;lt;/title&amp;gt; S: &amp;lt;usedtechnique&amp;gt;technique_group_content_based&amp;lt;/usedtechnique&amp;gt; S: &lt;/recommendation&gt; S: </recommendationsset>
Link relations @@describe link relations that are central to this API <filteredrecommendationset></filteredrecommendationset>
<recommendationsset></recommendationsset> The set of recommendations for the user identified as user (the integer that represents the user in myExperiment). Its cardinality may be restricted up to a number (max) /recommender/recommendations/recommendationSet/user/{userId}{?max} <filteredrecommendationsset></filteredrecommendationsset> The set of recommendations for the user identified as userID of the item type itemType (i.e.workflows, files, users, packs). Its cardinality may be restricted up to a number (max) /recommender/recommendations/recommendationSet/user/{user}/{type}{?max} <recommendationcontext></recommendationcontext> The recommendation context must be set up in case that the user may be interested in receiving recommendations based in a group of myExperiment resources or keywords. The recommendation context is composed by the set of resources (0..N resources defined by the resource query parameter), the set of keywords (0..N keywords defined by the keyword query parameter), and the URI of the user that is associated with the context (user query param) /recommender/contexts/recommendationContext{?user,resource, keyword} <contextualizedrecommendationsset></contextualizedrecommendationsset> The set of contextualized recommendations for the user identified as user (user query param) of items of a type (type query param)(i.e.workflows, files, users, packs). Its cardinality may be restricted up to a number (max query param) /recommender/recommendations/contextualizedRecommendationsSet{?user,type, max} HTTP methods The service description is obtained in response to an HTTP GET to a Recommender Service URI.
The Recommender Service responds to an HTTP GET with the results of a recommendationsSet, using the URI defined by expanding the template provided by the service description.
Resources and formats A RecommendationsSet represents a set of recommendations for a given user
<recommendationsset></recommendationsset>
<recommendation>
&lt;explanation&gt;An user oriented description on why the recommendation is made to the user&lt;/explanation&gt;
&lt;itemtype&gt;
&#91;item_type_workflow&#124;item_type_file&#124;item_type_pack&#124;item_type_user&#93;
&lt;/itemtype&gt;
&lt;resource&gt;The URL of the recommended item&lt;/resource&gt;
&lt;strength&gt;A real number that ranges from 0 to 5 with that represents the relevancy of the recommendation&lt;/strength&gt;
&lt;title&gt;The title or name of the recommended item&lt;/title&gt;
&lt;usedtechnique&gt;
&#91;technique_keyword_content_based&#124;technique_social&#124;technique_collaborative&#124;technique_inferred&#124;technique_group_content_based&#93;
&lt;/usedtechnique&gt;
</recommendation> ... <recommendationsset></recommendationsset> Ther recommendationContext resource contains the group or resources and keywords that are considered in the provisioning of contextualized recommendations for a given user.
<recommendationcontext>
&lt;resource&gt;resource URI 0&lt;/resource&gt;
..
&lt;resource&gt;resource URI N&lt;/resource&gt;
&lt;keyword&gt;keyword 0&lt;/keyword&gt;
..
&lt;keyword&gt;keyword N&lt;/keyword&gt;
&lt;useruri&gt;The user associated with the context&lt;/useruri&gt;
</recommendationcontext> Cache considerations The recommendationsSet that are not dependent of the user context are precalculated and cached.
Security considerations The recommendations provided by the Recommender Service are based in publicly available data and its functionin is a read-only function. There are no privacy or unintended data modification risks.
$BASEURI/wakeup GET This operation initializes the Recommender Service and provides a the textual information of its state. We will get the a message similar to this:
The Recommender Service has been initialized! It is available at: http://localhost:8015/recommender
It took 58.487to initialize the recommender system
of recommendations 6345 of inferred recommendations 225 of users> 6914 of workflows> 1759 of files> 856 of users with at least one rating 88 of users that have uploaded a workflow 278 of users with at least one favourite workflow 74 of favourited workflows (they may be repeated) 168 of users with at least one favourite and rating 20 of users that have received a recommendation 608 of items that have been recommended 1046 of recommendations by collaborative filtering algorithm 177 of recommendations by content based algorithm 4369 of recommendations by social network algorithm 1799 of users that have uploaded a file 120 of users with at least one favourite file 6 of favourited files (they may be repeated) 7 of file ratings 17 of workflow ratings 132 of tags 3325 of users with at least one tag 285 of the average tag per user 0.4809083 of the average tag per user that has tags 11.666667 of packs 370 #_ftnref] http://jersey.java.net/ garbage#_ftnref] http://jersey.java.net/