# Recommender Systems and Knowledge-based Selections

This notebook provides a brief introduction to the main categories of Recommendation Systems.  It goes on to show the unique value of Knowledge-based recommenders and how they can be implemented.  These systems are often overlooked because of the unwarranted notion that they are specific to the application domain.  However, they have many generalizeable concepts and offer a richer level of analysis than more superficial systems.

# Types of Recommenders

Succinctly, Recommendation Systems focus on providing users with items they may be interested in selecting.  Information may come from a variety of sources and combined to create user characteristic profiles and item feature groups.  Machine learning techniques are often used to transform and manipulate data, then make predictions from it for purposes of grouping, scoring, and ranking.

There are three different categories of recommender systems:

I. _Collaborative Systems_ make use of similarities among users and among items.  Amazon book recommendations is a good example.  They make use of calculations to determine:

* similarity between users
* similarity between items
* descriptions of new users and items with no information

Questions concerning these systems include:

* How do we find users with similar tastes to the user for whom we need a recommendation?
* How do we measure similarity?
* What should we do with new users, for whom a buying history is not yet available?
* How do we deal with new items that nobody has bought yet?
* What if we have only a few ratings that we can exploit?
* What other techniques besides looking for similar users can we use for making a prediction about whether a certain user will like an item?



II. _Content-based Systems_ use information from within users' previous selections.  Netflix movie recommendations are made from movies within a similar genre, and users' historical ratings.  Calculations consider:

* similarity within item genres
* creating and continously updating user profiles

Questions concerning these systems include:

* How can systems automatically acquire and continuously improve user profiles?
* How do we determine which items match, or are at least similar to or com- patible with, a user’s interests?
* What techniques can be used to automatically extract or learn the item descriptions to reduce manual annotation?



III. _Knowledge-based Systems_ are different from the first two in that they focus on specific domains where user history is limited.  Characteristics of these systems are that data may need to be obtained from the user, directly, and that item features are objective.  Models may also be based on Subject Matter Experts' (SME) understanding of user needs.  Design of such systems are based on:

* categories and types of information obtained about items
* obtaining user preferences

Questions concerning these systems include:

* What kinds of domain knowledge can be represented in a knowledge base?
* What mechanisms can be used to select and rank the items based on the user’s characteristics?
* How do we acquire the user profile in domains in which no purchase history is available, and how can we take the customer’s explicit preferences into account?
* Which interaction patterns can be used in interactive recommender systems?
* Finally, in which dimensions can we personalize the dialog to maximize the precision of the preference elicitation process?


Sometimes, a fourth recommender system is described as a _Hybrid_ of the others.  However, most practical applications of recommendation systems evolve to hybrid systems, in at least some respects.

Questions concerning these systems include:

* Which techniques can be combined, and what are the prerequisites for a given combination?
* Should proposals be calculated for two or more systems sequentially, or do other hybridization designs exist?
* How should the results of different techniques be weighted and can they be determined dynamically?


# Additional Topics in the Field

Additional topics within the field of Recommender Systems include:
 
* explanation of the recommendations provided
* evaluation techniques for improving system performance

Questions concerning these topics include: 

* How can a recommender system explain its proposals while increasing the user’s confidence in the system?
* How does the recommendation strategy affect the way recommendations can be explained?
* Can explanations be used to convince a user that the proposals made by the system are “fair” or unbiased?
* How can recommender systems be evaluated using experiments on historical datasets?
* What metrics are applicable for different evaluation goals?

# Knowledge-based Selections

In order for Collaborative or Content systems to work effectively, there must be frequent selections, preferably, from a large number of users.  This is not the scenario for items that are: i) not bought often, ii) not rated often, iii) complex, multi-faceted products.  Here, the reasons for selection far outnumber the frequencey of selections and users searching.

Recommendations are made based on:

* _similarities_ of user requirements and items
* explicit _recommendation rules_ 

In addition, because of the complexity of the decision, the systems are often called _conversational_.  Users often need personalized interaction while providing information, and explanation after the recommendation is made.

Three basic types of Knowledge-based systems and their basis for item selection are:

* constaint-based: explicitly defined recommendation rules
* case-based: item similarity measures
* hybrid: uses components of both techniques


### Constraint-based method

The domain of possible solutions is determined by two different sets of variables (V = Vc ∪ Vprod), one describing potential customer requirements and the other describing product properties, or domain of possibility.  Example Vc: customer property _max-price_ a customer is willing to pay, and Vprod _mpix_ denotes possible resolutions of a digital camera.

Three different sets of constraints (C = Cr ∪ Cf ∪ Cprod) define which items should be recommended to a customer in which situation.  Cr - compatibility constraints, Cf - filter conditions, Cprod - product constraints (entire product item).  Example Cr: if large-size photoprints are required, the maximal accepted price must be higher than 200.  Example Cf: if large-size photoprints, then resolutions greater than 5 mpix required.  Example Cprod: p1:{price:148, mpix:8.0,...}. 

Each solution to the _constraint satistfaction problem (CSP)_ (V = Vc ∪ Vprod, D, C = Cr ∪ Cf ∪ Cprod ∪ REQ) corresponds to a consistent recommendation.


### Utility-based method

Stuff about utitlity-based method ...

### Case-based method

If we cannot match user user requirements, exactly, then we must use similarity measures that describe the extent that the items do match.  In this context, sim(p,r) expresses for each item attribute value φr(p) its distance to the customer requirement r ∈ REQ.  For example, φmpix(p1) = 8.0.
```
similarity(p, REQ) = ( Sum_r∈REQ (w_r ∗sim(p,r)) / ( Sum_r∈REQ (w_r) )
```
where
```
MoreIsBetter: sim(p,r)= ( φr(p)−min(r) ) / ( max(r) − min(r) )
LessIsBetter: sim(p,r)= ( max(r)-φr(p) ) / ( max(r) − min(r) )
CloseIsBetter: sim(p,r)= 1 - ( | φr(p)-r | / ( max(r) − min(r) ))
```

Case-based often use _critiquing_ by getting feedback from users in order to improve recommendations.
 

### Application

Obtaining information from users can be a difficult task.  This can often spill-over to other fields, such as item-response theory and psychometrics.  However, recommender systems should resist the temptation to go in these directions.  Questions should be few and short.  Additional information can be obtained from Subject Matter Experts investigating user needs and adjusting the recommender in their own user interface.

Additional information may also come from external data sources.  This can lead the Knowledge-based system to evolve with Content-based profiling and become a Hybrid system.  This happens particularly with item category, genre and key-word information.

# Implementation

A recommendation can be viewed as a data filtering task, or _conjunctive query_, on a database with a set of selection criteria.  `σ[mpix≥10,price<300](P)` is such a conjunctive query on the database table P (which represents Vprod as columns / attributes and Cprod as records / entries), where σ represents the selection operator and `[mpix ≥ 10, price < 300]` the corresponding selection criteria.

Steps to implement conjuctive query

* determine user
  + domain of needs (Vc)
  + constraints (Cr)
  + priority of product attributes
* create table of products P (Cprod x Vprod)
* naive solution
  + remove product records that do not meet Vc, Cr: `σ[mpix≥10,price<300](P) = {p1,p2,p3,p4,p5,p6,p7}`
  + sort based on prioritized attributes: `σ[mpix≥10,price<300](P) = {p4,p7}`
  + solution is top table record: `σ[mpix≥10,price<300](P) = {p4}`
* complex solution (no solution exists for naive method: `σ[mpix≥10,price<300](P) = {}`)
  + perform naive method to step possible
  + use case-based and utility-based methods for similarity measures and weighting
  + ...

# Reference: Literature

### Books

__Used extensively__

* Recommender Systems: An Introduction
* Machine Learning Applications in Recommender Systems

__General understanding__

* Building a Recommendation System with R
* Probabilistic Approaches to Recommendations
* Recommender Systems: A Handbook



### Additional Sources

__Projects__
* [recommendation system python notebooks](https://github.com/amitkaps/recommendation)
* [discussion of distance metrics](http://www.benfrederickson.com/distance-metrics/)
* [Surprise Scikit project for recommender systems](http://surpriselib.com/)
* [Entree System source data](https://github.com/KinGoverm/ai/tree/master/dataset/entree)
* [Entree System project information](https://kdd.ics.uci.edu/databases/entree/entree.data.html)

__Papers__

* [papers and articles on recommender systems](https://github.com/daicoolb/RecommenderSystem-Paper)
* [overview of recommender systems](http://www.cis.upenn.edu/~ungar/CF/)
* [wikipedia: recommendation systems](https://en.wikipedia.org/wiki/Recommender_system)
* [wikipedia: knowledge-based](https://en.wikipedia.org/wiki/Knowledge-based_recommender_system)