# pypbl
## Preference based learning with python

## What is preference based learning?

If we know a set of preferences based on pairwise comparison 

* **item a** ≻ **item b**
* **item b** ≻ **item d**

What can we say about **item c**?


## Use cases

### Recommender system
We wish to suggest new items that are of interest based on an individual's personal preferences.

For example, recommending a car make and model based on subjective comparison of different car attributes.

### Decision support
A decision maker has to make a decision based on multiple objectives. Preference based learning can be used to understand the user’s preferences for trading off different objectives or KPIs.

For example, designing air traffic avoidance algorithms based on subject matter expertise.

### Parameter inference
We wish to infer parameters of a model with only binary information.

For example, rating X-box player skill level based on historic wins and losses between players.

## Cars example

In [1]:
import pandas as pd
from pypbl.priors import Normal, Exponential
from pypbl.elicitation import BayesPreference

In [2]:
data = pd.read_csv('../examples/data/mtcars.csv')
data.set_index('model', inplace=True)
data.head(10)

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Mazda RX4,21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160.0,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108.0,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360.0,175,3.15,3.44,17.02,0,0,3,2
Valiant,18.1,6,225.0,105,2.76,3.46,20.22,1,0,3,1
Duster 360,14.3,8,360.0,245,3.21,3.57,15.84,0,0,3,4
Merc 240D,24.4,4,146.7,62,3.69,3.19,20.0,1,0,4,2
Merc 230,22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2
Merc 280,19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4


In [3]:
# instantiate based on the cars data set and add priors for each parameter
p = BayesPreference(data=data)
p.set_priors([
    Exponential(1),  # MPG - high miles per gallon is preferred
    Normal(),        # number of cylinders
    Normal(),        # displacement
    Exponential(2),  # horsepower - high horsepower is preferred
    Normal(),        # real axle ratio
    Normal(),        # weight
    Exponential(-3), # quarter mile time - high acceleration is preferred
    Normal(),        # engine type
    Normal(),        # transmission type
    Normal(),        # number of gears
    Normal()         # number of carburetors
])

In [4]:
# add some preferences and infer the weights for each parameter
p.add_strict_preference('Pontiac Firebird', 'Fiat 128')
p.add_strict_preference('Mazda RX4', 'Mazda RX4 Wag')
p.add_indifferent_preference('Merc 280', 'Merc 280C')

In [5]:
p.infer_weights(method='mean')

for a, b in zip(data.columns.values.tolist(), p.weights.tolist()):
    print('{}: {}'.format(a, b))

  lnpdiff = f + nlp - state.log_prob[j]


mpg: 0.27852693540675383
cyl: 0.08272223276383758
disp: 0.07007245887358331
hp: 0.24794371519982736
drat: 0.10353548831998018
wt: -0.48303527363928866
qsec: -0.772206194376317
vs: -0.15284689007178323
am: -0.34280443907059455
gear: -0.3038784436505787
carb: 0.03569086077650192


In [6]:
# rank all the items and highlight the top five
p.rank().head(5)

Unnamed: 0,utility
Camaro Z28,0.025455
Duster 360,0.008062
Hornet Sportabout,-0.106598
Pontiac Firebird,-0.149798
Dodge Challenger,-0.188468


In [7]:
# suggest a new pair based on maximum variance
p.suggest(method='max_variance')

('Camaro Z28', 'Maserati Bora')