In [1]:
import os
import sys

module_path = os.path.abspath(os.path.join('../../src'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
import pandas as pd

# Critiquing

At the heart of a critiqueing recommender lie **critiques**. Given an **anchor** or reference item, it asks for a recommendation based on that item **but** that differs in a certain **aspect(s)**. This feedback are what is called **critiques**. 

In [4]:
from framework.property import NumericalProperty, CategoricalProperty, BooleanProperty
from framework.parser.pandas_parser import parse_dataframe_as_items

data = pd.read_csv("../../../Datasets/Temp/train.csv")
data = data.set_index("Id")
sample = data.sample(50)

mapping = {
    "YrSold": NumericalProperty,
    "LotFrontage": NumericalProperty,
    "PoolArea": NumericalProperty,
    "Fireplaces": NumericalProperty
}

items = parse_dataframe_as_items(sample, mapping)

Where an item looks like this

In [3]:
item = items[0]
print(item)

<id:439, properties:{'YrSold': '2007.0', 'LotFrontage': '40.0', 'PoolArea': '0.0', 'Fireplaces': '1.0'}>


## Unit critique types
The base of all types of critiques are called **unit critiques**. These critiques target an individual property and give feedback of how the designated property needs to change. All currently implemtented types are documented with an example.

(Currently) unit critique types only support having *enums* as attributes. This is mainly to facilitate the automation of evaluating a critiqueing recommender. These need to be able to create all possible critiques which would be impossible if the parameter range was not finite.

### NotCritique

<!-- TODO later add using similar items, not just all items -->
For example, lets take an attribute with a value that is identical with a lot of other items

In [4]:
attr_freq = {}
for attr_name, attr_val in item.iter_props():
    attr_freq[attr_name] = 0
    # Check how often the exact same attribute value occurs in the dataset
    for other_item in items:
        if item != other_item:
            if other_item.get_prop(attr_name) == attr_val:
                attr_freq[attr_name] += 1
                
attr_freq = {k: v for k, v in attr_freq.items() if v > 0}
print(attr_freq)

{'YrSold': 11, 'PoolArea': 48, 'Fireplaces': 22}


Now taking a random one that has multiple occurences

In [5]:
import random as rng
attr = list(attr_freq.keys())[rng.randrange(len(attr_freq))]
attr_count = attr_freq[attr]
print(attr, ":", attr_count)

Fireplaces : 22


Now when a `NotCritique` is made to exclude other items that have this critique and the critique is incorperated in the feedback loop will prevent from items with the identical attribute being shown

In [6]:
from framework.critique.unit import NotCritique
crit = NotCritique(item[attr])

items_after_crit = list(other_item for other_item in items if other_item.id != item.id and crit.passes_critique(other_item[attr]))
print(len(items_after_crit))

27


## DirectionalCritique

A **directional critique** can be applied to any type property thats allows for comparison besides equivalency. It critiques a property so that any following recommendations require either a **larger** or **smaller** value for that property.

A example is made with an property for an item that has both items with less and more for that properties value

In [7]:
prop_distr = {}
for prop_name, prop_val in item.iter_props():
    prop_distr[prop_name] = {"smaller": 0, "larger": 0}
    for other_item in items:
        if other_item != item:
            if other_item[prop_name] < item[prop_name]:
                prop_distr[prop_name]["smaller"] += 1
            elif other_item[prop_name] > item[prop_name]:
                prop_distr[prop_name]["larger"] += 1
print(prop_distr)

{'YrSold': {'smaller': 11, 'larger': 27}, 'LotFrontage': {'smaller': 3, 'larger': 38}, 'PoolArea': {'smaller': 0, 'larger': 1}, 'Fireplaces': {'smaller': 23, 'larger': 4}}


When one is randomly picked that has both items with larger and smaller values for the property

In [8]:
both_prop_distr = list(k for k,v in prop_distr.items() if v["smaller"] > 0 and v["larger"] > 0)
chosen_prop = both_prop_distr[rng.randrange(len(both_prop_distr))]
print(chosen_prop)

YrSold


When a directional critique is made for each direction, the resulting items for each critique are the amount of items shown above

In [10]:
from framework.critique.unit.directional_critique import DirectionalCritique, DirectionalCritiqueDirections


crit_smaller = DirectionalCritique(item[chosen_prop], DirectionalCritiqueDirections.SMALLER)
crit_larger = DirectionalCritique(item[chosen_prop], DirectionalCritiqueDirections.GREATER)

crit_items_smaller = list(other_item for other_item in items if item != other_item and crit_smaller.passes_critique(other_item[chosen_prop]))
crit_items_larger = list(other_item for other_item in items if item != other_item and crit_larger.passes_critique(other_item[chosen_prop]))

print("Smaller: {}, larger: {}".format(len(crit_items_smaller), len(crit_items_larger)))

Smaller: 11, larger: 27
