In [1]:
import pandas as pd
import pd_explain
import warnings
warnings.filterwarnings("ignore")

usetex-False


# Runtime disclaimer

Since this is the proof of concept, the recommender is not optimized for performance.\
Running the recommend method may take a long time (between 20 seconds to just over a minute) in this notebook.\

# Basic Usage

The query recommender can automatically recommend queries to the user based on the data they provide.\
These queries are the ones that are most likely to be of interest to the user.

In [2]:
# Load the adult dataset
adults = pd.read_csv(r"..\Examples\Datasets\adult.csv")
adults.head()

Unnamed: 0,age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,label
0,39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
1,50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
2,38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K
3,53,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,<=50K
4,28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,<=50K


The recommender will automatically select attributes that are likely to generate interesting queries,
then generate queries based on these attributes.\
\
Right now, the automatic attribute selection uses only 1 correlation based metric, that gives a higher score the more correlated the attribute is with other attributes. This can be extended to include other metrics.\
\
Right now only filter queries are supported, but it is possible to extend the recommender to support other types of queries.\
\
Each type of query has its own recommender object, and calling the `recommend` method from a dataframe will run all of the enabled recommenders (more on how to enable and disable recommenders later).\
Each type of recommendations is shown in its own tab.

In [3]:
# Recommend queries
adults.recommend()

HTML(value='\n                <style>\n                .jupyter-widgets.widget-tab > .p-TabBar {\n            …

HTML(value='\n        <style>\n        .jupyter-widgets.widget-tab > .p-TabBar .p-TabBar-tab {\n            fl…

Tab(children=(Tab(children=(Tab(children=(Output(),), selected_index=0, titles=('label == >50K',)), Tab(childr…

# Advanced Usage

## Configuration

The recommender can be configured on both a global and per-recommender basis.
First, we will show how to configure the global settings.

In [4]:
from pd_explain.recommenders import get_global_recommender_config_settings, set_global_recommender_config_settings, enable_recommenders, disable_recommenders, get_global_config_info

To get the current global settings, use the `get_global_recommender_config_settings` function.

In [5]:
get_global_recommender_config_settings()

{'Engine settings': {'Enabled recommenders': {'FilterRecommender'},
  'Disabled recommenders': set()},
 'FilterRecommender': {'attributes': None,
  'top_k_attributes': 3,
  'top_k_recommendations': 1,
  'top_k_explanations': 4,
  'num_bins': 10}}

The global settings are currently set to the default values:
- All recommenders are enabled. Right now, there is only one recommender implemented, for recommending filter queries.
- Each recommender has its own default configuration. To get the specifics of what each setting is, each recommender configuration has a `config_info` property that provides information about the configuration settings.

In [6]:
get_global_config_info()

{'Engine settings': {'Enabled recommenders': 'The recommenders that are currently enabled.',
  'Disabled recommenders': 'The recommenders that are currently disabled.'},
 'FilterRecommender': {'num_bins': 'The number of bins to use when binning the data to generate recommendations.The higher the number, the more recommendation candidates will be generated, but it will also be slower to compute.',
  'attributes': 'The attributes to recommend queries for. If None, the recommender will automatically select the attributes.',
  'top_k_attributes': 'The maximum number of attributes to recommend queries for.',
  'top_k_recommendations': 'The maximum number of recommendations to return for each attribute.',
  'top_k_explanations': 'The maximum number of explanations to provide for each recommendation.'}}

We can modify the global settings using the `set_global_recommender_config_settings` function.

In [7]:
set_global_recommender_config_settings({'FilterRecommender': {'top_k_recommendations': 3, 'top_k_attributes': 2}}, apply_to_existing_recommenders=True)

The above commands sets the `top_k_recommendations` to 3 and `top_k_attributes` to 2 for the `FilterRecommender`. All other settings remain the same.\
The parameter `apply_to_existing_recommenders` determines whether the changes should be applied to existing recommender objects. It is set to `True` by default. Setting it to `False` will only apply the changes to new recommender objects.\
We can see the changes by getting the global settings again, and by running the `recommend` method again.

In [8]:
get_global_recommender_config_settings()

{'Engine settings': {'Enabled recommenders': {'FilterRecommender'},
  'Disabled recommenders': set()},
 'FilterRecommender': {'attributes': None,
  'top_k_attributes': 2,
  'top_k_recommendations': 3,
  'top_k_explanations': 4,
  'num_bins': 10}}

In [9]:
adults.recommend()

HTML(value='\n                <style>\n                .jupyter-widgets.widget-tab > .p-TabBar {\n            …

HTML(value='\n        <style>\n        .jupyter-widgets.widget-tab > .p-TabBar .p-TabBar-tab {\n            fl…

Tab(children=(Tab(children=(Tab(children=(Output(), Output()), selected_index=0, titles=('label == <=50K', 'la…

Likewise, we can enable and disable recommenders globally using the `enable_recommenders` and `disable_recommenders` functions.

In [10]:
disable_recommenders(['FilterRecommender'])

In [11]:
adults.recommend()

HTML(value='No recommenders enabled. Unable to provide recommendations.')

In [12]:
enable_recommenders(['FilterRecommender'])

The recommender can also be configured locally for each recommender object.

In [13]:
adults.recommender.recommender_configurations

{'Enabled recommenders': ['FilterRecommender'],
 'Disabled recommenders': [],
 'FilterRecommender': {'attributes': None,
  'top_k_attributes': 2,
  'top_k_recommendations': 3,
  'top_k_explanations': 4,
  'num_bins': 10}}

In [14]:
adults.recommender.recommender_configurations = {'FilterRecommender': {'top_k_recommendations': 2}}

In [15]:
adults.recommender.recommender_configurations

{'Enabled recommenders': ['FilterRecommender'],
 'Disabled recommenders': [],
 'FilterRecommender': {'attributes': None,
  'top_k_attributes': 2,
  'top_k_recommendations': 2,
  'top_k_explanations': 4,
  'num_bins': 10}}

This will not affect the global settings or any other recommender objects, unlike the global settings.

## Selecting Attributes

It is possible for the user to specify which attributes they are interested in, instead of letting the recommender automatically select them.
There are 2 ways to do this:

The first, is to specify the attributes when calling the `recommend` method.
This will only affect the current call, and will apply to all of the enabled recommenders.

In [16]:
adults.recommend(attributes=['age', 'education'])

HTML(value='\n                <style>\n                .jupyter-widgets.widget-tab > .p-TabBar {\n            …

HTML(value='\n        <style>\n        .jupyter-widgets.widget-tab > .p-TabBar .p-TabBar-tab {\n            fl…

Tab(children=(Tab(children=(Tab(children=(Output(), Output()), selected_index=0, titles=('age <= 22.0', 'age <…

The second, is to use the configuration settings to specify the attributes.\
This can be done globally or locally, and can be done on a per-recommender basis.\
Additionally, this will apply to all future calls to the `recommend` method, unless the attributes are specified in the method call.\
The attributes can be set back to `None` to let the recommender automatically select them.

In [17]:
adults.recommender.recommender_configurations

{'Enabled recommenders': ['FilterRecommender'],
 'Disabled recommenders': [],
 'FilterRecommender': {'attributes': None,
  'top_k_attributes': 2,
  'top_k_recommendations': 2,
  'top_k_explanations': 4,
  'num_bins': 10}}

In [18]:
adults.recommender.recommender_configurations = {'FilterRecommender': {'attributes': ['fnlwgt', 'marital-status']}}

In [19]:
adults.recommender.recommender_configurations

{'Enabled recommenders': ['FilterRecommender'],
 'Disabled recommenders': [],
 'FilterRecommender': {'attributes': ['fnlwgt', 'marital-status'],
  'top_k_attributes': 2,
  'top_k_recommendations': 2,
  'top_k_explanations': 4,
  'num_bins': 10}}

In [20]:
adults.recommend()

HTML(value='\n                <style>\n                .jupyter-widgets.widget-tab > .p-TabBar {\n            …

HTML(value='\n        <style>\n        .jupyter-widgets.widget-tab > .p-TabBar .p-TabBar-tab {\n            fl…

Tab(children=(Tab(children=(Tab(children=(Output(), Output()), selected_index=0, titles=('fnlwgt <= 65738.2', …