# Customizing Lens

Lens strives to give you sensible defaults and automatically do the proper assessments whenever possible. However, there are many times where you'll want to change, select, or extend the functionality. Lens is intended to be an _extensible framework_ that can accomodate your own analysis plan.

In this document we will describe a number of ways you can customize Lens:
* Setting up CredoModels
* Selecting which assessments to run
* Parameterizing assessments
* Incorporating new metrics

### Find the code
This notebook can be found on [github](https://github.com/credo-ai/credoai_lens/blob/develop/docs/notebooks/lens_customization.ipynb).

### Imports & Setup

In [2]:
# model and df are defined by this script
%run training_script.py
import credoai.lens as cl

In [4]:
# set up model and data artifacts
credo_model = cl.CredoModel(name='credit_default_classifier',
                            model=model)

credo_data = cl.CredoData(name='UCI-credit-default',
                          data=df,
                          sensitive_feature_keys=['SEX'],
                          label_key='target'
                          )

# specify the metrics that will be used by the Fairness and Performance assessment
metrics = ['precision_score', 'recall_score', 'equal_opportunity']
assessment_plan = {'Fairness': {'metrics': metrics},
                   'Performance': {'metrics': metrics}}

In [2]:
# Base Lens imports


In [1]:
import credoai.lens as cl

## Setting up CredoModel

The first step in using Lens is creating a CredoModel. Most kinds of models can be wrapped in a CredoModel for assessment. Let's see how to do it.

## Selecting Assessments
Lens has a number of assessments available, each of which works with different kinds of models or datasets. By default, Lens will automatically run every assessment that has its prerequesites met. However, you can instead specify a list of assessments and Lens will only use those.

In [19]:
from credoai.assessment import *
lens = cl.Lens(model=credo_model,
               data=credo_data,
               assessment_plan=assessment_plan,
               assessments=[FairnessAssessment, # <- new argument
                            DatasetFairnessAssessment] 
              )

Even when you select assessments, Lens will only run the ones that work with your model and data. You are setting the assessments that Lens has access to.

## Parameterizing Assessments

Now that we can select assessments, how about customizing them? There are two places where assessments can be customized: 
1. when their underlying module is initialized 
2. when they are ran (which runs the underlying module).

### Customizing initialization
To customize the module at initialization we use something we've already seen before - the `alignment spec`! The parameters that can be passed at this stage are the same parameters passed to the assessment's `init_module` function. 

The only one we've seen so far are the "metrics" argument that can be passed to the PerformanceAssessment and FairnessAssessment, but other assessments may be parameterized in different ways.

In [14]:
assessment = PerformanceAssessment()
assessment.init_module?

[0;31mSignature:[0m [0massessment[0m[0;34m.[0m[0minit_module[0m[0;34m([0m[0;34m*[0m[0;34m,[0m [0mmodel[0m[0;34m,[0m [0mdata[0m[0;34m,[0m [0mmetrics[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mignore_sensitive[0m[0;34m=[0m[0;32mTrue[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Initializes the performance module

Parameters
------------
model : CredoModel, optional
data : CredoData, optional
metrics : List-like
    list of metric names as string or list of Metrics (credoai.metrics.Metric).
    Metric strings should in list returned by credoai.metrics.list_metrics.
    Note for performance parity metrics like 
    "false negative rate parity" just list "false negative rate". Parity metrics
    are calculated automatically if the performance metric is supplied
ignore_sensitive : bool
    Whether to ignore the sensitive_feature of CredoData (thus preventing calculation
    of disaggregated performance). Generally used when Lens is also runnin

### Customizing how assessments are run

The other way of parameterizing the assessments is by passing arguments to the assessment's `run` function. These kwargs are passed to `lens.run_assessments`, which are, in turn passed to the assessment's initialized module.

For instance, the `Fairness` assessment initializes `mod.Fairness`, whose `run` argument can take a `method` parameter which controls how fairness scores are calculated. The default is "between_groups", but we can change it like so:

In [24]:
run_kwargs = {'Fairness': {'method': 'to_overall'}}
lens.run_assessments(assessment_kwargs = run_kwargs)

INFO:absl:Running assessment-DatasetFairness
INFO:absl:Running assessment-Fairness


<credoai.lens.Lens at 0x29a46fd60>

## Module Specific Customization - Custom Metrics for Fairness Base

Each module has different parameterization options, as discused above. The FairnessModule takes a set of metrics to calculate on the model and data. Many metrics are supported out-of-the-box. These metrics can be referenced by string. However, custom metrics can be created as well. Doing so will allow you to calculate any metric that takes in a `y_true` and some kind of prediction

Custom metrics can be incorporated by creating a `Metric` object. `Metrics` are lightweight wrapper classes that defines a few characteristics of the custom function needed by Lens. 

**Example: Confidence Intervals**

We will create custom metrics that reflect the lower and upper 95th percentile confidence bound on the true positive rate.

Confidence intervals are not generically supported. However, they can be derived for metrics derived from a confusion matrix using the `wilson confidence interval`. A convenience function called `confusion_wilson` is supplied which returns an array: [lower, upper] bound for the metric. 

Wrapping the confusion wilson function in a `Metric` allows us to pass it as a metric to the FairnessModule.

In [None]:
from credoai.metrics.credoai_metrics import confusion_wilson
from credoai.metrics import Metric

# define partial functions for the true positive rate lower bound
def lower_bound_tpr(y_true, y_pred):
    return confusion_wilson(y_true, y_pred, metric='true_positive_rate', confidence=0.95)[0]

# and upper bound
def upper_bound_tpr(y_true, y_pred):
    return confusion_wilson(y_true, y_pred, metric='true_positive_rate', confidence=0.95)[1]

# wrap the functions in fairness functions
lower_metric = Metric(name = 'lower_bound_tpr', 
                      metric_category = "binary_classification",
                      fun = lower_bound_tpr)

upper_metric = Metric(name = 'upper_bound_tpr', 
                      metric_category = "binary_classification",
                      fun = upper_bound_tpr)

In [None]:
# Note: Remember to instantiate the assessment!
selected_assessments = [FairnessAssessment()]
init_kwargs = deepcopy(alignment_spec)
init_kwargs['Fairness']['metrics'] = [lower_metric, 'tpr', upper_metric]
custom_lens = cl.Lens(model=credo_model,
               data=credo_data,
               assessments=selected_assessments,
               assessment_plan=init_kwargs)
custom_lens.run_assessments().get_results()

## Understanding assessment requirements

The prerequesities can be queried by calling the `get_requirements` function on an assessment. These indicate the set of features or functions your model and data must instantiate in order for the assessment to be run. 

These requirements are specific for the model and the data. Each requirement is either a single requirements or a tuple. If a tuple, only one of the requirements within the tuple must be met. For instance, the FairnessAssessment needs *either* `predict_proba` OR `predict`. See `credoai.assessments.credo_assessment.AssessmentRequirements` for more.

In [4]:
from credoai.assessment import FairnessAssessment
assessment = FairnessAssessment()
assessment.get_requirements()

{'model_requirements': [('predict_proba', 'predict')],
 'data_requirements': ['X', 'y', 'sensitive_features']}

You can also get the requirements for all assessments.

In [5]:
from credoai.assessment import get_assessment_requirements
get_assessment_requirements()

{'DatasetFairnessAssessment': {'model_requirements': [],
  'data_requirements': ['X', 'y', 'sensitive_features']},
 'FairnessAssessment': {'model_requirements': [('predict_proba', 'predict')],
  'data_requirements': ['X', 'y', 'sensitive_features']},
 'NLPEmbeddingBiasAssessment': {'model_requirements': ['embedding_fun'],
  'data_requirements': []},
 'NLPGeneratorAssessment': {'model_requirements': ['generator_fun'],
  'data_requirements': []},
 'PerformanceAssessment': {'model_requirements': [('predict_proba', 'predict')],
  'data_requirements': ['X', 'y']}}

## Creating New Modules & Assessments

WIP section!