# Lens FAQ
This document answers some of the most common functionality questions you may have.

**A note on customization**

Lens strives to give you sensible defaults and automatically do the proper assessments whenever possible. However, there are many times where you'll want to change, select, or extend the functionality. Lens is intended to be an _extensible framework_ that can accomodate your own analysis plan.

**Find the code**

This notebook can be found on [github](https://github.com/credo-ai/credoai_lens/blob/develop/docs/notebooks/lens_faq.ipynb).

**Imports and Setup**

In [1]:
# model and df are defined by this script
%run training_script.py
import credoai.lens as cl

# specify the metrics that will be used by the Fairness and Performance assessment
metrics = ['precision_score', 'recall_score', 'equal_opportunity']
assessment_plan = {'Fairness': {'metrics': metrics},
                   'Performance': {'metrics': metrics}}

## How do I get my model working with Lens?

The first step in using Lens is creating a `CredoModel`. Most kinds of models can be wrapped in a `CredoModel` for assessment. `CredoModel`'s primary attribute is its `config` which is a dictionary reflecting its functionality. This config is what determines how the model will be interacted with, and thus which assessments can be used.

The simplest case happens when your model's functionality can be inferred by Lens, which is used to define the `config`. Below is an example using a scikitlearn model

In [2]:
print('frameworks:', cl.CredoModel.supported_frameworks())
credo_model = cl.CredoModel(name='my_model',
                            model=model)
# configuration automatically inferred from model
credo_model.config

frameworks: ('sklearn', 'xgboost')


{'predict': <bound method ForestClassifier.predict of RandomForestClassifier()>,
 'predict_proba': <function credoai.artifacts.CredoModel._sklearn_style_config.<locals>.predict_proba(X)>}

**Defining the config manually**

While this config is inferred when working with a supported model (e.g., sklearn, above), you can also define it manually. For instance, you can define the same CredoModel like below:


In [3]:
config = {'predict_proba': model.predict_proba,
          'predict':  model.predict}
credo_model = cl.CredoModel(name='my_model',
                            model_config=config)

This is a much more generic approach. A CredoModel is just a collection of functions that conform to some function spec. For instance, the "predict" function above must conform to the function signature of the `predict` function used in sklearn models.

<br>

**Using precomputed values**

A common use case you may run into is wanting to assess *pre-computed* predictions. You don't need Lens to perform inference, just use the inferences you've already generated for assessment.

Below is an example of such a case using the same example. Note that the `predict` still needs to take in an `X` variable to maintain the appropriate function signature. In this case, however, X is completely ignored and the predictions are used.

In [4]:
# precomputed predictions
predictions = model.predict(X)
probs = model.predict_proba(X)
# light wrapping
config = {'predict': lambda X: predictions,
          'predict_proba': lambda X: probs}
credo_model = cl.CredoModel(name='my_model_name',
                            model_config=config)

## How do I get my datasets working with Lens?

`CredoData` are the equivalent of `CredoModels` for datasets. They can be passed to Lens as "data" (which is the validation data to assess the model against) or as "training_data" (which will not be used to evaluate the model, but will be assessed itself).

CredoData expects a dataframe that includes all the features used to train the model (and to potentially call the model's `predict` function), as well as the target. Optionally, a sensitive feature key can be passed as well.

In [5]:
# set up model and data artifacts
credo_data = cl.CredoData(name='my_dataset_name',
                          X=X_test,
                          y=y_test,
                          sensitive_features=sensitive_features_test
                          )

## How do I get assessment results from Lens?

Running assessments isn't very helpful if you can't view them! 

* You can get results by calling `lens.get_results()`
* You can visualize results by calling `lens.display_results()`

All results will be dictionaries or pandas objects.

**Note**

If you want to export the assessments to Credo AI's Governance App, check out the [Connecting with Governance App](https://credoai-lens.readthedocs.io/en/latest/notebooks/governance_integration.html) tutorial for directions.


In [6]:
lens = cl.Lens(model=credo_model,
               data=credo_data,
               assessment_plan=assessment_plan).run_assessments()

INFO:absl:Initializing Assessment (DatasetEquity)
INFO:absl:Initializing Assessment (DatasetFairness)
INFO:absl:Initializing Assessment (DatasetProfiling)
INFO:absl:Initializing Assessment (Fairness) with kwargs: {'metrics': ['precision_score', 'recall_score', 'equal_opportunity']}
INFO:absl:Initializing Assessment (ModelEquity)
INFO:absl:fairness metric, equal_opportunity, unused by PerformanceModule
INFO:absl:Initializing Assessment (Performance) with kwargs: {'metrics': ['precision_score', 'recall_score', 'equal_opportunity']}
INFO:absl:Running assessment: DatasetEquity
INFO:absl:Running assessment: DatasetFairness
INFO:absl:Running assessment: DatasetProfiling


Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

INFO:absl:Running assessment: Fairness
INFO:absl:Running assessment: ModelEquity
INFO:absl:Running assessment: Performance


In [7]:
results = lens.get_results()

Results are organized in a hierarchical dictionary. The first level describes the artifacts (data and/or models used for the assessment). The three possible artifcacts are:

1. training (for training data)
2. validation (for validation data)
3. model

The first level will have keys like "validation_model", or "training_validation_model", indicating which artifacts are related to those artifacts. In this case, we didn't pass a training dataset, and no assessments were run only on the model, so we have assessments run on the validation dataset, or on the validation dataset and the model.

In [8]:
results.keys()

dict_keys(['validation', 'validation_model'])

The next level are the assessment names. In this case, the "Fairness" and "Performance" assessments were automatically run on the validation dataset and model

In [9]:
results['validation_model'].keys()

dict_keys(['Fairness', 'ModelEquity', 'Performance'])

Finally, there are keys related to the different results created by each assessment.

In [10]:
results['validation_model']['Fairness'].keys()

dict_keys(['SEX'])

## How can I choose which assessments to run?
Lens has a number of assessments available, each of which works with different kinds of models or datasets. **By default, Lens will automatically run every assessment that has its prerequesites met.** 

However, you can instead specify a list of assessments and Lens will only select from those. Even when you select assessments, Lens will only run the ones that work with your model and data. 

In [11]:
# specifying a set of assessments to use

from credoai.assessment import FairnessAssessment, DatasetFairnessAssessment
lens = cl.Lens(model=credo_model,
               data=credo_data,
               assessment_plan=assessment_plan,
               assessments=[FairnessAssessment, # <- new argument
                            DatasetFairnessAssessment] 
              )

INFO:absl:Initializing Assessment (DatasetFairness)
INFO:absl:Initializing Assessment (Fairness) with kwargs: {'metrics': ['precision_score', 'recall_score', 'equal_opportunity']}


### List all assessments and their names

To list all assessments that are available, as well as their names (needed for creating assessment plans), use the helper function `get_assessment_names`. 

In [12]:
from credoai.assessment.utils import get_assessment_names
get_assessment_names()

{'DatasetEquityAssessment': 'DatasetEquity',
 'DatasetFairnessAssessment': 'DatasetFairness',
 'DatasetProfilingAssessment': 'DatasetProfiling',
 'FairnessAssessment': 'Fairness',
 'ModelEquityAssessment': 'ModelEquity',
 'NLPEmbeddingBiasAssessment': 'NLPEmbeddingBias',
 'NLPGeneratorAssessment': 'NLPGenerator',
 'PerformanceAssessment': 'Performance',
 'PrivacyAssessment': 'Privacy',
 'SecurityAssessment': 'Security'}

## How do I get documentation on the assessments I am running?

Assessment documentation is somewhat complex. Assessments wrap modules, which themselves have documentation. Normally, you don't have to worry about the module itself, except if you are creating your own assessments or want to use the modules directly. The main exception here is if you want to configure how you run the assessment (see this [section](#How-do-I-configure-my-assessments?)).

For the assessments, you may be interested in what parameters you can pass to their initialization (which is passed using the `assessment_plan` parameter when running Lens).

Below are a number of aspects of an assessment you may need to query. The next few sections expand on these aspects of assessments.

In [13]:
# # Uncomment this section to run!
# # Different aspects of documentation you may be interested in
# # These are all for the fairness assessment

# # what parameters can be passed to the initialization?
# FairnessAssessment.init_module?

# # what requirements are needed? 
# # (This is normally included in the assessments base documentation)
# FairnessAssessment().get_requirements()

# # what does the module require? 
# # This is often similar to the parameters passed to assessment initialization
# FairnessAssessment().module?

## Understanding assessment requirements

The prerequesities can be queried by calling the `get_requirements` function on an assessment. These indicate the set of features or functions your model and data must instantiate in order for the assessment to be run. 

These requirements are specific for the model and the data. Each requirement is either a single requirements or a tuple. If a tuple, only one of the requirements within the tuple must be met. For instance, the FairnessAssessment needs *either* `predict_proba` OR `predict`. See `credoai.assessments.credo_assessment.AssessmentRequirements` for more.

In [14]:
from credoai.assessment import FairnessAssessment
assessment = FairnessAssessment()
assessment.get_requirements()

{'model_requirements': [('predict_proba', 'predict')],
 'data_requirements': ['X', 'y', 'sensitive_features'],
 'training_data_requirements': []}

You can also get the requirements for all assessments.

In [15]:
from credoai.assessment import get_assessment_requirements
get_assessment_requirements()

{'DatasetEquityAssessment': {'model_requirements': [],
  'data_requirements': ['y', 'sensitive_features'],
  'training_data_requirements': []},
 'DatasetFairnessAssessment': {'model_requirements': [],
  'data_requirements': ['X', 'y', 'sensitive_features'],
  'training_data_requirements': []},
 'DatasetProfilingAssessment': {'model_requirements': [],
  'data_requirements': ['X', 'y'],
  'training_data_requirements': []},
 'FairnessAssessment': {'model_requirements': [('predict_proba', 'predict')],
  'data_requirements': ['X', 'y', 'sensitive_features'],
  'training_data_requirements': []},
 'ModelEquityAssessment': {'model_requirements': ['predict'],
  'data_requirements': ['sensitive_features'],
  'training_data_requirements': []},
 'NLPEmbeddingBiasAssessment': {'model_requirements': ['embedding_fun'],
  'data_requirements': [],
  'training_data_requirements': []},
 'NLPGeneratorAssessment': {'model_requirements': ['generator_fun'],
  'data_requirements': [],
  'training_data_require

You can see which assessments are usable for a particular combination of artifacts by passing them to the helper function `get_usable_assessments`. Lens does this under the hood.

In [16]:
from credoai.assessment import get_usable_assessments
get_usable_assessments(credo_model=credo_model, 
                       credo_data=credo_data)

{'Fairness': <credoai.assessment.assessments.FairnessAssessment at 0x290a2e4f0>,
 'ModelEquity': <credoai.assessment.assessments.ModelEquityAssessment at 0x290b149a0>,
 'Performance': <credoai.assessment.assessments.PerformanceAssessment at 0x290a2e3d0>}

## How do I configure my assessments?

Now that we can select assessments, how about configuring them? 

To configure the assessment's underlying module we use something we've already seen before - the `assessment plan`! The parameters that can be passed at this stage are the same parameters passed to the assessment's `init_module` function. 

The only one we've seen so far are the "metrics" argument that can be passed to the PerformanceAssessment and FairnessAssessment, but other assessments may be configured in different ways. 

You can see the possible arguments by looking at the assessment's module's documentation. For instance, below we see that the `FairnessModule` used by the `FairnessAssessment` can take the argument `method`


In [17]:
from credoai.assessment import PerformanceAssessment
assessment = FairnessAssessment()
assessment.module?

[0;31mInit signature:[0m
[0massessment[0m[0;34m.[0m[0mmodule[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mmetrics[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0msensitive_features[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0my_true[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0my_pred[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0my_prob[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mmethod[0m[0;34m=[0m[0;34m'between_groups'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
Fairness module for Credo AI. Handles any metric that can be
calculated on a set of ground truth labels and predictions,
e.g., binary classification, multiclass classification, regression.

This module takes in a set of metrics  and provides functionality to:
- calculate the metrics
- create disaggregated metrics

Parameters
----------
metrics : List-like
    list of metric names as string or list of Metrics (credoai.metrics.Me

In [18]:
# changing the assessment plan to configure the Fairness Module's method argument
assessment_plan = {'Performance': {'metrics': metrics},
                   'Fairness': {'method': 'to_overall'}}

## What metrics are available?

Each assessment has different configuration options, as discused above. Some assessments take a set of metrics as their configuration (the FairnessAssessment and PerformanceAssessment).

Many metrics are supported out-of-the-box. These metrics can be referenced by string.

In [19]:
# all out-of-the-box supported metrics can be accessed by calling list_metrics
from credoai.metrics import list_metrics
metrics = list_metrics()

BINARY_CLASSIFICATION
	accuracy_score, average_precision_score,
	balanced_accuracy_score, f1_score, fallout_rate,
	false_discovery_rate, false_negative_rate,
	false_omission_rate, false_positive_rate, fdr,
	fnr, fpr, hit_rate,
	matthews_correlation_coefficient, miss_rate,
	overprediction, precision, precision_score,
	recall, recall_score, roc_auc_score,
	selection_rate, sensitivity, specificity, tnr,
	tpr, true_negative_rate, true_positive_rate,
	underprediction

FAIRNESS
	demographic_parity,
	demographic_parity_difference,
	demographic_parity_ratio, disparate_impact,
	equal_opportunity, equal_opportunity_difference,
	equalized_odds, equalized_odds_difference,
	statistical_parity

REGRESSION
	MAE, MSD, MSE, RMSE, d2_tweedie_score,
	explained_variance_score, max_error,
	mean_absolute_error,
	mean_absolute_percentage_error,
	mean_gamma_deviance, mean_pinball_loss,
	mean_poisson_deviance, mean_squared_deviation,
	mean_squared_error, mean_squared_log_error,
	median_absolute_error, r2, r2_s

Under the hood each metric is wrapped in a `Metric` class. `Metrics` are lightweight wrapper classes that defines a few characteristics of the custom function needed by Lens.

This class defines a canonical name for Lens, synonyms, a metric category, the function, and whether the metric takes probabilities or categorical predictions. The metric category defines the expected function signature, as described in `Metric`'s documentation

For instance, below is the false positive rate metric.

In [20]:
from credoai.metrics import BINARY_CLASSIFICATION_METRICS
BINARY_CLASSIFICATION_METRICS['false_positive_rate']

Metric(name='false_positive_rate', metric_category='BINARY_CLASSIFICATION', fun=<function false_positive_rate at 0x28cb0daf0>, takes_prob=False, equivalent_names={'false_positive_rate', 'fallout_rate', 'fpr'})

## How do I use my own custom metrics?

Custom metrics can be created by using the `Metric` class.  

**Example: Confidence Intervals**

We will create custom metrics that reflect the lower and upper 95th percentile confidence bound on the true positive rate.

Confidence intervals are not supported by default. However, they can be derived for some metrics using the `wilson confidence interval`. We will use a convenience function called `confusion_wilson` returns an array: [lower, upper] bound for metrics like true-positive-rate. 

Wrapping the wilson function in a `Metric` allows us to use it in Lens.

In [21]:
from credoai.metrics.credoai_metrics import confusion_wilson
from credoai.metrics import Metric

# define partial functions for the true positive rate lower bound
def lower_bound_tpr(y_true, y_pred):
    return confusion_wilson(y_true, y_pred, metric='true_positive_rate', confidence=0.95)[0]

# and upper bound
def upper_bound_tpr(y_true, y_pred):
    return confusion_wilson(y_true, y_pred, metric='true_positive_rate', confidence=0.95)[1]

# wrap the functions in fairness functions
lower_metric = Metric(name = 'lower_bound_tpr', 
                      metric_category = "binary_classification",
                      fun = lower_bound_tpr)

upper_metric = Metric(name = 'upper_bound_tpr', 
                      metric_category = "binary_classification",
                      fun = upper_bound_tpr)

In [22]:
from credoai.assessment import PerformanceAssessment

assessment_plan = {'Performance': {'metrics': [lower_metric, 'tpr', upper_metric]}}
lens = cl.Lens(model=credo_model,
               data=credo_data,
               assessments=[PerformanceAssessment],
              assessment_plan=assessment_plan)
lens.run_assessments().get_results()['validation_model']['Performance']['overall_performance']

INFO:absl:Initializing Assessment (Performance) with kwargs: {'metrics': [Metric(name='lower_bound_tpr', metric_category='BINARY_CLASSIFICATION', fun=<function lower_bound_tpr at 0x29528e280>, takes_prob=False, equivalent_names={'lower_bound_tpr'}), 'tpr', Metric(name='upper_bound_tpr', metric_category='BINARY_CLASSIFICATION', fun=<function upper_bound_tpr at 0x29528e310>, takes_prob=False, equivalent_names={'upper_bound_tpr'})]}
INFO:absl:Running assessment: Performance


Unnamed: 0,value,subtype
lower_bound_tpr,0.995532,overall_performance
tpr,0.997137,overall_performance
upper_bound_tpr,0.998166,overall_performance
