# AUA, DS 229 – MLOps
### Week 2 – Responsible Machine Learning with Error Analysis

- [Responsible Machine Learning with Error Analysis](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/responsible-machine-learning-with-error-analysis/ba-p/2141774)
- [Responsible AI dashboard: A one-stop shop for operationalizing Responsible AI in practice](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/responsible-ai-dashboard-a-one-stop-shop-for-operationalizing/ba-p/3030944)
- [AI Show: Live Responsible AI Dashboard: One-stop shop for operationalizing RAI in practice](https://techcommunity.microsoft.com/t5/video-hub/ai-show-live-responsible-ai-dashboard-one-stop-shop-for/ba-p/3060153)
- [Getting started](https://github.com/microsoft/responsible-ai-toolbox/blob/main/notebooks/responsibleaidashboard/getting-started.ipynb)
- [Take a Tour: Responsible AI Toolbox](https://github.com/microsoft/responsible-ai-toolbox/blob/main/notebooks/responsibleaidashboard/tour.ipynb)

# Notebook 01 – Regression Error Analysis

This notebook demonstrates the use of the Responsible AI Toolbox to make error analysis for regressn task. It walks through the API calls necessary to create a widget that guides a visual analysis of the data.

In [None]:
# !pip install --upgrade pip
# !pip install interpret-community 
# !pip install raiwidgets

## Launch Responsible AI Toolbox

The following section examines the code necessary to create the dataset. It then generates insights using the `responsibleai` API that can be visually analyzed.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
import zipfile

First, load the apartment dataset and specify the different types of features. Then, clean it and put it into a DataFrame with named columns. After loading and cleaning the data, split the datapoints into training and test sets. Assemble separate datasets for the full sample and the test data.

In [None]:
from raiutils.dataset import fetch_dataset
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer


def split_label(dataset, target_feature):
    X = dataset.drop([target_feature], axis=1)
    y = dataset[[target_feature]]
    return X, y


def clean_data(X, y, target_feature):
    features = X.columns.values.tolist()
    classes = y[target_feature].unique().tolist()
    pipe_cfg = {
        'num_cols': X.dtypes[X.dtypes == 'int64'].index.values.tolist(),
        'cat_cols': X.dtypes[X.dtypes == 'object'].index.values.tolist(),
    }
    num_pipe = Pipeline([
        ('num_imputer', SimpleImputer(strategy='median')),
        ('num_scaler', StandardScaler())
    ])
    cat_pipe = Pipeline([
        ('cat_imputer', SimpleImputer(strategy='constant', fill_value='?')),
        ('cat_encoder', OneHotEncoder(handle_unknown='ignore', sparse=False))
    ])
    feat_pipe = ColumnTransformer([
        ('num_pipe', num_pipe, pipe_cfg['num_cols']),
        ('cat_pipe', cat_pipe, pipe_cfg['cat_cols'])
    ])
    X = feat_pipe.fit_transform(X)
    
    print("Categorical columns:", pipe_cfg['cat_cols'])
    print("Numerical columns:", pipe_cfg['num_cols'])
    return X, feat_pipe, features, classes

In [None]:
target_feature = 'SalePriceK'
categorical_features = []

outdirname = 'responsibleai.12.28.21'
zipfilename = outdirname + '.zip'

fetch_dataset('https://publictestdatasets.blob.core.windows.net/data/' + zipfilename, zipfilename)

with zipfile.ZipFile(zipfilename, 'r') as unzip:
    unzip.extractall('.')

In [None]:
# 1) Reading the data and splitting into train/test.
all_data = pd.read_csv('apartments-train.csv')
all_data = all_data.drop(['Sold_HigherThan_Median','SalePrice'], axis=1)
X, y = split_label(all_data, target_feature)
X_train_original, X_test_original, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=7)

# 2) Preprocessing train/test parts (median imputation and standard scaling of numerical columsn, 
#    constan imputation and one-hot encoding of categorical features). 
X_train, feat_pipe, features, classes = clean_data(X_train_original, y_train, target_feature)
y_train = y_train[target_feature].to_numpy()
X_test = feat_pipe.transform(X_test_original)
y_test = y_test[target_feature].to_numpy()

# 3) Train/test dataframe construction.
train_data = X_train_original.copy()
train_data[target_feature] = y_train
test_data = X_test_original.copy()
test_data[target_feature] = y_test

In [None]:
test_data.head()

### Create Data Insights

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error

In [None]:
print(f"Train data shape: {train_data.shape}")
print(f"Test data shape: {test_data.shape}")
print(f"Target: {target_feature}")

sns.displot(data=train_data, x=target_feature)
plt.show()

In [None]:
def ignore_target(df):
    """Selects all columns but target."""
    return df.loc[:, df.columns != target_feature]


# Fitting a Random Forest Regressor model.
model = RandomForestRegressor().fit(X=ignore_target(train_data), y=train_data[target_feature])
y_pred = model.predict(ignore_target(test_data))

mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = model.score(ignore_target(test_data), y_test)
print(f"MSE: {round(mse, 2)}\nMAE: {round(mae, 2)}\nR^2: {round(r2, 2)}")

In [None]:
from raiwidgets import ResponsibleAIDashboard
from responsibleai import RAIInsights

To use Responsible AI Dashboard, initialize a RAIInsights object upon which different components can be loaded.

RAIInsights accepts the model, the full dataset, the test dataset, the target feature string, the task type string, and a list of strings of categorical feature names as its arguments.

In [None]:
rai_insights = RAIInsights(model, train_data, test_data, target_feature, 'regression',
                           categorical_features=categorical_features)

Once all the desired components have been loaded, compute insights on the test set.

In [None]:
rai_insights.explainer.add()  # Interpretability.
rai_insights.error_analysis.add()  # Error Analysis.
rai_insights.causal.add(treatment_features=['OverallCond', 'OverallQual', 'Fireplaces', 'GarageCars', 
                                            'ScreenPorch'])  # Causal insights.

rai_insights.compute()  # Compute insights.

Finally, visualize and explore the model insights. Use the resulting widget or follow the link to view this in a new tab.

In [None]:
ResponsibleAIDashboard(rai_insights, port=6009)

#### **Tasks:**
<div class="alert alert-success">

✅ Investigate **Error analysis** tab:
  - Chart **Tree map** is used to identify common failure patterns (notice that MAE is significantly higher for *GrLivArea > 2671*).
  - Chart **Heat map** is sued to focus on combination of features (visualize 'GrLivArea' vs 'YearBuilt', what can you infer?).
    
✅ Investigate **Model overview** tab:
  - Chart **Dataset cohorts** is used to analyze your model performance on different dataset cohorts.
  - Chart **Feature cohorts** is used to do a comparative analysis across sensitive groups (analyze 'YearBuilt' and derive insights).

✅ Investigate **Data Anlysis** tab:
  - Chart **Table view** is used to analyze dataset with predictions.    
  - Chart **Chart view** is used to analyze dataset statistics across differend groups:    
      - Use **Individual datapoints** subsection to construct 2D plots, e.g. construct 'Predicted Y' vs 'True Y' plot.
      - Use **Aggregate plots** subsection to visualize against different attributes, e.g. 'Error' vs 'YearBuilt' and 'GrLivArea'. 
    
✅ Investigate **Feature importances** tab:
  - Chart **Aggregate feature importance** is used to visualize top-k decisive features.
  - Chart **Individual feature importance** is used for more granular analysis:
      - Use **Feature importance plot** to select up to 5 datapoints and visualize which features were important for those particular predictions. Note that this can be valuable for fairness analysis, imagine that features like neighborhood, gender, skin color appear to be critical for your model, is there any problem?
      - Use **Individual conditional expectation (ICE) plot** to investigate how changing a feature value from a minimum value to a maximum value impacts the prediction on the selected data instance. Derive insights for 'YearBuilt' and 'GrLivArea'.
    
✅ Investigate **Causal analysis** tab:
  - Chart **Aggregate causal effects** is used to answer "what-if" questions. In particular, for each treatment variable passed to the API you get how much the prediction will change if you increase a variable by one unit.
  - Chart **Individual causal what-if** does the same as above for a specific data point.
  - Chart **Treatment policy** represents the best future interventions one can apply to his/her houses to see the biggest positive boost in the housing prices. Derive insights for a treatment variable 'ScreenPorch'. Below you can find gains for the suggested policis relative to the baseline of not making any change.
    
</div>

See this [developer blog](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/responsible-ai-dashboard-a-one-stop-shop-for-operationalizing/ba-p/3030944) (Decision Making Flow section) to learn more about this use case and how to use the dashboard to debug your housing price prediction model.

### References
- Responsible AI Toolbox [[Source]](https://github.com/microsoft/responsible-ai-toolbox/blob/main/notebooks/responsibleaidashboard/responsibleaidashboard-housing-decision-making.ipynb) 