# Beyond Single Feature Importance with ICECREAM: Cloud Experiment

In this experiment (see also Section 5.2 of the paper), we apply ICECREAM to the Cloud Computing Application defined in the paper and appendix.

The notebook is meant to enable the user to re-produce our results. It is split into four sections:

- Setup: This section imports the required modules and defines a few utility functions.

- Preparation: This section generates a synthetic dataset from a `generating_causal_network` with given parameters. It then calculates and stores the ground truth, and fits a new causal network to the generated samples, such that only the structure of the network needs to be known. The generated data is stored to a folder (the default is `/data`). The `/original_data` folder already contains the samples used for the results in the paper, so this section can be skipped.

- Calculation of scores: This is the calculation of minimal full-explanation coalitions with ICECREAM, and of anomaly scores using the baseline methods. 
  
  **WARNING: For large numbers of samples (e.g., the 10_000_000 samples that were used in the paper), this can take a very long time!**

- Analysis: This section loads the data from the previous sections and performs the same analysis we are showing in the paper.

## Setup

In [None]:
from datetime import datetime
import os
import itertools

import pickle
from timeit import default_timer as timer
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm

from dowhy.gcm.anomaly import attribute_anomalies

from explain import CausalNetwork, expected_value, noisify
from explain.explanations.causal_explanation import exact_explanation_score, find_minimum_size_coalitions
from causal_models import CloudServiceErrorModel, CloudServiceErrorRootModel, CloudServiceErrorConditionalModel

In [None]:
tqdm.pandas()

In [None]:
def subsets(s, include_empty_set=True):
    return map(set, itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(0 if include_empty_set else 1, len(s) + 1)))

def find_ground_truth_minimum_size_coalitions(causal_network, sample, target, ground_truth_error_nodes, *, threshold=0.99999):
    coalitions = [coalition for coalition in subsets(ground_truth_error_nodes, include_empty_set=False) if expected_value(exact_explanation_score(causal_network, pd.Series(sample), target, coalition)) >= threshold]
    min_size = min(map(len, coalitions))
    return [coalition for coalition in coalitions if len(coalition) == min_size]

## Preparation

First, we create a causal network for generating the ground truth. From this network, we draw samples, separate the error samples and calculate the ground truth. All data generated in this section is stored to a folder such that it can be loaded later.

In [None]:
# Set folder for data storage and loading (pre-calculated data is stored at `/original`)
folder = 'data'
os.makedirs(folder, exist_ok=True)

In [None]:
# Create generating causal network

order_db = CloudServiceErrorRootModel(p=0.05) # X_7
customer_db = CloudServiceErrorRootModel(p=0.01) # X_4
shipping = CloudServiceErrorRootModel(p=0.03) # X_6
product_db = CloudServiceErrorRootModel(p=0.02) # X_1
order = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel], t=1, p=0.01) # X_8
auth = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel], t=1, p=0.02) # X_5
product = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel] * 3, t=2, p=0.01) # X_3
caching = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel], t=1, p=0.01) # X_2
api = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel] * 4, t=3, p=0.01) # X_9
www = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel] * 2, t=2, p=0.0) # Y

generating_causal_network = CausalNetwork(
    {'order_db': order_db, 'customer_db': customer_db, 'shipping': shipping, 'product_db': product_db,
        'order': (order, ['order_db']), 'auth': (auth, ['customer_db']),
        'product': (product, ['customer_db', 'shipping', 'caching']), 'caching': (caching, ['product_db']),
        'api': (api, ['customer_db', 'order', 'auth', 'product']), 'www': (www, ['auth', 'api'])
    }
)

nodes = list(generating_causal_network.nodes)
noise = noisify(nodes)

In [None]:
# Create samples

number_of_samples = 10_000_000
noise_samples = generating_causal_network.draw_noise_samples(number_of_samples)[noise]

samples_number_of_errors = noise_samples.astype(int).sum(axis=1)
samples_number_of_errors.groupby(samples_number_of_errors).count()

In [None]:
# Store compressed samples and error samples to save space
print(f'Storing compressed samples...')
ground_truth_compressed = list(noise_samples.progress_apply(lambda row: sum(int(value) * 2 ** index for index, value in enumerate(reversed(row))), axis=1, result_type='reduce'))

with open(f'{folder}/compressed.pkl', 'wb') as f:
    pickle.dump(ground_truth_compressed, f, protocol=4)

In [None]:
print(f'Recovering full samples...')
samples = generating_causal_network.from_noise(noise_samples).astype(str)

print(f'Calculating error samples...')
error_samples = samples[samples['www'] == '1'].copy()

error_samples['sample'] = error_samples[nodes].to_dict('records')
error_samples['error_nodes'] = error_samples.progress_apply(lambda row: {node for node in generating_causal_network.nodes if row[f'_{node}'] == '1'}, axis=1)
error_samples['minimal_coalitions'] = error_samples.progress_apply(lambda row: find_ground_truth_minimum_size_coalitions(generating_causal_network, row['sample'], 'www', row['error_nodes']), axis=1, result_type='reduce')

print(f'Storing error samples...')
error_samples.to_pickle(f'{folder}/error_samples.pkl', protocol=4)

print('Done!')

In [None]:
# Recover full samples using original causal network
print('Reading compressed samples...')
ground_truth_compressed = pd.read_pickle(f'{folder}/compressed.pkl')
print('Decompressing samples...')
ground_truth = [[(number >> 9 - n) & 1 for n in range(10)] for number in ground_truth_compressed]

print('Recovering full samples...')
noise_samples = pd.DataFrame(ground_truth, columns=noise)
samples = generating_causal_network.from_noise(noise_samples).astype(str)

# Create causal network with dummy parameters and fit to samples
print('Fitting causal network...')
order_db = CloudServiceErrorRootModel(p=0.0)
customer_db = CloudServiceErrorRootModel(p=0.0)
shipping = CloudServiceErrorRootModel(p=0.0)
product_db = CloudServiceErrorRootModel(p=0.0)
order = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel], t=0, p=0.0)
auth = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel], t=0, p=0.0)
product = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel] * 3, t=0, p=0.0)
caching = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel], t=0, p=0.0)
api = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel] * 4, t=0, p=0.0)
www = CloudServiceErrorConditionalModel(parent_signature=[CloudServiceErrorModel] * 2, t=0, p=0.0)

causal_network = CausalNetwork(
    {'order_db': order_db, 'customer_db': customer_db, 'shipping': shipping, 'product_db': product_db,
        'order': (order, ['order_db']), 'auth': (auth, ['customer_db']),
        'product': (product, ['customer_db', 'shipping', 'caching']), 'caching': (caching, ['product_db']),
        'api': (api, ['customer_db', 'order', 'auth', 'product']), 'www': (www, ['auth', 'api'])
    }
)

causal_network.fit(samples)

with open(f'{folder}/causal_network.pkl', 'wb') as f:
    pickle.dump(causal_network, f)

## Calculation of scores

Now, we load the fitted causal network and the error samples, and calculate the scores (explanation score and anomaly scores) using ICECREAM and the baseline methods.

In [None]:
# Uncomment to manually set folder
# folder = 'data'

In [None]:
# Calculate and store scores

explanation_score_threshold = 0.9998
num_rca_distribution_samples = 1_000

with open(f'{folder}/causal_network.pkl', 'rb') as f:
    causal_network = pickle.load(f)
    nodes = list(causal_network.nodes)
    noise = noisify(nodes)

# Only load observation columns from error samples, not the ground truth
with open(f'{folder}/error_samples.pkl', 'rb') as f:
    error_samples = pickle.load(f)[nodes]

print(f'Calculating explanation scores...')
explanation_score_result = [find_minimum_size_coalitions(causal_network, sample, "www", threshold=explanation_score_threshold) for _, sample in tqdm(error_samples.iterrows())]
with open(f'{folder}/explanation_score.pkl', 'wb') as f:
    pickle.dump(explanation_score_result, f, protocol=4)

print(f'Calculating IT RCA scores...')
anomaly_scores = pd.DataFrame(attribute_anomalies(causal_network, 'www', error_samples, attribute_mean_deviation=False, num_distribution_samples=num_rca_distribution_samples))
with open(f'{folder}/outlier_rca.pkl', 'wb') as f:
    pickle.dump(anomaly_scores, f, protocol=4)

print(f'Calculating mean deviation RCA scores...')
mean_deviation_scores = pd.DataFrame(attribute_anomalies(causal_network, 'www', error_samples, attribute_mean_deviation=True, num_distribution_samples=num_rca_distribution_samples))
with open(f'{folder}/mean_deviation_rca.pkl', 'wb') as f:
    pickle.dump(mean_deviation_scores, f, protocol=4)

print(f'Calculating traversal RCA scores...')
def anomaly_traversal(causal_graph, anomaly_nodes):
    return {node for node in anomaly_nodes if not set(anomaly_nodes) & set(causal_graph.predecessors(node))}

traversal_rca_nodes = list(error_samples[nodes].apply(lambda x: x == '1').apply(lambda x: list(error_samples.columns[x.values]), axis=1))
traversal_rca_result = [anomaly_traversal(causal_network.graph, error_nodes) for error_nodes in traversal_rca_nodes]
with open(f'{folder}/traversal_rca.pkl', 'wb') as f:
    pickle.dump(traversal_rca_result, f, protocol=4)

print(f'Calculating alternative traversal RCA scores...')
def anomaly_traversal_2(causal_graph, target, anomaly_nodes):
    nodes, anormal_nodes = {target}, set()

    while nodes:
        node = nodes.pop()
        if node in anomaly_nodes:
            if not (anomaly_nodes & set(causal_graph.predecessors(node))):
                anormal_nodes.add(node)
            else:
                nodes.update(anomaly_nodes & set(causal_graph.predecessors(node)))

    return anormal_nodes

traversal_rca_2_result = [anomaly_traversal_2(causal_network.graph, 'www', set(error_nodes)) for error_nodes in traversal_rca_nodes]
with open(f'{folder}/traversal_rca_2.pkl', 'wb') as f:
    pickle.dump(traversal_rca_2_result, f, protocol=4)

print('Done!')

## Analysis

In [None]:
# Uncomment to manually set folder
# folder = 'data'

In [None]:
with open(f'{folder}/causal_network.pkl', 'rb') as f:
    causal_network = pickle.load(f)
    nodes = list(causal_network.nodes)
    noise = noisify(nodes)

# Load compressed data, decode into dataframe and load error samples
print(f'Loading compressed samples...')
ground_truth_compressed = pd.read_pickle(f'{folder}/compressed.pkl')
ground_truth = [[(number >> 9 - n) & 1 for n in range(10)] for number in tqdm(ground_truth_compressed)]

print(f'Recovering full samples...')
noise_samples = pd.DataFrame(ground_truth, columns=noise)
samples = generating_causal_network.from_noise(noise_samples).astype(str)

normal_samples = samples[samples['www'] == '0'].copy()

# This time, load full error samples (including ground truth) for analysis
print('Loading error samples...')
with open(f'{folder}/error_samples.pkl', 'rb') as f:
    error_samples = pickle.load(f)

print('Done!')

In [None]:
# Create table with number of samples by number of errors
normal_samples_number_of_errors = normal_samples[noise].astype(int).sum(axis=1)
error_samples_number_of_errors = error_samples[noise].astype(int).sum(axis=1)

table = pd.concat([normal_samples_number_of_errors.groupby(normal_samples_number_of_errors).count(), error_samples_number_of_errors.groupby(error_samples_number_of_errors).count()], axis=1).fillna(0).astype(int)
table.columns = ['Y=0', 'Y=1']
table

In [None]:
# Read raw results and create result dataframe

outlier_rca_absolute_threshold = 0.15
outlier_rca_cumulative_threshold = 0.95
mean_deviation_rca_absolute_threshold = 0.15
mean_deviation_rca_cumulative_threshold = 0.95

with open(f'{folder}/error_samples.pkl', 'rb') as f:
    error_samples = pickle.load(f)

with open(f'{folder}/causal_network.pkl', 'rb') as f:
    causal_network = pickle.load(f)
    nodes = list(causal_network.nodes)
    noise = noisify(nodes)

with open(f'{folder}/explanation_score.pkl', 'rb') as f:
    explanation_score_result = pickle.load(f)

with open(f'{folder}/outlier_rca.pkl', 'rb') as f:
    anomaly_scores = pickle.load(f)

outlier_rca_result = list(anomaly_scores.apply(lambda row: set(anomaly_scores.columns[row >= outlier_rca_absolute_threshold]) & (set(nodes)), axis=1))

outlier_rca_result_2 = []
for index, row in anomaly_scores.iterrows():
    scores = row.sort_values(ascending=False).cumsum()
    outlier_rca_result_2.append(set(scores.index[:(scores >= outlier_rca_cumulative_threshold * row.sum()).argmax() + 1]))

with open(f'{folder}/mean_deviation_rca.pkl', 'rb') as f:
    mean_deviation_scores = pickle.load(f)

mean_rca_result = list(mean_deviation_scores.apply(
    lambda row: set(mean_deviation_scores.columns[row >= mean_deviation_rca_absolute_threshold]) & (set(causal_network.nodes)),
    axis=1))

mean_rca_result_2 = []
for index, row in mean_deviation_scores.iterrows():
    scores = row.sort_values(ascending=False).cumsum()
    mean_rca_result_2.append(set(scores.index[:(scores >= mean_deviation_rca_cumulative_threshold * row.sum()).argmax() + 1]))

with open(f'{folder}/traversal_rca.pkl', 'rb') as f:
    traversal_rca_result = pickle.load(f)

with open(f'{folder}/traversal_rca_2.pkl', 'rb') as f:
    traversal_rca_result_2 = pickle.load(f)

result = error_samples.copy()
result['num_errors'] = result['error_nodes'].apply(len)
result['error_nodes_union'] = result['minimal_coalitions'].apply(lambda minimal_coalitions: set().union(*minimal_coalitions))

result['explanation_score_minimal_coalitions'] = explanation_score_result
result['explanation_score_error_nodes'] = result['explanation_score_minimal_coalitions'].apply(lambda minimal_coalitions: set().union(*minimal_coalitions))
result['outlier_rca_error_nodes'] = outlier_rca_result
result['outlier_rca_2_error_nodes'] = outlier_rca_result_2
result['mean_rca_error_nodes'] = mean_rca_result
result['mean_rca_2_error_nodes'] = mean_rca_result_2
result['traversal_rca_error_nodes'] = traversal_rca_result
result['traversal_rca_2_error_nodes'] = traversal_rca_result_2

with open(f'{folder}/result.pkl', 'wb') as f:
    pickle.dump(result, f, protocol=4)

In [None]:
# Uncomment to start with stored results 
# with open(f'{folder}/result.pkl', 'rb') as f:
#     result = pickle.load(f)

In [None]:
# First metric: Accuracy

accuracy = result.copy()

accuracy['explanation_score_correct'] = accuracy.apply(lambda row: all(coalition in row['minimal_coalitions'] for coalition in row['explanation_score_minimal_coalitions']), axis=1)
accuracy['outlier_rca_correct'] = accuracy.apply(
    lambda row: row['outlier_rca_error_nodes'] in row['minimal_coalitions'], axis=1)
accuracy['outlier_rca_2_correct'] = accuracy.apply(
    lambda row: row['outlier_rca_2_error_nodes'] in row['minimal_coalitions'], axis=1)
accuracy['mean_rca_correct'] = accuracy.apply(
    lambda row: row['mean_rca_error_nodes'] in row['minimal_coalitions'], axis=1)
accuracy['mean_rca_2_correct'] = accuracy.apply(
    lambda row: row['mean_rca_2_error_nodes'] in row['minimal_coalitions'], axis=1)
accuracy['traversal_rca_correct'] = accuracy.apply(
    lambda row: row['traversal_rca_error_nodes'] in row['minimal_coalitions'], axis=1)
accuracy['traversal_rca_2_correct'] = accuracy.apply(
    lambda row: row['traversal_rca_2_error_nodes'] in row['minimal_coalitions'], axis=1)

accuracy_columns = ['explanation_score_correct', 'outlier_rca_correct', 'outlier_rca_2_correct',
                        'mean_rca_correct', 'mean_rca_2_correct', 'traversal_rca_correct', 'traversal_rca_2_correct']

ax = accuracy.groupby('num_errors')[accuracy_columns].mean().plot(marker='o', ylabel='Accurary',
                                                                            xlabel='Number of original errors',
                                                                            xticks=list(accuracy.groupby(
                                                                                'num_errors').groups.keys()))
markers = ['s','o', 'D', 'P', '^', 'v', 'x']
for line, marker in zip(ax.get_lines(), markers):
    line.set_marker(marker)

ax.legend(['ICECREAm (ours)', 'IT-RCA-i', 'IT-RCA-c', 'Mean-RCA-i', 'Mean-RCA-c', 'Simple Traversal RCA', 'Backtracking Traversal RCA'])

plt.savefig(f'{folder}/accuracy.pdf', bbox_inches='tight')

In [None]:
# Second metric: Confusion (true positives, false negatives, false positives)

confusion = result.loc[result['num_errors'] > 2].copy()

fig, axs = plt.subplots(2, 3, figsize=(9, 6))

methods = ['explanation_score', 'outlier_rca', 'outlier_rca_2', 'mean_rca', 'mean_rca_2', 'traversal_rca']
method_names = ['ICECREAm (ours)', 'IT-RCA-i', 'IT-RCA-c', 'Mean-RCA-i', 'Mean-RCA-c', 'Traversal RCA']

for ax, method, method_name in zip(axs.reshape(6), methods, method_names):
    data = np.array([confusion.apply(lambda row: len(row['error_nodes_union'] & row[f'{method}_error_nodes']), axis=1).mean(),
        confusion.apply(lambda row: len(row['error_nodes_union'] - row[f'{method}_error_nodes']), axis=1).mean(),
        confusion.apply(lambda row: len(row[f'{method}_error_nodes'] - row['error_nodes_union']),axis=1).mean()
    ])
    ax.pie(data, colors=['#009E73', '#0072B2', '#E69F00'])
    ax.set_title(method_name)

fig.legend(['True positives', 'False negatives', 'False positives'], loc='center', bbox_to_anchor=(0.5, 0.1), ncols=3)
plt.subplots_adjust(wspace=0)
plt.savefig(f'{folder}/confusion-geq-3.pdf', bbox_inches='tight')

In [None]:
confusion = result.loc[result['num_errors'] > 2].copy()

fig, axs = plt.subplots(1, 2, figsize=(9, 6))

for ax, method, method_name in zip(axs, ['traversal_rca', 'traversal_rca_2'], ['Simple Traversal RCA', 'Backtracking Traversal RCA']):
    data = np.array([confusion.apply(lambda row: len(row['error_nodes_union'] & row[f'{method}_error_nodes']), axis=1).mean(),
        confusion.apply(lambda row: len(row['error_nodes_union'] - row[f'{method}_error_nodes']), axis=1).mean(),
        confusion.apply(lambda row: len(row[f'{method}_error_nodes'] - row['error_nodes_union']),axis=1).mean()
    ])
    ax.pie(data, colors=['#009E73', '#0072B2', '#E69F00'])
    ax.set_title(method_name)

fig.legend(['True positives', 'False negatives', 'False positives'], loc='center', bbox_to_anchor=(0.5, 0.2), ncols=3)
plt.subplots_adjust(wspace=0)
plt.savefig(f'{folder}/confusion-geq-3-traversal.pdf', bbox_inches='tight')