<img src="https://comet.ml/images/logo_comet_light.png" width="200px"/>

# Comet.ml Confusion Matrix

*This page is available as an executable or viewable **Jupyter Notebook**:* <br/>
<a href="https://mybinder.org/v2/gh/comet-ml/comet-examples/master?filepath=notebooks%2FComet-Confusion-Matrix.ipynb" target="_parent"><img align="left" src="https://mybinder.org/badge_logo.svg"></a>
<a href="https://nbviewer.jupyter.org/github/comet-ml/comet-examples/blob/master/notebooks/Comet-Confusion-Matrix.ipynb" target="_parent"><img align="right" src="https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.png" width="109" height="20"></a>
<br/>
<hr/>
Comet.ml can generate a variety of visualizations, including line charts, scatter charts, bar charts, and histograms. This notebook explores Comet's confusion matrix chart.

## Setup

The first thing we'll do in this notebook tutorial is install **comet_ml** and other items that we'll need for this deomonstration. That will include **keras**, **tensorflow**, and **numpy**.

First, comet_ml (you may want to do this slightly differently on your computer):

In [None]:
%pip install --upgrade --upgrade-strategy eager --user comet_ml 

And now tensorflow, keras, and numpy:

In [None]:
%pip install --upgrade --upgrade-strategy eager --user keras tensorflow numpy

As the output may suggest, if anything got updated, it might be a good idea to restart the kernel and continue from here.

### Comet Configuration

To run the following experiments, you'll need to set your COMET_API_KEY. The easiest way to to this is to set the values in a cell like this:

```python
import comet_ml

comet_ml.save(api_key="...")
```
where you replace the ...'s with your key.

You can get your COMET_API_KEY under your quickstart link (replace YOUR_USERNAME with your Comet.ml username):

https://www.comet.ml/YOUR_USERNAME/quickstart



## Example 1: Simple Confusion Matrix

First, we will create an experiment:

In [1]:
from comet_ml import Experiment

We're not interested at the moment in logging environment details or the code and related items, so I'll not log those:

In [2]:
experiment = Experiment(project_name="confusion-matrix", log_env_details=False, log_code=False)

COMET INFO: Experiment is live on comet.ml https://staging.comet.ml/testuser/confusion-matrix/582c05687fee4e4a94c7901e0827dfe5



As a simple example, let's consider that we have these six patterns that are our ouput targets (desired output):

In [3]:
desired_output = [
    [1, 0, 0],
    [0, 1, 0],
    [0, 0, 1],
    [1, 0, 0],
    [0, 1, 0],
    [0, 0, 1],
 ]

Imagine that this is a classification task where each target (desired output) is composed of three output values, with one unit "on" (set to 1) and the others "off" (set to 0). This is sometimes called a "one-hot" representation and is a common way of representing categories. There are 6 patterns, where there are 2 each for category.

Now, let's make up some sample data that an model might produce. Let's say initially that the output is pretty random and doesn't even add up to 1 for each row. This may be unrealistic as many such classification tasks might use an error/loss output metric that is based on [cross entropy](http://www.cse.unsw.edu.au/~billw/cs9444/crossentropy.html) which would make the sum of values closer to 1. That might be desirable, but is not required for our example here.

In [4]:
actual_output = [
    [0.1, 0.5, 0.4],
    [0.2, 0.2, 0.3],
    [0.7, 0.4, 0.5],
    [0.3, 0.8, 0.3],
    [0.0, 0.5, 0.3],
    [0.1, 0.5, 0.5],
 ]

Our goal now is to visualize how much the model mixes up the categories. That is, we'd like to see the Confusion Matrix comparing all categories against each other. We can do that easily by simply logging it with the experiment:

In [5]:
experiment.log_confusion_matrix(desired_output, actual_output);

That's it! We can now end the experiment and take a look at the resulting matrix:

In [6]:
experiment.end()

COMET INFO: ----------------------------
COMET INFO: Comet.ml Experiment Summary:
COMET INFO:   Data:
COMET INFO:     url: https://staging.comet.ml/testuser/confusion-matrix/582c05687fee4e4a94c7901e0827dfe5
COMET INFO:   Uploads:
COMET INFO:     confusion-matrix: 1
COMET INFO: ----------------------------
COMET INFO: Uploading stats to Comet before program termination (may take several seconds)


In [7]:
experiment.display(tab="confusion-matrices")

## Example #2: Log Confusion Matrices During Learning

In [8]:
from comet_ml.utils import ConfusionMatrix

from tensorflow.keras.callbacks import Callback
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.utils import to_categorical

from keras.datasets import mnist


Using TensorFlow backend.


In [9]:
num_classes = 10

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype("float32")
x_test = x_test.astype("float32")
x_train /= 255
x_test /= 255

# convert class vectors to binary class matrices
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

In [10]:
def create_model():
    model = Sequential()
    model.add(Dense(128, activation="sigmoid", input_shape=(784,)))
    model.add(Dense(128, activation="sigmoid"))
    model.add(Dense(128, activation="sigmoid"))
    model.add(Dense(10, activation="softmax"))
    model.compile(
        loss="categorical_crossentropy", optimizer=RMSprop(), metrics=["accuracy"]
    )
    return model

In [11]:
class ConfusionMatrixCallback(Callback):
    def __init__(self, experiment, inputs, targets):
        self.experiment = experiment
        self.inputs = inputs
        self.targets = targets

    def on_epoch_end(self, epoch, logs={}):
        predicted = self.model.predict(self.inputs)
        self.experiment.log_confusion_matrix(
            self.targets,
            predicted,
            title="Confusion Matrix, Epoch #%d" % (epoch + 1),
            file_name="confusion-matrix-%03d.json" % (epoch + 1),
        )

In [12]:
experiment = Experiment(project_name="confusion-matrix", log_env_details=False, log_code=False)

COMET INFO: Experiment is live on comet.ml https://staging.comet.ml/testuser/confusion-matrix/e0d1d179cc0e4066b0557b2f39ffce2f



Before any training:

In [13]:
model = create_model()

y_predicted = model.predict(x_test)

experiment.log_confusion_matrix(
    y_test,
    y_predicted,
    step=0,
    title="Confusion Matrix, Epoch #0",
    file_name="confusion-matrix-%03d.json" % 0,
);

In [14]:
callback = ConfusionMatrixCallback(experiment, x_test, y_test)

model.fit(
    x_train,
    y_train,
    batch_size=120,
    epochs=5,
    callbacks=[callback],
    validation_data=(x_test, y_test),
)

COMET INFO: Ignoring automatic log_parameter('verbose') because 'keras:verbose' is in COMET_LOGGING_PARAMETERS_IGNORE
COMET INFO: Ignoring automatic log_parameter('do_validation') because 'keras:do_validation' is in COMET_LOGGING_PARAMETERS_IGNORE


Train on 60000 samples, validate on 10000 samples
Epoch 1/5


COMET INFO: Ignoring automatic log_metric('batch_batch') because 'keras:batch_batch' is in COMET_LOGGING_METRICS_IGNORE
COMET INFO: Ignoring automatic log_metric('batch_size') because 'keras:batch_size' is in COMET_LOGGING_METRICS_IGNORE


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7ff9496acd30>

In [15]:
experiment.end()

COMET INFO: ----------------------------
COMET INFO: Comet.ml Experiment Summary:
COMET INFO:   Data:
COMET INFO:     url: https://staging.comet.ml/testuser/confusion-matrix/e0d1d179cc0e4066b0557b2f39ffce2f
COMET INFO:   Metrics [count] (min, max):
COMET INFO:     accuracy [5]                : (0.7760833501815796, 0.963866651058197)
COMET INFO:     batch_accuracy [250]        : (0.07500000298023224, 0.9750000238418579)
COMET INFO:     batch_loss [250]            : (0.02728164941072464, 2.5384292602539062)
COMET INFO:     epoch_duration [5]          : (5.349212733999593, 6.861876343999029)
COMET INFO:     loss [5]                    : (0.12004035719856619, 0.7737466164529324)
COMET INFO:     step                        : 2920
COMET INFO:     val_accuracy [5]            : (0.9052000045776367, 0.9639000296592712)
COMET INFO:     val_loss [5]                : (0.1247768894834444, 0.32307542380690574)
COMET INFO:     validate_batch_accuracy [45]: (0.880081295967102, 1.0)
COMET INFO:     val

In [16]:
experiment.display(tab="confusion-matrices")

## Example 3: Reuse ConfusionMatrix instance

In [17]:
def index_to_example(index):
    image_array = x_test[index]
    image_name = "confusion-matrix-%05d.png" % index
    results = experiment.log_image(
        image_array, name=image_name, image_shape=(28, 28, 1)
    )
    # Return sample, assetId (index is added automatically)
    return {"sample": image_name, "assetId": results["imageId"]}


class ConfusionMatrixCallback(Callback):
    def __init__(self, experiment, inputs, targets, cm):
        self.experiment = experiment
        self.inputs = inputs
        self.targets = targets
        self.cm = cm

    def on_epoch_end(self, epoch, logs={}):
        predicted = self.model.predict(self.inputs)
        self.cm.compute_matrix(self.targets, predicted)
        self.experiment.log_confusion_matrix(
            matrix=self.cm,
            title="Confusion Matrix, Epoch #%d" % (epoch + 1),
            file_name="confusion-matrix-%03d.json" % (epoch + 1),
        )

In [18]:
experiment = Experiment(project_name="confusion-matrix", log_env_details=False, log_code=False)

COMET INFO: Experiment is live on comet.ml https://staging.comet.ml/testuser/confusion-matrix/d9f38aee643b4599a1cd2e25b6337ded



In [19]:
model = create_model()

In [20]:
# Before any training:
y_predicted = model.predict(x_test)

In [21]:
CM = ConfusionMatrix(index_to_example_function=index_to_example)

In [23]:
CM.compute_matrix(y_test, y_predicted)

In [24]:
CM.display()

   A                Confusion Matrix            
   c               Predicted Category           
   t       0   1   2   3   4   5   6   7   8   9
   u   0   0   0   0   0   0   0   0   0 980   0
   a   1   0   0   0   0   0   0   0   0 113   0
   l   2   0   0   0   0   0   0   0   0 103   0
       3   0   0   0   0   0   0   0   0 101   0
   C   4   0   0   0   0   0   0   0   0 982   0
   a   5   0   0   0   0   0   0   0   0 892   0
   t   6   0   0   0   0   0   0   0   0 958   0
   e   7   0   0   0   0   0   0   0   0 102   0
   g   8   0   0   0   0   0   0   0   0 974   0
   o   9   0   0   0   0   0   0   0   0 100   0
   r


In [25]:
experiment.log_confusion_matrix(
    matrix=CM,
    step=0,
    title="Confusion Matrix, Epoch #0",
    file_name="confusion-matrix-%03d.json" % 0,
);

{'web': 'https://staging.comet.ml/api/asset/download?assetId=8cfb31ffafb24946892d1e97bb50f971&experimentKey=d9f38aee643b4599a1cd2e25b6337ded',
 'api': 'https://staging.comet.ml/api/rest/v1/asset/get-asset?assetId=8cfb31ffafb24946892d1e97bb50f971&experimentKey=d9f38aee643b4599a1cd2e25b6337ded',
 'assetId': '8cfb31ffafb24946892d1e97bb50f971'}

In [26]:
callback = ConfusionMatrixCallback(experiment, x_test, y_test, CM)

model.fit(
    x_train,
    y_train,
    batch_size=120,
    epochs=5,
    callbacks=[callback],
    validation_data=(x_test, y_test),
)

Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7ff94976c400>

In [27]:
experiment.end()

COMET INFO: ----------------------------
COMET INFO: Comet.ml Experiment Summary:
COMET INFO:   Data:
COMET INFO:     url: https://staging.comet.ml/testuser/confusion-matrix/d9f38aee643b4599a1cd2e25b6337ded
COMET INFO:   Metrics [count] (min, max):
COMET INFO:     accuracy [5]                : (0.7754833102226257, 0.9641166925430298)
COMET INFO:     batch_accuracy [250]        : (0.0833333358168602, 0.9666666388511658)
COMET INFO:     batch_loss [250]            : (0.046925898641347885, 2.69844913482666)
COMET INFO:     epoch_duration [5]          : (8.152320004999638, 37.434222419000434)
COMET INFO:     loss [5]                    : (0.12025839633122086, 0.7795354133248329)
COMET INFO:     step                        : 2920
COMET INFO:     val_accuracy [5]            : (0.9118000268936157, 0.9631999731063843)
COMET INFO:     val_loss [5]                : (0.12855259421095253, 0.3052644800543785)
COMET INFO:     validate_batch_accuracy [45]: (0.8884146213531494, 1.0)
COMET INFO:     va

In [29]:
experiment.display(tab="confusion-matrices")

Data structure/asset type: "confusion-matrix"

JSON format:

```
{
    "version": 1,
    "title": "Confusion Matrix",
    "labels": <List of N strings>,
    "matrix": <N x N list of lists of integers, [ROW][COL] order>,
    "rowLabel": "Actual Category",
    "columnLabel": "Predicted Category",
    "maxSamplesPerCell": 25,
    "sampleMatrix": <N x N list of list of cells, or null>,
    "type": "integer" | "string" | "link" | "image",
}
```

The matrix is the [ROW][COL] integers of confusion. labels are for
rows (top to bottom) and cols (left to right). The diagonal (top-left
to bottom-right) are the "correct" cells (predicted and true are the
same).

If the sampleMatrix is null, there are no examples.

A cell can be null, or a dictionary/object with keys/values:

* index: an integer representing example number
* sample: string
* assetId: string or null (null if not asset)

When type == "integer", the cell is a list of
associated example numbers (usually the index, but
can be anything.)

Otherwise:

* when type == "string", s is the string to show
* when type == "link", s is a URL
* when type == "image", s is the name of the asset, a is assetId

When the asset is logged, it is logged with:

* step
* file_name ("confusion-matrix.json")
* asset_type ("confusion-matrix")

We hope that this gives you some ideas of how you can use the Comet Confusion Matrix! If you have questions or comments, feel free to visit the [Comet issue tracker](https://github.com/comet-ml/issue-tracking) and leave us a note.