## Bidisagreements visualisation matrix

In this notebook I will demonstrate how to use the `agreements` module within the annotations library, which allows for the qualitative assessment of bidisagreements (cases of data instances with 1 disagreement). The lone class here is `agreements.BiDisagreements`.

In [3]:
import sys
import pandas as pd

from disagree import agreements
from disagree.agreements import BiDisagreements

First we set up a dictionary of annotated data, which we convert to a dataframe for use in this library:

In [4]:
test_annotations = {"a": [None, None, None, None, None, "dog", "ant", "cat", "dog", "cat", "cat", "cow", "cow", None, "cow"],
                    "b": ["cat", None, "dog", "cat", "cow", "cow", "ant", "cow", None, None, None, None, None, None, None],
                    "c": [None, None, "dog", "cat", "cow", "ant", "ant", None, "dog", "cat", "cat", "cow", "cow", None, "ant"]}
df = pd.DataFrame(test_annotations)
print(df) 

       a     b     c
0   None   cat  None
1   None  None  None
2   None   dog   dog
3   None   cat   cat
4   None   cow   cow
5    dog   cow   ant
6    ant   ant   ant
7    cat   cow  None
8    dog  None   dog
9    cat  None   cat
10   cat  None   cat
11   cow  None   cow
12   cow  None   cow
13  None  None  None
14   cow  None   ant


Initialise the instance of `BiDisagreements`:

In [5]:
bidis = BiDisagreements(df)

We can get a summary of the number of instances of data where no disagreements occurred, where 1 disagreement occurred (bidisagreement), where 2 disagreements occurred (tridisagreement), and where even more disagreements occurred. 

In [6]:
bidis.agreements_summary()

Number of instances with:
No disagreement: 9
Bidisagreement: 2
Tridisagreement: 1
More disagreements: 0


(9, 2, 1, 0)

This shows that there are 9 instances of data for which all annotators that labelled it agree. There are 2 instances whereby 2 of the annotators disagree on the label. There is 1 instance where 3 annotators disagree. There are no instances where more than 3 annotators disagree (there are only 3 annotators in this example anyway, so it would be very strange if this wasn't zero!).

If you want to just have a look at the bidisagreements visually, then you can return a matrix representing the disagreements, and plot it however you like. Element $(i, j)$ is the number of bidisagreements between label $i$ and label $j$.

In [7]:
mat = bidis.agreements_matrix()
mat_normalised = bidis.agreements_matrix(normalise=True)

In [8]:
print("Bidisagreements Matrix")
print(mat)
print()
print("Normalised Bidisagreements Matrix")
print(mat_normalised)

Bidisagreements Matrix
[[0. 0. 0. 2.]
 [0. 0. 0. 0.]
 [0. 0. 0. 2.]
 [2. 0. 2. 0.]]

Normalised Bidisagreements Matrix
[[0.  0.  0.  0.5]
 [0.  0.  0.  0. ]
 [0.  0.  0.  0.5]
 [0.5 0.  0.5 0. ]]


As we've seen when using the `agreements_summary` method, there were two bidisagreements. This matrix shows that 2 of these come from a disagreement between labels 2 and 0, and the other 2 come from labels 2 and 3. 

At this small scale, it's not very useful, but when you have 10s of thousands of labels, this can be really useful for quickly identifying where large disagreements are coming from. Once you can pinpoint where the disagreement comes from, you can go about modifying annotation schema and/or label types.

Addressing these issues is essential to building datasets robust to machine learning algorithms. If your annotations are frought with disagreements, then any machine learning model will not be reliable.