You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that the data list needs to contain the same number of triples for each
individual coder, containing category values for the same set of items.
Alpha (Krippendorff 1980)
Kappa (Cohen 1960)
S (Bennet, Albert and Goldstein 1954)
Pi (Scott 1955)
TODO: Describe handling of multiple coders and missing data
In theory, Krippendorff's alpha CAN handle missing data (coders that do not assign an annotation to an item).
However, the documentation states:
Note that the data list needs to contain the same number of triples for each
individual coder, containing category values for the same set of items.
So, if we omit a triplet for missing data. Will the metrics be computed correctly? Do we have to put a place holder in the labels field of the triplet like np.nan or None?
Something like?
("coder1", "item1", None)
I am particularly interested in the multi-label case with the MASI distance and with labels defined in a frozenset(['l1','l2']).
The text was updated successfully, but these errors were encountered:
It seems that if None is used, it is considered as a value, and not ignored like one might expect. Please have a look at #2865 for some more discussion on the topic. That said, I'm unsure whether omitting these triples results in the correct computation - I'm not very familiar with the agreement module, nor the corresponding line of research.
Documentation does not specify how to handle missing values:
nltk/nltk/metrics/agreement.py
Lines 35 to 44 in e4444c9
In theory, Krippendorff's alpha CAN handle missing data (coders that do not assign an annotation to an item).
However, the documentation states:
So, if we omit a triplet for missing data. Will the metrics be computed correctly? Do we have to put a place holder in the
labels
field of the triplet like np.nan or None?Something like?
("coder1", "item1", None)
I am particularly interested in the multi-label case with the MASI distance and with labels defined in a
frozenset(['l1','l2'])
.The text was updated successfully, but these errors were encountered: