Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question - Agreement Krippendorff's alpha handling missing values #2732

Open
gbmarc1 opened this issue Jun 16, 2021 · 1 comment
Open

Question - Agreement Krippendorff's alpha handling missing values #2732

gbmarc1 opened this issue Jun 16, 2021 · 1 comment
Labels

Comments

@gbmarc1
Copy link

gbmarc1 commented Jun 16, 2021

Documentation does not specify how to handle missing values:

Note that the data list needs to contain the same number of triples for each
individual coder, containing category values for the same set of items.
Alpha (Krippendorff 1980)
Kappa (Cohen 1960)
S (Bennet, Albert and Goldstein 1954)
Pi (Scott 1955)
TODO: Describe handling of multiple coders and missing data

In theory, Krippendorff's alpha CAN handle missing data (coders that do not assign an annotation to an item).

However, the documentation states:

Note that the data list needs to contain the same number of triples for each
individual coder, containing category values for the same set of items.

So, if we omit a triplet for missing data. Will the metrics be computed correctly? Do we have to put a place holder in the labels field of the triplet like np.nan or None?

Something like?

("coder1", "item1", None)

I am particularly interested in the multi-label case with the MASI distance and with labels defined in a frozenset(['l1','l2']).

@tomaarsen
Copy link
Member

It seems that if None is used, it is considered as a value, and not ignored like one might expect. Please have a look at #2865 for some more discussion on the topic. That said, I'm unsure whether omitting these triples results in the correct computation - I'm not very familiar with the agreement module, nor the corresponding line of research.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants