Krippendorff's alpha coefficient

Jeffrey M Girard edited this page Apr 5, 2016 · 25 revisions

Overview

The alpha coefficient is a chance-adjusted index for the reliability of categorical measurements. It estimates chance agreement using an average-distribution-based approach. Like Scott's pi coefficient, it assumes that observers have a conspired "quota" for each category that they work together to meet. However, unlike pi, it also attempts to correct for sample size and yields a higher reliability score than the pi coefficient when the reliability experiment includes fewer items.

MATLAB Functions

  • mALPHAK %Calculates alpha using vectorized formulas

Simplified Formulas

Use these formulas with two raters and two (dichotomous) categories:


p_o

m_1

m_2

p_c

alpha


n_11 is the number of items both raters assigned to category 1

n_22 is the number of items both raters assigned to category 2

n is the total number of items

n_1+ is the number of items rater 1 assigned to category 1

n_2+ is the number of items rater 1 assigned to category 2

n_+1 is the number of items rater 2 assigned to category 1

n_+2 is the number of items rater 2 assigned to category 2

Contingency Table

Generalized Formulas

Use these formulas with multiple raters, multiple categories, and any weighting scheme:


rstar_ik

p'_o

rbar

epsilon_n

p_o

pi_k

p_c

alpha


q is the total number of categories

w_kl is the weight associated with two raters assigning an item to categories k and l

r_il is the number of raters that assigned item i to category l

n' is the number of items that were coded by two or more raters

r_ik is the number of raters that assigned item i to category k

r_i is the number of raters that assigned item i to any category

References

  1. Krippendorff, K. (1970). Estimating the reliability, systematic error and random error of interval data. Educational and Psychological Measurement, 30(1), 61–70.
  2. Krippendorff, K. (1980). Content analysis: An introduction to its methodology. Newbury Park, CA: Sage Publications.
  3. Hayes, A. F., & Krippendorff, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1(1), 77–89.
  4. Gwet, K. L. (2014). Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters (4th ed.). Gaithersburg, MD: Advanced Analytics.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.