It is a measure of inter-rater reliability and there is a Julia package to compute it: https://discourse.julialang.org/t/ann-krippendorff-jl/54261