crowdkit.aggregation.classification.dawid_skene.OneCoinDawidSkene
| Source code
OneCoinDawidSkene(
self,
n_iter: int = 100,
tol: float = 1e-05
)
The one-coin Dawid-Skene aggregation model works exactly the same as the original Dawid-Skene model based on the EM algorithm, except for calculating the workers' errors
at the M-step of the algorithm.
For the one-coin model, a worker confusion (error) matrix is parameterized by a single parameter
In other words, the worker
Parameters
- E-step. Estimates the true task label probabilities using the specified workers' responses, the prior label probabilities, and the workers' error probability matrix.
- M-step. Calculates a worker skill as their accuracy according to the label probability. Then estimates the workers' error probability matrix by assigning user skills to error matrix row by row.
Y. Zhang, X. Chen, D. Zhou, and M. I. Jordan. Spectral methods meet EM: A provably optimal algorithm for crowdsourcing. Journal of Machine Learning Research. Vol. 17, (2016), 1-44.
https://doi.org/10.48550/arXiv.1406.3824
Parameters | Type | Description |
---|---|---|
n_iter |
int | The maximum number of EM iterations. |
tol |
float | The tolerance stopping criterion for iterative methods with a variable number of steps. The algorithm converges when the loss change is less than the |
labels_ |
Optional[Series] | The task labels. The |
probas_ |
Optional[DataFrame] | The probability distributions of task labels. The |
priors_ |
Optional[Series] | The prior label distribution. The |
errors_ |
Optional[DataFrame] | The workers' error matrices. The |
skills_ |
Optional[Series] | The workers' skills. The |
loss_history_ |
List[float] | A list of loss values during training. |
Examples:
from crowdkit.aggregation import OneCoinDawidSkene
from crowdkit.datasets import load_dataset
df, gt = load_dataset('relevance-2')
hds = OneCoinDawidSkene(100)
result = hds.fit_predict(df)
Method | Description |
---|---|
fit | Fits the model to the training data with the EM algorithm. |