Let:

* $workers(s):$ all workers that annotate sentence $s$;
* $sentences(i):$ all sentences annotated by worker $i$;
* $WorkVec(i, s):$ annotations of worker $i$ on sentence $s$ as a binary vector;
* $SentVec(s) = \sum_{i \in workers(s)} WorkVec(i,s)$, where $s$ is a sentence.

## Sentence Quality Score (SQS)

The sentence quality score $SQS(s)$ is computed as the average cosine similarity between all worker vectors for a given sentence $s$, weighted by the worker quality ($WQS$) and relation quality ($RQS$). The goal is to capture the degree of agreement in annotating the sentence. Through the weighted average, workers and relations with lower quality will have less of an impact on the final score.

$$ SQS(s) = \frac{\sum_{i, j \in workers(s)} Wcos(WorkVec(i,s), WorkVec(j,s)) \; WQS(i) \; WQS(j)}{\sum_{i, j \in workers(s)} WQS(i) \; WQS(j)}, \; i \neq j.$$


### Weighted Cosine

To weigh the metrics with the relation quality, we compute $Wcos$, the weighted version of the cosine similarity. This metric is only applicable to closed tasks, where relation quality can be calculated across sentences. For open-ended tasks, we consider relation quality equal to 1 and calculate the regular cosine similarity.

$$ Wcos(vec_1, vec_2) = \frac{\sum_{r} vec_1(r) \; vec_2(r) \; RQS(r)}{\sqrt{(\sum_{r} vec_1^2(r) \; RQS(r)) \; (\sum_{r} vec_2^2(r) \; RQS(r))}} .$$ 


## Worker Quality Score (WQS)

The worker quality score $WQS(i)$ for a given worker $i$ is the product of 2 separate metrics - the worker-worker agreement $WWA(i)$ and the worker-sentence agreement $WSA(i)$.

$$ WQS(i) = WSA(i) \; WWA(i) .$$

### Worker-Sentence Agreement

The worker-sentence agreement $WSA(i)$ is the average cosine distance between the annotations of a worker $i$ and all annotations for the sentences they have worked on, weighted by the sentence and relation quality. It calculates how much a worker disagrees with the crowd on a sentence basis. Through the weighted average, sentences and relations with lower quality will have less of an impact on the final score.

$$ WSA(i) = \frac{\sum_{s \in sentences(i)} Wcos(WorkVec(i,s), SentVec(s) - WorkVec(i, s)) \; SQS(s)}{\sum_{s \in sentences(i)} SQS(s)} .$$

### Worker-Worker Agreement

The worker-worker agreement $WWA(i)$ is the average cosine distance between the annotations of a worker $i$ and all other workers that have worked on the same sentences as worker $i$, weighted by the worker and relation qualities. The metric gives an indication as to whether there are consisently like-minded workers. This is useful for identifying communities of thought. Through the weighted average, workers and relations with lower quality will have less of an impact on the final score of the given worker.

$$ WWA(i) = \frac{ \sum_{j \in workers(s \in sentences(i))} Wcos(WorkVec(i, s), WorkVec(j, s)) \; WQS(j) \; SQS(s) }{ \sum_{j \in workers(s \in sentences(i))} WQS(j) \; SQS(s) }, \; i \neq j .$$


## Relation Quality Score (RQS)

The relation quality score $RQS(r)$ calculates the agreement of selecting a relation $r$, over all the sentences it appears in. Therefore, it is only applicable to closed tasks, where the same relation set is used for all sentences/units. It is based on $P_r(i | j)$, the probability that if a worker $j$ annotates relation $r$ in a sentence, worker $i$ will also annotate it. $RQS(r)$ is the weighted average of $P_r(i | j)$ for all possible pairs of workers. Through the weighted average, sentences and relations with lower quality will have less of an impact on the final score of the relation.

$$ RQS(r) = \frac{ \sum_{i,j} WQS(i) \; WQS(j) \; P_r(i | j) }{ \sum_{i,j} WQS(i) \; WQS(j) }, i \neq j .$$

$$ P_r(i | j) = \frac{ \sum_{s \in sentences(i) \cap sentences(j) } SQS(s) \; WorkVec(i, s)(r) \; WorkVec(j, s)(r) }{ \sum_{s \in sentences(i) \cap sentences(j) } SQS(s) \; WorkVec(j, s)(r) } . $$


## Sentence-Relation Score (SRS)

The sentence-relation score $SRS(s, r)$ calculates the likelihood that relation $r$ is expressed in sentence $s$. It is the ratio of the number of workers that picked relation $r$ over all workers that annotated the sentence, weighted by the worker quality.

$$ SRS(s, r) = \frac{ \sum_{i \in workers(s)} WorkVec(i,s)(r) \; WQS(i) }{ \sum_{i \in workers(s)} WQS(i) }. $$