<img align="right" src="images/tf-small.png" width="90"/>
<img align="right" src="images/etcbc.png" width="100"/>


# Creating a Coreference-annotated Corpus for Biblical Hebrew

#### An Analysis of Inter-annotator Agreement for Coreference Resolution Annotations in the Psalms and Beyond

## 1. Introduction

This notebook demonstrates and analyses the calculations of inter-annotator agreement, abbreviated to IAA, for the annotation of coreference information in the Hebrew Bible, specifically the Psalms. The Psalms, consisting of 150 poems in Ancient Hebrew, have been chosen as annotation corpus because they are the focus of my PhD research: "Who is who in the Psalms?" 

The Psalms and some comparison texts from Genesis, Numbers and Isaiah, have been annotated according to certain rules: an annotation scheme. A brief explanation of what data is annotated, the process and accompanying annotation tools and resources can be found in the [coreference annotation](https://github.com/cmerwich/participant-analysis/tree/master/annotation) notebooks. Why comparison texts have been annotated is explained below.

A vital part of the annotation process is checking the annotations for inter-annotator agreement. A website on [corpus linguistics](https://corpuslinguisticmethods.wordpress.com/2014/01/15/what-is-inter-annotator-agreement) puts it nicely: 

> "Inter-annotator agreement is a measure of how well two (or more) annotators can make the same annotation decision for a certain category."

From an IAA measure at least three things can be derived: 

* the *reliability* of the annotation scheme or guidelines that describe the category that is being annotated. Do the annotators understand the scheme and can they apply it *independently* and *consistently*? 
* the *reliability* of the annotation process, which is a necessary condition for 
* the *correctness* of the resulting annotations. 

The goal is create a coreference annotation method for the Hebrew Bible that can be used, adapted and corrected by others. In order to do that, we need to know to what extent the annotated information is correct. Producing reliable, consistent and correct coreference annotations are key to developing solid computer-assisted analyses of participants in the Psalms and other Hebrew Bible books. 


### 1.1 Setting up IAA

The scope and financial means of the PhD project do not allow for setting up a large scale IAA process with a team of annotators ranging in level of expertise. Two annotators were available for the current project. In the calculations I therefore function as annotator **A** and a fellow PhD candidate, Gyusang Jin, as annotator **B**. 

10 texts out of 150 Psalms were chosen randomly for annotator **B** with Python's `random` module since the whole Psalms corpus was already annotated by **A**:

```ruby
import random 
for i in range(1,11):
    print(random.randint(1, 151))
```

The output was 11, 88, 101, 17, 70, 138, 20, 32, 67, 129. After a training in the annotation guidelines, **B** started annotating. The 10 resulting annotation files of **B** together with **A**'s annotation files form the input for the IAA algorithm. The annotation files are found under [A](chris_A) and [B](gyus_B). 

The Psalms corpus consists of poetry that is written in Biblical Hebrew. The annotator therefore does not only have to deal with difficulties when trying to understand texts written in a different genre and language, they have also originated in a different space and time. A question that comes up is if it is harder to annotate poetic texts for coreference compared to for example narrative texts.

To enable some comparison between the annotation of different genres Numbers 8-10, which are narrative texts, were also annotated for coreference by **A** and **B**. The processing of the Numbers annotations however is in work in progress and will be included as soon as possible. 


### 1.2 Some Theory and the IAA Algorithm

There are all kinds of algorithms for IAA measures. [NLTK](https://www.nltk.org/api/nltk.metrics.html) has implemented some of these agreement metrics. [Artstein and Poesio](https://dl.acm.org/citation.cfm?id=1479206) wrote an informative article on the mathematics of IAA in 2008. I have decided to implement my own algorithm that uses SciPy's [lignear sum assignment module](https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.optimize.linear_sum_assignment.html). This gives me more control over the algorithm's output. 

Calculating IAA for coreference annotations in the Hebrew Bible is cast as an assignment problem. A classic assignment problem is e.g. to assign jobs *{p, q, r, s}* to workers *{a, b, c, d}* so as to minimize the total cost. [Here](http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html) you can find a nice explanation of this 'Kuhn-Munkres algorithm'. [Wikipedia](https://en.wikipedia.org/wiki/Hungarian_algorithm) is also helpful. 

The implementation of IAA for our case is as follows. Coreference resolution is the task of finding all expressions that refer to the same entity in a text. 

* A referring expression is called a **mention**. 
* An entity can be called a **class**, or **C**. A **class** is a set that contains two or more mentions that refer to the same entity. 
* A mention that refers to an entity that no other mention refers to is called a **singleton**, or **S**. A **singleton** is a set that contains one **mention**. A **singleton** set can also contain all singletons from one text. 

Comparing annotations of `annotator A` and `annotator B` then is done with set calculations. Given the Venn diagram (Source: [wikipedia](https://en.wikipedia.org/wiki/Set_theory)) below, for IAA we are interested in:

<img align="center" src="images/venn_a_intersect_b.png" width="220"/>

* $A \setminus B$ = Left, or blue
* $A \cap B$ = Middle, or purple
* $B \setminus A$ = Right, or pink
* $A \Delta B$ = Symmetric difference of A and B
* $\delta(A, B)$ = delta of A and B

To illustrate, in the following example

| $\setminus$ | A  | $\cap$ | B | $\setminus$ |
| ---- | ---- | ---- | ---- | ---- |
| 0 | C1 | 4 | C1 | 1 |  
| 1 | C2 | 7 | C2 | 1 |
| 8 | C3 | 0 | C3 | 0 |
| 3 | S  | 17 | S  | 2 |  


annotator **A** and annotator **B** have 4 mentions in common in *C1*, **B** has 1 mention more in *C1* than **A**. **A** and **B** have 7 mentions in common in *C2*. Both have annotated 1 mention that they not have in common. For *S*, **A** and **B** have an intersection of 17. For **A** the relative complement, or difference, is 3; for **B** it's 2. 

It is also possible that **A** or **B** has formed a class the other has not: the *C3* class is an example of that.  

The symmetric difference is the calculation of the annotations that belong to **A** or **B** but not to their intersection (Middle). Symmetric difference is defined for sets **A**, **B** as

$$A \Delta B = (A \setminus B) \cup (B \setminus A)$$

So the symmetric distance between **A** and **B** for *C1* is $0+1 = 1$. $A \Delta B$ for *S* is 5. 

The notation for the $\delta$ (delta) of a set is

$$ \delta = A \Delta B / A \cup B $$

The delta of *S* for example is: $(3+2)/(3+17+2) \approx 0.227 $.

To return to the assignment problem, the annotations of **A** and **B** can be divided as respectively $n$ and $k$ nodes into two disjoint and independent sets $U$ and $V$ in an acyclic bipartite graph (source: [wikipedia](https://en.wikipedia.org/wiki/Bipartite_graph)). In this bipartite graph every edge connects a node in $U$ to one in $V$:

<img align="center" src="images/bipartite-graph.png" width="220"/> 

In the [IAA algorithm](iaa.py) then, **A** and **B** (or $U$ and $V$) are matched with function `match()` using a distance function `distance()` that calculates the symmetric distance. The cost of the matching is calculated with the aforementioned `linear_sum_assignment` SciPy module. The results of the matching are stored in array $r$. The rest of the functions are helper functions that print the calculations to plain text files with extension `.iaa`. These `.iaa` files are found [here](iaa-files). 

### 2. Executing the Code

To enable researchers to develop their own annotation process, with different annotators working in different file locations, I have decided to work with a [`Makefile`](Makefile). A `Makefile` does all of the file handling in the terminal, it prevents dependency hell when the annotators decide to change their files after the IAA calculations have been done. With the command `make` the calculations are easily done again. Some instructions:

* make an "iaa"-folder (or give it some other name) somewhere on your computer and place the `iaa.py`, `acc.py` and `Makefile` in that same folder;
* change the file locations for the `.ann` files under NU_A, PS_A etc. in the `Makefile`. Make sure there are **A** and **B** locations for the files of the two different annotators; 
* go to the "iaa"-folder in your terminal and give the command `make`;
* `iaa.py` will do its work and the IAA measures are printed per Hebrew Bible book in separate txt files and it prints one total IAA measure for all compared texts. All files are stored in the "iaa"-folder. 

Jupyter notebooks allow lots of cool magic, another possibility is to just run a shell command in this notebook with `!`, do: 

```ruby
! make
```
--TO DO: give total_psalms etc. .iaa extension 

In [11]:
# Run when you have matched the file locations in the Makefile with the file locations on your PC

! make 

sort -k 7n,7 total_numbers total_psalms
Psalms_138.iaa	-	9	62	10	19	0.2346
Psalms_088.iaa	-	25	121	25	50	0.2924
Psalms_011.iaa	-	12	49	12	24	0.3288
Psalms_129.iaa	-	9	36	9	18	0.3333
Psalms_070.iaa	-	11	34	10	21	0.3818
Psalms_032.iaa	-	21	71	26	47	0.3983
Psalms_020.iaa	-	18	55	19	37	0.4022
Psalms_017.iaa	-	38	107	42	80	0.4278
Psalms_101.iaa	-	19	45	20	39	0.4643
Psalms_067.iaa	-	20	42	21	41	0.494
Numbers_001.iaa	-	0	0	5	5	1.0
Numbers_008.iaa	-	244	0	7	251	1.0
Numbers_012.iaa	-	0	0	143	143	1.0
python3 acc.py total_numbers total_psalms
total_numbers	-	244	0	155	399	1.0
total_psalms	-	182	622	194	376	0.3768


### 3. IAA analysis
Here follows an analysis of the calculated IAA measures. For a categorisation of the various disagreement types I use an adapted version of the description from the [OntoNotes CoNNL 2012 shared task](https://dl.acm.org/citation.cfm?id=2391181.2391183), p. 8. 

* *Annotator Error*: an annotator error. This is a catch-all category for errors that do not fit in the other categories. 
* *Genuine Ambiguity*: genuine ambiguous cases where annotators interpret the referent(s) differently.  
* *Generics*: one annotator considers a mention generic and the other annotator does not. This a hard category, since the context of Hebrew Bible texts is far away in time and often unclear. 
* *Guidelines*: the guidelines should be more clear on this example. 
* *Referents*: each annotator thought this was referring to two completely different things. 
* *NP*: one annotator did not mark this nominal phrase. 
* *Suffix*: one annotator did not mark this suffix. 
* *Verb*: one annotator did not mark this verb. 
* *Independent Personal Pronoun*: one annotator did not mark this IPP. 
* *Named Entity*: can be person, measurement unit, people, place, demonstrative personal pronoun. One annotator did not mark this NE.  
* *Demonstrative Pronoun*: one annotator did not mark this DP. 
* *Appositive*: one annotator did not mark this appositive. 

These categories could prove insufficient after the analysis. If this is the case they will be adjusted accordingly. 

### 3.1 Totals
Let's pull in the IAA measures for `total_psalms.iaa` with a shell command, and sort the seventh column by ascending order.

In [12]:
! sort -k 7n,7 iaa-files/total_psalms.iaa

Psalms_138.iaa	-	9	62	10	19	0.2346
Psalms_088.iaa	-	25	121	25	50	0.2924
Psalms_011.iaa	-	12	49	12	24	0.3288
Psalms_129.iaa	-	9	36	9	18	0.3333
Psalms_070.iaa	-	11	34	10	21	0.3818
Psalms_032.iaa	-	21	71	26	47	0.3983
Psalms_020.iaa	-	18	55	19	37	0.4022
Psalms_017.iaa	-	38	107	42	80	0.4278
Psalms_101.iaa	-	19	45	20	39	0.4643
Psalms_067.iaa	-	20	42	21	41	0.494


Also accumulate the IAA measures of all texts `total_psalms.iaa` with `acc.py`. 

`Lt` stands for Left total, `Mt` for Middle total etc. 

In [13]:
from acc import print_total

name, Lt, Mt, Rt, Dt, dt = print_total('iaa-files/total_psalms.iaa')

iaa-files/total_psalms.iaa	-	182	622	194	376	0.3768


The IAA $\delta$ of all Psalms annotated by **A** and **B** is $0.3768$. We'll get back to that value in a bit. 


Great, now we make it neat. Let's put all the values per text and the total in a pandas dataframe in which the '-' is dropped and again sort the values in the seventh column ($\delta$) by ascending order:

In [14]:
import pandas as pd

tot_column_names=['-','L', 'M', 'R', 'D', 'd']
tot_data_types={'-': str, 'L': int, 'M': int, 'R': int, 'D': int, 'd': float}

ps_df = pd.read_table('iaa-files/total_psalms.iaa', 
                           delim_whitespace=True, 
                           names=tot_column_names,
                           dtype=tot_data_types
                          ).drop(columns='-').sort_values(by='d')

df = pd.DataFrame([[Lt, Mt, Rt, Dt, dt]],
                  index=['total_psalms'],
                  columns=['L', 'M', 'R', 'D', 'd']
                 )

tot_ps_df = ps_df.append(df)

tot_ps_df

Unnamed: 0,L,M,R,D,d
Psalms_138.iaa,9,62,10,19,0.2346
Psalms_088.iaa,25,121,25,50,0.2924
Psalms_011.iaa,12,49,12,24,0.3288
Psalms_129.iaa,9,36,9,18,0.3333
Psalms_070.iaa,11,34,10,21,0.3818
Psalms_032.iaa,21,71,26,47,0.3983
Psalms_020.iaa,18,55,19,37,0.4022
Psalms_017.iaa,38,107,42,80,0.4278
Psalms_101.iaa,19,45,20,39,0.4643
Psalms_067.iaa,20,42,21,41,0.494


The psalms dataframe (`ps_df`) is built up of the IAA calculations per text. The colums are named L(eft) and R(ight) for the relative complement, M(iddle) for intersection, the symmetric diference is D for $\Delta$,  and d for $\delta$.

The $\delta$ is a value $0 \leq \delta \leq 1 $ where $0$ denotes total inter-annotator agreement and $1$ total inter-annotator *dis*agreement. Considering the values of $\delta$ in the `total_psalms` dataframe the question arises what $\delta$ value threshold is maintained to be able to speak of IAA. Or in other words: here we encounter the problem of interpreting the meaning of the resulting values. To quote Artstein and Poesio:

> "Unfortunately, deciding what counts as an adequate level of agreement for a specific purpose is still little more than a black art: [a]s we will see, different levels of agreement may be appropriate for resource building and for more linguistic purposes." 

[Artstein and Poesio, 2008](https://dl.acm.org/citation.cfm?id=1479206), p.576

To put it in another away, though IAA is important, it is hard to analyse. This statement nevertheless does not withold them from concluding that in relation to [Krippendorff's](https://en.wikipedia.org/wiki/Krippendorff%27s_alpha) $\alpha$ - an agreement coefficent for multiple coders where 1 signifies perfect agreement:

> "only values above 0.8 ensured an annotation of reasonable quality. We therefore feel that if a threshold needs to be set, 0.8 is a good value." 

[Artstein and Poesio, 2008](https://dl.acm.org/citation.cfm?id=1479206), p.591

Taking into account some of the difficulties that the corpus poses however - see the remarks under §1.1 'Setting up IAA' - I suggest we start doing some black art ourselves. I set a 'study' IAA threshold for individual texts at IAA $\leq 0.333$ with which the texts with a low IAA measure can be selected for annotation analysis. 

In [15]:
ps_df.loc[(ps_df['d'] >= 1/3)]

Unnamed: 0,L,M,R,D,d
Psalms_070.iaa,11,34,10,21,0.3818
Psalms_032.iaa,21,71,26,47,0.3983
Psalms_020.iaa,18,55,19,37,0.4022
Psalms_017.iaa,38,107,42,80,0.4278
Psalms_101.iaa,19,45,20,39,0.4643
Psalms_067.iaa,20,42,21,41,0.494


### 3.2 Psalms_067.iaa 

Starting with `Psalms_067.iaa` with $\delta = 0.4940$, the file can be loaded in a pandas dataframe for clarity. 

In [16]:
column_names=('ann_A','ann_B', 'L', 'M', 'R', 'D', 'd') 
data_types={'ann_A': str, 'ann_B': str ,'L': int, 'M': int, 'R': int, 'D': int, 'd': float}

ps067_df = pd.read_table('iaa-files/Psalms_067.iaa', 
                           delim_whitespace=True, 
                           names=column_names,
                           dtype=data_types
                          )
ps067_df

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
0,C1,C5,0,6,0,0,0.0
1,C3,C3,2,3,0,2,0.4
2,C4,C2,1,12,7,8,0.4
3,C5,C4,9,12,0,9,0.4286
4,C6,C6,0,2,0,0,0.0
5,C7,C1,2,0,9,11,1.0
6,S,S,1,7,5,6,0.4615
7,C2,-,5,0,0,5,1.0


Recall that the IAA algorithm matches the *C* and *S* sets of annotations from **A** and **B** in the most optimal way. That is why different class numbers can be matched differently for **A** and **B**. The different matching of class numbers does not mean that they have been identified as different entities. For indices 0 and 4

In [17]:
ps067_df.loc[[0,4]]

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
0,C1,C5,0,6,0,0,0.0
4,C6,C6,0,2,0,0,0.0


means that there is complete agreement between **A** and **B** on which mentions refer to a certain entity. Indices 5 and 7

In [18]:
ps067_df.loc[[5,7]]

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
5,C7,C1,2,0,9,11,1.0
7,C2,-,5,0,0,5,1.0


indicate that there is complete disagreement between **A** and **B** on which mentions refer to a certain entity.

* *C7* and *C1* are matched, but **A** and **B** have found that different mentions refer to that entity. 
* **A** has found an extra class, which can be concluded from both the non-match (i.e. '-') and the extra class that **A** (i.e. *C7* or *C2*) has found compared to **B**. 

For the sets with $\delta \geq 1/4 $:

In [19]:
ps067_df.loc[[1,2,3,6]]

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
1,C3,C3,2,3,0,2,0.4
2,C4,C2,1,12,7,8,0.4
3,C5,C4,9,12,0,9,0.4286
6,S,S,1,7,5,6,0.4615


* on index 2, **B** connects 7 mentions to his *C2*
* on index 3, while **A** connects 9 mentions to his *C5*, **B** connects 0 mentions to his *C4*.
* in *S* **A** and **B** have 7 mentions in common, but **B** finds 5 more mentions as singletons.  

Now that we have a general overview of what the differences in classes are, let's take a closer look at the specific differences in the mentions that have been included in each class by each annotator. Since the optimal matching between the **A**'s and **B**'s annotations are always the same, unless changes have been made in the annotations, it is possible to retrieve the actual words of the annotations. In the cell below the function `retrieve_ann`, imported from `retrieve_iaa.py`, runs the comparison again and prints per class matching:

1. on the first line, between hyphens, the comparison of a class with IAA measures;
2. on the second line the annotations of **A** in ETCBC's [transcription](https://annotation.github.io/text-fabric/Writing/Hebrew.html) of Biblical Hebrew. The annotations have also been done on the transcripted texts. `un A` indicates unpaired classes of **A** that have not been matched with **B**;
3. on the third line the annotations of **B** in ETCBC's transcription. `un B` indicates unpaired classes of **B** that have not been matched with **A**. 
4. on the fourth and fifth lines, the specific differences (i.e. `diff`) in annnotations between **A** and **B** if they exist. If there are no differences these lines will nog be printed. The notation is: the annotator, **A** or **B**, followed by `diff`, then by a word in brackets together with an index number that points to . An example:

`A	['JSWR', 'LBB']`

`B	['JSWR', 'LBB <QC']`

`A diff	[('LBB', 1)]`	

`B diff	[('LBB <QC', 1)]`

Meaning: **A** has one annotation that **B** does not have, and the other way around. **A**'s differing annotation can be found on index 1 (all indices start at 0), in **A**'s annotations above. **B**'s differing annotation can also be found on index 1 in **B**'s annotations above. If **A** has differing annotations that **B** does not have, or vice versa, an empty list is shown like so:

`A diff	[('>RY', 0), ('>RY', 1)]`	

`B diff	[]`

In [20]:
from retrieve_iaa import retrieve_ann

retrieve_ann('chris_A/Psalms_067.ann', 'gyus_B/Psalms_067.ann')

ann_A	ann_B	L	M	R	D	d	

------------------------------------------------------------
C1	C5	0	6	0	0	0.0
------------------------------------------------------------
A	['NW', 'NW', 'NW', 'NW', 'NW', 'NW']	

B	['NW', 'NW', 'NW', 'NW', 'NW', 'NW']	

------------------------------------------------------------
C3	C3	2	3	0	2	0.4
------------------------------------------------------------
A	['>RY', '>RY', '>RY', 'NTNH', 'H']	

B	['>RY', 'NTNH', 'H']	

A diff	[('>RY', 0), ('>RY', 1)]	

B diff	[]	

------------------------------------------------------------
C4	C2	1	12	7	8	0.4
------------------------------------------------------------
A	['JWDW', '<MJM', 'JWDW', '<MJM', 'KL', 'M', '<MJM', 'JWDW', '<MJM', 'JWDW', '<MJM', 'KL', 'M']	

B	['KL&GWJM', 'JWDW', '<MJM', 'JWDW', '<MJM', 'KL', 'M', 'JFMXW', 'JRNNW', 'L>MJM', '<MJM', 'L>MJM', 'XM', 'WDW', '<MJM', 'JWDW', '<MJM', 'KL', 'M']	

A diff	[('JWDW', 7)]	

B diff	[('KL&GWJM', 0), ('JFMXW', 7), ('JRNNW', 8), ('L>MJM', 9), ('L>MJM', 11), ('XM', 12

1. Starting with C3 - C3 $\delta = 0.4$: 
    * **A** added two mentions of '>RY' more than **B**. 
    * One occurrence of '>RY' in **A**'s class is one *Annotator Error*, it occurs in v.7 and is undetermined. It refers to 'land' that yields harvest, and does not refer to 'the earth' as with the other mentions. 
    * **B** missed a determined mention of '>RY': *Annotator Error*. Therefore one mention ended up as singleton in **B**'s *S* set which it shouldn't have. 
    -----------------------------------------------------------
1. C4 - C2, $\delta = 0.4$: 
    * 'XM' added by **B** indicates one *Annotator Error*. In v. 5 the verb 'TNX' occurs with suffix M. 'TNX' refers to '>LHJM', 'M' to '<MJM'. The mentions were not annotated correctly. 
    * 'WDW' added by **B** indicates one *Annotator Error*. In v. 6 the verb 'JWDW'occurs. **B** forgot to annotate the letter *yod*. 
    * The rest of the differences in mentions added to the **A** and **B** classes is the result of an interpretation issue, thus categorised as 5x *Genuine Ambiguity*. The question is if '<MJM', 'L>MJM' and 'KL&GWJM' all refer to the same 'peoples'/'nations' or not. This interpretation question also causes the difference in the number of classes and the number of singletons. Where **A** sees one class of '<MJM' (*C4*), one class of 'L>MJM' (*C2*, unpaired), and a singleton in KL&GWJM; **B** adds all these mentions in one class (*C2*). 
    -----------------------------------------------------------
1. C5 - C4 $\delta \approx 0.4286$: 
    * Compared to **B**, **A**'s class has a greater number of mentions, the mentions refer to '>LHJM', God. **B**'s class *C1* contains mentions that also refer to '>LHJM'. Since **B**'s *C1* contains '>LHJM', it is probably a linking error made in brat, it can therefore be classified as nine *Annotator Error*s. 
    -----------------------------------------------------------
1. C7 - C1 $\delta = 1$:
    * **B**'s second '>LHJM' class is then matched with **A**'s *C7* which contains the title of Psalm 67. The annotation guidelines prescribe that the titles of Psalms are annotated as self-contained units, and that they are not connected to coreference chains in the text that follows. While both **A** and **B** have followed that guideline, **B** has missed the apposition relation between 'CJR' and 'MZMWR'. According to the guidelines this relation should be annotated, this is thus categorised as one *Appositive* error.


The matching of the remaining classes have already been discussed in relation to aforementioned classes. 


#### Conclusion Psalm 67
The relatively low IAA measure can mostly be attributed to 13 *Annotator Error*s and 5 *Genuine Ambiguity* cases and 1 *Appositive* error. The *Annotator Error*s and *Appositive* error can easily be corrected, the *Genuine Ambiguity* has to be left for what it is.

### 3.3 Psalms_101.iaa 

`Psalms_101.iaa` has the second most low IAA measure with $\delta = 0.4643$. Following the same analysis procedure as with `Psalms_064.iaa`, we first make and inspect a table of the IAA measures per class. Then we retrieve the actual words of the annotations to take a closer look at the differences in annotations. 

In [21]:
ps101_df = pd.read_table('iaa-files/Psalms_101.iaa', 
                           delim_whitespace=True, 
                           names=column_names,
                           dtype=data_types
                          )
ps101_df

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
0,C1,C1,0,3,0,0,0.0
1,C2,C2,1,20,1,2,0.0909
2,C3,C4,1,1,1,2,0.6667
3,C4,C5,2,1,1,3,0.75
4,C5,C3,2,0,2,4,1.0
5,C8,C6,1,2,0,1,0.3333
6,C10,C7,0,2,0,0,0.0
7,C11,C8,0,2,0,0,0.0
8,S,S,6,14,15,21,0.6
9,C9,-,2,0,0,2,1.0


The dataframe `ps101_df` immediately shows some striking differences in annotations. **A** and **B** agree (almost) completely on four classes (index 0, 1, 6, 7) with $\delta$ being approximately 0. For the remaining classes there is a disagreement of .3333 or greater. We leave the classes that are absolutely similar out of consideration and we take a look at the rest of the classes. The remaining classes are selected by sorting all $\delta$'s that are equal or greater than 0.01. and we retrieve all annotations:

In [22]:
ps101_df.loc[(ps101_df['d'] >= 0.01)]

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
1,C2,C2,1,20,1,2,0.0909
2,C3,C4,1,1,1,2,0.6667
3,C4,C5,2,1,1,3,0.75
4,C5,C3,2,0,2,4,1.0
5,C8,C6,1,2,0,1,0.3333
8,S,S,6,14,15,21,0.6
9,C9,-,2,0,0,2,1.0
10,C6,-,2,0,0,2,1.0
11,C7,-,2,0,0,2,1.0


In [23]:
retrieve_ann('chris_A/Psalms_101.ann', 'gyus_B/Psalms_101.ann')

ann_A	ann_B	L	M	R	D	d	

------------------------------------------------------------
C1	C1	0	3	0	0	0.0
------------------------------------------------------------
A	['K', 'JHWH', 'TBW>']	

B	['K', 'JHWH', 'TBW>']	

------------------------------------------------------------
C2	C2	1	20	1	2	0.0909
------------------------------------------------------------
A	['>CJRH', '>ZMRH', '>FKJLH', 'J', '>THLK', 'J', 'J', '>CJT', 'J', 'FN>TJ', 'J', 'NJ', '>D<', '>YMJT', '>WKL', 'J', 'J', 'NJ', 'J', 'J', '>YMJT']	

B	['>CJRH', '>ZMRH', '>FKJLH', 'J', '>THLK', 'J', 'J', '>CJT', 'J', 'FN>TJ', 'J', 'J', '>D<', '>YMJT', '>WKL', 'J', 'J', 'NJ', 'J', 'J', '>YMJT']	

A diff	[('NJ', 11)]	

B diff	[('J', 11)]	

------------------------------------------------------------
C3	C4	1	1	1	2	0.6667
------------------------------------------------------------
A	['LBB', 'JSWR']	

B	['LBB <QC', 'JSWR']	

A diff	[('LBB', 0)]	

B diff	[('LBB <QC', 0)]	

------------------------------------------------------------
C4	C

Let's start at index 1 in the `ps101_df` table and work our way down to 11. 

1. C2 - C2 $\delta \approx 0.0909$: 
    * Both **A** and **B** annotated 11 occurrences of a first person singular (1Csg). They differ in their annotations of the different forms of 1Csg: J or NJ. Of the word MIM.EN.IJ, 'from me', in v. 101:4 **A** annotated NJ, where **B** annotated J. According to the `annotation_resources` and the annotation guidelines J is an annotation mistake, since ETCBC's analysed form is (M(N&M&N+(NJ and NJ should be annotated. It is therefore categorised as one *Annotator Error*. 
    -----------------------------------------------------------   
1. C3 - C4 $\delta \approx 0.6667$: 
    * Where **A** has annotated 'heart' ('LBB'), **B** has also annotated the adjective: 'crooked heart' ('LBB <QC'). Though the annotation guidelines indeed allow for the annotation of adjectives as mention, they are not coreferenced with other referents by themselves. Conversely, **A** annotates 'DRK' ('way', see point 5 below)  without the adjective 'TMJM' ('complete'), but annotates 'TMJM' as a seperate mention, which is incorrect, and adds it to his *S* class. **B** has not annotated 'TMJM' as mention. The annotation guidelines therefore need to be revised. These two errors are categorised as *Guidelines*. 
    -----------------------------------------------------------
1. C4 - C5 $\delta = 0.75$: 
    * In 101:5 **A** incorrectly annotated 'W' instead of 'HW', like **B** did. Both 'W' and 'HW' are suffix forms of 3Msg. The `annotation_resources` and the annotation guidelines indicate for R;<;HW ('his neighbour') that the suffix (R<=/+HW) 'HW' should be annotated. This error is categorised as one *Annotator Error*. As a consequence 'R<H' is added to **A**'s *S* class instead of **B**'s correct 'R<'.  
    * In the same verse **B**, in contrast to **A**, did not corefer the participle 'MLWCNJ' ('he who slanders') to his class and ended up as singleton in *S*. The same goes for the second 'W' which **A** corefered to this class and **B** did not, making it a singleton. Though 'MLWCNJ' is determined as a participle (Msg) and does not have a person feature that refers to the ensuing 3Msg suffixes, they do corefer. These two errors are categorised as *Annotator Error*. 
    -----------------------------------------------------------    
1. C5 - C3 $\delta = 1.0$: 
    * In this match **A** corefers 'LBB' ('heart') with 'W' in 101:5. As the suffix 'W' is attached to an object marker ('>T'), the 3Msg refers to 'he who slanders' (cf. point 3 (C4 - C5)), not to 'heart'. This is an *Annotator Error*. **B** has not corefered 'W' and 'LBB', hence they are added to his *S* class. 
    * **B** corefers 'JDBQ' with 'DBR&BLJ<L' in 101:3. The question is if 'JDBQ', (verb, 3Msg, to cling to), and 'DBR&BLJ<L' (NP, 'word of wickedness', both words are Msg) corefer. 'DBR&BLJ<L' occurs in a clause that does not directly precede 'JDBQ'. 'JDBQ' is preceded by the clause 'I hate the doing of (the) faithless'. It is therefore more probable that 'JDBQ' corefers as follows: 'the doing of the faithless, it/he does not cling to me'. Both **A** and **B** have made an *Annotator Error*, which can be counted as two. For **A** that means that 'JDBQ' and 'DBR&BLJ<L' are added to his *S* class. 
    -----------------------------------------------------------
1. C8 - C6 $\delta \approx 0.3333$: 
    * In 101:6 **A** corefers the complement 'DRK' with 'HW>' (IPP, 3Msg, 'he') and 'JCRT' (verb, 3Msg, 'to serve'). If **A** were to be followed, 'he' ('HW>') would refer to 'the way' ('DRK'), and the way would 'serve': 'the way, he, he, serves'. This is not correct since 'he' ('HW>') is subject of 'JCRT' 'he serves'. This error is categorised as one *Annotator Error*. 
    -----------------------------------------------------------
1. S - S $\delta = 0.6$: 
    * Some of the annotated singletons of **A** and **B** have already been discussed in relation to the matched classes. There are however a few differences that are worth looking into to. 
    * **B**'s 'WMCPV' corresponds to **A**'s 'MCPV', but the former has incorrectly annotated the 'W' ('and'): it is therefore one *Annotator Error*. 
    * **A** and **B** have annotated three occurrences of '<JN' (substantive, determined by a suffix 1Csg, 'eye') in 101:3, 101:6 and 101:7 differently. **A** correctly coreferences two occurrences of the determined substantive making it an unmatched class (*C6*). **A** also misses one occurrence ('<JN', 15, in *S*) hence making it one *Annotator Error*. **B** annotates all mentions, 2x ('NGD <JN', 7), 1x ('<JN', 17), but does not corefer them. These are three *Annotator Error*s. Furthermore **B** annotates 'NGD <JN' in 101:3 and 101:7. Strictly speaking the annotation guidelines allow for the annotation of 'NGD', meaning 'counterpart' or 'before', but this substantive seems to function more as a preposition. The difference in translation is 'the counterpart of my eyes' versus 'before my eyes'. **B** has chosen the former, **A** the latter. These two errors are categorised as *Guidelines*. The guidelines should be adjusted. 
    * Whereas **B**'s has added 'QRB BJT' to his *S* class, **A** adds the two occurrences to a separate class (*C9*). 'QRB BJT', 'the interior of the house' occurs in 101:2 and 101:7 and both NP's are determined by a suffix 1Csg. They should therefore be coreferenced in a class as **A** has done. **B** has made two *Annotator Error*s. 
    * Though **A** has coreferenced one occurrence of 'DRK' to his *C8* class, he has also missed an occurrence that **B** has added to his *S* class. 'DRK' occurs in 101:2 and 101:5. **A** has made an *NP* error. 
    * Lastly, **A** annotates an additional and unpaired *C7* class with 'N>MNJ&>RY' and 'HLK'. It is however not 'N>MNJ&>RY' ('faithful in the land') that should be coreferenced with 'HLK' (particple, Msg, 'he who walks'), but 'HW>' and 'JCRT' from the *C8 - C6* match making it: 'he who walks, he, he, serves'. This error is categorised as one *Annotator Error*. 


#### Conclusion Psalm 101

The low IAA measure is a result of 17 *Annotator Error*s and 5 *Guidelines* errors. These can all be easily corrected.

### 3.4 Psalms_017.iaa 

`Psalms_017.iaa` has an IAA measure of $\delta \approx 0.4278$. Let's see how that measure has come to be. 

In [24]:
ps017_df = pd.read_table('iaa-files/Psalms_017.iaa', 
                           delim_whitespace=True, 
                           names=column_names,
                           dtype=data_types
                          )
ps017_df

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
0,C1,C11,9,27,0,9,0.25
1,C2,C1,0,2,0,0,0.0
2,C3,C2,0,2,0,0,0.0
3,C4,C3,0,2,0,0,0.0
4,C5,C4,0,2,0,0,0.0
5,C6,C5,7,20,2,9,0.3103
6,C7,C10,2,0,2,4,1.0
7,C9,C7,0,12,3,3,0.2
8,C10,C9,2,2,1,3,0.6
9,C11,C8,2,0,5,7,1.0


Like in the previous analyses we study all classes, or sets, of which $\delta$ is not 0. The annotation choice an annnotator makes for one class or mention, has influence on the way other classes are built up. The differences in annotations between the classes therefore need to be studied in relation to each other. For `ps017_df` let's zoom in on the classes with an IAA measure of 0.2 or greater: 

In [25]:
ps017_df.loc[(ps017_df['d'] >= 0.2)]

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
0,C1,C11,9,27,0,9,0.25
5,C6,C5,7,20,2,9,0.3103
6,C7,C10,2,0,2,4,1.0
7,C9,C7,0,12,3,3,0.2
8,C10,C9,2,2,1,3,0.6
9,C11,C8,2,0,5,7,1.0
10,C12,C6,2,0,10,12,1.0
11,C13,C13,0,2,4,4,0.6667
12,C14,C12,3,2,0,3,0.6
13,S,S,6,34,15,21,0.3818


In [26]:
retrieve_ann('chris_A/Psalms_017.ann', 'gyus_B/Psalms_017.ann')

ann_A	ann_B	L	M	R	D	d	

------------------------------------------------------------
C1	C11	9	27	0	9	0.25
------------------------------------------------------------
A	['CM<H', 'JHWH', 'HQCJBH', 'H>ZJNH', 'K', 'K', 'BXNT', 'PQDT', 'YRPT', 'TMY>', 'K', 'K', 'K', 'T<N', '>L', 'HV', 'K', 'CM<', 'HPLH', 'K', 'K', 'CMR', 'K', 'TSTJR', 'QWMH', 'JHWH', 'QDMH', 'HKRJ<', 'PLVH', 'K', 'K', 'JHWH', 'K', 'TML>', 'K', 'K']	

B	['CM<H', 'JHWH', 'HQCJBH', 'H>ZJNH', 'K', 'K', 'BXNT', 'PQDT', 'YRPT', 'TMY>', 'K', 'K', 'K', 'T<N', '>L', 'QWMH', 'JHWH', 'QDMH', 'HKRJ<', 'PLVH', 'K', 'K', 'JHWH', 'K', 'TML>', 'K', 'K']	

A diff	[('HV', 15), ('K', 16), ('CM<', 17), ('HPLH', 18), ('K', 19), ('K', 20), ('CMR', 21), ('K', 22), ('TSTJR', 23)]	

B diff	[]	

------------------------------------------------------------
C2	C1	0	2	0	0	0.0
------------------------------------------------------------
A	['MCPV', 'JY>']	

B	['MCPV', 'JY>']	

------------------------------------------------------------
C3	C2	0	2	0	0	0.

1. C1 - C11 $\delta = 0.25$: 
    * **A** and **B** recognise a 'JHWH' ('God') class in their *C1* and *C11* classes respectively. **A** has added nine mentions more to his class than **B**. These 9 mentions plus 'MWCJ<' are **B**'s *C6* class which is matched with **A**'s *C12* further down. **A** adds 'MWCJ<' in 17:7 (NP vocative, Msg, 'saviour'/'helper') to his singleton set. The difference of mentions as indicated by **A**'s diff for this *C1* - *C11* matching seems to be caused by a linking error made in brat by **B**. All mentions (verbs: 'HV', 'CM<', 'HPLH', 'CMR', 'TSTJR'; suffixes 'K') refer in 2Msg to God. **A** forgot to link 'MWCJ<' to this class: one *Annotator Error*. **B**'s linking error is counted as nine *Annotator Error*'s. Making it 10 *Annotator Error*s. 
    -----------------------------------------------------------
1. C6 - C5 $\delta \approx 0.3103$: 
    * **A** adds seven mentions to his *C6* class that **B** misses. **B**'s diff indicates that he has added two mentions ('DWD' and  'TJ') that **A** misses. Starting with those, 'DWD' (nomen proprium, 'David') belongs to the title of Ps 17. As was the case with Ps 67, according to the guidelines titles of Psalms are annotated for coreference but are not connected to coreference information in the rest of the Psalm. 'TJ' is supposed to be the suffix 1Csg 'J', it is connected to '>MRT' ('word') in 17:6, but **B** has erroneously also annotated the 'T'. **B**'s 'DWD' and  'TJ' are 2 *Annotator Error*s. This also explains two of the differences in **A**'s and **B**'s *S*, namely: 'DWD' and '>MRT'. 
    * For the remaining differences, the seven mentions in **A**'s *C6*, there are two explanations. The first one is that the difference is a result of *Genuine Ambiguity*. In the *C11* - *C8* match, **B**'s *C8* agrees with **A**'s seven mentions, except for '>FB<H' (verb, 1Csg, 'to be sated') and the aforementioned (T)'J' suffix 1Csg. While **B** explictely identifies his *C5*, which contains 1Csg mentions, as 'David'. **A** leaves the 1Csg mentions class unidentified. Possibly, **B** started a new coreference class because of the sudden occurrence of the suffix 'NW' (1Cpl) in 17:11. Strangely, **B** adds '>XZH' (verb, 1Csg, 'to see') in 17:15 to his *C8* class, but leaves '>FB<H' in the same verse out which is then added to *S*. In terms of consistency '>FB<H', 'I am sated', should also have been added to this class. The second explanation - the difference is a result of an *Annotator Error* - is therefore more probable. **B** did not link the all the 1Csg mentions correctly. We count 6 *Annotator Error*'s. 
    -----------------------------------------------------------
1. C7 - C10 $\delta = 1.0$ || and C10 - C9 $\delta = 0.6$: 
    * These two matchings are discussed in relation to each other. 
    * In 17:7 **A** incorrectly links 'XWSJM' to 'MTQWMMJM'. 'XWSJM' (participle, Mpl, 'to seek refuge'), translated as 'they\those who seek refuge' is not the same referent as 'MTQWMMJM' (participle, Mpl, 'to arise'), 'they/those who arise'. The translation of this part of 17:7 would be: 'Oh you helper [God] for those who seek refuge from those who arise against your [God] right hand'. 'XWSJM' and 'MTQWMMJM' therefore end up in **A**'s *S* class. We categorise and count 2 *Annotator Error*'s. 
    * **A** and **B** differ on their understanding of the lion metaphor in 17:12 with translation: 'His likeness is of a lion, he longs to tear; and like a young lion who sits in a hiding place'. **B** annotates two classes. **B** links 'KPJR' ('young lion') to 'JCB' (participle, Msg, 'to sit') in 17:12 in a separate class *C10*. **A** links these two mentions to '>RJH' ('lion') and 'JKSWP' (verb, 3Msg, to 'to long') in one *C10* class: '>RJH', 'JKSWP', 'KPJR', 'JCB'.  **B** adds 'VRWP' (infinitive cst, 'to tear') to his *C10*. According to the annotation guidelines infinitives are not annotated unless there is lexical identity with the same verb elsewhere. 'To tear' does not occur elsewhere in the 17, 'VRWP' is therefore 1 *Annotator Error*. The question is then: do 'the lion' and 'the young lion' corefer or not, are they the same referents? Though a young lion is still a lion, it is probable in this text that the two lions are not the same. Therefore: 1 case of *Referents*. 
    ------------------------------------------------------------
1. C9 - C7 $\delta = 0.2$ || and C8 unpaired **A**: 
    * **A** is right in linking 'PNJ RC<JM' and 'CDW' in  17:9 in the unpaired class, but incorrectly links 'RC<' from 17:13 to this class. 'PNJ RC<JM' (NP, Mpl, 'faces of the wicked') does not corefer with 'RC<' (NP, Msg, 'a wicked') which has a singular form. 'CDW' (verb, 3Cpl, 'to despoil') corefers with the plural of'PNJ RC<JM'. This is counted as 1 *Annotator Error*. 
    * **B** links 'PNJ RC<JM' and 'CDW' together with 'NVWT' (infinitive, 'to extend') to his *C7*. Since infinitives are not annotated as mention, 'NVWT' is categorised as 1 *Annotator Error*. 
    * Where **A** sees the 'faces of the wicked' as a separate class from the '>JB' (participle, Mpl, 'be hostile'/'enemies) class (*C9*) that starts in 17:9, **B** considers 'PNJ RC<JM' and '>JB' belonging to one class (*C7*). Whether 'the faces of the wicked' and 'the enemies' refer to the same entity is hard to say. Therefore, these differences between **A** and **B** are categorised as 2 cases of *Genuine Ambiguity*, one for **A** and one for **B**. 
    ------------------------------------------------------------
1. C11 - C8 $\delta = 1.0$: 
    * Leaving **B**'s *C8* out of consideration, since it has already been discussed under point 2, **A** distinguishes a 'NPC' (Fsg, 'soul') class. **B** has annotated them as two singletons. The first occurrence of 'NPC' in 17:9 is an undetermined NP, but clearly refers to the soul of the I person that is being threatened by enemies. This can be inferred from the three suffixes 1Csg that occur in this verse. The 'I', the 'NPC' or self is under threat. The second occurence of 'NPC' in 17:13 is a determined NP with a 1Csg suffix. Here the same I person asks God to save him ('my soul') from 'the sword' (i.e. a threat) of the enemy. These two differences are categorised as 2 cases of *Genuine Ambiguity*. 
    ------------------------------------------------------------
1. C12 - C6 $\delta = 1.0$ || C13 - C13 $\delta \approx 0.6667$ || and C14 - C12 $\delta = 0.6$: 
    * The annotation choices made by **A** and **B** in 17:14 are best understood when discussing the three matchings in relation to each other. **A**'s *C12* consists of 2x 'MTJM' (substantive, Mpl, 'man'); **B**'s *C6* has already been discussed under point 1 above. **A** distinguishes a separate *C13* class with two suffixes 3Mpl ('M'). **B** connects one occurence of 'MTJM' to the two 3Mpl suffixes together with 'HNJXW' (verb, 3Cpl, 'to settle'), 'M', and 'HM' (both 3Mpl suffixes). **B** this identifies the 3Mpl mentions as 'a man/a people' ('MTJM'). **B** correctly connects 'M' and 'M' with 'MTJM', but forgets to connect the second 'MTJM', which is thus added to his singletons. That **A** missing a link between his *C12* and *C13* is 1 *Annotator Error*. **B** forgetting the second 'MTJM' is also 1 *Annotator Error*. (2)
    * As mentioned **B** adds 'HNJXW', 'M' and 'HM' to his *C13*. **A** however considers 'JFB<W', 'BNJM', 'HNJXW', 'M' and 'HM' as one class (*C14*) of which 'BNJM' (substantive NP, Mpl, 'children') is the subject. **B** sees a separate class in 'JFB<W' and 'BNJM' (*C12*). The subject of 'HNJXW' is, as **A** has annotated, 'BNJM' and not 'MTJM'. This is categorised as one *Referents* error. In addition, **B** does not link his *C12* to his *C13* and is therefore categorised as 1 *Annotator Error*. 
    ------------------------------------------------------------
1. C15 unpaired **A**: 
    * Of the four occurences of 'PNJ' (substantive, Mpl, 'face') - 17:2, 17:15, PNJ+K, suffix 2Msg; 17:9, 'PNJ RC<JM'; 17:13, PNJ+W, suffix 3Msg - **A** has annotated the two PNJ mentions that have the 2Msg suffix. They both refer to the face of God. **B** adds the two mentions to his singletons. These two misses are categorised as 2 *Annotator Error*s. 
    ------------------------------------------------------------
1. S - S $\delta \approx 0.3818$: 
    * Most of the differences between **A** and **B**'s annotations have been discussed, a few mentions are still left open however. 
    * In **A**'s diff 'LJLH', '>ZN' and '>JCWN BT&<JN' are of interest. 'LJLH' is a time phrase, and should be annotated. '>ZN' ('ear') matches **B**'s erroneously annotated 'ZN': he forgot the aleph ('>'). The same goes for '>JCWN BT&<JN' ('pupil of the eye'), which matches **B**'s 'JCWN BT&<JN': here the aleph is also missing. That makes 3 *Annotator Error*s. 
    * In **B**'s diff 'TMK', 'MMTQWMMJM', 'YDQ' and 'HQJY' are of interest. 'TMK' and and 'HQJY' are infinitives, 'MMTQWMMJM' is the same as **A**'s 'MTQWMMJM' in his *C7*. **B** has incorrectly annotated an extra 'M', which is a preposition. That leaves the NP 'YDQ' which **A** has incorrectly not annotated as mention. That makes 4 *Annotator Error*s. 


#### Conclusion Psalm 17

The low IAA measure of 0.4278 is a result of 36 *Annotator Error*s, 2 case of *Referents* and 4 cases of *Genuine Ambiguity*. The *Annotator Error*s, *Referents* errors can be corrected, the interpretation cannot. 

### 3.5 Psalms_020.iaa 

`Psalms_020.iaa` has an IAA measure of $\delta \approx 0.4022$. As can be seen below none of the matched classes are perfectly similar, therefore the differences between all classes will be studied. 

In [31]:
ps020_df = pd.read_table('iaa-files/Psalms_020.iaa', 
                           delim_whitespace=True, 
                           names=column_names,
                           dtype=data_types
                          )
ps020_df

Unnamed: 0,ann_A,ann_B,L,M,R,D,d
0,C1,C1,0,10,1,1,0.0909
1,C2,C3,1,10,2,3,0.2308
2,C3,C5,2,12,13,15,0.5556
3,C4,C2,1,1,1,2,0.6667
4,C7,C4,0,3,2,2,0.4
5,S,S,1,19,0,1,0.05
6,C5,-,2,0,0,2,1.0
7,C6,-,11,0,0,11,1.0


In [27]:
retrieve_ann('chris_A/Psalms_020.ann', 'gyus_B/Psalms_020.ann')

ann_A	ann_B	L	M	R	D	d	

------------------------------------------------------------
C1	C1	0	10	1	1	0.0909
------------------------------------------------------------
A	['K', 'K', 'K', 'K', 'K', 'K', 'K', 'K', 'K', 'K']	

B	['K', 'K', 'K', 'K', 'K', 'K', 'K', 'K', 'K', 'K', 'K']	

A diff	[]	

B diff	[('K', 10)]	

------------------------------------------------------------
C2	C3	1	10	2	3	0.2308
------------------------------------------------------------
A	['NRNNH', 'NW', 'NDGL', '>NXNW', 'NW', 'NZKJR', '>NXNW', 'QMNW', 'NT<WDD', 'NW', 'QR>NW']	

B	['NRNNH', 'NW', 'NDGL', '>NXNW', 'NW', 'NZKJR', '>NXNW', 'QMNW', 'NT<WDD', 'NW', 'QR>', 'NW']	

A diff	[('QR>NW', 10)]	

B diff	[('QR>', 11), ('NW', 12)]	

------------------------------------------------------------
C3	C5	2	12	13	15	0.5556
------------------------------------------------------------
A	['J<N', 'JHWH', 'JFGB', 'CM05 >LHJ J<QB', 'JCLX', 'JS<D', 'JZKR', 'JDCNH', 'JTN', 'JML>', 'K', 'CM&>LHJ', 'JML>', 'JHWH']	

B	['J<N', 'JHWH'

1. C1 - C1 $\delta \approx 0.0909$: 
    * **A** and **B** both distinguish a 'K' (suffix, 2Msg, 'you') class.  **A** annotates one 'K' less than **B**, and adds it to his *C3*. **A**'s *C3* is a 'JHWH', God, class. Though the text is a bit ambiguous - it could refer to 'JHWH' - the 'K' should have been added to *C1*. This is categorised as 1 *Referents* error. 
    -----------------------------------------------------------
1. C2 - C3 $\delta \approx 0.2308$: 
    * In 20:10 **A** has annotated 'QR>NW' where **B** has 'QR>' and 'NW'. 'QR>NW' is an infinitive construct, 'to call' and functions as a predicate with subject suffix. The annotation guidelines state that infinitives with a subject suffix are marked as one mention. This is therefore noted as 2 *Annotator Error*s.    
    -----------------------------------------------------------
1. C3 - C5 $\delta \approx 0.5556$, || C4 - C2 $\delta \approx 0.6667$ || and *C5* unpaired **A**: 
    * **B** has annotated a larger class than **A**. The diff of the latter indicates two differences: 'JTN' and 'K'. 'K' has already been discussed. 'JTN' in 20:5, is a verb, 3Msg of 'to give', **B** has forgotten to annotate the Yod ('J'), which results in the annotation of 'TN'. This is 1 *Annotator Error*. 
    * Except for 'TN', as noted, and 'MLK', **B**'s diff is much larger and corresponds to **A** unpaired *C6*. 'MLK' is dealt with below. 
    * **B** annotates in 20:7 'HWCJ<', **A** 'HWCJ'. 'HWCJ<' (verb, 2Msg, 'to help') is the right form, **A** has forgotten to annotate the Ayin ('<'). This is 1 *Annotator Error*. 
    * **B** annotates 'J<NH' in 20:7 ('J<NH', 17), the correct annotation of the mention should be 'J<N' (verb, 3Msg, 'to answer') however. The 'H' belongs to the suffix 'HW', 3Msg. This also explains **B**'s difference with **A** in their *C4* - *C2* matching. **B**'s 'W' should be 'HW' there. These are 2 *Annotator Error*s. 
    * Both **A** and **B** have found that their *C3* and *C5* respectively can be identified as 'YHWH'. Since **A**'s  unpaired *C6* also contains the name 'JHWH', **A** has probably made an linking error. This can counted as 1 *Annotator Error*. 
    -----------------------------------------------------------
1. C7 - C4 $\delta = 0.4$ || and *C5* unpaired **A**: 
    * **B** has added '>LH' 2x to his *C4*, where **A** has added 2x '>LH' to a separate class *C5*. '>LH' (Cpl, 'these') in 20:8 both function as subject in a demonstrative pronoun phrase, hence **B**'s correct annotation of these mentions in one class. The demonstratives refer to those who trust in chariots and horses, but not in the name of God. 2 *Annotator Error*s.
    -----------------------------------------------------------
1. S - S $\delta = 0.05$: 
    * **A** has annotated 'MLK' (substantive, 'king') as singleton, and **B** has not. **B** has added 'MLK' to his *C5*, which is the 'YHWH' class. **A** finds 'MLK' does not refer to God. It can be classified as 1 *Referents* error. 


#### Conclusion Psalm 20

The lower IAA measure is a result of 2 *Referents* errors and 9 *Annotator Error*s and there are no cases of *Genuine Ambiguity*. That means that if the errors are corrected **A** and **B** agree on who is who on this text. 

### 3.6 Psalms_032.iaa 

In [None]:
retrieve_ann('chris_A/Psalms_032.ann', 'gyus_B/Psalms_032.ann')

### 3.7 Psalms_070.iaa 

In [None]:
retrieve_ann('chris_A/Psalms_070.ann', 'gyus_B/Psalms_070.ann')