<a href="https://colab.research.google.com/github/polyankaglade/autoshaving_project_2020/blob/main/Trying_Berts_for_coreference.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### От изначального автора тетрадки:

This notebook runs the coreferecne resolution model described in ["SpanBERT: Improving Pre-training by Representing and Predicting Spans"](https://arxiv.org/pdf/1907.10529.pdf) by Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy, and released here: https://github.com/mandarjoshi90/coref

This Colab is by me, Jonathan K. Kummerfeld. My website is www.jkk.name

Note:
- This code does not handle text with multiple speakers, for that you will need to adjust the data preparation process.
- Occasionally I get a bug where either an assertion about the size of the input mask fails or a sequence is being assigned to an array element. It appears to be inconsistent across runs, so I'm not sure what is going on.
- The default model is not the best one. I chose it because it is much faster to download.

If you have suggestions, please contact me at jkummerf@umich.edu

# Configuration

First, specify your input. If you are just playing with this, edit the provided text. If you want to run on a larger file:

1. Upload a file.
2. Set the filename.

In [12]:
import pandas as pd

In [15]:
df = pd.read_csv('https://raw.githubusercontent.com/google-research-datasets/gap-coreference/master/gap-validation.tsv', sep='\t')
df.head(3)

Unnamed: 0,ID,Text,Pronoun,Pronoun-offset,A,A-offset,A-coref,B,B-offset,B-coref,URL
0,validation-1,He admitted making four trips to China and pla...,him,256,Jose de Venecia Jr,208,False,Abalos,241,False,http://en.wikipedia.org/wiki/Commission_on_Ele...
1,validation-2,"Kathleen Nott was born in Camberwell, London. ...",She,185,Ellen,110,False,Kathleen,150,True,http://en.wikipedia.org/wiki/Kathleen_Nott
2,validation-3,"When she returns to her hotel room, a Liberian...",his,435,Jason Scott Lee,383,False,Danny,406,True,http://en.wikipedia.org/wiki/Hawaii_Five-0_(20...


In [17]:
from collections import Counter

In [18]:
Counter([len(t.split('.')) for t in df.Text]).most_common()

[(4, 174),
 (5, 106),
 (3, 95),
 (6, 36),
 (7, 15),
 (2, 12),
 (8, 6),
 (11, 4),
 (9, 3),
 (10, 1),
 (14, 1),
 (1, 1)]

In [22]:
import nltk
nltk.download('punkt')

from nltk.tokenize import sent_tokenize

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [23]:
df['sentences'] = df.Text.apply(sent_tokenize)
df.head(3)

Unnamed: 0,ID,Text,Pronoun,Pronoun-offset,A,A-offset,A-coref,B,B-offset,B-coref,URL,sentences
0,validation-1,He admitted making four trips to China and pla...,him,256,Jose de Venecia Jr,208,False,Abalos,241,False,http://en.wikipedia.org/wiki/Commission_on_Ele...,[He admitted making four trips to China and pl...
1,validation-2,"Kathleen Nott was born in Camberwell, London. ...",She,185,Ellen,110,False,Kathleen,150,True,http://en.wikipedia.org/wiki/Kathleen_Nott,"[Kathleen Nott was born in Camberwell, London...."
2,validation-3,"When she returns to her hotel room, a Liberian...",his,435,Jason Scott Lee,383,False,Danny,406,True,http://en.wikipedia.org/wiki/Hawaii_Five-0_(20...,"[When she returns to her hotel room, a Liberia..."


In [26]:
df['len'] = df['sentences'].apply(len)
df['len'].value_counts()

3     183
4     119
2     114
5      19
1      15
10      1
9       1
8       1
7       1
Name: len, dtype: int64

In [None]:
df.iloc[:100].value_counts()

Next, specify the data type and model:

In [260]:
genre = "nw"
# The Ontonotes data for training the model contains text from several sources
# of very different styles. You need to specify the most suitable one out of:
# "bc": broadcast conversation
# "bn": broadcast news
# "mz": magazine
# "nw": newswire
# "pt": Bible text
# "tc": telephone conversation
# "wb": web data

model_name = "bert_base"
# The fine-tuned model to use. Options are:
# bert_base
# spanbert_base
# bert_large
# spanbert_large

# System Installation
Get the code:

In [3]:
! git clone https://github.com/mandarjoshi90/coref.git
%cd coref

Cloning into 'coref'...
remote: Enumerating objects: 3, done.[K
remote: Counting objects: 100% (3/3), done.[K
remote: Compressing objects: 100% (3/3), done.[K
remote: Total 731 (delta 0), reused 0 (delta 0), pack-reused 728[K
Receiving objects: 100% (731/731), 4.17 MiB | 14.29 MiB/s, done.
Resolving deltas: 100% (439/439), done.
/content/coref


Temporary hack to fix a requirement (pending pull request)

In [4]:
! cat requirements.txt | sed 's/MarkupSafe==1.0/MarkupSafe==1.1.1/' > tmp
! mv tmp requirements.txt

Set some environment variables. The data directory one is used by the system, the other is so we can use the model defined above.

In [261]:
import os
os.environ['data_dir'] = "."
os.environ['CHOSEN_MODEL'] = model_name

Run Setup. Note, some incompatibility issues do appear, but I still find that everything installs and runs. Also, I specifically request tensorflow 2 and then uninstall it to make sure we've got a clean setup.

In [6]:
%tensorflow_version 2.x
! pip uninstall -y tensorflow
! pip install -r requirements.txt --log install-log.txt -q
! ./setup_all.sh

Uninstalling tensorflow-2.4.0:
  Successfully uninstalled tensorflow-2.4.0
[K     |████████████████████████████████| 102kB 4.5MB/s 
[K     |████████████████████████████████| 1.2MB 14.8MB/s 
[K     |████████████████████████████████| 163kB 22.1MB/s 
[K     |████████████████████████████████| 6.6MB 944kB/s 
[K     |████████████████████████████████| 552kB 49.8MB/s 
[K     |████████████████████████████████| 61kB 7.7MB/s 
[K     |████████████████████████████████| 2.2MB 48.6MB/s 
[K     |████████████████████████████████| 5.4MB 25.7MB/s 
[K     |████████████████████████████████| 890kB 47.1MB/s 
[K     |████████████████████████████████| 133kB 45.2MB/s 
[K     |████████████████████████████████| 153kB 49.3MB/s 
[K     |████████████████████████████████| 51kB 7.0MB/s 
[K     |████████████████████████████████| 51kB 6.6MB/s 
[K     |████████████████████████████████| 92kB 10.4MB/s 
[K     |████████████████████████████████| 20.4MB 1.4MB/s 
[K     |████████████████████████████████| 2.1MB 

Get the finetuned BERT model specified above.

In [262]:
! ./download_pretrained.sh $CHOSEN_MODEL

Downloading bert_base
--2020-12-22 20:27:12--  http://nlp.cs.washington.edu/pair2vec/bert_base.tar.gz
Resolving nlp.cs.washington.edu (nlp.cs.washington.edu)... 128.208.3.120, 2607:4000:200:12::78
Connecting to nlp.cs.washington.edu (nlp.cs.washington.edu)|128.208.3.120|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1593421271 (1.5G) [application/x-gzip]
Saving to: ‘./bert_base.tar.gz’


2020-12-22 20:27:34 (69.0 MB/s) - ‘./bert_base.tar.gz’ saved [1593421271/1593421271]

bert_base/
bert_base/checkpoint
bert_base/events.out.tfevents.1551148826.learnfair0213
bert_base/events.out.tfevents.1551148825.learnfair0213
bert_base/model.max.ckpt.index
bert_base/stdout.log
bert_base/bert_config.json
bert_base/vocab.txt
bert_base/model.max.ckpt.data-00000-of-00001
bert_base/events.out.tfevents.1551148806.learnfair2008


# Data Preparation and Prediction

Process the data to be in the required input format.

In [27]:
from bert import tokenization
import json

def make_line(text):
    data = {
        'doc_key': genre,
        'sentences': [["[CLS]"]],
        'speakers': [["[SPL]"]],
        'clusters': [],
        'sentence_map': [0],
        'subtoken_map': [0],
    }

    # Determine Max Segment
    max_segment = None
    for line in open('experiments.conf'):
        if line.startswith(model_name):
            max_segment = True
        elif line.strip().startswith("max_segment_len"):
            if max_segment:
                max_segment = int(line.strip().split()[-1])
                break

    tokenizer = tokenization.FullTokenizer(vocab_file="cased_config_vocab/vocab.txt", do_lower_case=False)
    subtoken_num = 0
    for sent_num, line in enumerate(text):
        raw_tokens = line.split()
        tokens = tokenizer.tokenize(line)
        if len(tokens) + len(data['sentences'][-1]) >= max_segment:
            data['sentences'][-1].append("[SEP]")
            data['sentences'].append(["[CLS]"])
            data['speakers'][-1].append("[SPL]")
            data['speakers'].append(["[SPL]"])
            data['sentence_map'].append(sent_num - 1)
            data['subtoken_map'].append(subtoken_num - 1)
            data['sentence_map'].append(sent_num)
            data['subtoken_map'].append(subtoken_num)

        ctoken = raw_tokens[0]
        cpos = 0
        for token in tokens:
            data['sentences'][-1].append(token)
            data['speakers'][-1].append("-")
            data['sentence_map'].append(sent_num)
            data['subtoken_map'].append(subtoken_num)
            
            if token.startswith("##"):
                token = token[2:]
            if len(ctoken) == len(token):
                subtoken_num += 1
                cpos += 1
                if cpos < len(raw_tokens):
                    ctoken = raw_tokens[cpos]
            else:
                ctoken = ctoken[len(token):]

    data['sentences'][-1].append("[SEP]")
    data['speakers'][-1].append("[SPL]")
    data['sentence_map'].append(sent_num - 1)
    data['subtoken_map'].append(subtoken_num - 1)

    return data

In [257]:
test_data = [make_line(t) for t in df.sentences[:100]]

In [258]:
print(test_data[0]['sentences'][0])

['[CLS]', 'He', 'admitted', 'making', 'four', 'trips', 'to', 'China', 'and', 'playing', 'golf', 'there', '.', 'He', 'also', 'admitted', 'that', 'Z', '##TE', 'officials', ',', 'whom', 'he', 'says', 'are', 'his', 'golf', 'b', '##udd', '##ies', ',', 'hosted', 'and', 'paid', 'for', 'the', 'trips', '.', 'Jose', 'de', 'V', '##ene', '##cia', 'III', ',', 'son', 'of', 'House', 'Speaker', 'Jose', 'de', 'V', '##ene', '##cia', 'Jr', ',', 'alleged', 'that', 'A', '##bal', '##os', 'offered', 'him', 'US', '$', '10', 'million', 'to', 'withdraw', 'his', 'proposal', 'on', 'the', 'N', '##B', '##N', 'project', '.', '[SEP]']


In [259]:
with open("test.in.json", 'w') as out:
    for d in test_data:
        json.dump(d, out, sort_keys=True)
        out.write('\n')

#! cat test.in.json

Run Prediction

In [263]:
! GPU=0 python predict.py $CHOSEN_MODEL test.in.json test2.out.txt

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  from ._conv import register_converters as _register_converters
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
W1222 20:30:43.574110 139691296565120 deprecation_wrapper.py:119] From /content/coref/coref_ops.py:11: The name tf.NotDifferentiable is deprecated. Please use tf.no_gradient instead.

W1222 20:30:43.599173 139691296565120 deprecation_wrapper.py:119] From /content/coref/bert/optimization.py:87: The name tf.train.Opti

# Output Handling

Finally, we do a little processing to get the output to have the same token indices as our input.

In [264]:
def convert_mention(output, mention, comb_text):
    start = output['subtoken_map'][mention[0]]
    end = output['subtoken_map'][mention[1]] + 1
    nmention = (start, end)
    mtext = ''.join(' '.join(comb_text[mention[0]:mention[1]+1]).split(" ##"))
    return (nmention, mtext)

def get_clusters(output):
    comb_text = [word for sentence in output['sentences'] for word in sentence]
    seen = set()
    #print('Clusters:')
    clusters = []
    for cluster in output['predicted_clusters']:
        mapped = []
        for mention in cluster:
            seen.add(tuple(mention))
            mapped.append(convert_mention(output, mention, comb_text))
        clusters.append(mapped)

    return clusters

# print('\nMentions:')
# for mention in output['top_spans']:
#     if tuple(mention) in seen:
#         continue
#     print(convert_mention(mention), end=",\n")

In [265]:
with open('test2.out.txt') as f:
    lines = f.readlines()

In [266]:
outputs = [json.loads(l) for l in lines]

In [267]:
print(outputs[0]['sentences'])

[['[CLS]', 'He', 'admitted', 'making', 'four', 'trips', 'to', 'China', 'and', 'playing', 'golf', 'there', '.', 'He', 'also', 'admitted', 'that', 'Z', '##TE', 'officials', ',', 'whom', 'he', 'says', 'are', 'his', 'golf', 'b', '##udd', '##ies', ',', 'hosted', 'and', 'paid', 'for', 'the', 'trips', '.', 'Jose', 'de', 'V', '##ene', '##cia', 'III', ',', 'son', 'of', 'House', 'Speaker', 'Jose', 'de', 'V', '##ene', '##cia', 'Jr', ',', 'alleged', 'that', 'A', '##bal', '##os', 'offered', 'him', 'US', '$', '10', 'million', 'to', 'withdraw', 'his', 'proposal', 'on', 'the', 'N', '##B', '##N', 'project', '.', '[SEP]']]


In [268]:
predictions = [get_clusters(output) for output in outputs]

In [269]:
clusters = [[[m[1] for m in cluster] for cluster in clusters] for clusters in predictions]
clusters[1]

[['Kathleen Nott', 'Her', 'her', 'Kathleen', 'She'],
 ['London', 'London', 'London']]

In [270]:
test_df = df.iloc[:100].copy()

In [271]:
test_df['clusters'] = clusters

# Сравнение (лучше см. [тетрадку](https://github.com/polyankaglade/autoshaving_project_2020/blob/main/Compare%20predictions%20to%20GAP.ipynb))

In [208]:
def flatten(cluster):
    clus_words = []
    for ent in cluster:
        clus_words.extend(ent.split())
    return ' '.join(clus_words)

In [247]:
test_df.clusters[1]

[['Kathleen Nott', 'Her', 'her', 'Kathleen', 'She'],
 ['London', 'London', 'London']]

In [209]:
flatten(test_df.clusters[1][0])

'Kathleen Nott Her her Kathleen She'

In [272]:
def compare(pronoun, a, a_value, b, b_value, clusters):

    #clusters = [flatten(c) for c in clusters]

    positives = []
    negatives = []
    if a_value:
        positives.append([pronoun, a])
    else:
        negatives.append([pronoun, a])

    if b_value:
        positives.append([pronoun, b])
    else:
        negatives.append([pronoun, b])

    
    tp_clusters = []
    tp_gold = []

    fp_clusters = []
    fp_gold = []

    tn_clusters = []
    fn_clusters = []

    # for cluster in clusters:

    #     for p in positives:
    #         if p[0] and p[1] in cluster:
    #             tp_clusters.append(cluster)
    #     for n in negatives:
    #         if n[0] and n[1] in cluster:
    #             fp_clusters.append(cluster)

    for p in positives:
        for cluster in clusters:
            if p[0] in cluster and p[1] in cluster:
                tp_clusters.append(cluster)
                tp_gold.append(p)

    for p in positives:
        if p not in tp_gold:
            fn_clusters.append(p)

    for n in negatives:
        for cluster in clusters:
            if n[0] in cluster and n[1] in cluster:
                fp_clusters.append(cluster)
                fp_gold.append(n)

    for n in negatives:
        if n not in fp_gold:
            tn_clusters.append(n)

    return tp_clusters, fp_clusters, tn_clusters, fn_clusters

In [273]:
res = test_df.apply(lambda x: compare(x['Pronoun'], x['A'], x['A-coref'], x['B'], x['B-coref'], x['clusters']), axis=1).values

In [274]:
def clusters_to_string(clusters):
    return '; '.join([', '.join([m for m in c]) for c in clusters])

In [275]:
for i, n in enumerate(['tp', 'fp', 'tn', 'fn']):

    test_df[n] = [r[i] for r in res]
    test_df[f'{n}_count'] = test_df[n].apply(len)
    test_df[f'str_{n}'] = test_df[n].apply(clusters_to_string)

In [276]:
test_df['str_clusters'] = test_df.clusters.apply(clusters_to_string)

In [277]:
test_df[['Pronoun', 'A', 'A-coref', 'B', 'B-coref', 'tp_count', 'fp_count', 'tn_count', 'fn_count']]

Unnamed: 0,Pronoun,A,A-coref,B,B-coref,tp_count,fp_count,tn_count,fn_count
0,him,Jose de Venecia Jr,False,Abalos,False,0,0,2,0
1,She,Ellen,False,Kathleen,True,1,0,1,0
2,his,Jason Scott Lee,False,Danny,True,1,0,1,0
3,he,Reucassel,True,Debnam,False,1,0,1,0
4,she,Finch Hatton,False,Beryl Markham,True,0,0,1,1
...,...,...,...,...,...,...,...,...,...
95,he,Fred Ziffel,False,Drucker,True,1,0,1,0
96,her,Seema,False,Shalini,False,0,2,0,0
97,she,Branton,False,Heloise,False,0,0,2,0
98,his,Hibbert,True,Christopher Robin,False,1,0,1,0


In [278]:
len(test_df[test_df['fp_count'] + test_df['fn_count'] > 0])

46

In [279]:
len(test_df[test_df['tp_count'] + test_df['tn_count'] == 2])

54

In [280]:
falses = test_df[test_df['fp_count'] + test_df['fn_count'] > 0]
falses[['Pronoun', 'A', 'A-coref', 'B', 'B-coref', 'str_clusters', 'str_fp', 'str_fn', 'str_tp', 'str_tn']]

Unnamed: 0,Pronoun,A,A-coref,B,B-coref,str_clusters,str_fp,str_fn,str_tp,str_tn
4,she,Finch Hatton,False,Beryl Markham,True,"Karen Blixen, her, her; her husband, Finch Hat...",,"she, Beryl Markham",,"she, Finch Hatton"
5,he,James Randi,False,Jos* Alvarez,True,"stage performer Jos * Alvarez, he",,"he, Jos* Alvarez",,"he, James Randi"
7,his,Colin,False,Jake Burns,True,"He, He, he, Colin; the Belfast Sunday News, th...",,"his, Jake Burns",,"his, Colin"
8,he,Scott,False,Cowan,True,"F . Scott Fitzgerald ' s, Fitzgerald, his, Sco...","F . Scott Fitzgerald ' s, Fitzgerald, his, Sco...","he, Cowan",,
9,her,Beverley Callard,True,Liz,False,"her, Liz, her, Beverley Callard, Liz","her, Liz, her, Beverley Callard, Liz",,"her, Liz, her, Beverley Callard, Liz",
10,he,Ioannis Mamouris,False,Kallergis,True,"This particular government, Greece, Greece; th...",,"he, Kallergis",,"he, Ioannis Mamouris"
12,her,Queen,True,Crystal,False,"Princess Luminous, her, her, Princess Luminous...",,"her, Queen",,"her, Crystal"
13,his,Dan Dailey,False,Michael Kidd,True,"dancer / choreographer Michael Kidd, his",,"his, Michael Kidd",,"his, Dan Dailey"
18,his,Dwight,False,Andy,True,"Andy, Dwight ' s, Dwight ' s, Dwight, Andy, th...","Andy, Dwight ' s, Dwight ' s, Dwight, Andy, th...",,"Andy, Dwight ' s, Dwight ' s, Dwight, Andy, th...",
19,his,Morris,False,David W. Taylor,True,"Rear Admiral David W . Taylor, his",,"his, David W. Taylor",,"his, Morris"


In [281]:
def make_pos_neg(a, a_value, b, b_value):
    positives = []
    negatives = []
    if a_value:
        positives.append(a)
    else:
        negatives.append(a)

    if b_value:
        positives.append(b)
    else:
        negatives.append(b)

    return ', '.join(positives), ', '.join(negatives)

In [282]:
p_n = test_df.apply(lambda x: make_pos_neg(x['A'], x['A-coref'], x['B'], x['B-coref']), axis=1).values

In [283]:
test_df['positives'] = [x[0] for x in p_n]
test_df['negatives'] = [x[1] for x in p_n]
test_df.head()

Unnamed: 0,ID,Text,Pronoun,Pronoun-offset,A,A-offset,A-coref,B,B-offset,B-coref,URL,sentences,len,clusters,tp,tp_count,str_tp,fp,fp_count,str_fp,tn,tn_count,str_tn,fn,fn_count,str_fn,str_clusters,positives,negatives
0,validation-1,He admitted making four trips to China and pla...,him,256,Jose de Venecia Jr,208,False,Abalos,241,False,http://en.wikipedia.org/wiki/Commission_on_Ele...,[He admitted making four trips to China and pl...,3,"[[He, He, he, his], [four trips to China, the ...",[],0,,[],0,,"[[him, Jose de Venecia Jr], [him, Abalos]]",2,"him, Jose de Venecia Jr; him, Abalos",[],0,,"He, He, he, his; four trips to China, the trip...",,"Jose de Venecia Jr, Abalos"
1,validation-2,"Kathleen Nott was born in Camberwell, London. ...",She,185,Ellen,110,False,Kathleen,150,True,http://en.wikipedia.org/wiki/Kathleen_Nott,"[Kathleen Nott was born in Camberwell, London....",3,"[[Kathleen Nott, Her, her, Kathleen, She], [Lo...","[[Kathleen Nott, Her, her, Kathleen, She]]",1,"Kathleen Nott, Her, her, Kathleen, She",[],0,,"[[She, Ellen]]",1,"She, Ellen",[],0,,"Kathleen Nott, Her, her, Kathleen, She; London...",Kathleen,Ellen
2,validation-3,"When she returns to her hotel room, a Liberian...",his,435,Jason Scott Lee,383,False,Danny,406,True,http://en.wikipedia.org/wiki/Hawaii_Five-0_(20...,"[When she returns to her hotel room, a Liberia...",3,"[[she, her, her, She, she, Angela], [the fligh...","[[Danny, his]]",1,"Danny, his",[],0,,"[[his, Jason Scott Lee]]",1,"his, Jason Scott Lee",[],0,,"she, her, her, She, she, Angela; the flight, t...",Danny,Jason Scott Lee
3,validation-4,"On 19 March 2007, during a campaign appearance...",he,333,Reucassel,300,True,Debnam,325,False,http://en.wikipedia.org/wiki/Craig_Reucassel,"[On 19 March 2007, during a campaign appearanc...",2,"[[the then opposition leader Peter Debnam, Deb...","[[the then opposition leader Peter Debnam, Deb...",1,"the then opposition leader Peter Debnam, Debna...",[],0,,"[[he, Debnam]]",1,"he, Debnam",[],0,,"the then opposition leader Peter Debnam, Debna...",Reucassel,Debnam
4,validation-5,"By this time, Karen Blixen had separated from ...",she,427,Finch Hatton,290,False,Beryl Markham,328,True,http://en.wikipedia.org/wiki/Denys_Finch_Hatton,"[By this time, Karen Blixen had separated from...",4,"[[Karen Blixen, her, her], [her husband, Finch...",[],0,,[],0,,"[[she, Finch Hatton]]",1,"she, Finch Hatton","[[she, Beryl Markham]]",1,"she, Beryl Markham","Karen Blixen, her, her; her husband, Finch Hat...",Beryl Markham,Finch Hatton


In [284]:
export = ['Text', 'Pronoun', 'A', 'A-coref', 'B', 'B-coref',
          'positives', 'negatives',
          'str_clusters', 'str_fp', 'str_fn', 'str_tp', 'str_tn',
          'tp_count', 'fp_count', 'tn_count', 'fn_count']
export_df = test_df[export]
export_df.to_csv('results2.tsv', sep='\t')