# Fine tuning a Spacy model on a bio-medical dataset "Blurb" 

#### Fairly good performance, but have to try fine tuning the hyper-params - "accuracy vs efficiency" options, batch size, max number of epochs etc.


## Dataset


#### https://huggingface.co/datasets/EMBO/BLURB

JNLPBA
The BioNLP / JNLPBA Shared Task 2004 involves the identification and classification of technical terms referring to concepts of interest to biologists in the domain of molecular biology. The task was organized by GENIA Project based on the annotations of the GENIA Term corpus (version 3.02). Corpus format: **The JNLPBA corpus is distributed in IOB format, with each line containing a single token and its tag, separated by a tab character. Sentences are separated by blank lines.**


### Import Spacy

In [1]:
import spacy

In [3]:
!python -m spacy info

[1m

spaCy version    3.7.4                         
Location         C:\Users\rajar\anaconda3\lib\site-packages\spacy
Platform         Windows-10-10.0.19041-SP0     
Python version   3.8.5                         
Pipelines                                      



### Load dataset

In [4]:
import datasets 
blurb = datasets.load_dataset("EMBO/BLURB", "JNLPBA") 
blurb

DatasetDict({
    train: Dataset({
        features: ['id', 'tokens', 'ner_tags'],
        num_rows: 18608
    })
    validation: Dataset({
        features: ['id', 'tokens', 'ner_tags'],
        num_rows: 1940
    })
    test: Dataset({
        features: ['id', 'tokens', 'ner_tags'],
        num_rows: 4261
    })
})

In [9]:
type(blurb['train'])

datasets.arrow_dataset.Dataset

In [6]:
print(blurb['train'][1])

{'id': '1', 'tokens': ['IL-2', 'gene', 'expression', 'and', 'NF-kappa', 'B', 'activation', 'through', 'CD28', 'requires', 'reactive', 'oxygen', 'production', 'by', '5-lipoxygenase', '.'], 'ner_tags': [7, 8, 0, 0, 1, 2, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0]}


In [7]:
blurb['train'].features['ner_tags']

Sequence(feature=ClassLabel(names=['O', 'B-protein', 'I-protein', 'B-cell_type', 'I-cell_type', 'B-cell_line', 'I-cell_line', 'B-DNA', 'I-DNA', 'B-RNA', 'I-RNA'], id=None), length=-1, id=None)

In [8]:
# Collect the labels and create a mapping between numbers and labels.
label_lst = blurb['train'].features['ner_tags'].feature.names
id_to_label = {id: label for id, label in enumerate(label_lst)}
id_to_label

{0: 'O',
 1: 'B-protein',
 2: 'I-protein',
 3: 'B-cell_type',
 4: 'I-cell_type',
 5: 'B-cell_line',
 6: 'I-cell_line',
 7: 'B-DNA',
 8: 'I-DNA',
 9: 'B-RNA',
 10: 'I-RNA'}

**The given dataset is arrow dataset. <br>
The data is in form of list of individual words (tokens) and their corresponding ner_tags (labels). <br>
Also there are no details of token offsets - start, end of each token. <br><br>
But Spacy needs the data with text and entities like below <br>
A list of dict {'text': "given text", 'entities': [List of entities]}** <br>
{'text': 'IL-2 gene expression and NF-kappa B activation through CD28 requires reactive oxygen production by 5-lipoxygenase .',
  'entities': [(0, 4, 'B-DNA'),
   (5, 9, 'I-DNA'),
   (25, 33, 'B-protein'),
   (34, 35, 'I-protein'),
   (55, 59, 'B-protein'),
   (99, 113, 'B-protein')]}

### Data Preparation
We convert the arrow dataset to the format which Spacy needs. 
First we will convert the dataset to pandas df for some pre-processing and 
then derive the data format needed for Spacy.


In [11]:
print(id_to_label)

{0: 'O', 1: 'B-protein', 2: 'I-protein', 3: 'B-cell_type', 4: 'I-cell_type', 5: 'B-cell_line', 6: 'I-cell_line', 7: 'B-DNA', 8: 'I-DNA', 9: 'B-RNA', 10: 'I-RNA'}


In [12]:
import pyarrow as pa
import pandas as pd

# Given a list of ner_tags in numbers, return the list of corresponding label strings.
def give_ner_labels(ner_lst):
    ret_lst = []
    for id in ner_lst:
        ret_lst.append(id_to_label[id])
    return(ret_lst)


# Method to 
# 1. Combine the individual words in the 'tokens' column of each 
#     row of the dataframe in to a single sentence.
# 2. Return a list of (start, end, label) of each word/token.

def join_and_track_indices(row):
    
    # Sentence string to be returned.
    result_string = ""
    # A list of (start, end, label) of each word/token.
    labl_index_lst = []
    
    current_index = 0

    for word, label in zip(row.tokens, row.label_lst):
        if result_string:
            result_string += " "
            current_index += 1

        start_index = current_index
        result_string += word
        end_index = current_index + len(word) - 1
        
        # We don't need 'Other' label category for Spacy training.
        if label != 'O':
            labl_index_lst.append((start_index, end_index+1, label))
            
        current_index = end_index + 1
        
    # Return the combined sentence string and label index lst
    return result_string, labl_index_lst


# Method to create {'text': "given text", 'entities': [List of entities]} 
#  from 'text' and 'entities' column of each row of dataframe, and append it to a data list.
def build_data(row, data):
    temp_dict =  {}
    
    temp_dict['text'] = row['text']
    temp_dict['entities'] = row['entities']
    
    if len(temp_dict['entities']) != 0:
        data.append(temp_dict)

        
# Method to convert a given arrow dataset 
#  to a list of dict {'text': "given text", 'entities': [List of entities]} needed for Spacy
def convert_arrow_to_spacy(arrow_ds):
    
    ret_data = []
    
    ### convert from Huggingface dataset to Pandas DF
    #https://stackoverflow.com/questions/71102654/huggingface-datasets-convert-a-dataset-to-pandas-and-then-convert-it-back
    df = arrow_ds.to_pandas()
    
    df['label_lst'] = df['ner_tags'].apply(give_ner_labels)
    df = df[['tokens','label_lst']]

    df2 = df.copy()
    df2[['text', 'entities']] = df2.apply(join_and_track_indices, axis=1, \
                                                        result_type = 'expand')
    df2 = df2[["text", "entities"]]

    df2.apply(lambda row: build_data(row, ret_data), axis=1)
    return ret_data


training_data =  convert_arrow_to_spacy(blurb['train'])
validation_data =  convert_arrow_to_spacy(blurb['validation'])
test_data = convert_arrow_to_spacy(blurb['test'])

#### Check the sanity of prepared data samples

In [13]:
training_data[0:3]

[{'text': 'IL-2 gene expression and NF-kappa B activation through CD28 requires reactive oxygen production by 5-lipoxygenase .',
  'entities': [(0, 4, 'B-DNA'),
   (5, 9, 'I-DNA'),
   (25, 33, 'B-protein'),
   (34, 35, 'I-protein'),
   (55, 59, 'B-protein'),
   (99, 113, 'B-protein')]},
 {'text': 'Activation of the CD28 surface receptor provides a major costimulatory signal for T cell activation resulting in enhanced production of interleukin-2 ( IL-2 ) and cell proliferation .',
  'entities': [(18, 22, 'B-protein'),
   (23, 30, 'I-protein'),
   (31, 39, 'I-protein'),
   (136, 149, 'B-protein'),
   (152, 156, 'B-protein')]},
 {'text': 'In primary T lymphocytes we show that CD28 ligation leads to the rapid intracellular formation of reactive oxygen intermediates ( ROIs ) which are required for CD28 -mediated activation of the NF-kappa B / CD28-responsive complex and IL-2 expression .',
  'entities': [(3, 10, 'B-cell_type'),
   (11, 12, 'I-cell_type'),
   (13, 24, 'I-cell_type'),
   (38,

In [14]:
validation_data[0:3]

[{'text': "Disruption of the human SCL locus by `` illegitimate '' V- ( D ) -J recombinase activity .",
  'entities': [(18, 23, 'B-DNA'), (24, 27, 'I-DNA'), (28, 33, 'I-DNA')]},
 {'text': 'A fusion complementary DNA in the T cell line HSB-2 elucidates a provocative mechanism for the disruption of the putative hematopoietic transcription factor SCL .',
  'entities': [(2, 8, 'B-DNA'),
   (9, 22, 'I-DNA'),
   (23, 26, 'I-DNA'),
   (34, 35, 'B-cell_line'),
   (36, 40, 'I-cell_line'),
   (41, 45, 'I-cell_line'),
   (46, 51, 'I-cell_line'),
   (122, 135, 'B-protein'),
   (136, 149, 'I-protein'),
   (150, 156, 'I-protein'),
   (157, 160, 'B-protein')]},
 {'text': "The fusion cDNA results from an interstitial deletion between a previously unknown locus , SIL ( SCL interrupting locus ) , and the 5 ' untranslated region of SCL .",
  'entities': [(4, 10, 'B-DNA'),
   (11, 15, 'I-DNA'),
   (75, 82, 'B-DNA'),
   (83, 88, 'I-DNA'),
   (91, 94, 'B-DNA'),
   (97, 100, 'B-DNA'),
   (101, 113, 'I-DNA'),

In [15]:
test_data[0:3]

[{'text': 'Number of glucocorticoid receptors in lymphocytes and their sensitivity to hormone action .',
  'entities': [(10, 24, 'B-protein'),
   (25, 34, 'I-protein'),
   (38, 49, 'B-cell_type')]},
 {'text': 'The study demonstrated a decreased level of glucocorticoid receptors ( GR ) in peripheral blood lymphocytes from hypercholesterolemic subjects , and an elevated level in patients with acute myocardial infarction .',
  'entities': [(44, 58, 'B-protein'),
   (59, 68, 'I-protein'),
   (71, 73, 'B-protein'),
   (79, 89, 'B-cell_type'),
   (90, 95, 'I-cell_type'),
   (96, 107, 'I-cell_type')]},
 {'text': 'In the lymphocytes with a high GR number , dexamethasone inhibited [ 3H ] -thymidine and [ 3H ] -acetate incorporation into DNA and cholesterol , respectively , in the same manner as in the control cells .',
  'entities': [(7, 18, 'B-cell_type'),
   (31, 33, 'B-protein'),
   (190, 197, 'B-cell_type'),
   (198, 203, 'I-cell_type')]}]

In [16]:
test_data[1]

{'text': 'The study demonstrated a decreased level of glucocorticoid receptors ( GR ) in peripheral blood lymphocytes from hypercholesterolemic subjects , and an elevated level in patients with acute myocardial infarction .',
 'entities': [(44, 58, 'B-protein'),
  (59, 68, 'I-protein'),
  (71, 73, 'B-protein'),
  (79, 89, 'B-cell_type'),
  (90, 95, 'I-cell_type'),
  (96, 107, 'I-cell_type')]}

In [17]:
print(test_data[1]['text'][44:58], test_data[1]['text'][59:68])
print(test_data[1]['text'][79:89], test_data[1]['text'][90:95], test_data[1]['text'][96:107])

glucocorticoid receptors
peripheral blood lymphocytes


### Example to understand spacy.tokens.doc.Doc and spacy.tokens.span.Span

In [19]:
# This downloads Spacy model to Users-><username>->anaconda3->Lib->site-packages
!python -m spacy download en_core_web_sm

Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
     --------------------------------------- 12.8/12.8 MB 13.3 MB/s eta 0:00:00


[notice] A new release of pip is available: 23.3.1 -> 24.2
[notice] To update, run: python.exe -m pip install --upgrade pip



Installing collected packages: en-core-web-sm
Successfully installed en-core-web-sm-3.7.1
[38;5;2m[+] Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')


In [20]:
nlp = spacy.load("en_core_web_sm")
nlp

  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(


<spacy.lang.en.English at 0x238bf1a9430>

In [21]:
# Pass a text to the spacy model to get a 'doc' type object.
doc = nlp("Donad Trump was President of USA")

In [22]:
type(doc)

spacy.tokens.doc.Doc

In [29]:
doc.ents

(Donad Trump, USA)

In [37]:
for entity in doc.ents:
    print(entity)
    print(type(entity))
    print(entity.start_char, entity.end_char, entity.label_)
    print()

Donad Trump
<class 'spacy.tokens.span.Span'>
0 11 PERSON

USA
<class 'spacy.tokens.span.Span'>
29 32 GPE



In [35]:
from spacy import displacy
displacy.render(doc, style="ent", jupyter=True)

#### How we can convert our text examples to Spacy Doc format with entities as Spacy Span

In [44]:
training_data[0]

{'text': 'IL-2 gene expression and NF-kappa B activation through CD28 requires reactive oxygen production by 5-lipoxygenase .',
 'entities': [(0, 4, 'B-DNA'),
  (5, 9, 'I-DNA'),
  (25, 33, 'B-protein'),
  (34, 35, 'I-protein'),
  (55, 59, 'B-protein'),
  (99, 113, 'B-protein')]}

In [45]:
from spacy.util import filter_spans

example = training_data[0]
text = example['text']
labels = example['entities']

nlp = spacy.blank("en") # load a new spacy model
doc = nlp.make_doc(text) 
ents = []
    
for start, end, label in labels:
    span = doc.char_span(start, end, label=label, alignment_mode="contract")
        
    if span is None:
        #print("Skipping entity {} {} {}".format(text, start, label))
        print("Skipping entity")
    else:
        ents.append(span)
    
#There are some entity span overlaps, i.e., the indices of some entities overlap. 
#spaCy provides a utility method filter_spans to deal with this.
filtered_ents = filter_spans(ents)
doc.ents = filtered_ents         

In [46]:
doc

IL-2 gene expression and NF-kappa B activation through CD28 requires reactive oxygen production by 5-lipoxygenase .

In [47]:
type(doc)

spacy.tokens.doc.Doc

In [48]:
for entity in doc.ents:
    print(entity)
    print(type(entity))
    print(entity.start_char, entity.end_char, entity.label_)
    print()

IL-2
<class 'spacy.tokens.span.Span'>
0 4 B-DNA

gene
<class 'spacy.tokens.span.Span'>
5 9 I-DNA

NF-kappa
<class 'spacy.tokens.span.Span'>
25 33 B-protein

B
<class 'spacy.tokens.span.Span'>
34 35 I-protein

CD28
<class 'spacy.tokens.span.Span'>
55 59 B-protein

5-lipoxygenase
<class 'spacy.tokens.span.Span'>
99 113 B-protein



### Convert the whole training_data to Spacy Doc format and store them in a DocBin that Spacy needs to train

In [49]:
# Convert the training_data to DocBin format that Spacy needs to train
from spacy.tokens import DocBin

#Tqdm is a popular Python library that provides a simple and convenient way to add 
# progress bars to loops and iterable objects
from tqdm import tqdm

nlp = spacy.blank("en") # load a new spacy model
train_doc_bin = DocBin()
valid_doc_bin = DocBin()

In [50]:
from spacy.util import filter_spans

# doc_bin
#   doc
#     entities
#       span

def build_doc_bin(data, doc_bin):
    for example in tqdm(data): 
        text = example['text']
        labels = example['entities']
        doc = nlp.make_doc(text) 
        ents = []
    
        for start, end, label in labels:
        
            span = doc.char_span(start, end, label=label, alignment_mode="contract")
        
            if span is None:
                #print("Skipping entity {} {} {}".format(text, start, label))
                print("Skipping entity")
            else:
                ents.append(span)
    
        #There are some entity span overlaps, i.e., the indices of some entities overlap. 
        #spaCy provides a utility method filter_spans to deal with this.

        filtered_ents = filter_spans(ents)
        #print(filtered_ents)
        doc.ents = filtered_ents 
        doc_bin.add(doc)
    
build_doc_bin(training_data, train_doc_bin)
build_doc_bin(validation_data, valid_doc_bin)

# Save the training and validation doc_bins locally as spacy files
train_doc_bin.to_disk("train.spacy") 
valid_doc_bin.to_disk("validation.spacy") 

100%|██████████| 15150/15150 [00:03<00:00, 4476.97it/s]
100%|██████████| 1514/1514 [00:00<00:00, 4658.32it/s]


### Fine-tune Spacy Model

#### Prepare the config file for model training

In [17]:
# Pradip's method

# 1. Go here, select "ner", accuracy" (model will be larger and slower) and create base_config.cfg 
# https://spacy.io/usage/training#quickstart

# 2. Then run this
#!python -m spacy init fill-config base_config.cfg config.cfg

# 3. Then Set batch_size=100, max_epochs = 10 in config.cfg to limit the number of epochs, 
#      otherwise it will run to full optimisation and will take long time.


# CLI Alternate method -
# https://www.kaggle.com/code/mohsinsial/training-medical-ner-using-spacy
# https://www.youtube.com/watch?v=p_7hJvl7P2A&t=17s

# Got better results with the below CLI method with optimize efficiency 
#     and batch_size = 100, max_epochs = 100 in config.cfg

# 1. Run the below cli
!python -m spacy init config config.cfg --lang en --pipeline ner --optimize efficiency --force
#!python -m spacy init config config.cfg --lang en --pipeline ner --optimize accuracy --force

# 2. Edit batch_size = 100, max_epochs = 100 in config.cfg


[38;5;4m[i] Generated config template specific for your use case[0m
- Language: en
- Pipeline: ner
- Optimize for: efficiency
- Hardware: CPU
- Transformer: None
[38;5;2m[+] Auto-filled config with all values[0m
[38;5;2m[+] Saved config[0m
config.cfg
You can now add your data and train your pipeline:
python -m spacy train config.cfg --paths.train ./train.spacy --paths.dev ./dev.spacy


  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(


#### Train the spacy model using training data and validation data

In [45]:
!python -m spacy train config.cfg --output ./ --paths.train ./train.spacy \
    --paths.dev ./validation.spacy

[38;5;4m[i] Saving to output directory: .[0m
[38;5;4m[i] Using CPU[0m
[1m
[38;5;2m[+] Initialized pipeline[0m
[1m
[38;5;4m[i] Pipeline: ['tok2vec', 'ner'][0m
[38;5;4m[i] Initial learn rate: 0.001[0m
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     49.43    0.32    0.85    0.20    0.00
  0     200        342.11   5914.50   33.77   54.63   24.44    0.34
  0     400        497.25   5073.36   60.28   64.59   56.51    0.60
  0     600        453.30   5397.60   65.84   70.16   62.03    0.66
  0     800        570.01   6046.21   67.75   69.99   65.65    0.68
  0    1000        749.73   7634.96   70.12   72.27   68.11    0.70
  0    1200        917.49   8494.01   71.06   72.02   70.12    0.71
  0    1400       1123.18   9724.16   73.14   77.37   69.34    0.73
  0    1600       1560.12  11486.37   75.86   77.30   74.48    0.76
  1    1800       1961.66  13370.28   75.

  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(


In [46]:
# https://github.com/explosion/spaCy/discussions/11150
# https://support.prodi.gy/t/which-hyperparameters-are-recommended-for-tuning-on-a-custom-spacy-ner-model/5803

# Try with batch_size = 500 instead of 100 - No improvement
# Try with droput = 0.3 instead of 0.1 - No improvement
# Try with learn_rate = 0.005 instead of 0.001 - No improvement

#!python -m spacy train config.cfg --output ./ --paths.train ./train.spacy \
#    --paths.dev ./validation.spacy

### Test the fine tuned model and evaluate metrics on test data

#### Load the fine tuned model

In [47]:
import spacy
nlp_ner = spacy.load("model-best")

#### Take few text samples from test dataset and predict the entities 

In [48]:
from spacy import displacy

test_examples = test_data[40:45]
#test_examples = test_data[10:15]

for example in test_examples:
    doc = nlp_ner(example['text'])
    print("Predicted Labels :")
    displacy.render(doc, style="ent", jupyter=True)
    
    print("True Labels :")
    
    for i in range(len(example['entities'])): 
        start = example['entities'][i][0]
        end = example['entities'][i][1]
        label = example['entities'][i][2]
        print(example['text'][start:end], label)
        
    print("\n")

Predicted Labels :


True Labels :
nuclear B-protein
proteins I-protein
OCTA B-DNA
element I-DNA


Predicted Labels :


True Labels :
NF-A1 B-protein
NF-A2 B-protein
B B-cell_type
cells I-cell_type
NF-A2 B-protein
Ig B-DNA
promoter I-DNA
NF-A1 B-protein
OCTA B-DNA
element I-DNA


Predicted Labels :


True Labels :
B-cell B-cell_line
line I-cell_line
BJA-B B-cell_line
NF-A2 B-protein
Ig B-DNA
promoters I-DNA


Predicted Labels :


True Labels :
HeLa B-cell_line
cells I-cell_line
Ig B-DNA
promoter I-DNA


Predicted Labels :


True Labels :
Ig B-DNA
promoter I-DNA




In [49]:
#test_data[0:5]

#### Evaluation metrics on test data 
https://stackoverflow.com/questions/52856057/is-there-a-way-with-spacys-ner-to-calculate-metrics-per-entity-type

In [50]:
# Method 1

from spacy.training.example import Example

examples = []

for example in test_data:
    ent_dict = {}
    #print(example['text'])
    ent_dict['entities'] = example['entities']
    #print(ent_dict)
    doc = nlp_ner.make_doc(example['text'])
    examples.append(Example.from_dict(doc, ent_dict))

nlp_ner.evaluate(examples)
#examples

{'token_acc': 1.0,
 'token_p': 1.0,
 'token_r': 1.0,
 'token_f': 1.0,
 'ents_p': 0.7490529811634061,
 'ents_r': 0.7443791254125413,
 'ents_f': 0.7467087396218607,
 'ents_per_type': {'B-protein': {'p': 0.6878677055175704,
   'r': 0.8537596210775606,
   'f': 0.7618879887284256},
  'I-protein': {'p': 0.793999104343932,
   'r': 0.7427733556765815,
   'f': 0.7675324675324675},
  'B-cell_type': {'p': 0.7796508127633955,
   'r': 0.6741280583029672,
   'f': 0.7230597431602456},
  'I-cell_type': {'p': 0.8609022556390977,
   'r': 0.6890672016048145,
   'f': 0.7654596100278551},
  'B-DNA': {'p': 0.7049180327868853,
   'r': 0.6515151515151515,
   'f': 0.6771653543307087},
  'B-cell_line': {'p': 0.5484460694698354, 'r': 0.6, 'f': 0.5730659025787964},
  'I-DNA': {'p': 0.799404761904762,
   'r': 0.7506987143655673,
   'f': 0.7742865379071778},
  'I-cell_line': {'p': 0.6893305439330544,
   'r': 0.6663296258847321,
   'f': 0.6776349614395887},
  'B-RNA': {'p': 0.6605504587155964,
   'r': 0.610169491525

In [51]:
# Method 2 

from spacy.scorer import Scorer

examples1 = []

for example in test_data:
    ent_dict = {}
    #print(example['text'])
    ent_dict['entities'] = example['entities']
    #print(ent_dict)
    doc = nlp_ner(example['text'])
    examples1.append(Example.from_dict(doc, ent_dict))

scorer = Scorer()
scores = scorer.score(examples1)
scores

{'token_acc': 1.0,
 'token_p': 1.0,
 'token_r': 1.0,
 'token_f': 1.0,
 'sents_p': None,
 'sents_r': None,
 'sents_f': None,
 'tag_acc': None,
 'pos_acc': None,
 'morph_acc': None,
 'morph_micro_p': None,
 'morph_micro_r': None,
 'morph_micro_f': None,
 'morph_per_feat': None,
 'dep_uas': None,
 'dep_las': None,
 'dep_las_per_type': None,
 'ents_p': 0.7490529811634061,
 'ents_r': 0.7443791254125413,
 'ents_f': 0.7467087396218607,
 'ents_per_type': {'B-protein': {'p': 0.6878677055175704,
   'r': 0.8537596210775606,
   'f': 0.7618879887284256},
  'I-protein': {'p': 0.793999104343932,
   'r': 0.7427733556765815,
   'f': 0.7675324675324675},
  'B-cell_type': {'p': 0.7796508127633955,
   'r': 0.6741280583029672,
   'f': 0.7230597431602456},
  'I-cell_type': {'p': 0.8609022556390977,
   'r': 0.6890672016048145,
   'f': 0.7654596100278551},
  'B-DNA': {'p': 0.7049180327868853,
   'r': 0.6515151515151515,
   'f': 0.6771653543307087},
  'B-cell_line': {'p': 0.5484460694698354, 'r': 0.6, 'f': 0.5

In [52]:
# Method 3

#test_doc_bin = DocBin()
#build_doc_bin(test_data, test_doc_bin)
#test_doc_bin.to_disk("test.spacy") 

!spacy evaluate ./model-best ./test.spacy  

[38;5;4m[i] Using CPU[0m
[1m

TOK     100.00
NER P   74.91 
NER R   74.44 
NER F   74.67 
SPEED   31838 

[1m

                  P       R       F
B-protein     68.79   85.38   76.19
I-protein     79.40   74.28   76.75
B-cell_type   77.97   67.41   72.31
I-cell_type   86.09   68.91   76.55
B-DNA         70.49   65.15   67.72
B-cell_line   54.84   60.00   57.31
I-DNA         79.94   75.07   77.43
I-cell_line   68.93   66.63   67.76
B-RNA         66.06   61.02   63.44
I-RNA         75.13   77.54   76.32



  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(


In [53]:
!spacy evaluate ./model-best ./validation.spacy  

[38;5;4m[i] Using CPU[0m
[1m

TOK     100.00
NER P   79.62 
NER R   78.29 
NER F   78.95 
SPEED   33237 

[1m

                  P       R       F
B-DNA         76.48   67.68   71.81
I-DNA         81.83   81.83   81.83
B-protein     76.99   83.45   80.09
I-protein     83.53   80.34   81.90
B-cell_line   78.30   62.25   69.36
I-cell_line   88.95   68.10   77.14
B-cell_type   72.38   78.71   75.41
I-cell_type   70.77   80.19   75.19
I-RNA         89.47   84.07   86.69
B-RNA         87.36   71.03   78.35



  _torch_pytree._register_pytree_node(
  _torch_pytree._register_pytree_node(
