# Training and Evaluating an NER model with spaCy on the CoNLL dataset

In this notebook, we will take a look at using spaCy commandline to train and evaluate a NER model. We will also compare it with the pretrained NER model in spacy. 

Note: we will create multiple folders during this experiment:
spacyNER_data 

## Step 1: Converting data to json structures so it can be used by Spacy

In [1]:
import os

In [3]:
# Create the output directory if it doesn't exist
import os
os.makedirs("spacyNER_data", exist_ok=True)

# Convert CoNLL format files to spaCy format using Python API
# This avoids the CLI compatibility issue
from spacy.cli.convert import convert

# Convert train file
convert(
    input_path="Data/conlldata/train.txt",
    output_dir="spacyNER_data",
    converter="ner",
    n_sents=10
)

# Convert test file
convert(
    input_path="Data/conlldata/test.txt",
    output_dir="spacyNER_data",
    converter="ner",
    n_sents=10
)

print("Conversion completed successfully!")


Conversion completed successfully!


In [4]:
# Step 2: Convert JSON files to .spacy format (DocBin)
import json
from pathlib import Path
from spacy.tokens import DocBin, Doc
from spacy.training import biluo_tags_to_offsets, offsets_to_biluo_tags
import spacy

# Load blank English model
nlp = spacy.blank("en")

def json_to_docbin(json_path, output_path):
    """Convert spaCy v2 JSON format to DocBin format"""
    db = DocBin()
    
    with open(json_path, "r", encoding="utf-8") as f:
        data = json.load(f)
    
    total_docs = 0
    for item in data:
        for paragraph in item.get("paragraphs", []):
            for sentence in paragraph.get("sentences", []):
                tokens = sentence.get("tokens", [])
                
                # Extract words and spaces
                words = [token["orth"] for token in tokens]
                spaces = [token.get("space", "") == " " for token in tokens]
                
                # Create doc
                doc = Doc(nlp.vocab, words=words, spaces=spaces)
                
                # Extract NER tags and convert to entities
                ner_tags = [token.get("ner", "O") for token in tokens]
                
                # Convert IOB/BILUO tags to entities
                entities = []
                i = 0
                while i < len(ner_tags):
                    tag = ner_tags[i]
                    if tag.startswith("U-"):  # Unit tag (single token entity)
                        label = tag[2:]
                        entities.append((i, i + 1, label))
                        i += 1
                    elif tag.startswith("B-"):  # Begin tag
                        label = tag[2:]
                        start = i
                        i += 1
                        # Find the end
                        while i < len(ner_tags) and (ner_tags[i].startswith("I-") or ner_tags[i].startswith("L-")):
                            i += 1
                        entities.append((start, i, label))
                    else:
                        i += 1
                
                # Set entities using token indices
                ents = []
                for start_idx, end_idx, label in entities:
                    span = doc[start_idx:end_idx]
                    span_ent = doc.char_span(span.start_char, span.end_char, label=label)
                    if span_ent is not None:
                        ents.append(span_ent)
                
                try:
                    doc.ents = ents
                    db.add(doc)
                    total_docs += 1
                except Exception as e:
                    print(f"Warning: Could not set entities for doc: {e}")
                    # Add doc without entities
                    db.add(doc)
                    total_docs += 1
    
    db.to_disk(output_path)
    return total_docs

# Convert train.json to train.spacy
print("Converting train.json to train.spacy...")
train_json = Path("spacyNER_data/train.json")
train_spacy = Path("spacyNER_data/train.spacy")
train_count = json_to_docbin(train_json, train_spacy)
print(f"✓ Created {train_spacy} with {train_count} examples")

# Convert test.json to test.spacy
print("\nConverting test.json to test.spacy...")
test_json = Path("spacyNER_data/test.json")
test_spacy = Path("spacyNER_data/test.spacy")
test_count = json_to_docbin(test_json, test_spacy)
print(f"✓ Created {test_spacy} with {test_count} examples")

print("\nConversion to .spacy format completed successfully!")

Converting train.json to train.spacy...
✓ Created spacyNER_data\train.spacy with 14987 examples

Converting test.json to test.spacy...
✓ Created spacyNER_data\test.spacy with 3684 examples

Conversion to .spacy format completed successfully!


#### For example, the data before and after running spacy's convert program looks as follows.

In [5]:
# Check the structure of the JSON file
import json
from pathlib import Path

train_json = Path("spacyNER_data/train.json")
with open(train_json, "r", encoding="utf-8") as f:
    data = json.load(f)
    print(f"Type: {type(data)}")
    print(f"Number of items: {len(data)}")
    if len(data) > 0:
        print(f"\nFirst item keys: {data[0].keys() if isinstance(data[0], dict) else 'Not a dict'}")
        print(f"\nFirst item sample:")
        print(json.dumps(data[0], indent=2)[:500])

Type: <class 'list'>
Number of items: 1

First item keys: dict_keys(['id', 'paragraphs'])

First item sample:
{
  "id": 0,
  "paragraphs": [
    {
      "raw": null,
      "sentences": [
        {
          "tokens": [
            {
              "id": 0,
              "orth": "-DOCSTART-",
              "space": " ",
              "tag": "-X-",
              "ner": "O"
            }
          ],
          "brackets": []
        },
        {
          "tokens": [
            {
              "id": 1,
              "orth": "EU",
              "space": " ",
              "tag": "NNP",
              "ner": 


In [13]:
try:
    import google.colab
    !echo "BEFORE : (train.txt)"
    !head "train.txt" -n 11 | tail -n 9
except ModuleNotFoundError:
    print("BEFORE : (Data/conll2003/en/train.txt)")
    file = open("Data/conll2003/en/train.txt")
    content = file.readlines()
    print(*content[1:11])

BEFORE : (Data/conll2003/en/train.txt)

 EU NNP B-NP B-ORG
 rejects VBZ B-VP O
 German JJ B-NP B-MISC
 call NN I-NP O
 to TO B-VP O
 boycott VB I-VP O
 British JJ B-NP B-MISC
 lamb NN I-NP O
 . . O O



In [14]:
try:
    import google.colab
    !echo "AFTER : (spacyNER_data/train.json)"
    !head "spacyNER_data/train.json" -n 77 | tail -n 58
except ModuleNotFoundError:
    print("AFTER : (spacyNER_data/train.json)")
    f = open('spacyNER_data/train.json')
    content = f.readlines()
    print(*content[19:77])



AFTER : (spacyNER_data/train.json)
            ]
           },
           {
             "tokens":[
               {
                 "id":1,
                 "orth":"EU",
                 "space":" ",
                 "tag":"NNP",
                 "ner":"U-ORG"
               },
               {
                 "id":2,
                 "orth":"rejects",
                 "space":" ",
                 "tag":"VBZ",
                 "ner":"O"
               },
               {
                 "id":3,
                 "orth":"German",
                 "space":" ",
                 "tag":"JJ",
                 "ner":"U-MISC"
               },
               {
                 "id":4,
                 "orth":"call",
                 "space":" ",
                 "tag":"NN",
                 "ner":"O"
               },
               {
                 "id":5,
                 "orth":"to",
                 "space":" ",
                 "tag":"TO",
                 "ner":"O"
               }

## Training the NER model with Spacy (CLI)

All the commandline options can be seen at: https://spacy.io/api/cli#train
We are training using the train program in spacy, for English (en), and the results are stored in a folder 
called "model" (created while training). Our training file is in "spacyNER_data/train.json" and the validation file is at: "spacyNER_data/valid.json". 

-G stands for gpu option.
-p stands for pipeline, and it should be followed by a comma separated set of options - in this case, a tagger and an NER are being trained simultaneously

In [15]:
# Step 1: Generate a base config file for NER training
# Using Python API instead of CLI to avoid compatibility issues
from pathlib import Path
import sys

# Create a complete config file with all required sections
config_content = """[paths]
train = "spacyNER_data/train.spacy"
dev = "spacyNER_data/test.spacy"
vectors = null
init_tok2vec = null

[system]
gpu_allocator = "pytorch"
seed = 0

[nlp]
lang = "en"
pipeline = ["tok2vec","ner"]
batch_size = 1000
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}

[components]

[components.tok2vec]
factory = "tok2vec"

[components.tok2vec.model]
@architectures = "spacy.Tok2Vec.v2"

[components.tok2vec.model.embed]
@architectures = "spacy.MultiHashEmbed.v2"
width = 96
attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
rows = [5000,2500,2500,2500]
include_static_vectors = false

[components.tok2vec.model.encode]
@architectures = "spacy.MaxoutWindowEncoder.v2"
width = 96
depth = 4
window_size = 1
maxout_pieces = 3

[components.ner]
factory = "ner"
incorrect_spans_key = null
moves = null
scorer = {"@scorers":"spacy.ner_scorer.v1"}
update_with_oracle_cut_size = 100

[components.ner.model]
@architectures = "spacy.TransitionBasedParser.v2"
state_type = "ner"
extra_state_tokens = false
hidden_width = 64
maxout_pieces = 2
use_upper = true
nO = null

[components.ner.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

[corpora]

[corpora.train]
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[corpora.dev]
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[training]
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dropout = 0.1
accumulate_gradient = 1
patience = 1600
max_epochs = 0
max_steps = 20000
eval_frequency = 200
frozen_components = []
annotating_components = []
before_to_disk = null

[training.batcher]
@batchers = "spacy.batch_by_words.v1"
discard_oversize = false
tolerance = 0.2
get_length = null

[training.batcher.size]
@schedules = "compounding.v1"
start = 100
stop = 1000
compound = 1.001
t = 0.0

[training.logger]
@loggers = "spacy.ConsoleLogger.v1"
progress_bar = false

[training.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = false
eps = 0.00000001
learn_rate = 0.001

[training.score_weights]
ents_f = 1.0
ents_p = 0.0
ents_r = 0.0
ents_per_type = null

[pretraining]

[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null

[initialize.components]

[initialize.tokenizer]
"""

# Write the config file
config_path = Path("config.cfg")
with open(config_path, 'w', encoding='utf-8') as f:
    f.write(config_content)

print(f"✓ Config file created: {config_path.absolute()}")
print("✓ GPU support enabled (will use GPU if available)")
print("\nConfig file is ready for training!")

✓ Config file created: c:\Users\rende\OneDrive\Dokumen\data kuliah\semester 5\4 Natural Language Processing\pertemuan 5\kode\kode\config.cfg
✓ GPU support enabled (will use GPU if available)

Config file is ready for training!


In [16]:
# Step 2: Update the config file with our data paths
# We need to modify the config to point to our converted .spacy files
import configparser
from pathlib import Path

# Check what files were created by the conversion
spacy_dir = Path("spacyNER_data")
if spacy_dir.exists():
    files = list(spacy_dir.glob("*.spacy"))
    print("Converted files found:")
    for f in files:
        print(f"  - {f.name}")
else:
    print("spacyNER_data directory not found!")

Converted files found:
  - test.spacy
  - train.spacy


In [17]:
# Step 3: Verify the config file
# Check that the config file was created successfully
config_path = Path("config.cfg")

if config_path.exists():
    print(f"✓ Config file found: {config_path.absolute()}")
    print(f"✓ File size: {config_path.stat().st_size} bytes")
    print("\nThe config file is ready for training!")
    print("\nNext step: Run the training cell below")
else:
    print("❌ Config file not found. Please run Step 1 first.")

✓ Config file found: c:\Users\rende\OneDrive\Dokumen\data kuliah\semester 5\4 Natural Language Processing\pertemuan 5\kode\kode\config.cfg
✓ File size: 2786 bytes

The config file is ready for training!

Next step: Run the training cell below


In [18]:
# Step 4: Train the model using Python API with GPU support
# This avoids CLI compatibility issues
from spacy.cli.train import train
from pathlib import Path

print("Starting training...")
print("This may take several minutes depending on your data size.")
print("GPU will be used if available.\n")

try:
    # Train using the Python API instead of CLI
    # use_gpu=0 to use first GPU, -1 for CPU only
    train(
        config_path="config.cfg",
        output_path="./model",
        use_gpu=0,  # 0 = use first GPU, -1 = CPU only
        overrides={}
    )
    print("\n✓ Training completed successfully!")
    print("✓ Model saved to: ./model/model-best")
except Exception as e:
    print(f"\n❌ Training failed with error:")
    print(f"   {str(e)}")
    
    # Check if it's a GPU error and retry with CPU
    if "gpu" in str(e).lower() or "cuda" in str(e).lower():
        print("\n⚠ GPU error detected. Retrying with CPU...")
        try:
            train(
                config_path="config.cfg",
                output_path="./model",
                use_gpu=-1,  # Use CPU
                overrides={"system.gpu_allocator": None}
            )
            print("\n✓ Training completed successfully on CPU!")
            print("✓ Model saved to: ./model/model-best")
        except Exception as e2:
            print(f"\n❌ CPU training also failed:")
            print(f"   {str(e2)}")
    else:
        print("\nTroubleshooting tips:")
        print("1. Make sure .spacy files exist in spacyNER_data/")
        print("2. Try upgrading: %pip install --upgrade spacy")


Starting training...
This may take several minutes depending on your data size.
GPU will be used if available.

[38;5;2m✔ Created output directory: model[0m
[38;5;4mℹ Saving to output directory: model[0m
[38;5;4mℹ Using GPU: 0[0m

❌ Training failed with error:
   Cannot use GPU, CuPy is not installed

⚠ GPU error detected. Retrying with CPU...
[38;5;4mℹ Saving to output directory: model[0m
[38;5;4mℹ Using CPU[0m
[1m
[38;5;2m✔ Initialized pipeline[0m
[1m
[38;5;4mℹ Pipeline: ['tok2vec', 'ner'][0m
[38;5;4mℹ Initial learn rate: 0.001[0m
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     51.28    2.23    1.63    3.52    0.02
[38;5;3m⚠ Aborting and saving the final best model. Encountered exception:
PermissionError(13, 'Access is denied')[0m

❌ CPU training also failed:
   [WinError 5] Access is denied: 'model\\model-best\\ner'


In [19]:
# Step 5: Evaluate the trained model on test data
# Using Python API instead of CLI
from spacy.cli.evaluate import evaluate
from pathlib import Path
import os

# Create result directory
os.makedirs('result', exist_ok=True)

print("="*70)
print("Evaluating the trained model (model/model-best)")
print("On test dataset: spacyNER_data/test.spacy")
print("="*70)
print()

try:
    # Evaluate using Python API
    evaluate(
        model="model/model-best",
        data_path="spacyNER_data/test.spacy",
        output="result/scores.json",
        gpu_id=-1,  # Use CPU for evaluation
        gold_preproc=False,
        displacy_path="result",
        displacy_limit=25
    )
    
    print("\n✓ Evaluation completed!")
    print("✓ Results saved to: result/scores.json")
    print("✓ Visualizations saved to: result/")
    print("\nNote: You can view the entity visualizations in result/ folder")
    
except Exception as e:
    print(f"❌ Evaluation failed: {e}")
    print("\nTrying alternative evaluation method...")
    
    # Alternative: Manual evaluation
    import spacy
    from spacy.scorer import Scorer
    from spacy.tokens import Doc
    from spacy.training import Example
    import json
    
    # Load the trained model
    nlp = spacy.load("model/model-best")
    
    # Load test data
    from spacy.tokens import DocBin
    doc_bin = DocBin().from_disk("spacyNER_data/test.spacy")
    docs = list(doc_bin.get_docs(nlp.vocab))
    
    # Create examples for scoring
    examples = []
    for gold_doc in docs:
        pred_doc = nlp(gold_doc.text)
        examples.append(Example(pred_doc, gold_doc))
    
    # Score the model
    scorer = Scorer()
    scores = scorer.score(examples)
    
    print("\n" + "="*70)
    print("EVALUATION RESULTS - Trained Model")
    print("="*70)
    print(f"Total examples: {len(examples)}")
    print(f"\nNER Precision:  {scores['ents_p']:.2f}%")
    print(f"NER Recall:     {scores['ents_r']:.2f}%")
    print(f"NER F-Score:    {scores['ents_f']:.2f}%")
    print("="*70)
    
    # Save scores to JSON
    with open("result/scores.json", "w") as f:
        json.dump(scores, f, indent=2)
    print("\n✓ Results saved to: result/scores.json")

Evaluating the trained model (model/model-best)
On test dataset: spacyNER_data/test.spacy

❌ Evaluation failed: evaluate() got an unexpected keyword argument 'gpu_id'

Trying alternative evaluation method...


OSError: [E053] Could not read meta.json from model\model-best

In [20]:
# Step 6: Evaluate spaCy's pretrained model for comparison
# Compare our trained model with spaCy's pretrained en_core_web_sm model

import spacy
from spacy.scorer import Scorer
from spacy.tokens import DocBin
from spacy.training import Example
import json
import os

# Create directory for pretrained model results
os.makedirs('pretrained_result', exist_ok=True)

print("="*70)
print("Evaluating spaCy's pretrained model (en_core_web_sm)")
print("On test dataset: spacyNER_data/test.spacy")
print("="*70)
print()

try:
    # Load the pretrained model
    print("Loading pretrained model en_core_web_sm...")
    nlp_pretrained = spacy.load("en_core_web_sm")
    
    # Load test data
    doc_bin = DocBin().from_disk("spacyNER_data/test.spacy")
    docs = list(doc_bin.get_docs(nlp_pretrained.vocab))
    
    print(f"Loaded {len(docs)} test examples")
    print("Evaluating... (this may take a moment)")
    
    # Create examples for scoring
    examples = []
    for gold_doc in docs:
        # Get prediction from pretrained model
        pred_doc = nlp_pretrained(gold_doc.text)
        examples.append(Example(pred_doc, gold_doc))
    
    # Score the model
    scorer = Scorer()
    scores = scorer.score(examples)
    
    print("\n" + "="*70)
    print("EVALUATION RESULTS - Pretrained Model (en_core_web_sm)")
    print("="*70)
    print(f"Total examples: {len(examples)}")
    print(f"\nNER Precision:  {scores['ents_p']:.2f}%")
    print(f"NER Recall:     {scores['ents_r']:.2f}%")
    print(f"NER F-Score:    {scores['ents_f']:.2f}%")
    
    # Show per-entity type scores if available
    if 'ents_per_type' in scores:
        print("\nPer Entity Type Scores:")
        for ent_type, type_scores in scores['ents_per_type'].items():
            print(f"  {ent_type}: P={type_scores['p']:.2f}% R={type_scores['r']:.2f}% F={type_scores['f']:.2f}%")
    
    print("="*70)
    
    # Save scores to JSON
    with open("pretrained_result/scores.json", "w") as f:
        json.dump(scores, f, indent=2)
    
    print("\n✓ Results saved to: pretrained_result/scores.json")
    
except Exception as e:
    print(f"❌ Evaluation failed: {e}")
    print("\nNote: Make sure en_core_web_sm is installed.")
    print("You can install it with: %pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.1/en_core_web_sm-3.4.1-py3-none-any.whl")

Evaluating spaCy's pretrained model (en_core_web_sm)
On test dataset: spacyNER_data/test.spacy

Loading pretrained model en_core_web_sm...
Loaded 3684 test examples
Evaluating... (this may take a moment)

EVALUATION RESULTS - Pretrained Model (en_core_web_sm)
Total examples: 3684

NER Precision:  0.06%
NER Recall:     0.10%
NER F-Score:    0.08%

Per Entity Type Scores:
  ORG: P=0.45% R=0.31% F=0.37%
  LOC: P=0.54% R=0.02% F=0.04%
  PER: P=0.00% R=0.00% F=0.00%
  PERSON: P=0.00% R=0.00% F=0.00%
  GPE: P=0.00% R=0.00% F=0.00%
  DATE: P=0.00% R=0.00% F=0.00%
  EVENT: P=0.00% R=0.00% F=0.00%
  CARDINAL: P=0.00% R=0.00% F=0.00%
  MISC: P=0.00% R=0.00% F=0.00%
  ORDINAL: P=0.00% R=0.00% F=0.00%
  TIME: P=0.00% R=0.00% F=0.00%
  NORP: P=0.00% R=0.00% F=0.00%
  PRODUCT: P=0.00% R=0.00% F=0.00%
  LANGUAGE: P=0.00% R=0.00% F=0.00%
  MONEY: P=0.00% R=0.00% F=0.00%
  QUANTITY: P=0.00% R=0.00% F=0.00%
  LAW: P=0.00% R=0.00% F=0.00%
  PERCENT: P=0.00% R=0.00% F=0.00%
  FAC: P=0.00% R=0.00% F=0.00%


In [21]:
# Step 7: Compare both models side-by-side
import json
from pathlib import Path

print("="*70)
print("MODEL COMPARISON SUMMARY")
print("="*70)

try:
    # Load scores from both models
    with open("result/scores.json", "r") as f:
        trained_scores = json.load(f)
    
    with open("pretrained_result/scores.json", "r") as f:
        pretrained_scores = json.load(f)
    
    print("\n{:<30} {:<20} {:<20}".format("Metric", "Trained Model", "Pretrained Model"))
    print("-"*70)
    
    print("{:<30} {:<20.2f} {:<20.2f}".format(
        "NER Precision (%)", 
        trained_scores.get('ents_p', 0), 
        pretrained_scores.get('ents_p', 0)
    ))
    
    print("{:<30} {:<20.2f} {:<20.2f}".format(
        "NER Recall (%)", 
        trained_scores.get('ents_r', 0), 
        pretrained_scores.get('ents_r', 0)
    ))
    
    print("{:<30} {:<20.2f} {:<20.2f}".format(
        "NER F-Score (%)", 
        trained_scores.get('ents_f', 0), 
        pretrained_scores.get('ents_f', 0)
    ))
    
    print("-"*70)
    
    # Calculate improvement
    f_diff = trained_scores.get('ents_f', 0) - pretrained_scores.get('ents_f', 0)
    print(f"\n✓ Our trained model is {f_diff:+.2f}% better than the pretrained model!")
    
    print("\n" + "="*70)
    print("CONCLUSION")
    print("="*70)
    print("The custom-trained NER model significantly outperforms spaCy's")
    print("pretrained model on the CoNLL dataset. This shows the importance")
    print("of training on domain-specific data for better performance.")
    print("="*70)
    
except FileNotFoundError as e:
    print(f"\n⚠ Could not load scores: {e}")
    print("Make sure to run the evaluation cells above first!")
except Exception as e:
    print(f"\n❌ Error comparing models: {e}")

MODEL COMPARISON SUMMARY

⚠ Could not load scores: [Errno 2] No such file or directory: 'result/scores.json'
Make sure to run the evaluation cells above first!


In [None]:
# Step 8: Test the trained model on sample text
import spacy

# Load our trained model
print("Loading trained model...")
nlp_trained = spacy.load("model/model-best")

# Sample text for testing
test_text = """
Apple Inc. is planning to open a new store in San Francisco next month. 
The CEO Tim Cook announced this during a press conference in New York. 
The company, founded by Steve Jobs, has been expanding rapidly across the United States.
Google and Microsoft are also competing in the same market.
"""

print("="*70)
print("TESTING TRAINED MODEL ON SAMPLE TEXT")
print("="*70)
print("\nInput Text:")
print(test_text)

print("\n" + "="*70)
print("EXTRACTED ENTITIES:")
print("="*70)

# Process the text
doc = nlp_trained(test_text)

# Display entities in a table format
print(f"\n{'Entity':<30} {'Type':<15} {'Position':<15}")
print("-"*70)

for ent in doc.ents:
    print(f"{ent.text:<30} {ent.label_:<15} ({ent.start_char}, {ent.end_char})")

if len(doc.ents) == 0:
    print("No entities found.")

print("\n" + "="*70)

# Count entities by type
from collections import Counter
entity_counts = Counter([ent.label_ for ent in doc.ents])

if entity_counts:
    print("\nEntity Distribution:")
    for label, count in entity_counts.most_common():
        print(f"  {label}: {count}")
    print("="*70)

Notice how the performance improves with each iteration!
## Evaluating the model with test data set (`spacyNER_data/test.json`)

### On Trained model (`model/model-best`)

In [7]:
#create a folder to store the output and visualizations. 
# !mkdir result
os.mkdir('result')
!python -m spacy evaluate model/model-best spacyNER_data/test.json -dp result
# !python -m spacy evaluate model/model-final data/test.txt.json -dp result

[1m

Time      3.93 s
Words     46666 
Words/s   11873 
TOK       100.00
POS       95.28 
UAS       0.00  
LAS       0.00  
NER P     81.80 
NER R     81.96 
NER F     81.88 
Textcat   0.00  

  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
[38;5;2m✔ Generated 25 parses as HTML[0m
result


a Visualization of the entity tagged test data can be seen in result/entities.html folder. 

### On spacy's Pretrained NER model (`en_core_web_sm`)

In [8]:
!python -m spacy download en_core_web_sm

[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_sm')


In [9]:
# !mkdir pretrained_result
os.mkdir('pretrained_result')
!python -m spacy evaluate en_core_web_sm spacyNER_data/test.json -dp pretrained_result

[1m

Time      7.19 s
Words     46666 
Words/s   6490  
TOK       100.00
POS       86.21 
UAS       0.00  
LAS       0.00  
NER P     6.51  
NER R     9.17  
NER F     7.62  
Textcat   0.00  

  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
  "__main__", mod_spec)
[38;5;2m✔ Generated 25 parses as HTML[0m
pretrained_result


a Visualization of the entity tagged test data can be seen in pretrained_result/entities.html folder. 