# Training a neural network model

This chapter covers how to customize spaCy's statistical models. That includes writing a training loop from scratch and understanding the basics of training.

In [8]:
import os
import spacy

model_path = os.path.join(os.getcwd(), 'models/en_core_web_md')
nlp = spacy.load(model_path)

## Why train a custom model?

- Better results on your specific domain
- Learn classification schemes specifically for your problem
- Essential for text classification
- Very useful for named entity recognition
- Less critical for part-of-speech tagging and dependency parsing

## How training works

1. Initialize the model weights randomly with nlp.begin_training
2. Predict a few examples with the current weights by calling nlp.update
3. Compare prediction with true labels
4. Calculate how to change weights to improve predictions
5. Update weights slightly
6. Go back to 2.

![Training](images/training.png)

In [1]:
("iPhone X is coming", {'entities': [(0, 8, 'GADGET')]})

('iPhone X is coming', {'entities': [(0, 8, 'GADGET')]})

In [3]:
("I need a new phone! Any tips?", {'entities': []})

('I need a new phone! Any tips?', {'entities': []})

Goal: teach the model to generalize

## The training data

- Examples of what we want the model to predict in context
- Update an existing model: a few hundred to a few thousand examples
- Train a new category: a few thousand to a million examples
- Usually created manually by human annotators
- Can be semi-automated

## Training loop steps

- Loop for a number of times.
- Shuffle the training data.
- Divide the data into batches.
- Update the model for each batch.
- Save the updated model.

In [10]:
# Example loop
import random

TRAINING_DATA = [
    ("How to preorder the iPhone X", {'entities': [(20, 28, 'GADGET')]})
    # And many more examples...
]

# Loop for 10 iterations
for i in range(10):
    # Shuffle the training data
    random.shuffle(TRAINING_DATA)
    # Create batches and iterate over them
    for batch in spacy.util.minibatch(TRAINING_DATA):
        # Split the batch in texts and annotations
        texts = [text for text, annotation in batch]
        annotations = [annotation for text, annotation in batch]
        # Update the model
        nlp.update(texts, annotations)

# Save the model
nlp.to_disk(path_to_model)

KeyError: "[E022] Could not find a transition with the name 'B-GADGET' in the NER model."

## Setup pipeline from scratch

In [13]:
# Start with blank English model
nlp = spacy.blank('en')
# Create blank entity recognizer and add it to the pipeline
ner = nlp.create_pipe('ner')
nlp.add_pipe(ner)
# Add a new label
ner.add_label('GADGET')

# Start the training
nlp.begin_training()
# Train for 10 iterations
for itn in range(10):
    random.shuffle(TRAINING_DATA)
    # Divide examples into batches
    for batch in spacy.util.minibatch(TRAINING_DATA, size=2):
        texts = [text for text, annotation in batch]
        annotations = [annotation for text, annotation in batch]
        # Update the model
        nlp.update(texts, annotations)