# Binder for custom-ner-de

This binder enables you to train, evaluate, and apply a custom NER model using spaCy. The only prerequisite is that you know how to annotate in and export documents from Transkribus. That's it!

## Train a custom NER model using spaCy with German texts annotated in Transkribus

Input URL to an annotated PAGE XML Zip file exported from Transkribus (for example a public link pointing to a file on SWITCHdrive):

In [1]:
ZIP_URL = "https://drive.switch.ch/index.php/s/nIF5agoktDJP3li/download"  # change this sample URL to your own URL

Optional inputs:

In [2]:
WORD_REMOVE = None  # list of words to be removed from the list of entities (false positives), e.g., WORD_REMOVE = ["Händeklatschen", "Salpeter"]
PERSON_NAMES = None  # list of persons to be added to the model, e.g., PERSON_NAMES = ["Max Mustermann", "Ada Lovelace"]
LOCATION_NAMES = None  # list of locations to be added to the model, e.g., LOCATION_NAMES = ["Basel", "Mittelerde"]

Train the model (this can take some time):

In [3]:
from custom_ner_de.client import Client
my_client = Client()
my_client.train_model(zip_url=ZIP_URL,
                      word_remove=WORD_REMOVE,
                      person_names=PERSON_NAMES,
                      location_names=LOCATION_NAMES,
                      epochs=1)

Downloading Zip file... 

AssertionError: 

Save the model to the `/custom-ner-de/user_output/models/` directory:

In [None]:
my_client.save_model()

If you want to keep the model, you must download it from this directory to your local machine (Binder will reset after you close your browser).


## Evaluate a custom NER model

Optional inputs:

In [None]:
MODEL_PATH = None  # complete path to custom NER model directory (if no input is provided, the previously trained model is loaded)

Evaluate model:

In [None]:
my_client.evaluate_model(model_path=MODEL_PATH)

## Apply the custom NER model to new German texts transcribed in Transkribus

Input URL to plain text file (for example a public link pointing to a file on SWITCHdrive) to which to custom NER model should be applied:

In [None]:
TEXT_URL = "https://drive.switch.ch/index.php/s/4eBBIImulcOfMf7/download"  # change this sample URL to your own URL

Optional inputs:

In [None]:
MODEL_PATH = None  # complete path to custom NER model directory (if no input is provided, the previously trained model is loaded)

Apply the model to the text:

In [None]:
my_client.apply_model(text_url=TEXT_URL,
                      model_path=MODEL_PATH)

Display the result:

In [None]:
my_client.result

Save the result to the `/custom-ner-de/user_output/results/` directory:

In [None]:
my_client.save_result2csv()

If you want to keep the result, you must download it from this directory to your local machine (Binder will reset after you close your browser).