## Example notebook
In this notebook we show 3 different ways to you the spacy annotator:   

    - annotation without spaCy model   
    - annotations with spaCy model   
    - annotations with spaCy model and entityRuler   
    
Enjoy :)

In [None]:
# python -m pip install -e .

In [None]:
import sys
sys.path.append('../')

In [None]:
import pandas as pd
import spacy_annotator as spa

## Import data

In [None]:
df = pd.DataFrame({
    "text": [
        "New york is lovely, Milan is nice, but london is amazing!",
        "Stockholm is too cold. Ingrid Bergman says so."
    ]})

df

## Annotation _without_ spaCy model
Basic implementation of the spacy annotator. The user input labels and entities manually.

In [None]:
annotator = spa.Annotator(labels=["GPE", "PERSON"])

In [None]:
annotator.instructions

In [None]:
df_labels = annotator.annotate(df=df, col_text="text")

### Inspect output

In [None]:
df_labels

## Annotation _with_ spaCy model
Use the small, medium, large spaCy model or even transformers to label you data

In [None]:
import spacy

In [None]:
nlp = spacy.load("en_core_web_trf")

In [None]:
annotator = spa.Annotator(labels=["GPE", "PERSON"], model=nlp)

In [None]:
df_labels = annotator.annotate(df=df, col_text="text", shuffle=True)

### Inspect output

In [None]:
df_labels

## Annotation _with_ spaCy model _and_ EntityRuler
Use a combinations of spaCy models and entity ruler patters to label those entities that even a large model might miss

In [None]:
patterns = [
    {"label": "GPE", "pattern": "london"}, # this one isn't picked up by "ner"
    {"label": "GPE", "pattern": "Stockholm"},
    {"label": "PERSON", "pattern": "Humphrey Bogart"},
]

In [None]:
ruler = nlp.add_pipe("entity_ruler", config={"phrase_matcher_attr": "LOWER"}, before="ner")
ruler.add_patterns(patterns)

In [None]:
annotator = spa.Annotator(labels=["GPE", "PERSON"], model=nlp)

In [None]:
df_labels = annotator.annotate(df=df, col_text="text", shuffle=True)

### Inspect output and save dataframe of annotations to .spacy format for training in Spacy3 pipeline.

In [None]:
df_labels

In [None]:
# saves to the current working directory with the default name 'annotations.spacy'
annotator.to_spacy(df_labels)

In [None]:
# saves to the current working directory with the specified name or path
annotator.to_spacy(df_labels, "spacy_labels.spacy")

In [None]:
# saves to a specified directory
annotator.to_spacy(df_labels, "C:\pick_your_directory\spacy_labels.spacy")

In [None]:
#fin