# Annotating News Headlines
<img src="./screencast.gif"/>

We will load up a sample set of news headlines, tag parts of speech using `spaCy`, and create an output annotator to mark records that required a second check

In [None]:
import json
with open('./data.json', 'r') as fp:
    data = [x['text'] for x in json.load(fp)]

We will use the small english corpus for portability

In [None]:
import spacy
nlp = spacy.load('en_core_web_sm')

In [None]:
annotated_data = []
for headline in data:
    doc = nlp(headline)
    annotated_data.append(doc)

In [None]:
datum = annotated_data[0]
word = datum[2]

In [None]:
word.pos_

Using `ipymarkup`, we will display the words and their corresponding parts of speech.

In [None]:
from IPython.display import display
from ipymarkup import BoxLabelMarkup as Markup, Span


def display_record(record):
    spans = [Span(bloc.idx, bloc.idx+len(bloc), bloc.pos_) for bloc in record]
    markup = Markup(record.text, spans)
    display(markup)
    
display_record(annotated_data[0])

## Assemble our annotator
Now we can assemble our checker using `ipyannotate`. For this simple task, we will simply have `Ok` and `Check` options, but ipyannotate offers greater flexibility we could leverage to build more complex annotators.

In [None]:
from ipyannotate import annotate
annotation = annotate(annotated_data, display=display_record)
annotation

In [None]:
annotation.tasks