Jupyter setup: make output more readable

In [None]:
%%html
<style>
.rendered_html {
    font-size: 30px;
}
td {
    font-size: 20px;
}
</style>

Jupyter setup: Import [pandas](https://pandas.pydata.org/) for table formatting.

In [None]:
import pandas as pd
table = pd.DataFrame

For this demo, both [spacy](https://spacy.io/) and [neuralcoref](https://huggingface.co/coref/) must be installed.

In [None]:
import spacy
nlp = spacy.load('en_coref_lg')

Run all the spacy NLP models on some sample text.

In [None]:
doc = nlp(u'''Dr. Phil visited
    China. She liked the country a lot.''')

Print out each of the sentences.

In [None]:
[sent.text for sent in doc.sents]

Print out each of the tokens in each sentence.

In [None]:
[[token.text for token in sent]
 for sent in doc.sents]

Print out a table of tokens and their lemmas

In [None]:
table([[token.text, token.lemma_]
       for token in doc])

Print a sample embedding

In [None]:
visited = doc[3]
china = doc[5]
country = doc[10]

visited.vector.shape

Print some embedding similarities

In [None]:
sim = china.similarity
sim(visited), sim(country), sim(nlp("India")[0])

Print out a table of tokens and their part-of-speech tags

In [None]:
table([[token.text, token.pos_]
       for token in doc])

Print out a table of named entities and their types.

In [None]:
table([[entity.text, entity.label_]
       for entity in doc.ents])

Use spacy's visualizer to display the named entities.

In [None]:
spacy.displacy.render(doc, style='ent', jupyter=True)

Print out a table of tokens and their grammatical heads

In [None]:
table([[token.text, token.head.text]
       for token in doc])

Use spacy's visualizer to display the dependency tree

In [None]:
spacy.displacy.render(doc.sents, style='dep', jupyter=True)

Print out a table of tokens and their entities (coreference)

In [None]:
table([[token.text,
        [cluster.main.text
         for cluster in token._.coref_clusters]]
       for token in doc])