In [1]:
import spacy

from spacy.lang.pt import Portuguese
from spacy import displacy

In [2]:
text = "O rato roeu a roupa do rei de Roma."

In [3]:
nlp = Portuguese()

## Tokenization

In [4]:
doc = nlp(text)

for token in doc:
    print(token.text)
    

O
rato
roeu
a
roupa
do
rei
de
Roma
.


## Part of Speech (POS) Tagging

A word’s part of speech defines its function within a sentence. A noun, for example, identifies an object. 
An adjective describes an object. A verb describes action. Identifying and tagging each word’s part of 
speech in the context of a sentence is called Part-of-Speech Tagging, or POS Tagging.

We’ll need to import its `pt_core_news_sm`model, because that contains the dictionary and grammatical 
information required to do this analysis. Then all we need to do is load this model with .load() and 
loop through our new docs variable, identifying the part of speech for each word using .pos_.

In [None]:
text = "O rato roeu a roupa do rei de Roma."

doc = nlp(text)

for token in doc:
    print(token.text)
    

## Named Entity Recognition (NER)

In [5]:
nlp = spacy.load('pt_core_news_sm')

In [6]:
text = "O Fantástico teve acesso exclusivo à investigação da Aeronáutica que apontou que o sargento Manoel Silva Rodrigues, preso na Espanha em junho com 39 kg de cocaína, entrou no avião ainda desligado e não passou a bagagem pelos procedimentos de segurança previstos. O militar estava na comitiva presidencial que levava o presidente Jair Bolsonaro – que estava em outra aeronave – ao encontro do G20 no Japão."

doc = nlp(text)

In [7]:
displacy.render(doc, style = "ent",jupyter = True)

## Dependency Parsing

Depenency parsing is a language processing technique that allows us to better determine the meaning 
of a sentence by analyzing how it’s constructed to determine how the individual words relate to each other.

Consider, for example, the sentence “Bill throws the ball.” We have two nouns (Bill and ball) 
and one verb (throws). But we can’t just look at these words individually, or we may end up thinking that 
the ball is throwing Bill! To understand the sentence correctly, we need to look at the word order 
and sentence structure, not just the words and their parts of speech.

In [8]:
nlp = spacy.load('pt_core_news_sm')

In [12]:
text = "O rato roeu a roupa do rei de Roma."

doc = nlp(text)

In [15]:
for chunk in doc.noun_chunks:
   print(chunk.text, chunk.root.text, chunk.root.dep_, chunk.root.head.text)

In [16]:
displacy.render(doc, style="dep", jupyter= True)