* Prefix: Character(s) at the beginning like: $ ( " { [
* Suffix: Character(s) at the end like: km ) , . ! "
* Infix: Character(s) in between like: - -- / ...
* Exception: Special-case rule to split a string into several tokens or prevent a token from being split when punctuation rules are applied.

In [3]:
import spacy
nlp = spacy.load('en_core_web_sm')

In [2]:
mystring = '"We\'re moving to L.A.!"'

In [3]:
print(mystring)

"We're moving to L.A.!"


In [4]:
doc = nlp(mystring)

In [5]:
for token in doc:
    print(token.text)

"
We
're
moving
to
L.A.
!
"


In [6]:
doc2 = nlp("We're here to help! Send snail-mail, email support@oursite.com or visit us at http://www.oursite.com!")

In [8]:
for t in doc2:
    print(t)

We
're
here
to
help
!
Send
snail
-
mail
,
email
support@oursite.com
or
visit
us
at
http://www.oursite.com
!


In [9]:
doc3 = nlp("A 5km NYC cab ride costs $10.30")

In [10]:
for t in doc3:
    print(t)

A
5
km
NYC
cab
ride
costs
$
10.30


In [11]:
doc4 = nlp("Let's visit St.Louis in the U.S. next year.")

In [12]:
for t in doc4:
    print(t)

Let
's
visit
St
.
Louis
in
the
U.S.
next
year
.


In [13]:
len(doc4)

12

In [14]:
doc4.vocab

<spacy.vocab.Vocab at 0x2afd7e89f48>

In [None]:
len(doc4.vocab)  # number of tokens in this language model we uploaded

57852

In [16]:
doc5 = nlp("It is better to give than receive.")

In [17]:
doc5[0]

It

In [18]:
doc5[4:8]

give than receive.

In [None]:
doc5[0] = 'test'  # assignment is not supported

TypeError: 'spacy.tokens.doc.Doc' object does not support item assignment

In [20]:
doc6 = nlp('Apple to build a Hong Kong factory for $6 million')

In [22]:
for t in doc6:
    print(t.text, end=' |')

Apple |to |build |a |Hong |Kong |factory |for |$ |6 |million |

In [26]:
for entity in doc6.ents:
    print(entity)
    print(entity.label_)
    print(str(spacy.explain(entity.label_)))
    print('\n')

Apple
ORG
Companies, agencies, institutions, etc.


Hong Kong
GPE
Countries, cities, states


$6 million
MONEY
Monetary values, including unit




In [27]:
doc9 = nlp('Autonomous cars shift insurance liability toward manufacturers.')

In [28]:
for chunk in doc9.noun_chunks:
    print(chunk)

Autonomous cars
insurance liability
manufacturers


In [1]:
from spacy import displacy

In [4]:
doc = nlp("Apple is going to build a U.K. factory for $6 million.")

In [9]:
displacy.render(doc, style='dep', jupyter=True, options={'distance':100})

In [10]:
doc = nlp("Over the last quarter Apple sold nearly 20 thousand iPods for a profit of $6 million")

In [12]:
displacy.render(doc, style='ent', jupyter=True)

In [None]:
doc = nlp("This is a sentence.")
displacy.serve(doc, style='dep') # Go to this url: http://127.0.0.1:5000/


[93m    Serving on port 5000...[0m
    Using the 'dep' visualizer



127.0.0.1 - - [09/Nov/2024 10:00:02] "GET / HTTP/1.1" 200 3057
127.0.0.1 - - [09/Nov/2024 10:00:02] "GET /favicon.ico HTTP/1.1" 200 3057



    Shutting down server on port 5000.

