# Rule based Matching

In [1]:
# We are creating a Rule and trying to match something with this in the document 
# And we try to understand what the document is trying to convey and try to create queries to get inferences 

In [2]:
import spacy

In [3]:
nlp = spacy.load('en_core_web_sm')

In [4]:
type(nlp)

spacy.lang.en.English

## Accessing the Text

In [5]:
# If you want to paste multi-text line in the doc then you use triple quotation ('''''')
# But if you put single quotation ('') then you need to paste everything in a single line as in previous sessions 

In [6]:
doc = nlp('''ChatGPT (Chat Generative Pre-trained Transformer)[1] is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning)[2] with both supervised and reinforcement learning techniques.

ChatGPT was launched as a prototype on November 30, 2022, and quickly garnered attention for its detailed responses and articulate answers across many domains of knowledge. Its uneven factual accuracy was identified as a significant drawback.[3] Following the release of ChatGPT, OpenAI was valued at $29 billion.[4]

Training

Pioneer Building, San Francisco, headquarters of OpenAI

Sam Altman, CEO of OpenAI
ChatGPT was fine-tuned on top of GPT-3.5 using supervised learning as well as reinforcement learning.[5] Both approaches used human trainers to improve the model's performance. In the case of supervised learning, the model was provided with conversations in which the trainers played both sides: the user and the AI assistant. In the reinforcement step, human trainers first ranked responses that the model had created in a previous conversation. These rankings were used to create 'reward models' that the model was further fine-tuned on using several iterations of Proximal Policy Optimization (PPO).[6][7] Proximal Policy Optimization algorithms present a cost-effective benefit to trust region policy optimization algorithms; they negate many of the computationally expensive operations with faster performance.[8][9] The models were trained in collaboration with Microsoft on their Azure supercomputing infrastructure.

In addition, OpenAI continues to gather data from ChatGPT users that could be used to further train and fine-tune ChatGPT. Users are allowed to upvote or downvote the responses they receive from ChatGPT; upon upvoting or downvoting, they can also fill out a text field with additional feedback.[10][11][12]

Features and limitations

Cropped screenshot of a conversation with ChatGPT, December 30, 2022
Although the core function of a chatbot is to mimic a human conversationalist, ChatGPT is versatile. For example, it has the ability to write and debug computer programs; to compose music, teleplays, fairy tales, and student essays; to answer test questions (sometimes, depending on the test, at a level above the average human test-taker);[13] to write poetry and song lyrics;[14] to emulate a Linux system; to simulate an entire chat room; to play games like tic-tac-toe; and to simulate an ATM.[15] ChatGPT's training data includes man pages and information about Internet phenomena and programming languages, such as bulletin board systems and the Python programming language.[15]

In comparison to its predecessor, InstructGPT, ChatGPT attempts to reduce harmful and deceitful responses.[16] In one example, whereas InstructGPT accepts the premise of the prompt "Tell me about when Christopher Columbus came to the US in 2015" as being truthful, ChatGPT acknowledges the counterfactual nature of the question and frames its answer as a hypothetical consideration of what might happen if Columbus came to the U.S. in 2015, using information about Columbus' voyages and facts about the modern world – including modern perceptions of Columbus' actions.[6]

Unlike most chatbots, ChatGPT remembers previous prompts given to it in the same conversation; journalists have suggested that this will allow ChatGPT to be used as a personalized therapist.[17] To prevent offensive outputs from being presented to and produced from ChatGPT, queries are filtered through OpenAI's company-wide moderation API,[18][19] and potentially racist or sexist prompts are dismissed.[6][17]

ChatGPT suffers from multiple limitations. OpenAI acknowledged that ChatGPT "sometimes writes plausible-sounding but incorrect or nonsensical answers".[6] This behavior is common to large language models and is called hallucination.[20] The reward model of ChatGPT, designed around human oversight, can be over-optimized and thus hinder performance, otherwise known as Goodhart's law.[21] ChatGPT has limited knowledge of events that occurred after 2021. According to the BBC, as of December 2022 ChatGPT is not allowed to "express political opinions or engage in political activism".[22] Yet, research suggests that ChatGPT exhibits a pro-environmental, left-libertarian orientation when prompted to take a stance on political statements from two established voting advice applications.[23] In training ChatGPT, human reviewers preferred longer answers, irrespective of actual comprehension or factual content.[6] Training data also suffers from algorithmic bias, which may be revealed when ChatGPT responds to prompts including descriptors of people. In one instance, ChatGPT generated a rap indicating that women and scientists of color were inferior to white and male scientists.[24][25]

Service
ChatGPT was launched on November 30, 2022, by San Francisco-based OpenAI, the creator of DALL·E 2 and Whisper. The service was launched as initially free to the public, with plans to monetize the service later.[26] By December 4, OpenAI estimated ChatGPT already had over one million users.[10] CNBC wrote on December 15, 2022, that the service "still goes down from time to time".[27] The service works best in English, but is also able to function in some other languages, to varying degrees of success.[14] Unlike some other recent high-profile advances in AI, as of December 2022, there is no sign of an official peer-reviewed technical paper about ChatGPT.[28]

According to OpenAI guest researcher Scott Aaronson, OpenAI is working on a tool to attempt to watermark its text generation systems so as to combat bad actors using their services for academic plagiarism or for spam.[29][30] The New York Times relayed in December 2022 that the next version of GPT, GPT-4, has been "rumored" to be launched sometime in 2023.[17]

Reception and implications
Positive reactions
ChatGPT was met in December 2022 with generally positive reviews; The New York Times labeled it "the best artificial intelligence chatbot ever released to the general public".[31] Samantha Lock of The Guardian noted that it was able to generate "impressively detailed" and "human-like" text.[32] Technology writer Dan Gillmor used ChatGPT on a student assignment, and found its generated text was on par with what a good student would deliver and opined that "academia has some very serious issues to confront".[33] Alex Kantrowitz of Slate magazine lauded ChatGPT's pushback to questions related to Nazi Germany, including the claim that Adolf Hitler built highways in Germany, which was met with information regarding Nazi Germany's use of forced labor.[34]

In The Atlantic's "Breakthroughs of the Year" for 2022, Derek Thompson included ChatGPT as part of "the generative-AI eruption" that "may change our mind about how we work, how we think, and what human creativity really is".[35]

Kelsey Piper of the Vox website wrote that "ChatGPT is the general public's first hands-on introduction to how powerful modern AI has gotten, and as a result, many of us are [stunned]" and that ChatGPT is "smart enough to be useful despite its flaws".[36] Paul Graham of Y Combinator tweeted that "The striking thing about the reaction to ChatGPT is not just the number of people who are blown away by it, but who they are. These are not people who get excited by every shiny new thing. Clearly, something big is happening."[37] Elon Musk wrote that "ChatGPT is scary good. We are not far from dangerously strong AI".[36] Musk paused OpenAI's access to a Twitter database pending a better understanding of OpenAI's plans, stating that "OpenAI was started as open-source and non-profit. Neither is still true."[38][39] Musk had co-founded OpenAI in 2015, in part to address existential risk from artificial intelligence, but had resigned in 2018.[39]


Google CEO Sundar Pichai upended the work of numerous internal groups in response to the threat of disruption by ChatGPT.[40]
In December 2022, Google internally expressed alarm at the unexpected strength of ChatGPT and the newly discovered potential of large language models to disrupt the search engine business, and CEO Sundar Pichai "upended" and reassigned teams within multiple departments to aid in its artificial intelligence products, according to The New York Times.[40] The Information reported on January 3, 2023 that Microsoft Bing was planning to add optional ChatGPT functionality into its public search engine, possibly around March 2023.[41][42]

Stuart Cobbe, a chartered accountant in England & Wales, decided to the test the ChatGPT chatbot by entering questions from a sample exam paper on the ICAEW website and then entering its answers back into the online test. ChatGPT scored 42% which, while below the 55% pass mark, was considered a reasonable attempt.[43]''')

In [7]:
doc

ChatGPT (Chat Generative Pre-trained Transformer)[1] is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning)[2] with both supervised and reinforcement learning techniques.

ChatGPT was launched as a prototype on November 30, 2022, and quickly garnered attention for its detailed responses and articulate answers across many domains of knowledge. Its uneven factual accuracy was identified as a significant drawback.[3] Following the release of ChatGPT, OpenAI was valued at $29 billion.[4]

Training

Pioneer Building, San Francisco, headquarters of OpenAI

Sam Altman, CEO of OpenAI
ChatGPT was fine-tuned on top of GPT-3.5 using supervised learning as well as reinforcement learning.[5] Both approaches used human trainers to improve the model's performance. In the case of supervised learning, the model was provided with conversations in which the trainers played both sides: the

## Tokenization

In [8]:
for token in doc:
    print(token.text)

ChatGPT
(
Chat
Generative
Pre
-
trained
Transformer)[1
]
is
a
chatbot
launched
by
OpenAI
in
November
2022
.
It
is
built
on
top
of
OpenAI
's
GPT-3
family
of
large
language
models
,
and
is
fine
-
tuned
(
an
approach
to
transfer
learning)[2
]
with
both
supervised
and
reinforcement
learning
techniques
.



ChatGPT
was
launched
as
a
prototype
on
November
30
,
2022
,
and
quickly
garnered
attention
for
its
detailed
responses
and
articulate
answers
across
many
domains
of
knowledge
.
Its
uneven
factual
accuracy
was
identified
as
a
significant
drawback.[3
]
Following
the
release
of
ChatGPT
,
OpenAI
was
valued
at
$
29
billion.[4
]



Training



Pioneer
Building
,
San
Francisco
,
headquarters
of
OpenAI



Sam
Altman
,
CEO
of
OpenAI


ChatGPT
was
fine
-
tuned
on
top
of
GPT-3.5
using
supervised
learning
as
well
as
reinforcement
learning.[5
]
Both
approaches
used
human
trainers
to
improve
the
model
's
performance
.
In
the
case
of
supervised
learning
,
the
model
was
provided
with
conversations
in
whi

In [10]:
print(len(doc))

1607


## No. of Sentences in the doc

In [11]:
sent_count = 0
for sent in doc.sents:
    sent_count = sent_count+1
    print(sent_count,':',sent)
print('\n The Total No. of Sentences:',sent_count)

1 : ChatGPT (Chat Generative Pre-trained Transformer)[1] is a chatbot launched by OpenAI in November 2022.
2 : It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning)[2] with both supervised and reinforcement learning techniques.


3 : ChatGPT was launched as a prototype on November 30, 2022, and quickly garnered attention for its detailed responses and articulate answers across many domains of knowledge.
4 : Its uneven factual accuracy was identified as a significant drawback.[3]
5 : Following the release of ChatGPT, OpenAI was valued at $29 billion.[4]

Training

Pioneer Building, San Francisco, headquarters of OpenAI

Sam Altman, CEO of OpenAI
ChatGPT was fine-tuned on top of GPT-3.5 using supervised learning as well as reinforcement learning.[5]
6 : Both approaches used human trainers to improve the model's performance.
7 : In the case of supervised learning, the model was provided with conversations in which the tr

## POS Tagging 

In [12]:
for token in doc:
    print(token.text,'===>',token.pos_)

ChatGPT ===> PROPN
( ===> PUNCT
Chat ===> PROPN
Generative ===> PROPN
Pre ===> PROPN
- ===> ADJ
trained ===> VERB
Transformer)[1 ===> SPACE
] ===> PUNCT
is ===> AUX
a ===> DET
chatbot ===> NOUN
launched ===> VERB
by ===> ADP
OpenAI ===> PROPN
in ===> ADP
November ===> PROPN
2022 ===> NUM
. ===> PUNCT
It ===> PRON
is ===> AUX
built ===> VERB
on ===> ADP
top ===> NOUN
of ===> ADP
OpenAI ===> PROPN
's ===> PART
GPT-3 ===> PROPN
family ===> NOUN
of ===> ADP
large ===> ADJ
language ===> NOUN
models ===> NOUN
, ===> PUNCT
and ===> CCONJ
is ===> AUX
fine ===> ADV
- ===> PUNCT
tuned ===> VERB
( ===> PUNCT
an ===> DET
approach ===> NOUN
to ===> PART
transfer ===> VERB
learning)[2 ===> NUM
] ===> PUNCT
with ===> ADP
both ===> CCONJ
supervised ===> ADJ
and ===> CCONJ
reinforcement ===> NOUN
learning ===> NOUN
techniques ===> NOUN
. ===> PUNCT


 ===> SPACE
ChatGPT ===> PROPN
was ===> AUX
launched ===> VERB
as ===> ADP
a ===> DET
prototype ===> NOUN
on ===> ADP
November ===> PROPN
30 ===> NUM
, ==

as ===> ADP
a ===> DET
personalized ===> ADJ
therapist.[17 ===> NOUN
] ===> PUNCT
To ===> PART
prevent ===> VERB
offensive ===> ADJ
outputs ===> NOUN
from ===> ADP
being ===> AUX
presented ===> VERB
to ===> ADP
and ===> CCONJ
produced ===> VERB
from ===> ADP
ChatGPT ===> PROPN
, ===> PUNCT
queries ===> NOUN
are ===> AUX
filtered ===> VERB
through ===> ADP
OpenAI ===> PROPN
's ===> PART
company ===> NOUN
- ===> PUNCT
wide ===> ADJ
moderation ===> NOUN
API,[18][19 ===> PROPN
] ===> PUNCT
and ===> CCONJ
potentially ===> ADV
racist ===> ADJ
or ===> CCONJ
sexist ===> ADJ
prompts ===> NOUN
are ===> AUX
dismissed.[6][17 ===> PROPN
] ===> PUNCT


 ===> SPACE
ChatGPT ===> VERB
suffers ===> VERB
from ===> ADP
multiple ===> ADJ
limitations ===> NOUN
. ===> PUNCT
OpenAI ===> PROPN
acknowledged ===> VERB
that ===> SCONJ
ChatGPT ===> NOUN
" ===> PUNCT
sometimes ===> ADV
writes ===> VERB
plausible ===> ADJ
- ===> PUNCT
sounding ===> VERB
but ===> CCONJ
incorrect ===> ADJ
or ===> CCONJ
nonsensical ===

the ===> DET
reaction ===> NOUN
to ===> ADP
ChatGPT ===> PROPN
is ===> AUX
not ===> PART
just ===> ADV
the ===> DET
number ===> NOUN
of ===> ADP
people ===> NOUN
who ===> PRON
are ===> AUX
blown ===> VERB
away ===> ADV
by ===> ADP
it ===> PRON
, ===> PUNCT
but ===> CCONJ
who ===> PRON
they ===> PRON
are ===> AUX
. ===> PUNCT
These ===> PRON
are ===> AUX
not ===> PART
people ===> NOUN
who ===> PRON
get ===> AUX
excited ===> VERB
by ===> ADP
every ===> DET
shiny ===> ADJ
new ===> ADJ
thing ===> NOUN
. ===> PUNCT
Clearly ===> ADV
, ===> PUNCT
something ===> PRON
big ===> ADJ
is ===> AUX
happening ===> VERB
. ===> PUNCT
"[37 ===> PUNCT
] ===> PUNCT
Elon ===> PROPN
Musk ===> PROPN
wrote ===> VERB
that ===> SCONJ
" ===> PUNCT
ChatGPT ===> NOUN
is ===> AUX
scary ===> ADJ
good ===> NOUN
. ===> PUNCT
We ===> PRON
are ===> AUX
not ===> PART
far ===> ADV
from ===> ADP
dangerously ===> ADV
strong ===> ADJ
AI".[36 ===> PROPN
] ===> PUNCT
Musk ===> PROPN
paused ===> VERB
OpenAI ===> PROPN
's ===> PA

## NER

In [18]:
for ent in doc.ents:
    print(ent.text,'===>',ent.label_)

OpenAI ===> GPE
November 2022 ===> DATE
OpenAI ===> GPE
GPT-3 ===> PRODUCT
November 30, 2022 ===> DATE
ChatGPT ===> ORG
OpenAI ===> GPE
29 ===> MONEY
San Francisco ===> GPE
OpenAI

Sam Altman ===> ORG
OpenAI
ChatGPT ===> ORG
AI ===> ORG
first ===> ORDINAL
Proximal Policy Optimization ===> ORG
Microsoft ===> ORG
OpenAI ===> GPE
feedback.[10][11][12 ===> CARDINAL
ChatGPT ===> ORG
December 30, 2022 ===> DATE
ChatGPT ===> ORG
Python ===> GPE
InstructGPT ===> ORG
one ===> CARDINAL
InstructGPT ===> ORG
Christopher Columbus ===> PERSON
US ===> GPE
2015 ===> DATE
Columbus ===> GPE
U.S. ===> GPE
2015 ===> DATE
Columbus ===> GPE
Columbus ===> GPE
OpenAI ===> GPE
OpenAI ===> GPE
Goodhart ===> ORG
2021 ===> DATE
BBC ===> ORG
December 2022 ===> DATE
two ===> CARDINAL
one ===> CARDINAL
November 30, 2022 ===> DATE
San Francisco ===> GPE
OpenAI ===> GPE
2 ===> CARDINAL
Whisper ===> PERSON
December 4 ===> DATE
OpenAI ===> GPE
one million ===> CARDINAL
CNBC ===> ORG
December 15, 2022 ===> DATE
time".[27

## Rule based Matching 

In [20]:
doc 

ChatGPT (Chat Generative Pre-trained Transformer)[1] is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning)[2] with both supervised and reinforcement learning techniques.

ChatGPT was launched as a prototype on November 30, 2022, and quickly garnered attention for its detailed responses and articulate answers across many domains of knowledge. Its uneven factual accuracy was identified as a significant drawback.[3] Following the release of ChatGPT, OpenAI was valued at $29 billion.[4]

Training

Pioneer Building, San Francisco, headquarters of OpenAI

Sam Altman, CEO of OpenAI
ChatGPT was fine-tuned on top of GPT-3.5 using supervised learning as well as reinforcement learning.[5] Both approaches used human trainers to improve the model's performance. In the case of supervised learning, the model was provided with conversations in which the trainers played both sides: the

#### Important types of matching:
    
    1) Token Matching
    
    2) Phrase Matching 
    
    3) Entity Matching 
    
    4) Combination of all the above 

#### How to do Matching ?

    1) Create an object/instance of the Matcher Class
    
    2) Define a pattern, which is nothing but what we are looking for 
    
    3) Add the pattern to the Matcher Object 
    
    4) Pass the doc to the Matcher Object 

## Token Matching 

In [21]:
from spacy.matcher import Matcher

#### 1) Create a Matcher Object

In [23]:
matcher_1 = Matcher(nlp.vocab)

#### 2) Pattern : It is a list of dictionaries 

In [22]:
pattern_1 = [{'text':'ChatGPT'}]

pattern_1

[{'text': 'ChatGPT'}]

#### As you can see above, pattern_1 is just a list of a dictionary

#### 3) Adding pattern to Matcher now 

In [24]:
matcher_1.add('Pattern_1',[pattern_1])

#### 4) Now passing the doc to the Matcher Object 

In [25]:
match_1 = matcher_1(doc)

In [26]:
print(len(match_1))

38


In [30]:
for match_id, start, end in match_1:
    span = doc[start:end]
    print(span.text)

ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT
ChatGPT


## Phrase Matcher

### Looking for 'ChatGPT is'

In [33]:
matcher_2 = Matcher(nlp.vocab)
pattern_2 = [{'text':'ChatGPT'},
            {'text':'is'}]
matcher_2.add('Pattern_2',[pattern_2])
match_2 = matcher_2(doc)

print('The No.of times ChatGPT is, is repeated is:',len(match_2))
print('\n')
for match_id, start, end in match_2:
    span = doc[start:end]
    print(span.text)

The No.of times ChatGPT is, is repeated is: 6


ChatGPT is
ChatGPT is
ChatGPT is
ChatGPT is
ChatGPT is
ChatGPT is


## Looking for 'language model/s'

#### Lemma of langauge: language, languages etc


In [36]:
matcher_3 = Matcher(nlp.vocab)
pattern_3 = [{'LEMMA':'language'},
            {'LEMMA':'model'}]
matcher_3.add('Pattern_3',[pattern_3])
match_3 = matcher_3(doc)

print(len(match_3))
print('\n')
for match_id, start, end in match_3:
    span = doc[start:end]
    print(span.text)

3


language models
language models
language models


## Looking for digits 

In [39]:
matcher_4 = Matcher(nlp.vocab)
pattern_4 = [{'IS_DIGIT':True}]
matcher_4.add('Pattern_4',[pattern_4])
match_4 = matcher_4(doc)

print(len(match_4))
print('\n')
for match_id, start, end in match_4:
    span = doc[start:end]
    print(span.text)

26


2022
30
2022
29
30
2022
2015
2015
2021
2022
30
2022
2
4
15
2022
2022
2022
2022
2022
2015
2022
3
2023
42
55


## Looking for words like [formed, created, launched,.....]

In [41]:
matcher_5 = Matcher(nlp.vocab)
pattern_5 = [{'LEMMA':{'IN':['create','form','launch','establish','introduce','demonstrate']}}]
matcher_5.add('Pattern_5',[pattern_5])
match_5 = matcher_5(doc)

print(len(match_5))
print('\n')
for match_id, start, end in match_5:
    span = doc[start:end]
    print(span.text)

8


launched
launched
created
create
established
launched
launched
launched


## Entity Matcher 

### Occurence of GPE Type 

In [42]:
matcher_6 = Matcher(nlp.vocab)
pattern_6 = [{'ENT_TYPE':'GPE'}]
matcher_6.add('Pattern_6',[pattern_6])
match_6 = matcher_6(doc)

print(len(match_6))
print('\n')
for match_id, start, end in match_6:
    span = doc[start:end]
    print(span.text)

30


OpenAI
OpenAI
OpenAI
San
Francisco
OpenAI
Python
US
Columbus
U.S.
Columbus
Columbus
OpenAI
OpenAI
San
Francisco
OpenAI
OpenAI
AI
OpenAI
OpenAI
Nazi
Germany
Germany
OpenAI
OpenAI
OpenAI
OpenAI
New
York


## Occurances of PERSON 

In [43]:
matcher_7 = Matcher(nlp.vocab)
pattern_7 = [{'ENT_TYPE':'PERSON'}]
matcher_7.add('Pattern_7',[pattern_7])
match_7 = matcher_7(doc)

print(len(match_7))
print('\n')
for match_id, start, end in match_7:
    span = doc[start:end]
    print(span.text)

23


Christopher
Columbus
Whisper
Scott
Aaronson
GPT-4
Dan
Gillmor
Alex
Kantrowitz
Hitler
Derek
Thompson
Paul
Graham
Sundar
Pichai
Sundar
Pichai
Microsoft
Bing
Stuart
Cobbe


## Tokens having length >=...

In [45]:
matcher_8 = Matcher(nlp.vocab)
pattern_8 = [{'LENGTH':{'>=':15}}]
matcher_8.add('Pattern_8',[pattern_8])
match_8 = matcher_8(doc)

print(len(match_8))
print('\n')
for match_id, start, end in match_8:
    span = doc[start:end]
    print(span.text)

8


computationally
performance.[8][9
feedback.[10][11][12
conversationalist
dismissed.[6][17
hallucination.[20
applications.[23
scientists.[24][25


## Tokens having length 2

In [46]:
matcher_9 = Matcher(nlp.vocab)
pattern_9 = [{'LENGTH':{'==':2}}]
matcher_9.add('Pattern_9',[pattern_9])
match_9 = matcher_9(doc)

print(len(match_9))
print('\n')
for match_id, start, end in match_9:
    span = doc[start:end]
    print(span.text)

256


is
by
in
It
is
on
of
's
of
is
an
to



as
on
30
of
as
of
at
29






of



of
on
of
as
as
to
's
In
of
in
AI
In
in
to
on
of
to
of
in
on



In
to
be
to
to
or
or






of
30
of
is
to
is
it
to
to
to
on
at
to
to
to
an
to
to
an
's
as



In
to
to
In
of
me
to
US
in
as
of
as
of
if
to
in
of



to
it
in
to
be
as
To
to
's
or



or
is
to
is
of
be
as
's
of
to
as
of
is
to
or
in
to
on
In
of
or
be
to
of
In
of
to



on
30
by
of
as
to
to
By
on
15
to
in
is
to
in
to
of
in
AI
as
of
is
no
of
an



to
is
on
to
to
so
as
to
or
in
of
to
be
in



in
it
to
of
it
to
on
on
to
of
's
to
to
in
's
of



In
's
of
as
of
AI
we
we



of
is
's
on
to
AI
as
of
us
is
to
be
of
to
is
of
by
it
by
is
is
We
's
to
of
's
as
is
co
in
in
to
in
of
in
to
of
by
In
at
of
of
to
to
in
to
on
to



in
to
by
on
42
55


In [50]:
matcher_10 = Matcher(nlp.vocab)
pattern_10 = [{'text':'Musk'}]
matcher_10.add('Pattern_10',[pattern_10])
match_10 = matcher_10(doc)

print(len(match_10))
print('\n')
for match_id, start, end in match_10:
    span = doc[start:end]
    print(span.text)

3


Musk
Musk
Musk
