# Spacy Entity Ruler

Spacy offers multiple methods for performing rules based NER. One such method is via its EntityRuler.

The __EntityRuler__ is a spacy factory that allows one to create a set of patterns with corresponding labels. A __factory__ in spacy is a set of classes and functions perloaded in spacy that perform set tasks.

In [1]:
import spacy

In [2]:
nlp = spacy.load("en_core_web_sm")

In [3]:
text = "Python is a high level programming language."

In [4]:
doc = nlp(text)

In [5]:
for ent in doc.ents:
    print(ent.text, ent.label_)

One user has created the entity ruler and given it a set of instructions, the user can add it to the spacy pipeline as a new pipe.

In [6]:
# entity ruler should be fore the ner to get priority
ruler = nlp.add_pipe("entity_ruler",before="ner")

In [7]:
nlp.analyze_pipes()

{'summary': {'tok2vec': {'assigns': ['doc.tensor'],
   'requires': [],
   'scores': [],
   'retokenizes': False},
  'tagger': {'assigns': ['token.tag'],
   'requires': [],
   'scores': ['tag_acc'],
   'retokenizes': False},
  'parser': {'assigns': ['token.dep',
    'token.head',
    'token.is_sent_start',
    'doc.sents'],
   'requires': [],
   'scores': ['dep_uas',
    'dep_las',
    'dep_las_per_type',
    'sents_p',
    'sents_r',
    'sents_f'],
   'retokenizes': False},
  'attribute_ruler': {'assigns': [],
   'requires': [],
   'scores': [],
   'retokenizes': False},
  'lemmatizer': {'assigns': ['token.lemma'],
   'requires': [],
   'scores': ['lemma_acc'],
   'retokenizes': False},
  'entity_ruler': {'assigns': ['doc.ents', 'token.ent_type', 'token.ent_iob'],
   'requires': [],
   'scores': ['ents_f', 'ents_p', 'ents_r', 'ents_per_type'],
   'retokenizes': False},
  'ner': {'assigns': ['doc.ents', 'token.ent_iob', 'token.ent_type'],
   'requires': [],
   'scores': ['ents_f', 'ent

In [8]:
patterns = [
    {"label":"PLANG","pattern":"Python"}
]

In [9]:
ruler.add_patterns(patterns)

In [10]:
doc = nlp(text)
for ent in doc.ents:
    print(ent.text, ent.label_)

Python PLANG
