# Custom Rule Based Entity Recognition

    The entity ruler is an exciting new component that lets you add named entities based on pattern dictionaries and  make it easy to combine rule based and statistical named entity recognition for even more powerful models.

### Entity Patterns

    Entity patterns are patterns with two keys: 
    
    'label', specifying the label to assign to the entity if the pattern is matched and 
    
    'pattern' the match pattern. 
    The entity ruler accepts two types of patterns:
    >  Phase Pattern : {'label' : 'ORG', 'pattern' : 'APPLE'}
    >  Token Pattern : {'label' : 'GPE', 'pattern' : [{'LOWER' : 'san'}, {'LOWER':'franscisco'}]}

### Using the Entity Ruler

        The entity ruler is a pipeline component that is typically added via nlp.add_pipe. When the nlp object is called on a text, it will find matches in the doc and add them as entities to doc.ents using the specified pattern label as entity label.  

In [34]:
import spacy
from spacy.pipeline import EntityRuler

In [35]:
nlp = spacy.load('en_core_web_sm')

In [36]:
ruler = EntityRuler(nlp)

In [37]:
mypattern = [{'label' : 'ORG', 'pattern' : 'AESM'}, 
             {'label' : 'GPE', 'pattern' : [{'LOWER' : 'san'}, {'LOWER':'franscisco'}]}
            ]

In [38]:
ruler.add_patterns(mypattern)

In [39]:
nlp.add_pipe(ruler)

In [40]:
doc = nlp("AESM  is opening its first big office in San Franscisco")

In [41]:
doc


AESM  is opening its first big office in San Franscisco

In [54]:
myents = [  (token.text, token.label_) for token in doc.ents ]
myents

[('AESM', 'ORG'), ('first', 'ORDINAL'), ('San Franscisco', 'GPE')]