Find the full project [on GitHub](https://github.com/explosion/projects/tree/v3/tutorials/spanruler_restaurant_reviews) or read [the blog post](https://blog.victoriaslocum.com/post/spanruler-ner-data)!

In [10]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
!pip install spacy
!pip install srsly

In [None]:
!python -m spacy project clone tutorials/spanruler_restaurant_reviews

In [None]:
cd /content/spanruler_restaurant_reviews

In [None]:
!python -m spacy project assets

In [None]:
!ls assets/

In [None]:
!head assets/train_raw.iob

In [7]:
import spacy
import srsly
from spacy import displacy
from pprint import pprint

In [None]:
train_data = srsly.read_jsonl("/content/spanruler_restaurant_reviews/assets/train_review.jsonl")

example_1 = list(train_data)[0]

pprint(example_1)

Download model:

In [None]:
!gdown 1PCH7BA2JIToIP77cr7jT7I7ougFEigaB

In [None]:
!unzip /content/spanruler_restaurant_reviews/ner_review.zip
!mv ner_review training/ner_review

In [13]:
!rm -r /content/spanruler_restaurant_reviews/ner_review.zip

We're providing the model for you in the previous cell so you don't have to run this 

🔽 This takes about 2 hours to run

In [None]:
# !python -m spacy project run train-review

In [None]:
text = "find me a cheap chinese restaurant with at least 3 stars"

nlp = spacy.load("/content/spanruler_restaurant_reviews/training/ner_review/model-best")
doc = nlp(text)

displacy.render(doc, style="ent", jupyter=True)

In [None]:
nlp = spacy.blank("en")

ruler = nlp.add_pipe("span_ruler", config={'spans_filter': {'@misc': 'spacy.first_longest_spans_filter.v1'}})
patterns = [{ 
   "label": "Rating", 
   "pattern": [ 
      {"LOWER": "at", "OP": "?"}, 
      {"LOWER": "least", "OP": "?"}, 
      {"IS_DIGIT": True}, 
      {"LOWER": {"REGEX": "star(s)?"}}, 
      {"LOWER": {"REGEX": "rat(ed|ing|ings)?"}, "OP": "?"}, 
    ], 
},]
ruler.add_patterns(patterns)

doc = nlp("find me a restaurant with at least 3 stars")
print([(span.text, span.label_) for span in doc.spans["ruler"]])
# displacy.render(doc, style="ent", jupyter=True)

**Time to write your own rules!** See if you can find a rule to match both "less than 4 miles" and "less than 1 mile from here"

In [None]:
text_1 = "find me a chinese restaurant less than 4 miles"
text_2 = "where is a good indian restaurant less than 1 mile from here"

nlp = spacy.blank("en")

ruler = nlp.add_pipe("span_ruler", config={'spans_filter': {'@misc': 'spacy.first_longest_spans_filter.v1'}})
patterns = [{ 
   "label": "Location", 
   "pattern": [
          ...
    ], 
},]
ruler.add_patterns(patterns)

doc_1 = nlp(text_1)
print([(span.text, span.label_) for span in doc_1.spans["ruler"]])

doc_2 = nlp(text_2)
print([(span.text, span.label_) for span in doc_2.spans["ruler"]])

We can now assemble our trained NER model with our SpanRuler model. We've written rules contained in `scripts/rules_review.py`. 

If you're ever confused on what a command does, you can add `--help` to the end of it

In [None]:
!python -m spacy project run download

In [None]:
!python -m spacy project run prodigy-convert

In [None]:
!python -m spacy project run assemble-review

In [None]:
!python -m spacy project run evaluate-review

In [None]:
text = "where is the closest sushi bars to my zip code"

nlp = spacy.load("/content/spanruler_restaurant_reviews/models/ner_ruler_review")
doc = nlp(text)

displacy.render(doc, style="ent", jupyter=True)