# Wine Reviews Search Engine

Enter some sample queries to see how well this performs!

- lots of tannins leading to a harsh, puckery feel in the mouth
- shiraz fruity plum
- fruity chardonnay with cherry flavors
- sweet citrus chardonnay
- dessert wine

In [1]:
# Setup search engine

import build.constants as C
import nmslib
import pandas as pd
import sqlite3 as sql
import time

from sentence_transformers import SentenceTransformer

start = time.process_time()
print(f"LOADING NMS index from {C.NMS_INDEX1}...")
index = nmslib.init(method="hnsw", space="cosinesimil")
index.loadIndex(C.NMS_INDEX1)

print(f"LOADING sentence transformer {C.SENTENCE_TRANSFORMER_MODEL_NAME}...")
model = SentenceTransformer(C.SENTENCE_TRANSFORMER_MODEL_NAME)

print(f"LOADING dataset from {C.SQLITE_DATASET} sqlite file...")
with sql.connect(C.SQLITE_DATASET) as c:
    df = pd.read_sql("select * from wine", c)
end = time.process_time()
print(f"INIT completed in {end-start:.2f} seconds")

def search(df, query: str) -> None:
    start = time.process_time()
    query_embeddings = model.encode(query, convert_to_tensor=True).cpu()
    ids, distances = index.knnQuery(query_embeddings, k=20)
    end = time.process_time()
    print(f"SEARCHED {df.shape[0]} reviews of {df.title.nunique()} wines "
          f"from {df.winery.nunique()} wineries in {(end-start)*1000:.2f}ms\n")

    # TODO: better Jupyter output
    matches = []
    for i, j in zip(ids, distances):
        print((f"NAME: {df.winery.values[i]} {df.title.values[i]} "
            f"({df.country.values[i]})\n"
            f"REVIEW: {df.description.values[i]}\n"
            f"RANK: {df.points.values[i]} "
            f"DISTANCE: {j:.2f}"))

Your CPU supports instructions that this binary was not compiled to use: SSE3 SSE4.1 SSE4.2 AVX AVX2
For maximum performance, you can install NMSLIB from sources 
pip install --no-binary :all: nmslib


LOADING NMS index from /data/index.bin...
LOADING sentence transformer msmarco-distilbert-base-v4...
LOADING dataset from /data/wine.db sqlite file...
INIT completed in 2.61 seconds


In [5]:
search(df, "merlot cherry notes")

SEARCHED 100261 reviews of 99388 wines from 14975 wineries in 11.06ms

NAME: Harbes Family Vineyard Harbes Family Vineyard 2015 Dry Rosé (North Fork of Long Island) (US)
REVIEW: While tart red-cherry and red-apple notes are pleasant, there's an earthiness that persists throughout this dry Merlot rosé. It's light in body with a tangy finish.
RANK: 85 DISTANCE: 0.38
NAME: Le Petit Cochonnet Le Petit Cochonnet 2015 Merlot (Pays d'Oc) (France)
REVIEW: A very drinkable, light and easy to like Merlot, this offers ripe notes of black plum, cherry and berry on the nose and mouth, with a faint hint of milk chocolate in the background. The soft mouthfeel and fruity palate finish short, but clean.
RANK: 84 DISTANCE: 0.39
NAME: One Woman One Woman 2012 Estate Reserve Merlot (North Fork of Long Island) (US)
REVIEW: Bramble, violet and leather notes lend complexity to this fruity but elegantly composed Merlot. Ripe plum and cherry flavors are plump but pristine, brightened by crisp acidity and a smo