# Wine Reviews Search Engine

Enter some sample queries to see how well this performs!

- lots of tannins leading to a harsh, puckery feel in the mouth
- shiraz fruity plum
- fruity chardonnay with cherry flavors
- sweet citrus chardonnay
- dessert wine

In [1]:
# Setup search engine

import build.constants as C
import nmslib
import pandas as pd
import sqlite3 as sql
import time

from sentence_transformers import SentenceTransformer

start = time.process_time()
print(f"LOADING NMS index from {C.NMS_INDEX1}...")
index = nmslib.init(method="hnsw", space="cosinesimil")
index.loadIndex(C.NMS_INDEX1)

print(f"LOADING sentence transformer {C.SENTENCE_TRANSFORMER_MODEL_NAME}...")
model = SentenceTransformer(C.SENTENCE_TRANSFORMER_MODEL_NAME)

print(f"LOADING dataset from {C.SQLITE_DATASET} sqlite file...")
with sql.connect(C.SQLITE_DATASET) as c:
    df = pd.read_sql("select * from wine", c)
end = time.process_time()
print(f"INIT completed in {end-start:.2f} seconds")

def search(df, query: str) -> None:
    start = time.process_time()
    query_embeddings = model.encode(query, convert_to_tensor=True).cpu()
    ids, distances = index.knnQuery(query_embeddings, k=20)
    end = time.process_time()
    print(f"SEARCHED {df.shape[0]} reviews of {df.title.nunique()} wines "
          f"from {df.winery.nunique()} wineries in {(end-start)*1000:.2f}ms\n")

    # TODO: better Jupyter output
    matches = []
    for i, j in zip(ids, distances):
        print((f"NAME: {df.winery.values[i]} {df.title.values[i]} "
            f"({df.country.values[i]})\n"
            f"REVIEW: {df.description.values[i]}\n"
            f"RANK: {df.points.values[i]} "
            f"DISTANCE: {j:.2f}"))

Your CPU supports instructions that this binary was not compiled to use: SSE3 SSE4.1 SSE4.2 AVX AVX2
For maximum performance, you can install NMSLIB from sources 
pip install --no-binary :all: nmslib


LOADING NMS index from /data/index.bin...
LOADING sentence transformer msmarco-distilbert-base-v4...
LOADING dataset from /data/wine.db sqlite file...
INIT completed in 2.76 seconds


In [4]:
search(df, "fruity desert wine")

SEARCHED 100261 reviews of 99388 wines from 14975 wineries in 10.12ms

NAME: Castello del Poggio Castello del Poggio NV Moscato (Asti) (Italy)
REVIEW: A delightful dessert wine, it carries a fruity fragrance of peach and tropical fruit. The frothy palate shows sweet peach and green melon accented with hints of sage. It finishes on a refreshing note. Pair this with sorbet or fruit tarts.
RANK: 86 DISTANCE: 0.40
NAME: Dr. Leimbrock Dr. Leimbrock 2015 Brauneberger Juffer Spätlese Riesling (Mosel) (Germany)
REVIEW: Exotic notes of saffron and dusty pollen mingle into ripe flavors of melon, peach and guava in this plush spicy wine. Medium sweet in style and softly textured, its lushness is offset by sprightly acidity and a touch of earthiness on the finish. Drink now through 2020.
RANK: 89 DISTANCE: 0.42
NAME: Rock Wall Rock Wall 2009 Westphall Ridge Vineyard Rockpile Road Zinfandel (Sonoma County) (US)
REVIEW: A somewhat rustic wine, edgy in tannins and not quite ripe yet jammy at the same