# 1. Import libraries


In [1]:
import pandas as pd
from qdrant_client import models, QdrantClient
from sentence_transformers import SentenceTransformer

# 2. Setup Authoritative Knowledge Base

Next, we need to setup the domain-specific (aka Authoritative) knowledge base that will augment the LLM.  This requires the creation of embeddings into a vector database.

## Load and Inspect the data

In [2]:
df = pd.read_csv("data/top_wines.csv")
df.head()

Unnamed: 0,name,region,variety,rating,notes
0,3 Rings Reserve Shiraz 2004,"Barossa Valley, Barossa, South Australia, Aust...",Red Wine,96.0,Vintage Comments : Classic Barossa vintage con...
1,Abreu Vineyards Cappella 2007,"Napa Valley, California",Red Wine,96.0,Cappella is a proprietary blend of two clones ...
2,Abreu Vineyards Cappella 2010,"Napa Valley, California",Red Wine,98.0,Cappella is one of the oldest vineyard sites i...
3,Abreu Vineyards Howell Mountain 2008,"Howell Mountain, Napa Valley, California",Red Wine,96.0,When David purchased this Howell Mountain prop...
4,Abreu Vineyards Howell Mountain 2009,"Howell Mountain, Napa Valley, California",Red Wine,98.0,"As a set of wines, it is hard to surpass the f..."


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1365 entries, 0 to 1364
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   name     1365 non-null   object 
 1   region   1364 non-null   object 
 2   variety  1347 non-null   object 
 3   rating   1365 non-null   float64
 4   notes    1365 non-null   object 
dtypes: float64(1), object(4)
memory usage: 53.4+ KB


## Clean up Data

In [4]:
# Remove NA entries...
df = df[df['region'].notna()]
df = df[df['variety'].notna()]

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1347 entries, 0 to 1364
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   name     1347 non-null   object 
 1   region   1347 non-null   object 
 2   variety  1347 non-null   object 
 3   rating   1347 non-null   float64
 4   notes    1347 non-null   object 
dtypes: float64(1), object(4)
memory usage: 63.1+ KB


## Create Records

  Transform each wine row into a data record (ie dictionary):

In [6]:
data = df.to_dict("records")

# Show what the first record looks like...
for k, v in data[0].items():
  print(f'{k}: {v}')

name: 3 Rings Reserve Shiraz 2004
region: Barossa Valley, Barossa, South Australia, Australia
variety: Red Wine
rating: 96.0
notes: Vintage Comments : Classic Barossa vintage conditions. An average wet Spring followed by extreme heat in early February. Occasional rainfall events kept the vines in good balance up to harvest in late March 2004. Very good quality coupled with good average yields. More than 30 months in wood followed by six months tank maturation of the blend prior to bottling, July 2007. 


## Vectorize Data

First, create an encoder to create the embeddings from the wine notes:

In [7]:
encoder = SentenceTransformer('all-MiniLM-L6-v2') # Model

Next, create an in-memory Qdrant database instance to store vectorized data:

In [8]:
qdrant = QdrantClient(":memory:")

In [9]:
# Create collection to store wines
qdrant.create_collection(
    collection_name="top_wines",
    vectors_config=models.VectorParams(
        size=encoder.get_sentence_embedding_dimension(), # Vector size is defined by used model
        distance=models.Distance.COSINE
    )
)

True

Vectorize the wine notes, and associate each vector with its corresponding wine record (ie the payload).  Get a coffee, this may take a while:

In [10]:
qdrant.upload_points(
    collection_name="top_wines",
    points=[
        models.PointStruct(
            id=idx,
            vector=encoder.encode(record["notes"]).tolist(),
            payload=record,
        ) for idx, record in enumerate(data)
    ]
)

# 3. Demonstrate RAG in action



We need to run the LLM locally as a web server; by default, the LLM will run at http://127.0.0.1:8080.  Make sure the LLM web server is running - if you haven't already done so, open a terminal and run the following script:

```./llm_server.py```

At this point, we are ready to take user input, use the input to extract relevant text from our authoritative knowledge base, and pass the augmented prompt to the LLM.  Start with the user prompt:

In [12]:
user_prompt = "Suggest me an amazing Malbec wine from Argentina"

Before engaging the LLM, search the vector database using the user's prompt as search criteria:

In [13]:
hits = qdrant.search(
    collection_name="top_wines",
    query_vector=encoder.encode(user_prompt).tolist(),
    limit=3
)
for hit in hits:
  print("score:", hit.score, "payload:", hit.payload)

score: 0.6377782412175261 payload: {'name': 'Catena Zapata Argentino Vineyard Malbec 2004', 'region': 'Argentina', 'variety': 'Red Wine', 'rating': 98.0, 'notes': '"The single-vineyard 2004 Malbec Argentino Vineyard spent 17 months in new French oak. Remarkably fragrant and complex aromatically, it offers up aromas of wood smoke, creosote, pepper, clove, black cherry, and blackberry. Made in a similar, elegant style, it is the most structured of the three single vineyard wines, needing a minimum of a decade of additional cellaring. It should easily prove to be a 25-40 year wine. It is an exceptional achievement in Malbec. When all is said and done, Catena Zapata is the Argentina winery of reference – the standard of excellence for comparing all others. The brilliant, forward-thinking Nicolas Catena remains in charge, with his daughter, Laura, playing an increasingly large role. The Catena Zapata winery is an essential destination for fans of both architecture and wine in Mendoza. It is

Extract search results to be passed to the LLM:

In [14]:
search_results = [hit.payload for hit in hits]

Finally, we send the augmented prompt to the local LLM using Open AI's api:

In [16]:
llm_url = 'http://127.0.0.1:8080/v1'

from openai import OpenAI
client = OpenAI(
    base_url=llm_url,
    api_key = "sk-no-key-required"
)
completion = client.chat.completions.create(
    model="LLaMA_CPP",
    messages=[
        {"role": "system", "content": "You are chatbot, a wine specialist. Your top priority is to help guide users into selecting amazing wine and guide them with their requests."},
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "content": str(search_results)}
    ]
)
print(completion.choices[0].message)

ChatCompletionMessage(content="Sure, I can suggest you an amazing Malbec wine from Argentina, which is the Catena Zapata Adrianna Vineyard Malbec 2004. It is a single-vineyard wine from the Gualtallary district, which is known for producing some of the finest Malbecs in Argentina. The wine is inky purple with aromas of wood smoke, pencil lead, game, black cherry, and blackberry liqueur. It is full-flavored, yet remarkably light on its feet, with a medium to full-bodied structure. It is a fine test of one's ability to defer immediate gratification, and when all is said and done, Catena Zapata is the Argentina winery of reference – the standard of excellence for comparing all others. The brilliant, forward-thinking Nicolas Catena remains in charge, with his daughter, Laura, playing an increasingly large role. The Catena Zapata winery is an essential destination for fans of both architecture and wine in Mendoza. It is hard to believe, given the surge in popularity of Malbec in recent year

## Clean Up

When you're done, stop the LLM web server previously launched: go back to the terminal and type Ctrl-C.