<a href="https://colab.research.google.com/github/jsandino/wine-rag/blob/main/wine_rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. Setup Environment

First, prepare the terrain with required third-party technologies.

##Install Dependencies

In [None]:
!pip install qdrant-client==1.12.1
!pip install sentence-transformers==3.3.1
!pip install openai==1.11.1

## Import Libraries

In [6]:
import pandas as pd
from qdrant_client import models, QdrantClient
from sentence_transformers import SentenceTransformer

## Download the LLM

In [7]:
from pathlib import Path
import urllib.request

def download_llm():
  llm_path = Path("llm/mxbai-embed-large-v1-f16.llamafile")
  if not llm_path.is_file():
    Path("llm").mkdir(parents=True, exist_ok=True)
    url = "https://huggingface.co/Mozilla/mxbai-embed-large-v1-llamafile/resolve/main/mxbai-embed-large-v1-f16.llamafile"
    urllib.request.urlretrieve(url, llm_path)

download_llm()

# 2. Setup Authoritative Knowledge Base

Next, we need to setup the domain-specific (aka Authoritative) knowledge base that will augment the LLM.  This requires the creation of embeddings into a vector database.

## Load and Inspect the data

In [39]:
def load_wines():
  wine_path = Path("top_wines.csv")
  if not wine_path.is_file():
    url = "https://raw.githubusercontent.com/jsandino/wine-rag/refs/heads/main/data/top_wines.csv"
    urllib.request.urlretrieve(url, wine_path)
  return pd.read_csv(Path("top_wines.csv"))

df = load_wines()
df.head()

Unnamed: 0,name,region,variety,rating,notes
0,3 Rings Reserve Shiraz 2004,"Barossa Valley, Barossa, South Australia, Aust...",Red Wine,96.0,Vintage Comments : Classic Barossa vintage con...
1,Abreu Vineyards Cappella 2007,"Napa Valley, California",Red Wine,96.0,Cappella is a proprietary blend of two clones ...
2,Abreu Vineyards Cappella 2010,"Napa Valley, California",Red Wine,98.0,Cappella is one of the oldest vineyard sites i...
3,Abreu Vineyards Howell Mountain 2008,"Howell Mountain, Napa Valley, California",Red Wine,96.0,When David purchased this Howell Mountain prop...
4,Abreu Vineyards Howell Mountain 2009,"Howell Mountain, Napa Valley, California",Red Wine,98.0,"As a set of wines, it is hard to surpass the f..."


In [36]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1347 entries, 0 to 1364
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   name     1347 non-null   object 
 1   region   1347 non-null   object 
 2   variety  1347 non-null   object 
 3   rating   1347 non-null   float64
 4   notes    1347 non-null   object 
dtypes: float64(1), object(4)
memory usage: 63.1+ KB


## Clean up Data

In [23]:
# Remove NA entries...
df = df[df['region'].notna()]
df = df[df['variety'].notna()]

In [18]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1347 entries, 0 to 1364
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   name     1347 non-null   object 
 1   region   1347 non-null   object 
 2   variety  1347 non-null   object 
 3   rating   1347 non-null   float64
 4   notes    1347 non-null   object 
dtypes: float64(1), object(4)
memory usage: 63.1+ KB


## Create Records

  Transform each wine row into a data record (ie dictionary):

In [29]:
data = df.to_dict("records")

# Show what the first record looks like...
for k, v in data[0].items():
  print(f'{k}: {v}')

name: 3 Rings Reserve Shiraz 2004
region: Barossa Valley, Barossa, South Australia, Australia
variety: Red Wine
rating: 96.0
notes: Vintage Comments : Classic Barossa vintage conditions. An average wet Spring followed by extreme heat in early February. Occasional rainfall events kept the vines in good balance up to harvest in late March 2004. Very good quality coupled with good average yields. More than 30 months in wood followed by six months tank maturation of the blend prior to bottling, July 2007. 


## Vectorize Data

First, create an encoder to create the embeddings from the wine notes:

In [22]:
encoder = SentenceTransformer('all-MiniLM-L6-v2') # Model

Next, create an in-memory Qdrant database instance to store vectorized data:

In [20]:
qdrant = QdrantClient(":memory:")

In [None]:
# Create collection to store wines
qdrant.recreate_collection(
    collection_name="top_wines",
    vectors_config=models.VectorParams(
        size=encoder.get_sentence_embedding_dimension(), # Vector size is defined by used model
        distance=models.Distance.COSINE
    )
)

Vectorize the wine notes, and associate each vector with its corresponding wine record (ie the payload):

In [30]:
qdrant.upload_points(
    collection_name="top_wines",
    points=[
        models.PointStruct(
            id=idx,
            vector=encoder.encode(record["notes"]).tolist(),
            payload=record,
        ) for idx, record in enumerate(data)
    ]
)

# 3. Demonstrate RAG in action



We need to run the LLM locally as a server running at http://127.0.0.1:8081:

In [56]:
# This gives access to a browser tab, to test the LLM via web interface

# from google.colab.output import eval_js
# print(eval_js("google.colab.kernel.proxyPort(8081)"))

https://3r473mxaocy-496ff2e9c6d22116-8081-colab.googleusercontent.com/


In [None]:
# This cell is not currently working: the LLM is launched successfully, but it blocks execution of any other cells...

!chmod +x llm/mxbai-embed-large-v1-f16.llamafile
!llm/mxbai-embed-large-v1-f16.llamafile &

At this point, we are ready to take user input, use the input to extract relevant text from our authoritative knowledge base, and pass the augmented prompt to the LLM.  Start with the user prompt:

In [40]:
user_prompt = "Suggest me an amazing Malbec wine from Argentina"

Before engaging the LLM, search the vector database using the user's prompt as search criteria:

In [44]:
hits = qdrant.search(
    collection_name="top_wines",
    query_vector=encoder.encode(user_prompt).tolist(),
    limit=3
)
for hit in hits:
  print("score:", hit.score, "payload:", hit.payload)

score: 0.6377782347562875 payload: {'name': 'Catena Zapata Argentino Vineyard Malbec 2004', 'region': 'Argentina', 'variety': 'Red Wine', 'rating': 98.0, 'notes': '"The single-vineyard 2004 Malbec Argentino Vineyard spent 17 months in new French oak. Remarkably fragrant and complex aromatically, it offers up aromas of wood smoke, creosote, pepper, clove, black cherry, and blackberry. Made in a similar, elegant style, it is the most structured of the three single vineyard wines, needing a minimum of a decade of additional cellaring. It should easily prove to be a 25-40 year wine. It is an exceptional achievement in Malbec. When all is said and done, Catena Zapata is the Argentina winery of reference – the standard of excellence for comparing all others. The brilliant, forward-thinking Nicolas Catena remains in charge, with his daughter, Laura, playing an increasingly large role. The Catena Zapata winery is an essential destination for fans of both architecture and wine in Mendoza. It is

Extract search results to be passed to the LLM:

In [53]:
search_results = [hit.payload for hit in hits]

Finally, we send the augmented prompt to the local LLM using Open AI's api:

In [None]:
from openai import OpenAI
client = OpenAI(
    base_url="http://127.0.0.1:8081",
    api_key = "sk-no-key-required"
)
completion = client.chat.completions.create(
    model="LLaMA_CPP",
    messages=[
        {"role": "system", "content": "You are chatbot, a wine specialist. Your top priority is to help guide users into selecting amazing wine and guide them with their requests."},
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "content": str(search_results)}
    ]
)
print(completion.choices[0].message)