# Simple RAG Implementation

Based on [Alfredo Deza's GitHub Repository](https://github.com/alfredodeza/learn-retrieval-augmented-generation).

In this notebook we will build a simple RAG application based on a structured CSV file with wine rating. We will:
* Load the small dataset.
* Encode using vector embedding one of the columns.
* **R**etrieve some of the rows based on a query using semantic similarity.
* **A**ugment the prompt to the LLM with the retrieved data.
* **G**enerate a reply to the user's query based on the retrieved rows.

## Loading the Dataset

Since the data is in a simple, small and structured CSV file, we can load it using Pandas.

In [1]:
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')


In [2]:
import pandas as pd

data = (
    pd
    .read_csv('data/top_rated_wines.csv')
    .query('variety.notna()')
    .reset_index(drop=True)
    .to_dict('records')
)
data[:2]

[{'name': '3 Rings Reserve Shiraz 2004',
  'region': 'Barossa Valley, Barossa, South Australia, Australia',
  'variety': 'Red Wine',
  'rating': 96.0,
  'notes': 'Vintage Comments : Classic Barossa vintage conditions. An average wet Spring followed by extreme heat in early February. Occasional rainfall events kept the vines in good balance up to harvest in late March 2004. Very good quality coupled with good average yields. More than 30 months in wood followed by six months tank maturation of the blend prior to bottling, July 2007. '},
 {'name': 'Abreu Vineyards Cappella 2007',
  'region': 'Napa Valley, California',
  'variety': 'Red Wine',
  'rating': 96.0,
  'notes': 'Cappella is a proprietary blend of two clones of Cabernet Sauvignon with Cabernet Franc, Petit Verdot and Merlot. The gravelly soil at Cappella produces fruit that is very elegant in structure. The resulting wine exhibits beautiful purity of fruit with fine grained and lengthy tannins. '}]

## Encode using Vector Embedding

We will use one of the popular open source vector databases, [Qdrant](https://qdrant.tech/), and one of the popular embedding encoder and text transformer libraries, [SentenceTransformer](https://sbert.net/).

In [3]:
from qdrant_client import models, QdrantClient
from sentence_transformers import SentenceTransformer

# create the vector database client
qdrant = QdrantClient(":memory:") # Create in-memory Qdrant instance

# Create the embedding encoder
encoder = SentenceTransformer('all-MiniLM-L6-v2') # Model to create embeddings

In [4]:
# Create collection to store the wine rating data
qdrant.recreate_collection(
    collection_name="top_wines",
    vectors_config=models.VectorParams(
        size=encoder.get_sentence_embedding_dimension(), # Vector size is defined by used model
        distance=models.Distance.COSINE
    )
)

True

### Loading the data into the vector database

We will use the (vector) collection that we created above, to go over all the `notes` column of the wine dataset, and encode it into embedding vector, and store it in the vector database. The indexing of the data to allow quick retrieval is running in the background as we load it.

This step will take a few seconds (less than a minute on my laptop).

In [5]:
# vectorize!
qdrant.upload_points(
    collection_name="top_wines",
    points=[
        models.PointStruct(
            id=idx,
            vector=encoder.encode(doc["notes"]).tolist(),
            payload=doc
        ) for idx, doc in enumerate(data) # data is the variable holding all the wines
    ]
)

## **R**etrieve sematically relevant data based on user's query

Once the data is loaded into the vector database and the indexing process is done, we can start using our simple RAG system.

In [6]:
user_prompt = "Suggest me an amazing Malbec wine from Argentina"

### Encoding the user's query

We will use the same encoder that we used to encode the document data to encode the query of the user. 
This way we can search results based on semantic similarity. 

In [7]:
query_vector = encoder.encode(user_prompt).tolist()

### Search similar rows

We can now take the embedding encoding of the user's query and use it to find similar rows in the vector database.

In [8]:
# Search time for awesome wines!

hits = qdrant.search(
    collection_name="top_wines",
    query_vector=query_vector,
    limit=3
)
for hit in hits:
  print(hit.payload, "score:", hit.score)

{'name': 'Catena Zapata Argentino Vineyard Malbec 2004', 'region': 'Argentina', 'variety': 'Red Wine', 'rating': 98.0, 'notes': '"The single-vineyard 2004 Malbec Argentino Vineyard spent 17 months in new French oak. Remarkably fragrant and complex aromatically, it offers up aromas of wood smoke, creosote, pepper, clove, black cherry, and blackberry. Made in a similar, elegant style, it is the most structured of the three single vineyard wines, needing a minimum of a decade of additional cellaring. It should easily prove to be a 25-40 year wine. It is an exceptional achievement in Malbec. When all is said and done, Catena Zapata is the Argentina winery of reference – the standard of excellence for comparing all others. The brilliant, forward-thinking Nicolas Catena remains in charge, with his daughter, Laura, playing an increasingly large role. The Catena Zapata winery is an essential destination for fans of both architecture and wine in Mendoza. It is hard to believe, given the surge i

## **A**ugment the prompt to the LLM with retrieved data

In our simple example, we will simply take the top 3 results and use them as is in the prompt to the generation LLM.

In [9]:
# define a variable to hold the search results
search_results = [hit.payload for hit in hits]

## **G**enerate reply to the user's query

We will use one of the most popular generative AI LLMs from [OpenAI](https://platform.openai.com/docs/models). 

In [10]:
# Now time to connect to the local large language model
from openai import OpenAI
client = OpenAI()
completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are chatbot, a wine specialist. Your top priority is to help guide users into selecting amazing wine and guide them with their requests."},
        {"role": "user", "content": user_prompt},
        {"role": "assistant", "content": str(search_results)}
    ]
)
print(completion.choices[0].message.content)

I have a few amazing Malbec wines from Argentina that you might like:

1. Catena Zapata Argentino Vineyard Malbec 2004 - This wine has a rating of 98 and offers complex aromas of wood smoke, pepper, black cherry, and blackberry. It is structured and elegant, needing some cellaring but will reward you with exceptional taste.

2. Bodega Colome Altura Maxima Malbec 2012 - This Malbec from Salta, Argentina, has a rating of 96. Winemaker Thibaut Delmotte has crafted a wine of distinction that embodies the traditional grape variety in a modern viticultural setting.

3. Catena Zapata Adrianna Vineyard Malbec 2004 - With a rating of 97, this Malbec from the Adrianna Vineyard offers aromas of wood smoke, game, black cherry, and blackberry liqueur. It is opulent and full-flavored, a true delight to experience.

These wines are highly rated and are sure to provide you with an amazing Malbec experience from Argentina. Enjoy!
