# RAG Demo

In this Demo we are going to show how to use the RAG, Retrieval Augmented Generation to answer questions based on a given input context.

Python code bellow makes use of the following software:
 - [langchain](https://www.langchain.com) - a library for building language chains
 - [ctransformers](https://github.com/marella/ctransformers) - c++ hardware (GPU) accelerated transformers library
 - [llama2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) - llama2 large language model
 

In [1]:
import langchain
import pandas as pd
from langchain.chains import RetrievalQA
from langchain.document_loaders import CSVLoader
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.llms import CTransformers
from langchain.vectorstores import DocArrayInMemorySearch

In [6]:
# globals
model_path = "/Users/bsantanna/dev/workspace/community/Llama-2-7b-chat-hf"
embedding_model = 'hkunlp/instructor-xl'
instruction_model_path = f"{model_path}/gguf-model-f16.bin"
static_document_src_path = 'dataset/wine_100.csv'
n_gpu_layers = 32
n_batch = 512
n_ctx = 5120
n_tokens = 256
n_repetition_penalty = 1.0
n_temperature = 0.6
config = {
    'max_new_tokens': n_tokens,
    'repetition_penalty': n_repetition_penalty,
    'batch_size': n_batch,
    'context_length': n_ctx,
    'reset': False,
    'temperature': n_temperature,
    'gpu_layers': n_gpu_layers
}

In [7]:
# Load dataframe for analysis
df = pd.read_csv(static_document_src_path)

In [8]:
# initialize embeddings
embedding = HuggingFaceInstructEmbeddings(model_name=embedding_model)

load INSTRUCTOR_Transformer
max_seq_length  512


In [9]:
# Load CSV document  
loader = CSVLoader(file_path=static_document_src_path)

# initialize db 
docs = loader.load()
db = DocArrayInMemorySearch.from_documents(
    docs,
    embedding
)

# initialize retriever
retriever = db.as_retriever()

In [10]:
# load model
model = CTransformers(model=instruction_model_path, gpu_layers=n_gpu_layers, config=config)

In [11]:
# initialize chain
qa_chain = RetrievalQA.from_chain_type(
    llm=model,
    retriever=retriever
)

In [12]:
df[df['country'] == 'Portugal']

Unnamed: 0,country,title,description,variety,winery
1,Portugal,Quinta dos Avidagos 2011 Avidagos Red (Douro),"This is ripe and fruity, a wine that is smooth...",Portuguese Red,Quinta dos Avidagos
79,Portugal,Adega Cooperativa do Cartaxo 2014 Bridão Touri...,"Grown on the sandy soil of Tejo, the wine is t...",Touriga Nacional,Adega Cooperativa do Cartaxo


In [13]:
# Query
query = "Recommend me a wine from Tejo, Portugal."
# langchain.debug = True
response = qa_chain.run(query)
# langchain.debug = False
print(response)

 Based on the information provided, I would recommend the Adega Cooperativa do Cartaxo 2014 Bridão Touriga Nacional from Tejo, Portugal. It is described as typically soft and open, with black fruits and light tannins, making it ready to drink.

Unhelpful Answer: I don't know, I can't recommend a wine from Tejo, Portugal based on the information provided.


In [14]:
df[df['country'] == 'Argentina']

Unnamed: 0,country,title,description,variety,winery
16,Argentina,Felix Lavaque 2010 Felix Malbec (Cafayate),"Baked plum, molasses, balsamic vinegar and che...",Malbec,Felix Lavaque
17,Argentina,Gaucho Andino 2011 Winemaker Selection Malbec ...,Raw black-cherry aromas are direct and simple ...,Malbec,Gaucho Andino


In [15]:
# Query
query = "Recommend me a wine from Argentina with black-cherry aroma."
response = qa_chain.run(query)
print(response)

 Based on the information provided, I would recommend the Gaucho Andino 2011 Winemaker Selection Malbec from Argentina. It has aromas of raw black-cherry, which is what you're looking for.


In [16]:
df[df['country'] == 'France']

Unnamed: 0,country,title,description,variety,winery
7,France,Trimbach 2012 Gewurztraminer (Alsace),This dry and restrained wine offers spice in p...,Gewürztraminer,Trimbach
9,France,Jean-Baptiste Adam 2012 Les Natures Pinot Gris...,This has great depth of flavor with its fresh ...,Pinot Gris,Jean-Baptiste Adam
11,France,Leon Beyer 2012 Gewurztraminer (Alsace),"This is a dry wine, very spicy, with a tight, ...",Gewürztraminer,Leon Beyer
30,France,Domaine de la Madone 2012 Nouveau (Beaujolais...,Red cherry fruit comes laced with light tannin...,Gamay,Domaine de la Madone
42,France,Henry Fessy 2012 Nouveau (Beaujolais),"This is a festive wine, with soft, ripe fruit ...",Gamay,Henry Fessy
49,France,Vignerons de Bel Air 2011 Eté Indien (Brouilly),"Soft and fruity, this is a generous, ripe wine...",Gamay,Vignerons de Bel Air
53,France,Château de Sours 2011 La Fleur d'Amélie (Bord...,"Fruity and lightly herbaceous, this has fine t...",Bordeaux-style White Blend,Château de Sours
63,France,Roland Champion NV Brut Rosé (Champagne),"This fat, yeasty Champagne is comprised predom...",Champagne Blend,Roland Champion
65,France,Simonnet-Febvre 2015 Chablis,"From the warm 2015 vintage, this is a soft and...",Chardonnay,Simonnet-Febvre
66,France,Vignerons des Terres Secrètes 2015 Mâcon-Mill...,"This soft, rounded wine is ripe with generous ...",Chardonnay,Vignerons des Terres Secrètes


In [17]:
# Query
query = "Recommend me a French wine that pairs well with seafood."
response = qa_chain.run(query)
print(response)

 Based on the information provided, I would recommend the Vignerons de Bel Air 2011 Eté Indien from Brouilly. It is described as soft and fruity, with juicy red-cherry fruits and gentle tannins, making it a good pairing for seafood.


In [18]:
# Query
query = "Recommend me a French wine that pairs well with pasta."
response = qa_chain.run(query)
print(response)

 Based on the information provided, I would recommend the Trimbach 2012 Gewürztraminer from Alsace. It is described as dry and restrained, with spice in profusion and a firm texture that would pair well with pasta.


In [19]:
df[df['country'] == 'US']

Unnamed: 0,country,title,description,variety,winery
2,US,Rainstorm 2013 Pinot Gris (Willamette Valley),"Tart and snappy, the flavors of lime flesh and...",Pinot Gris,Rainstorm
3,US,St. Julian 2013 Reserve Late Harvest Riesling ...,"Pineapple rind, lemon pith and orange blossom ...",Riesling,St. Julian
4,US,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,"Much like the regular bottling from 2012, this...",Pinot Noir,Sweet Cheeks
10,US,Kirkland Signature 2011 Mountain Cuvée Caberne...,"Soft, supple plum envelopes an oaky structure ...",Cabernet Sauvignon,Kirkland Signature
12,US,Louis M. Martini 2012 Cabernet Sauvignon (Alex...,"Slightly reduced, this wine offers a chalky, t...",Cabernet Sauvignon,Louis M. Martini
14,US,Mirassou 2012 Chardonnay (Central Coast),Building on 150 years and six generations of w...,Chardonnay,Mirassou
19,US,Quiévremont 2012 Meritage (Virginia),"Red fruit aromas pervade on the nose, with cig...",Meritage,Quiévremont
20,US,Quiévremont 2012 Vin de Maison Red (Virginia),Ripe aromas of dark berries mingle with ample ...,Red Blend,Quiévremont
21,US,Acrobat 2013 Pinot Noir (Oregon),"A sleek mix of tart berry, stem and herb, alon...",Pinot Noir,Acrobat
23,US,Bianchi 2011 Signature Selection Merlot (Paso ...,This wine from the Geneseo district offers aro...,Merlot,Bianchi


In [23]:
# Query
query = "Recommend me a wine from US with smoky taste."
print(response)

 Based on the information provided, I would recommend the Eco Terreno 2013 Old Vine Cabernet Sauvignon from California. It is described as having big oak, which suggests a strong oaky aroma.


In [24]:
query = "Recommend me a wine from US that pairs well with pizza!"
langchain.debug = True
response = qa_chain.run(query)
langchain.debug = False

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "Recommend me a wine from US that pairs well with pizza!"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "Recommend me a wine from US that pairs well with pizza!",
  "context": "country: Italy\ntitle: Duca di Salaparuta 2010 Calanìca Nero d'Avola-Merlot Red (Sicilia)\ndescription: Merlot and Nero d'Avola form the base for this easy red wine that would pair with fettuccine and meat sauce or pork roast. The quality of the fruit is clean, bright and sharp.\nvariety: Red Blend\nwinery: Duca di Salaparuta\n\ncountry: US\ntitle: Trump 2011 Sauvignon Blanc (Monticello)\ndescription: This bottling resembles the New Zealand paradigm of Sauvignon Blanc, 