# Implement the Retrieval for a Retrieval Augmented Generation (RAG) Use Case
Now that we have all our context information stored in SAP HANA Cloud Vector Store, we can start asking the LLM questions about SAP AI Services. This time the model will not respond from it's knowledge base, that is what it knows from it's training data but the retriever will check for relevant context information in our vector database and send that text chunk to the LLM to read through before responding.

In [None]:
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings

from langchain.chains import RetrievalQA

from langchain_community.vectorstores.hanavector import HanaDB
from hdbcli import dbapi

import configparser
import variables

👉 SET the `EMBEDDING_TABLE` to `"EMBEDDINGS_SAP_AI_SERVICES_>add your name here<"` like in the previous exercise.

We are again connecting to our SAP HANA Cloud Vector Engine.

In [None]:
config = configparser.ConfigParser()
config.read('.user.ini')
connection = dbapi.connect(
    address=config.get('hana', 'url'), 
    port=config.get('hana', 'port'), 
    user=config.get('hana', 'user'),
    password=config.get('hana', 'passwd'),
    autocommit=True,
    sslValidateCertificate=False
)

In [None]:
# Create embeddings for custom documents
embeddings = OpenAIEmbeddings(deployment_id=variables.EMBEDDING_DEPLOYMENT_ID)
db = HanaDB(
    embedding=embeddings, connection=connection, table_name=variables.EMBEDDING_TABLE
)

In this step we are defining which LLM to use during the retrieving process. We then also assign which database to retrieve information from. 

In [None]:
# Define which model to use
chat_llm = ChatOpenAI(deployment_id=variables.LLM_DEPLOYMENT_ID)

# Create a retriever instance of the vector store
retriever = db.as_retriever(search_kwargs={"k": 2})

👉 Now instead of sending the query directly to the LLM, you will create a `RetrievalQA` instance and handover the LLM and the database that should be used during the retrieval process. Now you can send your query to the `Retriever`.

👉 Try out different queries. You can ask anything you would like to know about the SAP AI Services.

In [None]:
# Create the QA instance to query llm based on custom documents
qa = RetrievalQA.from_llm(llm=chat_llm, retriever=retriever, return_source_documents=True)

# Send query
query = "What is the machine learning model behind the regression model template of Data Attribute Recommendation?"
# query = "What is the premium edition of Document Information Extraction?"
# query = "What does Blocks of 100 Documents for Premium Edition mean?"

print(qa.invoke(query))

👉 Go back to [06-store-embeddings-hana](06-store-embeddings-hana.ipynb) and try out different chunk sizes or different values for overlap. Store these chunks in a different table by adding a new variable to [variables.py](variables.py) and run this script again using the newly created table.

[Next exercise](08-semantic-chunking.ipynb)