# Langchain Q&A with RAG

Important: Set your unique id in the cell below. This will be used to create a unique table name in HANA.

Make sure not to have any (leading or trailing) spaces in the unique id. If you have spaces, replace them with underscores.

In [27]:
YOUR_UNIQUE_ID = "<INSERT_YOUR_ID_HERE>".upper()

In [28]:
# Get HANA credentials
import json
try:
    # load keyfile from ../secrets/hana.json
    with open("../secrets/hana.json") as f:
        dbcreds = json.load(f)
        print("Found HANA credentials.")
except:
    print("Failed to load keyfile. Please make sure you have stored the file in a folder called 'secrets' located in project root.")

Found HANA credentials.


In [29]:
# Set up AI Core OpenAI Langchain Proxy connection using generative-ai-hub-sdk
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
embed = OpenAIEmbeddings(proxy_model_name='text-embedding-3-small')

In [30]:
# Create a connection using hana-ml
from hana_ml import ConnectionContext

# cc = ConnectionContext(userkey='VDB_BETA', encrypt=True) # when using key from hdbuserstore
cc= ConnectionContext(
    address=dbcreds["host"], 
    port=dbcreds["port"], 
    user=dbcreds["user"], 
    password=dbcreds["password"], 
    encrypt=True
    )
connection = cc.connection
hana_schema = cc.get_current_schema()
print(cc.hana_version())
print(hana_schema)

4.00.000.00.1726574991 (fa/CE2024.28)
USR_59GP137CB6UCA1OVEH3DIIEMU


In [31]:
from langchain_community.vectorstores.hanavector import HanaDB

embeddings_table = f"{YOUR_UNIQUE_ID}_LANGCHAIN_RAG_CHAIN"

# creates a table if it does not exists yet
db = HanaDB(
    embedding=embed, connection=connection, table_name=embeddings_table
)

In [32]:
# Delete already existing documents from the table
db.delete(filter={})

True

# Load a PDF document and add it to the table

Important: If you want to use a local document instead of the one downloaded from the web that we are using by default, make sure to have the document available in the data folder. If you want to use a different document, change the file path or web url accordingly.

To add the document to the data folder, create a subfolder called data in the folder where this notebook is located. Then, add the document to the data folder.

Make sure to rename the file or adjust the file path accordingly.

In [33]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

# local_file_path = "./data/genaihub_ai_core.pdf"
web_file_path = "https://help.sap.com/doc/c31b38b32a5d4e07a4488cb0f8bb55d9/CLOUD/en-US/f17fa8568d0448c685f2a0301061a6ee.pdf"

loader = PyPDFLoader(web_file_path, extract_images=False)
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
text_chunks = loader.load_and_split(text_splitter)

print(text_chunks[0])

print(f"Number of document chunks: {len(text_chunks)}")

page_content='Service Guide | PUBLIC
2024-09-17
SAP AI Core© 2024 SAP SE or an SAP affiliate  company. All rights reserved.
THE BEST RUN' metadata={'source': 'https://help.sap.com/doc/c31b38b32a5d4e07a4488cb0f8bb55d9/CLOUD/en-US/f17fa8568d0448c685f2a0301061a6ee.pdf', 'page': 0}
Number of document chunks: 619


In [34]:
split_chunks = [text_chunks[i:i+100] for i in range(0, len(text_chunks), 100)]
# add the loaded document chunks
for chunks in split_chunks:
    db.add_documents(chunks)

In [35]:
# take a look at the table
hdf = cc.sql(f''' SELECT "VEC_TEXT", "VEC_META", TO_NVARCHAR("VEC_VECTOR") AS "VEC_VECTOR" FROM {embeddings_table} ''')
localdf = hdf.head(10).collect()
localdf

Unnamed: 0,VEC_TEXT,VEC_META,VEC_VECTOR
0,Service Guide | PUBLIC\n2024-09-17\nSAP AI Cor...,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[-0.01243771,0.026832066,0.014955354,-0.018389..."
1,Content\n1 What Is SAP AI Core? .................,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[-0.009583199,-0.007215989,0.065046825,-0.0143..."
2,4.1 Free Tier ...................................,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[-0.0016880932,-0.009460846,0.06440146,-0.0158..."
3,Enable Cloud Foundry ............................,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[0.011808621,0.016032327,0.063063875,-0.030212..."
4,Create an Application to Sync Y our Folders .....,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[-0.00795204,0.042798124,0.046562128,-0.017425..."
5,7 .4 Manage Object Store Secrets ................,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[0.005675242,0.031929195,0.04001284,-0.0357683..."
6,Update a Generic Secret .........................,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[0.00043142107,-0.009738354,0.08119084,-0.0111..."
7,8.4 Orchestration ...............................,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[0.025592165,-0.01682132,0.05037549,-0.0211309..."
8,Connect Y our Data ..............................,"{""source"": ""https://help.sap.com/doc/c31b38b32...","[0.015408695,0.010766252,0.040982418,-0.028313..."
9,"11.1 Security Features of Data, Data Flow, and...","{""source"": ""https://help.sap.com/doc/c31b38b32...","[0.0040854346,0.017747745,0.099012434,-0.01032..."


In [45]:
countdf = cc.sql(f''' SELECT COUNT(*) FROM {embeddings_table} ''').collect()
countdf

Unnamed: 0,COUNT(*)
0,619


## Raw similarity search

In [None]:
query = "What is the Orchestration Service"
docs = db.similarity_search_with_relevance_scores(query, k=2, score_threshold=0.1, filter={})

for doc in docs:
    print("-" * 80)
    print("Source: ", doc[0].metadata)
    print(doc[0].page_content)

## Langchain Q&A Chain with RAG

In [38]:
# Create a retriever instance of the vector store
retriever = db.as_retriever()

In [39]:
from langchain_core.prompts import PromptTemplate

prompt_template = """
You are an expert in SAP BTP services. You are provided multiple context items (help documentation) that are related to the prompt you have to answer.
Only use the following pieces of context to answer the question at the end. If you don't find an appropriate answer in the given sources, apologize and say you don't know.

'''
{context}
'''

Question: {question}
"""

PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)
chain_type_kwargs = {"prompt": PROMPT}

In [63]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI

llm = ChatOpenAI(proxy_model_name="gpt-4o")
memory = ConversationBufferMemory(
    memory_key="chat_history", output_key="answer", return_messages=True
)
qa_chain = ConversationalRetrievalChain.from_llm(
    llm,
    db.as_retriever(
        search_type="similarity_score_threshold", 
        search_kwargs={"k": 5, "filter": {}, "score_threshold": 0.05}
    ),
    return_source_documents=True,
    memory=memory,
    verbose=False,
    combine_docs_chain_kwargs={"prompt": PROMPT},
)

In [64]:
question = "What is the Orchestration Service?"

result = qa_chain.invoke({"question": question})
print("Answer from LLM:")
print("================")
print(result["answer"])

source_docs = result["source_documents"]
print("================")
print(f"Number of used source document chunks: {len(source_docs)}")
for doc in source_docs: print(f"Source: {doc.metadata}") # access content through doc.page_content

Answer from LLM:
The Orchestration Service in SAP BTP is a feature that allows you to combine different modules into a pipeline that can be executed with a single API call. This service enables you to configure the details for each module and pass an orchestration configuration in JSON format with the request body. The response from one module in the pipeline can be used as the input for the next module, allowing for a streamlined and efficient workflow.

This service is particularly useful for deploying and managing AI models, and it can be accessed and configured using tools like Postman. After creating a deployment and obtaining a deployment URL, you can make calls to the orchestration service to perform various tasks, including configuring model and templating modules for inference calls.

For more detailed steps on how to use the Orchestration Service, including obtaining an authentication token and making orchestration calls, you can refer to the procedures outlined in the help d

In [42]:
question = "Which model providers are available in the genai hub?"

result = qa_chain.invoke({"question": question})
print(result["answer"])

The Generative AI Hub in SAP AI Core supports models from the following providers:

1. **Anthropic** (via AWS Bedrock)
2. **Amazon** (via AWS Bedrock)
3. **Azure OpenAI Service**
4. **Open source models** (hosted and accessed through SAP AI Core)
5. **GCP Vertex AI** (providing access to PaLM 2 and Gemini models from Google)
6. **Meta**
7. **Mistral AI**

For more detailed information, you can refer to the "Models and Scenarios in the Generative AI Hub" section on page 101.


In [43]:
question = "Elaborate on the second one you mentioned"

result = qa_chain.invoke({"question": question})
print(result["answer"])

Certainly! AWS Bedrock offers several models from Amazon, specifically under the "Titan" family. The Amazon models available via AWS Bedrock include:

1. **Titan Text Express**: This model is designed for text generation tasks. You can invoke this model using a JSON payload where you specify the input text and various configuration options such as `maxTokenCount`, `stopSequences`, `temperature`, and `topP`. For example:
   ```bash
   curl --location '$DEPLOYMENT_URL/invoke' \
       --header 'AI-Resource-Group: default' \
       --header 'Content-Type: application/json' \
       --header "Authorization: Bearer $AUTH_TOKEN" \
       --data '{
           "inputText": "Who am AI?",
           "textGenerationConfig": {
               "maxTokenCount": 10,
               "stopSequences": [],
               "temperature": 0,
               "topP": 1
           }
       }'
   ```

2. **Titan Text Lite**: Similar to the Titan Text Express, this model also focuses on text generation tasks and ca

In [48]:
question = "Which scenario and executable do I have to use when I want to use an LLM from the provider we just talked about?"

result = qa_chain.invoke({"question": question})
print(result["answer"])

To use an LLM from Amazon via AWS Bedrock, you can refer to the following models and their respective scenarios:

1. **Amazon Titan Embed Text**
2. **Amazon Titan Text Lite**
3. **Amazon Titan Text Express**

For these models, you can use the following executable IDs in your configuration:
- **aws-bedrock amazon--titan-embed-text1.2**
- **aws-bedrock amazon--titan-text-lite1**
- **aws-bedrock amazon--titan-text-express1**

Ensure you are using the correct executable ID from your workflow for the configuration. You should also verify that you have access to the scenario containing generative AI by sending a GET request to `{{apiurl}}/v2/lm/scenarios` and setting the Authorization header with Bearer `$TOKEN` and your resource group.

Here is an example curl command for invoking the Amazon Titan Text Express model:
```bash
curl --location '$DEPLOYMENT_URL/invoke' \
--header 'AI-Resource-Group: default' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $AUTH_TOK