The following code snippet is authored by:<br>
- Markus Fath https://github.com/fath-markus

Using langchain's vectore store plugin for HANA Vector Engine
to store embeddings generated by AI Core.

Prerequisites:
- langchain >= 0.1.4
- generative-ai-hub-sdk 1.2.0
- openAI ada deployment on AI Core

See:<br>
https://pypi.org/project/generative-ai-hub-sdk/<br>
https://python.langchain.com/docs/integrations/vectorstores/sap_hanavector<br>


In [10]:
import langchain
from langchain.text_splitter import CharacterTextSplitter

import langchain_community
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores.hanavector import HanaDB

from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings

print('langchain version:', langchain.__version__)
print('langchain_community version:', langchain_community.__version__)
# How to get the gen Ai Hub SDK version?

langchain version: 0.1.4
langchain_community version: 0.0.16


In [11]:
# using langchain to read and split the doc
text_documents = TextLoader("data/state_of_the_union.txt").load()
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=10)
text_chunks = text_splitter.split_documents(text_documents)
print(f"Number of document chunks: {len(text_chunks)}")

# using ai core to embed
embeddings = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002')


Number of document chunks: 88


In [19]:
# Creating a connection using hana-ml
from hana_ml import ConnectionContext
# cc = ConnectionContext(userkey='VDB_BETA', encrypt=True)
cc= ConnectionContext(
    address='[somehost].hanacloud.ondemand.com', 
    port='443', 
    user='[your user]', 
    password='[your password]', 
    encrypt=True
    )
connection = cc.connection

print(cc.hana_version())
print(cc.get_current_schema())

In [13]:
# creates a table if not exists
db = HanaDB(
    embedding=embeddings, connection=connection, table_name="OPENAI_SAMPLE"
)

In [14]:
# Delete already existing documents from the table
# db.delete(filter={})

# add the loaded document chunks
db.add_documents(text_chunks)

[]

In [22]:
# take a look at the table
hdf = cc.sql(''' SELECT "VEC_TEXT", "VEC_META", TO_NVARCHAR("VEC_VECTOR") AS "VEC_VECTOR" FROM "OPENAI_SAMPLE" ''')
df = hdf.head(5).collect()
df


Unnamed: 0,VEC_TEXT,VEC_META,VEC_VECTOR
0,"Madam Speaker, Madam Vice President, our First...","{""source"": ""data/state_of_the_union.txt""}","[-0.0087491,-0.012790698,-0.030803164,-0.01842..."
1,"Six days ago, Russia’s Vladimir Putin sought t...","{""source"": ""data/state_of_the_union.txt""}","[0.00020172076,-0.015397401,0.0011432898,-0.01..."
2,Groups of citizens blocking tanks with their b...,"{""source"": ""data/state_of_the_union.txt""}","[-0.008681937,-0.012989667,-0.002399831,-0.003..."
3,"Please rise if you are able and show that, Yes...","{""source"": ""data/state_of_the_union.txt""}","[-0.0020637077,-0.03406553,-0.0018372422,-0.02..."
4,The United States is a member along with 29 ot...,"{""source"": ""data/state_of_the_union.txt""}","[0.0018747073,-0.007185545,0.014091133,-0.0155..."


In [16]:
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query, k=2)

for doc in docs:
    print("-" * 80)
    print(doc.page_content)

--------------------------------------------------------------------------------
One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.
--------------------------------------------------------------------------------
As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. 

While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice.
