### Oracle Vector DB wrapped as a llama-index custom Vector Store

* ispired by: https://docs.llamaindex.ai/en/stable/examples/low_level/vector_store.html

In [1]:
import logging
import sys

from typing import List, Any, Optional, Dict, Tuple
from llama_index.vector_stores.types import (
    VectorStore,
    VectorStoreQuery,
    VectorStoreQueryResult,
)
from llama_index import StorageContext, VectorStoreIndex, ServiceContext
from llama_index.schema import TextNode, BaseNode, Document

import oci
import ads
from ads.llm import GenerativeAIEmbeddings, GenerativeAI
import oracledb

from config_private import COMPARTMENT_OCID, ENDPOINT

from oracle_vector_db import OracleVectorStore

In [2]:
# for debugging
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [3]:
def load_oci_config():
    # read OCI config to connect to OCI with API key

    # are you using default profile?
    oci_config = oci.config.from_file("~/.oci/config", "DEFAULT")

    return oci_config

In [4]:
# setup
oci_config = load_oci_config()

# need to do this way
api_keys_config = ads.auth.api_keys(oci_config)

# english, or for other language use: multilingual
MODEL_NAME = "cohere.embed-english-v3.0"

embed_model = GenerativeAIEmbeddings(
    compartment_id=COMPARTMENT_OCID,
    model=MODEL_NAME,
    auth=ads.auth.api_keys(oci_config),
    # Optionally you can specify keyword arguments for the OCI client, e.g. service_endpoint.
    client_kwargs={
        "service_endpoint": ENDPOINT
    },
)

#### Using the wrapper for the DB Vector Store

In [5]:
v_store = OracleVectorStore(verbose=False)

In [6]:
query = "What is JSON Relational Duality?"

In [7]:
# embed the query using OCI GenAI
query_embedding = embed_model.embed_documents([query])[0]

#  wrap in llama-index
query_obj = VectorStoreQuery(
    query_embedding=query_embedding, similarity_top_k=5
)

#### Use our Vector Store DB

In [8]:
%%time

q_result = v_store.query(query_obj)

CPU times: user 19.3 ms, sys: 5.29 ms, total: 24.6 ms
Wall time: 491 ms


In [9]:
for n, id, sim in zip(q_result.nodes, q_result.ids, q_result.similarities):
    print(f"Dod. id: {id}")
    print(f"Similarity: {-sim}")
    print(n.text)
    print("")

Dod. id: 1c0e1d15-c6ba-4d1d-89f9-11a497fe7bd0
Similarity: 0.605
2 Application Development JSON JSON-Relational Duality JSON Relational Duality Views are fully updatable JSON views over relational data. Data is still stored in relational tables in a highly efficient normalized format but can be accessed by applications in the form of JSON documents. Duality views provide you with game-changing flexibility and simplicity by overcoming the historical challenges developers have faced when building applications using relational or document models. Related Resources View Documentation JSON Schema JSON Schema-based validation is allowed with the SQL condition IS JSON and with a PL/SQL utility function. A JSON schema is a JSON document that specifies allowed properties (field names) and the corresponding allowed data types, and whether they are optional or mandatory. By default, JSON data is schemaless, providing flexibility. However, you may want to ensure that your JSON data contains particu

#### Integrate in the bigger RAG picture

In [10]:
llm_oci = GenerativeAI(
    compartment_id=COMPARTMENT_OCID,
    max_tokens=1024,
    # Optionally you can specify keyword arguments for the OCI client, e.g. service_endpoint.
    client_kwargs={
        "service_endpoint": ENDPOINT
    },
)

In [11]:
service_context = ServiceContext.from_defaults(llm=llm_oci, embed_model=embed_model)

In [12]:
index = VectorStoreIndex.from_vector_store(vector_store=v_store,
    service_context=service_context
)

In [13]:
query_engine = index.as_query_engine(similarity_top_k=5)

In [15]:
%%time

response = query_engine.query(query)

print(f"Question: {query}") 
print(response.response)
print("")

Question: What is JSON Relational Duality?
JSON Relational Duality is a concept that allows for flexible and simple application development by using relational or document models. It provides a way to store data in a highly efficient, normalized format while still having the ability to access it through applications in the form of JSON documents. This is done by creating duality views, which are fully updatable JSON views over relational data. This allows for the joining of JSON data with non-JSON relational data, the generation of JSON documents from relational data, and the projection of JSON data into a relational format. It also provides JSON schema-based validation to ensure data contains mandatory fixed structures and typing along with other optional flexible components. 

Would you like to know more about any of the aforementioned concepts? 

CPU times: user 60 ms, sys: 7.36 ms, total: 67.3 ms
Wall time: 6.56 s
