![Banner](banner.png)

# Exercise 2: Working with a Retrieval Augmented Generation (RAG) model

This notebook introduces the concept of **Retrieval Augmented Generation (RAG)**, which combines the power of vector searches with text generation by Artificial Intelligence. The goal is to extract information from a document repository and augment it with a Generative AI model to provide contextualized answers.

### Aims of the notebook:
1. **Creating vectors for documents**: First, PDF documents are stored in the database, split into smaller blocks of text and converted into vectors using a pre-trained language model.
2. **Creating vector indexes**: By creating a vector index on the generated embeddings, an efficient similarity search is enabled.
3. **Retrieval of texts**: After vectorizing the document content, text snippets closest to the queries can be retrieved from the database.
4. **Integration of Generative AI**: The generated text snippets are combined with a Generative AI model to create an augmented response consisting of both the existing documents and AI-generated content.
5. **Creation of triggers and functions**: Automated processes for updating embeddings on new documents are implemented and tested.

### Data basis:
This notebook uses PDF documents as a basis, which are stored and processed in the database. These are realistic documents, such as scientific reports or frequently asked questions (FAQs). These are stored as **BLOB** data in the `RAG_TAB` table.

Each PDF document is divided into **text chunks**, with each chunk containing a maximum of 100 words. These chunks are stored as text in the `RAG_CHUNKS` table. In addition, a **vector embedding** is generated for each chunk, which is used for semantic similarity searches. These embeddings make it possible to find sections of text that are semantically closest to a search query.

The data stored in the tables includes:
- **ID**: A unique identifier for each document.
- **Data (BLOB)**: The content of the uploaded PDF files.
- **Chunk data**: Text snippets extracted from the PDF documents.
- **Embedding**: The vector embeddings computed for each text snippet.

This notebook shows how modern AI technologies can be used to efficiently search existing information from large sets of documents and augment it with generative models.

Have fun & success!

## Preparing the environment

Before the actual processing can begin, some basic preparations must be made. This section describes how to install the required libraries and set up the connection to the Oracle database.

### Installed packages:
1. `oracledb`: This package allows you to access Oracle databases from Python and execute SQL queries. 
2. `ipython-sql`: Allows direct use of SQL within a Jupyter notebook.
3. `pandas`: Used to display database queries in an easy-to-read DataFrame format.

### Oracle database setup:
- **Instant Client Initialization**: The Oracle Instant Client is initialized using the `oracledb.init_oracle_client()` function. The path to the required libraries is specified explicitly to ensure that the database connection works.
- **Connection to the database**: The connection to the database is established by environment variables (`HOST_NAME` and `PDB_NAME`). These variables contain information about the host and database (PDB - Pluggable Database), which are combined to form a connection string (`dsn`).
- **Connection confirmation**: If the connection has been successfully established, this is confirmed by the output of the connection details.


In [1]:
!pip install oracledb 
!pip install ipython-sql
!pip install pandas



In [40]:
import oracledb
import pandas as pd
import os
import warnings
import time

warnings.filterwarnings('ignore')
pd.set_option('expand_frame_repr', False)
pd.options.display.max_colwidth = 800

d = '/home/jovyan/.jupyter/instantclient_23_5'
oracledb.init_oracle_client(lib_dir=d)
host = os.environ.get('HOST_NAME')
pdb = os.environ.get('PDB_NAME')
cs = host + '/' + pdb
print(cs)
# should be something like 'db23ai.subbb3fff175.quickcluster.oraclevcn.com/michael.subbb3fff175.quickcluster.oraclevcn.com'

connection = oracledb.connect(user='vector', password='vector', dsn=cs)
print(connection)

db23ai.subbb3fff175.quickcluster.oraclevcn.com/marcel.subbb3fff175.quickcluster.oraclevcn.com
<oracledb.Connection to vector@db23ai.subbb3fff175.quickcluster.oraclevcn.com/marcel.subbb3fff175.quickcluster.oraclevcn.com>


## Script 00: Creating the document table

This section creates the `RAG_TAB` table for saving documents in BLOB format. It contains:
- **id (number)**: Unique identifier.
- **data (blob)**: Saves the document contents.

In [11]:
sql = """
BEGIN
 EXECUTE IMMEDIATE('drop table if exists RAG_TAB purge');
 EXECUTE IMMEDIATE('create table RAG_TAB (id number, data blob)'); 
END;
"""

with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


## Script 01: Inserting documents

In this step, two PDF documents are inserted into the `RAG_TAB` table:
- **Document 1**: *breast-cancer-facts-and-figures-2019-2020.pdf*
- **Document 2**: *Coronavirus-FAQ.pdf*

In [12]:
sql= """
BEGIN
  insert into RAG_TAB values(1, to_blob(bfilename('VEC_DUMP', 'breast-cancer-facts-and-figures-2019-2020.pdf')));  
  insert into RAG_TAB values(2, to_blob(bfilename('VEC_DUMP', 'Coronavirus-FAQ.pdf')));
  commit;
END;
"""

with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


In [13]:
sql= """
select t.id, round(dbms_lob.getlength(t.data)/1024,2) "[kB]" from RAG_TAB t
"""
df = pd.read_sql(sql=sql, con=connection)
display(df)

Unnamed: 0,ID,[kB]
0,1,1360.9
1,2,104.94


## Script 02: Creating the `RAG_CHUNKS` table and inserting chunks

In this step, the `RAG_CHUNKS` table is created, which stores the split document texts (chunks) and their embeddings.

### Steps:
1. **Creating the `RAG_CHUNKS` table**:
 - **doc_id**: Document ID assigned to the original document.
 - **chunk_id**: Unique ID for each chunk.
 - **chunk_data**: The text of the chunk (max. 4000 characters).
 - **chunk_embedding**: Vector embedding of the chunk.

2. **Inserting the chunks and embeddings**:
 - The PDF data from `RAG_TAB` is broken down into smaller text blocks (chunks) (max. 100 words per chunk). 
 - For each chunk, a vector embedding is created with the model `minilml6_model` and inserted into the table `RAG_CHUNKS`.

### Model selection: minilml6_model

In exercise 2, we use the model **`minilml6_model`**, which enables a particularly efficient calculation of vector embeddings. This model was chosen for its compact architecture and fast computation without sacrificing accuracy. It is ideal for scenarios in which large data sets need to be processed and resources such as computing power and storage space need to be used efficiently.

### model delimitation

- `distiluse-base-multilingual-cased-v2`: Precise, multilingual (including German), computationally intensive - ideal for semantic similarity.
- `minilml6_model`: trained only for English texts, efficient, resource-saving - suitable for fast access to large amounts of data.

![title](img/end-to-end_pipeline_with_DBMS_VECTOR_CHAIN_package.png)

In [14]:
sql= """
begin
   execute immediate ('drop table if exists RAG_CHUNKS purge');
   execute immediate ('CREATE TABLE RAG_CHUNKS ('||
     '  doc_id          NUMBER, '||
     '  chunk_id        NUMBER, '||
     '  chunk_data      VARCHAR2(4000), '||
     '  chunk_embedding VECTOR )'
   );
end;
"""


with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


In [15]:
sql= """
begin
   INSERT INTO RAG_CHUNKS
   SELECT dt.id                      doc_id,
       et.embed_id                chunk_id,
       et.embed_data              chunk_data,
       to_vector(et.embed_vector) chunk_embedding 
   FROM RAG_TAB dt,
     dbms_vector_chain.utl_to_embeddings(
         dbms_vector_chain.utl_to_chunks(dbms_vector_chain.utl_to_text(dt.data),json('{
             "by" : "words",
             "max" : "100",
             "overlap" : "0",
             "split" : "recursively",
             "language" : "american",
             "normalize":"all"}')),
         json('{"provider":"database", "model":"minilml6_model"}')) t,
         json_table(t.column_value, '$[*]' COLUMNS (embed_id NUMBER path '$.embed_id', embed_data VARCHAR2(4000) path '$.embed_data',
                embed_vector clob path '$.embed_vector')) et;

commit;
end;
"""

tic = time.perf_counter()
with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")
toc = time.perf_counter()
print(f"Chunking and vectorizing took {toc - tic:0.4f} seconds")

SQL execution successful
Chunking and vectorizing took 9.5504 seconds


## Script 03: Creating a compound trigger for `RAG_TAB`

In this step, a **compound trigger** is created to ensure that new entries in `RAG_TAB` are also automatically transferred to the `RAG_CHUNKS` table. The trigger prevents data from becoming inconsistent between the two tables.

### How the trigger works:
1. **AFTER EACH ROW**: After a new row has been inserted in `RAG_TAB`, the new blob (document content) and ID are saved.
2. **AFTER STATEMENT**: Once the insertion process is complete, the document is split into text chunks, a vector embedding is created for each chunk and this information is transferred to the `RAG_CHUNKS` table.


In [16]:
sql= """
-- Compound trigger to avoid mismatch in rag_tab and rac_chunks!
CREATE OR REPLACE TRIGGER INSERT_rag_tab
FOR INSERT ON rag_tab
COMPOUND TRIGGER 
the_new blob;
new_id number;
AFTER EACH ROW IS
BEGIN
 the_new := :new.data;
 new_id  := :new.id;
END AFTER EACH ROW;
AFTER STATEMENT IS
BEGIN
 INSERT INTO RAG_CHUNKS
 SELECT new_id                     doc_id,
        et.embed_id                chunk_id,
        et.embed_data              chunk_data,
        to_vector(et.embed_vector) chunk_embedding
 FROM RAG_TAB dt,
     dbms_vector_chain.utl_to_embeddings(
         dbms_vector_chain.utl_to_chunks(dbms_vector_chain.utl_to_text(the_new),json('{
             "by" : "words",
             "max" : "100",
             "overlap" : "0",
             "split" : "recursively",
             "language" : "american",
             "normalize":"all"}')),
         json('{"provider":"database", "model":"minilml6_model"}')) t,
     json_table(t.column_value, '$[*]' COLUMNS (embed_id NUMBER path '$.embed_id', embed_data VARCHAR2(4000) path '$.embed_data',
                embed_vector clob path '$.embed_vector')) et;
END AFTER STATEMENT;
END INSERT_rag_tab;
"""

with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


## Script 03a: Inserting a new document in `RAG_TAB`

In this step, another document is inserted into the table `RAG_TAB`:

- **Document 3**: *breast-cancer-new.pdf*

As soon as the document has been inserted, the previously created trigger ensures that the document is split into text chunks, vector embeddings are created and these are transferred to the `RAG_CHUNKS` table.

In [17]:
sql="""
BEGIN
insert into RAG_TAB values(3, to_blob(bfilename('VEC_DUMP', 'breast-cancer-new.pdf')));
commit;
END; 
"""
tic = time.perf_counter()
with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")
toc = time.perf_counter()
print(f"Loading, chunking and vectorizing took {toc - tic:0.4f} seconds")

SQL execution successful
Loading, chunking and vectorizing took 13.0958 seconds


## Script 03b: Creation of a vector index on `RAG_CHUNKS`

In this step, a **vector index** is created on the `chunk_embedding` column of the `RAG_CHUNKS` table to improve the efficiency of the similarity search based on vector embeddings.

### Details:
1. **Index creation**: The index `IDX_RAG_CHUNKS_EMBED` is created with a **Cosine distance** as metric and a target accuracy of 95%. 
2. **Check the index**: An SQL query is executed to check the newly created index and ensure that it was created correctly.
3. **Retrieve the index parameters**: The parameters of the vector index are output in JSON format to view the structure and configuration of the index.

Note: Once a **HNSW** index has been created, DML operations (e.g. INSERT, UPDATE) are no longer permitted on the indexed table. An **IVF index** could be used for DML operations.

In [18]:
sql="""
create vector index IDX_RAG_CHUNKS_EMBED on RAG_CHUNKS(chunk_embedding) 
 organization INMEMORY NEIGHBOR GRAPH
 distance COSINE
 with target accuracy 95
"""

with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


In [19]:
sql="""
SELECT INDEX_NAME, INDEX_TYPE, INDEX_SUBTYPE
FROM USER_INDEXES
WHERE INDEX_NAME='IDX_RAG_CHUNKS_EMBED'
"""

df = pd.read_sql(sql=sql, con=connection)
display(df)

Unnamed: 0,INDEX_NAME,INDEX_TYPE,INDEX_SUBTYPE
0,IDX_RAG_CHUNKS_EMBED,VECTOR,INMEMORY_NEIGHBOR_GRAPH_HNSW


In [20]:
sql="""
SELECT JSON_SERIALIZE(IDX_PARAMS returning varchar2 PRETTY) "IDX_Params"
FROM VECSYS.VECTOR$INDEX
where IDX_NAME = 'IDX_RAG_CHUNKS_EMBED'
"""

import json

with connection.cursor() as cursor:     
    cursor.execute(sql)
    res, =cursor.fetchone()
    json_obj = json.loads(res)
    pretty= json.dumps(json_obj, indent=4)
    print(pretty)

{
    "type": "HNSW",
    "num_neighbors": 32,
    "efConstruction": 300,
    "distance": "COSINE",
    "accuracy": 95,
    "vector_type": "FLOAT32",
    "vector_dimension": 384,
    "degree_of_parallelism": 1,
    "pdb_id": 4,
    "indexed_col": "CHUNK_EMBEDDING"
}


## Script 04: Functions for the integration of Generative AI (RAG)

This section presents two variants of the function `rag_with_genai_function` that combines Generative AI with semantic similarity search in the stored document chunks. **However, only variant 2 is used in the workshop.
### Variant 1: Generative AI with Oracle Cloud Integration (documentation only)
This function uses **Oracle Generative AI** via the provided API. The steps include:
1. **Create a vector embedding**: An embedding is generated for the input.
2. **Search for similar chunks**: The most similar chunks from the saved PDF documents are determined based on the vector distance.
3 **Generative AI response**: The generated text chunk is combined with a response from Oracle Generative AI. The text is sent to the **cohere.command** model service via an API request.
4. **Return of the result**: The function returns the document chunk as well as the generative response.


### Variant 2: External Generative AI integration with SSL (workshop variant)
This function integrates **external Generative AI** via an HTTP API with SSL certificate. Only this variant is used in the workshop. The steps are:
1. **Vector embedding**: A vector embedding is created for the input.
2. **Search for chunks**: The most similar chunks from the `RAG_CHUNKS` table are retrieved.
3. **API request to external service**: The request is sent to an external Generative AI model (e.g. Llama3.1) via an API endpoint.
4. **Combine response**: The Generative AI response is combined with the document chunks and returned.

This variant shows how an external Generative AI service is integrated into the Retrieval Augmented Generation (RAG) process and is used in the workshop.

In [21]:
### variant 2
### accesses the internal GenAI service via an external name and SSL certificate

sql="""
CREATE OR REPLACE FUNCTION rag_with_genai_function( rag_input in varchar2) return varchar2
AS
    l_url             VARCHAR2(400) := 'http://olama.meinnetzwerk.com/api/chat';
    req               utl_http.req;
    resp              utl_http.resp;
    body              VARCHAR2(4000);
    buffer            VARCHAR2(8192);
    long_text         VARCHAR2(32767);
    query_vector      CLOB;
    cursor c1 is SELECT * FROM rag_chunks ORDER BY VECTOR_DISTANCE(CHUNK_EMBEDDING, query_vector, EUCLIDEAN_SQUARED) 
              FETCH FIRST 1 ROWS ONLY WITH TARGET ACCURACY 90; 
BEGIN
   -- in case you are behind a proxy, please uncomment the following line after setting the correct proxy address
   -- utl_http.set_proxy('http://username:passwd@192.168.22.33:5678');
   body := '{"model": "llama3.1:latest","messages": [{"role": "system", "content": "you are a lightly arrogant medical doctor"},' || 
                                                    '{"role": "user",   "content": "'|| rag_input ||'"}],' ||
	       '"stream": false, "options": {"use_mmap": true,"use_mlock": true,"num_thread": 8}}';
           
   req := utl_http.begin_request(l_url, 'POST', 'HTTP/1.1');
   utl_http.set_header(req, 'Accept', '*/*');
   utl_http.set_header(req, 'Content-Type', 'application/json');
   utl_http.set_header(req, 'Content-Length', length(body));
   utl_http.write_text(req, body);
   
   resp := utl_http.get_response(req);
   BEGIN
      utl_http.read_text(resp, buffer);
   EXCEPTION
      WHEN utl_http.end_of_body THEN
         utl_http.end_response(resp);
   END;
   buffer := json_value (buffer, '$.message.content' returning varchar2);

   SELECT vector_embedding(minilml6_model using rag_input as data) into query_vector;
   for row_1 in c1 loop
      long_text := '*** Internal PDF Search: \n\n'||row_1.chunk_data||' \n\n';
   end loop;

   long_text := long_text||'*** Generative AI Response:\n\n'||buffer;

   return (long_text);
END;
"""

with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


## Script 05: Network configuration: Assignment of access rights

In this step, the network configuration is adjusted to ensure that the user `vector` has the necessary rights to access external network requests. This is important to enable API calls to external Generative AI services.

### SQL command:
- **DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE**: This command adds a network access control list (ACL) that gives the user `vector` permission to make network connections.
 - **host**: The host name, here `'*'`, allows connections to all hosts.
 - **privilege_list**: The list of authorizations, here the privilege `connect` is granted so that the user `vector` can establish connections to external services.

In [22]:
sql="""
BEGIN
  DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE(
    host => '*',
    ace => xs$ace_type(privilege_list => xs$name_list('connect'),
                       principal_name => 'vector',
                       principal_type => xs_acl.ptype_db));
END;
"""

with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


## Script 06: Creation of OCI access data for API usage

In this step, a **credential** is created for access to Oracle Cloud Infrastructure (OCI) services. These credentials are required to use the Generative AI functions in the OCI

### SQL command:
- **Creation of a JSON object**: A JSON object is created that contains the necessary OCI access data:
 - **user_ocid**: The unique identifier of the user in OCI.
 - **tenancy_ocid**: The identifier of the tenancy (the OCI environment).
 - **compartment_ocid**: The identifier of the compartment in which the resources are located.
 - **private_key**: The private key required for authentication.
 - **fingerprint**: The fingerprint of the public key related to the private key.
  
- **Creation of the OCI credential**: The OCI credentials are created using the `dbms_vector_chain.create_credential` command and stored under the name `'OCI_CRED'`. These credentials are later used to authenticate to the OCI Generative AI service.

## Checking the stored access data (credentials)

In this step, a query is executed to check the **credentials** (access data) stored in the database.

### SQL query:
- **owner**: Displays the owner of the credentials (e.g. the user under which the credentials were created).
- **credential_name**: The name of the stored credential, in this case the credentials should be `OCI_CRED`, which was previously created.
- **enabled**: Indicates whether the saved credentials are enabled and available.


In [23]:
sql="""
select owner,credential_name, enabled from dba_credentials
"""

df = pd.read_sql(sql=sql, con=connection)
display(df)


Unnamed: 0,OWNER,CREDENTIAL_NAME,ENABLED
0,VECTOR,OCI_CRED,True


## Perform a vector search based on a search text

In this section, a semantic search is performed, searching the embedded chunks that are most similar to the entered search text


### Steps:
1. **Creation of an embedding for the search text**:
 - The entered search text (`suchtext`) is converted into a vector embedding using the `MINILML6_MODEL` model. This is saved in the variable `query_vector`.
   
2. **Search for the most similar chunks**:
 - The query searches the `RAG_CHUNKS` table and compares the stored embeddings of the chunks with the search text embedding using the **Cosine-Distance**.
 - The two most similar chunks for each document ID are retrieved, with the results sorted by vector distance.

3. **Display results**:
 - The results of the query are displayed in a DataFrame and contain the **doc_id**, **chunk_id** and the corresponding **chunk_data** (text section).


In [41]:
suchtext = input("enter searh text here")

sql1 = """
 SELECT vector_embedding(MINILML6_MODEL using :suchtext as data) from dual
"""

sql2 = """
SELECT doc_id, chunk_id, chunk_data
 FROM RAG_CHUNKS
 ORDER BY vector_distance(chunk_embedding , :query_vector, COSINE) 
 FETCH FIRST 2 PARTITIONS BY doc_id, 1 ROWS ONLY
"""

with connection.cursor() as cursor:     
    query_vector = cursor.var(oracledb.DB_TYPE_VECTOR)    
    cursor.execute(sql1, suchtext=suchtext)
    query_vector, =cursor.fetchone()
    
    df = pd.read_sql(sql2, params={'query_vector': query_vector}, con=connection)
    display(df)

Suchtext hier einfügen can men get breast cancer too ?


Unnamed: 0,DOC_ID,CHUNK_ID,CHUNK_DATA
0,1,86,Male breast cancer \nBreast cancer in men is r...
1,3,47,"prostate cancer, age, obesity and smoking. In ..."



## Generative AI response based on a search text: 
### Comparison of found document chunks with the general knowledge of the AI

In this section, the previously created function `rag_with_genai_function` is used to generate an extended answer that combines both relevant text sections from the stored documents and a Generative AI answer.

### Steps:
1. **Call the `rag_with_genai_function`**:
 - The entered search text (`suchtext`) is passed to the `rag_with_genai_function` function. This function searches for the most similar chunks in the table `RAG_CHUNKS`, based on the cosine distance.
   
2. **Generative AI response**:
 - In addition to the semantic search for relevant chunks, a Generative AI response is generated via the external Generative AI service. This response is combined with the found text section.

In [25]:
sql="""
select rag_with_genai_function(:suchtext) from dual
"""

with connection.cursor() as cursor:     
    cursor.execute(sql, suchtext=suchtext)
    res, =cursor.fetchone()
    print(res)

*** Internal PDF Search: 

Male breast cancer 
Breast cancer in men is rare, accounting for less than 1% 
of breast cancer cases in the US. However, since 1975, the 
incidence rate has increased slightly, from 1.0 case per 
100,000 men during 1975-1979 to 1.2 cases per 100,000 
men during 2012-2016.

41

Men are more likely than women 
(51% versus 36%) to be diagnosed with advanced 
(regional- or distant-stage) breast cancer, 
8

which likely 

*** Generative AI Response:

Yes, men can indeed develop breast cancer. (I mean, it's not as common as in women, of course, but still.) Male breast cancer, also known as gynecomastia-induced carcinoma, accounts for about 1% of all breast cancer cases.

Now, I know what you're thinking: "But doctor, isn't male breast cancer extremely rare?" Ah, yes... well, relatively speaking. (Smirk) While it's true that men are much less likely to develop breast cancer than women, there is still a small but significant risk factor, especially for certain subpo

## Generative AI response based on a search text:
### RAG, retrieval augmented generation
In this section, the text sections found by vector search are transmitted as a pre-filtered knowledge base
The query may only be answered with the transmitted documents, not with the (perhaps questionable) general knowledge of the AI.

A function real_rag_with_genai is created, which is similar to the previous function rag_with_genai_function. However, the new function real_rag_with_genai does not output the two results for comparison, but transmits the result of the vector search as a JSON parameter to the GenAI query.

In [46]:
sql="""
CREATE OR REPLACE FUNCTION real_rag_with_genai( rag_input in varchar2) return varchar2
AS
    l_url             VARCHAR2(400) := 'http://olama.meinnetzwerk.com/api/chat';
    req               utl_http.req;
    resp              utl_http.resp;
    body1             VARCHAR2(1000);
    body2             VARCHAR2(1000);
    body3             VARCHAR2(1000);
    body4             VARCHAR2(1000);
    buffer            VARCHAR2(8192);
    long_text         VARCHAR2(32767);
    query_vector      CLOB;
    cursor c1 is SELECT * FROM rag_chunks ORDER BY VECTOR_DISTANCE(CHUNK_EMBEDDING, query_vector, EUCLIDEAN_SQUARED) 
              FETCH FIRST 1 ROWS ONLY WITH TARGET ACCURACY 90; 
BEGIN
   -- defining the right prompt for RAG: specify what data to use and how to answer   
   body1 := '{"model": "llama3.1:latest","messages": [{"role": "system", "content": " '||
   'Your answers should begin with the phrase ''According to the information found in my database''.' || 
   'Please answer the following question only with information given in the provided DOCUMENTS"} ,' ;
   body2 := '{"role": "system", "content": "DOCUMENTS: {{ Not much useful information }} "} , ' ;
   body3 := '{"role": "user",   "content": "'|| rag_input ||'"}],' ;
   body4 :=  '"stream": false, "options": {"use_mmap": true,"use_mlock": true,"num_thread": 8}}';
           
   -- now, take the found chunks from the vector search as input to the genAI enquiry
   SELECT vector_embedding(minilml6_model using rag_input as data) into query_vector;
   long_text := body1||body2 ;
   for row_1 in c1 loop
      long_text := long_text||'{"role": "system", "content": "DOCUMENTS: {{'||replace(row_1.chunk_data,chr(10),' ')||' }} "} , ';
   end loop;
   long_text := long_text||body3||body4;

   -- in case you are behind a proxy, please uncomment the following line after setting the correct proxy address
   -- utl_http.set_proxy('http://username:passwd@192.168.22.33:5678');
   req := utl_http.begin_request(l_url, 'POST', 'HTTP/1.1');
   utl_http.set_header(req, 'Accept', '*/*');
   utl_http.set_header(req, 'Content-Type', 'application/json');
   utl_http.set_header(req, 'Content-Length', length(long_text));
   utl_http.write_text(req, long_text);

   -- now, call the genAI with the users enquiry plus found information
   resp := utl_http.get_response(req);
   BEGIN
      utl_http.read_text(resp, buffer);
   EXCEPTION
      WHEN utl_http.end_of_body THEN
         utl_http.end_response(resp);
   END;
   buffer := json_value (buffer, '$.message.content' returning varchar2);

   return (buffer);
END;
"""

with connection.cursor() as cursor:     
    cursor.execute(sql)
    if cursor.warning :
        print(cursor.warning)
    else :
        print("SQL execution successful")

SQL execution successful


In [47]:
sql="""
select real_rag_with_genai(:suchtext) from dual
"""

with connection.cursor() as cursor:     
    cursor.execute(sql, suchtext=suchtext)
    res, =cursor.fetchone()
    print(res)

According to the information found in my database, yes, men can get breast cancer. In fact, it's stated that "Breast cancer in men is rare, accounting for less than 1% of breast cancer cases in the US."


## summary

In this notebook, the combination of vector search and generative AI was presented. Documents were stored in the database, divided into text chunks and provided with vector embeddings. With the help of a vector index, relevant text sections could be found efficiently.

The function `rag_with_genai_function` made it possible to retrieve semantically similar document content and supplement it with a Generative AI response. This method provides a powerful solution for answering complex queries by combining the strengths of retrieval and Generative AI.