## Introduction

[Video Walkthrough](https://www.youtube.com/watch?v=Obbn15rZfvQ&list=PLypX5sYuDqvrqsXTw876gGHosCKvK_7QS&index=13)

This notebook demonstrates the implementation of a Retrieval-Augmented Generation (RAG) pipeline using KDB.AI and Large Language Models. By the end of this tutorial, you'll understand how to leverage vector databases and LLMs to create an effective RAG system.

### Setup and Dependencies
Install kdbai_client and import the necessary dependencies

##### Install Required Libraries

In [None]:
# Install required libraries
!pip install llama-index fastembed kdbai_client openai



##### Import Dependencies

In [None]:
import os
from getpass import getpass
import kdbai_client as kdbai
import time
from llama_index.core import Document, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
import pandas as pd
from fastembed import TextEmbedding
import openai
import textwrap

##### Connect to KDB.AI

In [None]:
KDBAI_ENDPOINT = (
    os.environ["KDBAI_ENDPOINT"]
    if "KDBAI_ENDPOINT" in os.environ
    else input("KDB.AI endpoint: ")
)
KDBAI_API_KEY = (
    os.environ["KDBAI_API_KEY"]
    if "KDBAI_API_KEY" in os.environ
    else getpass("KDB.AI API key: ")
)

KDB.AI endpoint: https://cloud.kdb.ai/instance/wrve8kwshj
KDB.AI API key: ··········


##### Initialize Embedding Model

In [None]:
fastembed = TextEmbedding()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

tokenizer_config.json:   0%|          | 0.00/1.24k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/706 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/695 [00:00<?, ?B/s]

model_optimized.onnx:   0%|          | 0.00/66.5M [00:00<?, ?B/s]

### Data Preparation


##### Download Dataset
We'll use the Paul Graham Essay Dataset as our corpus.

In [None]:
!llamaindex-cli download-llamadataset PaulGrahamEssayDataset --download-dir ./data

100% 1/1 [00:00<00:00,  2.72it/s]
Successfully downloaded PaulGrahamEssayDataset to ./data


### Create a KDB.AI session and table

In [None]:
KDBAI_TABLE_NAME = "paul_graham"
session = kdbai.Session(endpoint=KDBAI_ENDPOINT, api_key=KDBAI_API_KEY)

# Drop existing table if it exists
try:
    session.table(KDBAI_TABLE_NAME).drop()
    time.sleep(5)
except kdbai.KDBAIException:
    pass

In [None]:
# Define table schema

schema = dict(
    columns=[
        dict(name="text", pytype="bytes"),
        dict(
            name="embedding",
            vectorIndex=dict(type="flat", metric="L2", dims=384),
        ),
    ]
)

table = session.create_table(KDBAI_TABLE_NAME, schema)

#### Load and Parse Documents

In [None]:
node_parser = SentenceSplitter(chunk_size=500, chunk_overlap=100)
essays = SimpleDirectoryReader(input_dir="./data/source_files").load_data()
docs = node_parser.get_nodes_from_documents(essays)
len(docs)

44

##### Generate Embeddings

In [None]:
embedding_model = TextEmbedding()
documents = [doc.text for doc in docs]
embeddings = list(embedding_model.embed(documents))

Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

In [None]:
print(documents[1])

So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.

With microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]

The first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.

Computers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Appl

#####  Insert Data into KDB.AI Table

In [None]:
records_to_insert_with_embeddings = pd.DataFrame({
    "text": documents,
    "embedding": embeddings
})

In [None]:
table.insert(records_to_insert_with_embeddings)

True

### RAG Implementation

##### Define Query and Generate Embedding

In [None]:
query = "How does Paul Graham decide what to work on?"

In [None]:
query_embedding = list(embedding_model.embed([query]))[0].tolist()

##### Perform Vector Search

In [None]:
search_results = session.table(KDBAI_TABLE_NAME).search([query_embedding], n=10)
search_results_df = search_results[0]

In [None]:
pd.set_option('display.max_colwidth', None)
print("Top Search Results Based on Query:", query)
df = pd.DataFrame(search_results_df)
df.head(2)

Top Search Results Based on Query: How do you decide what to work on?


Unnamed: 0,text,embedding,__nn_distance
0,"We knew this was coming, but it was still hard when it did.\n\nI kept working on YC till March, to help get that batch of startups through Demo Day, then I checked out pretty completely. (I still talk to alumni and to new startups working on things I'm interested in, but that only takes a few hours a week.)\n\nWhat should I do next? Rtm's advice hadn't included anything about that. I wanted to do something completely different, so I decided I'd paint. I wanted to see how good I could get if I really focused on it. So the day after I stopped working on YC, I started painting. I was rusty and it took a while to get back into shape, but it was at least completely engaging. [18]\n\nI spent most of the rest of 2014 painting. I'd never been able to work so uninterruptedly before, and I got to be better than I had been. Not good enough, but better. Then in November, right in the middle of a painting, I ran out of steam. Up till that point I'd always been curious to see how the painting I was working on would turn out, but suddenly finishing this one seemed like a chore. So I stopped working on it and cleaned my brushes and haven't painted since. So far anyway.\n\nI realize that sounds rather wimpy. But attention is a zero sum game. If you can choose what to work on, and you choose a project that's not the best one (or at least a good one) for you, then it's getting in the way of another project that is. And at 50 there was some opportunity cost to screwing around.\n\nI started writing essays again, and wrote a bunch of new ones over the next few months. I even wrote a couple that weren't about startups. Then in March 2015 I started working on Lisp again.\n\nThe distinctive thing about Lisp is that its core is a language defined by writing an interpreter in itself. It wasn't originally intended as a programming language in the ordinary sense. It was meant to be a formal model of computation, an alternative to the Turing machine. If you want to write an interpreter for a language in itself, what's the minimum set of predefined operators you need?","[-0.052486546, -0.037547894, 0.028428264, 0.002678531, 0.010856037, 0.01053419, -0.051966634, -0.015711516, 0.02491498, -0.037169933, 0.018490974, -0.004179084, -0.027491497, 0.01952365, 0.03600465, 0.011625322, -0.04530152, 0.0035254613, -0.030323114, -0.053392835, -0.038193867, -0.016321087, 0.013805427, -0.06196813, 0.04244484, 0.07093362, -0.03509179, -0.075523235, 0.009419392, -0.18652868, -0.012934241, 0.022660254, 0.0134734465, -0.0060935053, 0.031707864, 0.05483813, -0.059910905, 0.026537346, -0.021477206, 0.022409543, -0.0015035876, 0.025525393, -0.11022901, -0.032164246, 0.033352446, -0.018563112, -0.022256548, -0.04628748, 0.012898676, 0.029967379, -0.034196883, -0.053546622, 0.036861476, 0.050337255, -0.01440328, 0.02089807, 0.051477704, 0.05381708, 0.035958868, 0.011148146, -0.005329434, -0.023108445, -0.08358203, 0.039739255, -0.0154602025, 0.03253821, -0.006801028, -0.061871927, 0.050129626, 0.14079295, -0.0070907357, 0.019820116, -0.05166138, 0.06913528, 0.015589136, -0.04536847, 0.017244404, 0.010174067, 0.0384817, -0.024191776, 0.028073508, -0.008786634, 0.026387988, 0.015315114, -0.011378572, 0.01482849, -0.031060888, 0.0444626, 0.040066842, 0.0493386, 0.05094562, 0.020320805, 0.005881617, 0.013353571, -0.028635636, -0.042912528, -0.0044636712, 0.022763168, -0.019849092, 0.49997365, ...]",0.750161
1,"I hoped to lure Robert into working on it with me, but there I ran into a hitch. Robert was now a postdoc at MIT, and though he'd made a lot of money the last time I'd lured him into working on one of my schemes, it had also been a huge time sink. So while he agreed that it sounded like a plausible idea, he firmly refused to work on it.\n\nHmph. Well, I'd do it myself then. I recruited Dan Giffin, who had worked for Viaweb, and two undergrads who wanted summer jobs, and we got to work trying to build what it's now clear is about twenty companies and several open source projects worth of software. The language for defining applications would of course be a dialect of Lisp. But I wasn't so naive as to assume I could spring an overt Lisp on a general audience; we'd hide the parentheses, like Dylan did.\n\nBy then there was a name for the kind of company Viaweb was, an ""application service provider,"" or ASP. This name didn't last long before it was replaced by ""software as a service,"" but it was current for long enough that I named this new company after it: it was going to be called Aspra.\n\nI started working on the application builder, Dan worked on network infrastructure, and the two undergrads worked on the first two services (images and phone calls). But about halfway through the summer I realized I really didn't want to run a company â especially not a big one, which it was looking like this would have to be. I'd only started Viaweb because I needed the money. Now that I didn't need money anymore, why was I doing this? If this vision had to be realized as a company, then screw the vision. I'd build a subset that could be done as an open source project.\n\nMuch to my surprise, the time I spent working on this stuff was not wasted after all. After we started Y Combinator, I would often encounter startups working on parts of this new architecture, and it was very useful to have spent so much time thinking about it and even trying to write some of it.\n\nThe subset I would build as an open source project was the new Lisp, whose parentheses I now wouldn't even have to hide.","[-0.06683652, -0.029170737, -0.051824465, -0.017708223, 0.016255112, -0.038584303, 0.004476572, -0.020616354, 0.026657112, -0.03683246, 0.02232789, 0.030152095, 0.024562534, -0.024379557, 0.080725335, -0.01326071, -0.052012756, -0.0306641, -0.01662159, 0.010331471, 0.034236412, -0.0405334, -0.002090824, -0.072778344, -0.02403441, 0.08795552, 0.0073284702, -0.07628283, 0.014052818, -0.17661175, -0.011885215, 0.016272064, 0.029111236, -0.0038097566, 0.0074332557, 0.043599308, -0.0034821727, -0.016008407, -0.055728395, 0.004959276, -0.004255187, -0.004253083, -0.011360369, 0.011034045, 0.0061246776, -0.027602559, -0.0053156177, -0.014020872, -0.045290228, 0.028560318, -0.055169746, -0.017281296, -0.010271299, 0.049023762, -0.033473283, 0.024389677, 0.008034339, 0.036835313, -0.036669653, -0.0056001716, 0.017473731, -0.012990314, -0.06698649, 0.031132068, 0.014796759, 0.04288554, -0.04848914, -0.05004304, 0.024222052, 0.105335176, 0.02729237, 0.01853294, -0.053454213, 0.060757145, 0.03820476, 0.0522973, -0.0112534445, 0.045833707, 0.040847186, -0.026113003, 0.020456789, -0.007908949, 0.0103222495, 0.018824141, -0.038614195, 0.017026488, 0.021087913, 0.02945828, 0.04862186, -0.00991343, 0.024734724, -0.005127729, 0.035253156, -0.013549324, -0.0076618753, -0.01507674, -0.012505747, 0.029462999, -0.029663675, 0.467665, ...]",0.814416


##### RAG Function Definition

In [None]:
# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY_HERE"
    

def RAG(retrieved_data,prompt):
    messages = "Answer the following query in three sentences based on the context and only the context: " + "\n"
    messages += prompt + "\n"
    if len(retrieved_data) > 0:
        messages += "Context: " + "\n"
        for data in retrieved_data:
            messages += data + "\n"
    openai.api_key = os.environ["OPENAI_API_KEY"]
    response = openai.chat.completions.create(
        model="gpt-4-turbo-preview",
        messages=[
            {
            "role": "user",
            "content": [
                {"type": "text", "text": messages},
            ],
            },
        ],
        max_tokens=300,
    )
    content = response.choices[0].message.content
    return content

##### Execute RAG Pipeline

In [None]:
# Utility Function for Text Wrapping

def print_wrapped(text, width=80):
    wrapper = textwrap.TextWrapper(width=width)
    word_list = wrapper.wrap(text=text)
    for line in word_list:
        print(line)

In [None]:
print("Query:", query)

print_wrapped(RAG(search_results_df["text"],query))

Query: How do you decide what to work on?
The person decides what to work on by following their interests and the urge to
try something completely different, as evidenced by their shift from working
intensively at YC to painting and later writing essays and working on Lisp. They
are guided by the desire to dive deeply into subjects or projects that capture
their curiosity, even if that means deviating from their previous path or
professional expertise. The decision-making process involves a mix of self-
reflection on past choices, seeking new challenges, and the willingness to
explore areas where they can be truly independent and potentially make lasting
contributions.


### Drop Table To Conserve Resources

In [None]:
table.drop()