<a href="https://colab.research.google.com/github/hadagarcia/llm-zoomcamp/blob/main/workshop/LLM_zoomcamp_RAG_homework.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Workshop RAG using dlt and LanceDB - Homework

# Create an up-to-date RAG with dlt and LanceDB

We will be creating an LLM chat bot that has the latest knowledge of the employee handbook of a fictional company. We will be able to chat to it about specific policies like PTO, work from home etc.

To build this, we would need to do three things:
1. The company policies exist in a [Notion Page](https://dlthub.notion.site/Employee-handbook-669c2a1e04044465811c8ca22977685d). We will need to first extract the text from these pages.
2. Once extracted, we will want to embed them into vectors and then store them in a vector database.
3. This will allow us to create our RAG: a function that would accept a user question, match it to the information stored in the vector database, and then send the question + relevant information as input to the LLM.

We will be using the following OSS tools for this:
1. dlt for data ingestion:  
  1. dlt can easily connect to any REST API source (like Notion)
  2. It also has integrations with vector databases, like LanceDB.
  3. It also allows to easily plug in functionality like incremental loading.
2. LanceDB as a vector database:
  1. LanceDB is an open-source vector database that is very easy to use and integrate into python workflows
  2. It is in-process and serverless (like DuckDB), which makes querying and retreival very efficient
3. Ollama for RAG:
  1. Ollama is open-source and allows you to easily run LLMs locally

**Note on running this notebook**: We are going to download and use a local Ollama instance for the RAG, so preferably select the **T4 GPU** in the runtime when starting this notebook (Runtime > Change runtime type > Hardware accelerator > T4 GPU).

You can also use the default CPU in case you're facing technical issues, but then your LLM responses might be slower (~2 mins/response)

## Part 1: Create a Notion -> LanceDB pipeline using dlt

### 1. Install requirements

To create a notion -> lancedb pipeline, we need to install:
1. dlt with lancedb extras
2. sentence-transformers: we need to use an embedding model to vectorize and store data inside LanceDB. For this we choose the open-source model "sentence-transformers/all-MiniLM-L6-v2".

In [1]:
%%capture
!pip install dlt[lancedb]==0.5.1a0
!pip install sentence-transformers

### 2. Create a dlt project with rest_api source and lancedb destination

We now create a dlt project using the command `dlt init <source> <destination>`.

This downloads all the modules required for the dlt source (rest api, in this case) into the local directory. See the side panel for the directory structure created.

What is the dlt rest api source?

It is a dlt source that allows you to connect to any REST API endpoint using a declarative configuration. You can:
- pass the endpoints that you want to connect to,
- define the relation between the endpoints
- define how you want to handle pagination and authentication

In [2]:
!yes | dlt init rest_api lancedb

Looking up the init scripts in [1mhttps://github.com/dlt-hub/verified-sources.git[0m...
Cloning and configuring a verified source [1mrest_api[0m (Generic API Source)
Do you want to proceed? [Y/n]: 
Verified source [1mrest_api[0m was added to your project!
* See the usage examples and code snippets to copy from [1mrest_api_pipeline.py[0m
* Add credentials for [1mlancedb[0m and other secrets in [1m./.dlt/secrets.toml[0m
* [1mrequirements.txt[0m was created. Install it with:
pip3 install -r requirements.txt
* Read [1mhttps://dlthub.com/docs/walkthroughs/create-a-pipeline[0m for more information


### 3. Add API credentials

To access APIs, databases, or any third-party applications, one might need to specify relevant credentials.



In [3]:
import os
from google.colab import userdata

os.environ["SOURCES__REST_API__NOTION__API_KEY"] = userdata.get("SOURCES__REST_API__NOTION__API_KEY")

os.environ["DESTINATION__LANCEDB__EMBEDDING_MODEL_PROVIDER"] = "sentence-transformers"
os.environ["DESTINATION__LANCEDB__EMBEDDING_MODEL"] = "all-MiniLM-L6-v2"

os.environ["DESTINATION__LANCEDB__CREDENTIALS__URI"] = ".lancedb"

### 4. Write the pipeline code

**Note**: We first go over the code step by step before putting it into runnable cells

1. Import necessary modules (run this cell)

In [4]:
import dlt
from rest_api import RESTAPIConfig, rest_api_source

from dlt.sources.helpers.rest_client.paginators import BasePaginator, JSONResponsePaginator
from dlt.sources.helpers.requests import Response, Request

from dlt.destinations.adapters import lancedb_adapter

### 5. Run the pipeline

In [5]:
from datetime import datetime, timezone

class PostBodyPaginator(BasePaginator):
    def __init__(self):
        super().__init__()
        self.cursor = None

    def update_state(self, response: Response) -> None:
        # Assuming the API returns an empty list when no more data is available
        if not response.json():
            self._has_next_page = False
        else:
            self.cursor = response.json().get("next_cursor")
            if self.cursor is None:
                self._has_next_page = False

    def update_request(self, request: Request) -> None:
        if request.json is None:
            request.json = {}

        # Add the cursor to the request body
        request.json["start_cursor"] = self.cursor

@dlt.resource(name="employee_handbook")
def rest_api_notion_resource():
    notion_config: RESTAPIConfig = {
        "client": {
            "base_url": "https://api.notion.com/v1/",
            "auth": {
                "token": dlt.secrets["sources.rest_api.notion.api_key"]
            },
            "headers":{
            "Content-Type": "application/json",
            "Notion-Version": "2022-06-28"
            }
        },
        "resources": [
            {
                "name": "search",
                "endpoint": {
                    "path": "search",
                    "method": "POST",
                    "paginator": PostBodyPaginator(),
                    "json": {
                        "query": "homework", #Here we specify to query onle the pages that contain the string homework [hidden page = "Homework: Employee handbook"]
                        "sort": {
                            "direction": "ascending",
                            "timestamp": "last_edited_time"
                        }
                    },
                    "data_selector": "results"
                }
            },
            {
                "name": "page_content",
                "endpoint": {
                    "path": "blocks/{page_id}/children",
                    "paginator": JSONResponsePaginator(),
                    "params": {
                        "page_id": {
                            "type": "resolve",
                            "resource": "search",
                            "field": "id"
                        }
                    },
                }
            }
        ]
    }

    yield from rest_api_source(notion_config,name="employee_handbook")

def extract_page_content(response):
    block_id = response["id"]
    last_edited_time = response["last_edited_time"]
    block_type = response.get("type", "Not paragraph")
    if block_type != "paragraph":
        content = ""
    else:
        try:
            content = response["paragraph"]["rich_text"][0]["plain_text"]
        except IndexError:
            content = ""
    return {
        "block_id": block_id,
        "block_type": block_type,
        "content": content, # It contains all the content for each page respectively
        "last_edited_time": last_edited_time,
        "inserted_at_time": datetime.now(timezone.utc)
    }

@dlt.resource(
    name="employee_handbook",
    write_disposition="merge",
    primary_key="block_id",
    columns={"last_edited_time":{"dedup_sort":"desc"}}
    )
def rest_api_notion_incremental(
    last_edited_time = dlt.sources.incremental("last_edited_time", initial_value="2024-06-26T08:16:00.000Z",primary_key=("block_id"))
):
    # last_value = last_edited_time.last_value
    # print(last_value)

    for block in rest_api_notion_resource.add_map(extract_page_content):
        if not(len(block["content"])):
            continue
        yield block

def load_notion() -> None:
    pipeline = dlt.pipeline(
        pipeline_name="company_policies",
        destination="lancedb",
        dataset_name="notion_pages",
        # full_refresh=True
    )

    load_info = pipeline.run(
        lancedb_adapter(
            rest_api_notion_incremental,
            embed="content" # We need to embed only the content of the page
        ),
        table_name="homework", #employee_handbook
        write_disposition="merge"
    )
    print(load_info)

load_notion()

_dlt_loads
[{'name': 'load_id', 'data_type': 'text', 'nullable': False}, {'name': 'schema_name', 'data_type': 'text', 'nullable': True}, {'name': 'status', 'data_type': 'bigint', 'nullable': False}, {'name': 'inserted_at', 'data_type': 'timestamp', 'nullable': False}, {'name': 'schema_version_hash', 'data_type': 'text', 'nullable': True}]
_dlt_version
[{'name': 'version', 'data_type': 'bigint', 'nullable': False}, {'name': 'engine_version', 'data_type': 'bigint', 'nullable': False}, {'name': 'inserted_at', 'data_type': 'timestamp', 'nullable': False}, {'name': 'schema_name', 'data_type': 'text', 'nullable': False}, {'name': 'version_hash', 'data_type': 'text', 'nullable': False}, {'name': 'schema', 'data_type': 'text', 'nullable': False}]
homework
[{'name': 'block_id', 'nullable': False, 'primary_key': True, 'data_type': 'text'}, {'name': 'block_type', 'data_type': 'text', 'nullable': True}, {'name': 'content', 'x-lancedb-embed': True, 'data_type': 'text', 'nullable': True}, {'dedup_so

Error while fetching `HF_TOKEN` secret value from your vault: 'Requesting secret HF_TOKEN timed out. Secrets can only be fetched when running from the Colab UI.'.
You are not authenticated with the Hugging Face Hub in this notebook.
If the error persists, please let us know by opening an issue on GitHub (https://github.com/huggingface/huggingface_hub/issues/new).


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

_dlt_pipeline_state
[{'name': 'version', 'data_type': 'bigint', 'nullable': False}, {'name': 'engine_version', 'data_type': 'bigint', 'nullable': False}, {'name': 'pipeline_name', 'data_type': 'text', 'nullable': False}, {'name': 'state', 'data_type': 'text', 'nullable': False}, {'name': 'created_at', 'data_type': 'timestamp', 'nullable': False}, {'name': 'version_hash', 'data_type': 'text', 'nullable': True}, {'name': '_dlt_load_id', 'data_type': 'text', 'nullable': False}, {'name': '_dlt_id', 'data_type': 'text', 'nullable': False, 'unique': True}]
UPLOAD
Pipeline company_policies load step completed in 27.37 seconds
1 load package(s) were loaded to destination LanceDB and into dataset notion_pages
The LanceDB destination used <dlt.destinations.impl.lancedb.configuration.LanceDBCredentials object at 0x79bca49ff370> location to store data
Load package 1721758723.7220953 is LOADED and contains no failed jobs


### 6. Visualize the output

In [6]:
import lancedb

db = lancedb.connect(".lancedb")
dbtable = db.open_table("notion_pages___homework") # previously it was notion_pages___employee_handbook

dbtable.to_pandas()

Unnamed: 0,id__,vector__,block_id,block_type,content,last_edited_time,inserted_at_time,_dlt_load_id,_dlt_id
0,c69f1ecf-7b02-5810-8286-3f42659ae9d4,"[-0.024265556, 0.04746074, -0.01179647, 0.0638...",a8196881-ae94-4767-8767-92fe1a327d24,paragraph,We owe our success to our employees. To show o...,2024-07-05 22:34:00+00:00,2024-07-23 18:18:46.885228+00:00,1721758723.7220953,dTuL82n/1Pwi/A
1,f2c18ac0-50f5-5b72-a871-dc5a46780353,"[-0.04966156, 0.10853508, -0.009762607, -0.036...",31fcbf26-2ca5-468a-8af8-d7eb4c2db8c8,paragraph,We want to ensure that private information abo...,2024-07-05 22:38:00+00:00,2024-07-23 18:18:46.888860+00:00,1721758723.7220953,H3PceR7PCPxWXA
2,4553193e-c655-54df-9a33-cfc570bf34d0,"[-0.06316319, 0.17331503, 0.025351755, -0.0191...",da7721fd-3d0f-4c04-bc5e-825ad60bed1c,paragraph,Employee health is important to us. We don’t d...,2024-07-05 22:52:00+00:00,2024-07-23 18:18:46.889079+00:00,1721758723.7220953,zUCY4UFOv0YW5A
3,791be1a1-6c67-530d-87ab-bd9912500ea5,"[-0.109743185, 0.10586075, 0.003290699, -0.021...",ff36dcf3-5faa-40b4-ad8e-92fdc952201e,paragraph,Our company is dedicated to maintaining a safe...,2024-07-05 22:52:00+00:00,2024-07-23 18:18:46.889234+00:00,1721758723.7220953,XopBTpizcZmS6w
4,a83497f4-922c-5d62-bab1-53804e93c811,"[0.05242333, -0.064576, 0.06586297, 0.01454380...",a1ff9697-4bb6-4f1e-b464-dda296dbd307,paragraph,If your job doesn’t require you to be present ...,2024-07-05 22:52:00+00:00,2024-07-23 18:18:46.889389+00:00,1721758723.7220953,zPfuOz+YzR9yBw
5,434b71e9-a11a-519d-a9fe-e3ade78d47d6,"[0.00052337867, -0.054883413, 0.043573413, -0....",e4ec9f4d-b687-4c28-a80d-985bfabcc2ba,paragraph,Remote working refers to working from a non-of...,2024-07-05 22:52:00+00:00,2024-07-23 18:18:46.889547+00:00,1721758723.7220953,6xLDAF+JyrjbmA
6,17816363-54b7-5ba7-b8d5-06d871a25414,"[0.03802633, -0.021509705, 0.04752782, 0.06470...",e6e550dc-b59e-4928-abd7-07eace948681,paragraph,There are some expenses that we will pay direc...,2024-07-05 22:52:00+00:00,2024-07-23 18:18:46.889710+00:00,1721758723.7220953,39xKi5O2lRrpZA
7,2a434cf9-09d9-5514-a88b-02977f2f953e,"[-0.05858811, -0.07540446, 0.033775203, 0.0096...",a269d0ca-ce14-481b-a5f4-9192d6840d6e,paragraph,Our company operates between 9 a.m. to 7 p.m. ...,2024-07-05 22:52:00+00:00,2024-07-23 18:18:46.889884+00:00,1721758723.7220953,b4EtxIWAuIMJdw
8,5f9384fa-7f98-5f52-a06e-05b05f42f69a,"[-0.013599302, 0.0047530197, 0.024835136, 0.01...",5b65f3e7-0a37-429a-818d-f99b53755ebd,paragraph,"In this section, we are going to be covering i...",2024-07-05 23:33:00+00:00,2024-07-23 18:18:46.890067+00:00,1721758723.7220953,45WKQ0JjR/hmKA
9,42af72f6-9db7-54a2-87b2-d466169078ff,"[0.032060888, 0.024244698, 0.008471344, 0.0317...",b27f7d80-f2f1-460e-aa0c-b8e770cf050a,paragraph,Our company observes the following holidays: N...,2024-07-05 22:52:00+00:00,2024-07-23 18:18:46.890214+00:00,1721758723.7220953,SUIP7Z4GK0igNA


## Q1: Rows in LanceDB
How many rows does the lancedb table "notion_pages__homework" have?

In [7]:
len(dbtable)

17

 ---

## Q2: Running the Pipeline: Last edited time

In the demo, we created an incremental dlt resource rest_api_notion_incremental that keeps track of last_edited_time. What value does it store after you've run your pipeline once? (Hint: you will be able to get this value by performing some aggregation function on the column last_edited_time of the table)

TODO: From this point forward check if we need it for our homework.

Now we make change to one of the paragraphs and run the pipeline again to see the effect of incremental loading. We observe two things:
1. The column `inserted_at_time` only changed for the updated row, implying that only this row was added
2. Looking at the primary key `block_id` we see that the original row was dropped and the updated row was inserted

In [None]:
db = lancedb.connect(".lancedb")
dbtable = db.open_table("notion_pages___employee_handbook")

dbtable.to_pandas()

Unnamed: 0,id__,vector__,block_id,block_type,content,last_edited_time,inserted_at_time,_dlt_load_id,_dlt_id
0,6adeb540-d180-5d40-bc84-c40e5c173ea1,"[-0.03892389, 0.1208173, 0.046208583, -0.00543...",baac0ba4-9b60-450e-8cc1-1e6e2a0fb7d9,paragraph,"In this section, we describe what we offer to ...",2024-07-03 17:34:00+00:00,2024-07-08 15:30:04.270715+00:00,1720452602.3108296,+LXDpddrXOJUXg
1,cffdb1bb-a146-5e90-8fbb-a1d577a2a98e,"[-0.0799329, 0.13477285, 0.0053403154, -0.0298...",0e429073-6383-4918-8961-fcc66346067f,paragraph,Employee health is important to us. We don’t d...,2024-06-26 08:46:00+00:00,2024-07-08 15:30:04.272891+00:00,1720452602.3108296,gDimzresa+mpsg
2,25cd721d-fd64-517f-9b3b-34e3fad3522e,"[-0.109743185, 0.10586075, 0.003290699, -0.021...",f4e006d7-9b38-49e9-94cf-552beaa75773,paragraph,Our company is dedicated to maintaining a safe...,2024-07-03 17:26:00+00:00,2024-07-08 15:30:04.273102+00:00,1720452602.3108296,1OZTtNPR9Ab8uA
3,c75b7ef9-96b6-551b-9cdd-795bbe01bb6e,"[0.050755523, -0.06461991, 0.06527383, 0.01465...",71618ca5-6c62-4b66-bc0f-3d855e0c4b8b,paragraph,If your job doesn’t require you to be present ...,2024-06-26 08:52:00+00:00,2024-07-08 15:30:04.273273+00:00,1720452602.3108296,t3k2vgTDh3Fc7Q
4,7a69c4c0-cd55-5090-903e-facf23eadde5,"[0.00052337867, -0.054883413, 0.043573413, -0....",cd15aaf5-6cdc-4a13-835c-2181fd7bf81e,paragraph,Remote working refers to working from a non-of...,2024-07-03 17:19:00+00:00,2024-07-08 15:30:04.273443+00:00,1720452602.3108296,R5wmdAkkOOmdpg
5,ff1141dc-88f6-500a-a8c3-c18e37661650,"[0.03802633, -0.021509705, 0.04752782, 0.06470...",a4b2f0c9-e0c8-4b3c-81e7-ef624809977d,paragraph,There are some expenses that we will pay direc...,2024-07-05 22:32:00+00:00,2024-07-08 15:30:04.273595+00:00,1720452602.3108296,qxJLuSZQaZq/fw
6,a28e913f-761f-5684-8cd5-0d0c49e0338c,"[-0.004968941, -0.003911972, 0.028705625, 0.00...",faacf4ec-90be-4e96-b8b9-29b5112bc7ca,paragraph,Employees receive [20 days] of Paid Time Off (...,2024-06-26 09:03:00+00:00,2024-07-08 15:30:04.662656+00:00,1720452602.3108296,/oDmr/7ulovYhQ
7,a18932d9-1583-5c42-bd0d-0f96738c5e6c,"[0.032060888, 0.024244698, 0.008471344, 0.0317...",e6021a51-f403-4950-80c2-ebff005c7289,paragraph,Our company observes the following holidays: N...,2024-06-26 09:08:00+00:00,2024-07-08 15:30:04.662820+00:00,1720452602.3108296,9I7CX2AaReDvng
8,93661874-13a2-5a43-bed8-868005dfd5e2,"[-0.0131553095, 0.008382407, 0.017044391, 0.05...",b8f4cc6d-c28c-4071-9545-caadce5eb37b,paragraph,These holidays are considered “off-days” for m...,2024-06-26 09:09:00+00:00,2024-07-08 15:30:04.662974+00:00,1720452602.3108296,HiI2XYEzmAEQMA
9,b220778f-1118-5c22-b614-3bc0fd0a602b,"[0.027987516, 0.067343615, 0.03980646, 0.00774...",ea7a1beb-6874-4f41-966d-dc1f80a1f635,paragraph,Employees who are unable to work due to illnes...,2024-06-26 09:11:00+00:00,2024-07-08 15:30:04.663125+00:00,1720452602.3108296,UwkDg5Htn2kTvA


## Part 2: Create a RAG bot using Ollama

With the contents from the employee handbook vectorized and stored in LanceDB, we're now ready to create our RAG with Ollama.



What is RAG?

Retrieval Automated Generation (RAG) is the framework of retrieving relevant documents from a database and passing it along with a query into an LLM so that the LLM can generate context-aware responses.

In our case, if we were to ask an LLM questions about our specific employee policies, then we would not get useful responses because the LLM has never seen these policies. A solution to this could be to paste all of the policies into the prompt and then ask our questions. However, this would not be feasible given the limitations on the size of the context window.

We can bypass this limitation using RAG:
1. Given a user question, we would first embed this question into a vector
2. Then we would do a vector search on our LanceDB table and retrieve top k results - which would be the most relevant paragraphs corresponding to the question
3. Finally, we would pass the original question along with the retrieved paragraphs as a prompt into the LLM


1. Install Ollama into the notebook's local runtime

In [None]:
!curl -fsSL https://ollama.com/install.sh | sh

>>> Downloading ollama...
############################################################################################# 100.0%
>>> Installing ollama to /usr/local/bin...
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


2. Start Ollama using `ollama serve`. This needs to run in the backgound - so we run it using `nohup` (to see the output log, open nohup.out).

In [None]:
!nohup ollama serve > nohup.out 2>&1 &

3. Pull the desired model. We're going to be using `llama1-uncensored` (takes about 1m to download)

In [None]:
%%capture
!ollama pull llama2-uncensored

In this next part we're going to be writing functions that accept user question, retrieve the relevant paragraphs from lancedb, and the pass the question and the retrieved pages as input into the ollama chat assistant

4. pip install ollama and import it

In [None]:
!pip install ollama

Collecting ollama
  Downloading ollama-0.2.1-py3-none-any.whl (9.7 kB)
Collecting httpx<0.28.0,>=0.27.0 (from ollama)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/75.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<0.28.0,>=0.27.0->ollama)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<0.28.0,>=0.27.0->ollama)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, ollama
Successfully installed h11-0.14.0 httpcore-1.0.5 

In [None]:
import ollama

5. Write a function that can retrieve content from lancedb relevant to the user query
  
  With LanceDB, you don't have to explicity embed the question. LanceDB stores information on the embedding model used and automatically embeds the question.

  We use the `db_table.search()` function to query the DB and then limit it to the top 2 most similar results and return that as the context to pass to the RAG.

  Limiting results is important because otherwise there might be too much confusing information. Similarly only picking the top choice might not give enough information.

In [None]:
def retrieve_context_from_lancedb(dbtable, question, top_k=2):

    query_results = dbtable.search(query=question).to_list()
    context = "\n".join([result["content"] for result in query_results[:top_k]])

    return context

6. Finally we define a very basic RAG. We define a simple system prompt, retrieve the relevant context for the user query with the function defined above and then send the user question and the context to the `llama2-uncensored` model.

In [None]:
def main():
  # Connect to the lancedb table
  db = lancedb.connect(".lancedb")
  dbtable = db.open_table("notion_pages___employee_handbook")

  # A system prompt telling ollama to accept input in the form of "Question: ... ; Context: ..."
  messages = [
      {"role": "system", "content": "You are a helpful assistant that helps users understand policies inside a company's employee handbook. The user will first ask you a question and then provide you relevant paragraphs from the handbook as context. Please answer the question based on the provided context. For any details missing in the paragraph, encourage the employee to contact the HR for that information. Please keep the responses conversational."}
  ]

  while True:
    # Accept user question
    question = input("You: ")

    # Retrieve the relevant paragraphs on the question
    context = retrieve_context_from_lancedb(dbtable,question,top_k=2)

    # Create a user prompt using the question and retrieved context
    messages.append(
        {"role": "user", "content": f"Question: '{question}'; Context:'{context}'"}
    )

    # Get the response from the LLM
    response = ollama.chat(
        model="llama2-uncensored",
        messages=messages
    )
    response_content = response['message']['content']
    print(f"Assistant: {response_content}")

    # Add the response into the context window
    messages.append(
        {"role": "assistant", "content":response_content}
    )

And we run the RAG! Some example questions you can ask:

* How many vacation days do I get?
* Can I get maternity leave?

**Note**: This is a very basic implementation of a RAG, since this workshop is mainly about data ingestion. So expect some weird answers. If you do stop and restart the cell, you will need to rerun the cell containing `ollama serve` first.

In [None]:
main()

You: How many vacation days do I get?
Assistant: Based on the provided context, the employee handbook states that employees are entitled to eight (8) paid vacation days per year. The first step would be to ask if they have been employed with the company for at least 90 days or more, as some companies may offer additional PTO upon meeting this requirement. If it has been over 90 days since employment, please provide details on what other options the employee has for PTO if they have not taken their floating day yet.
If the employee is an exempt employee, they will receive an additional day of PTO that they must take within 12 months after any holiday observed by the company. The employee should check with HR to see what holidays are counted for this purpose and if there are any special rules or requirements that apply. If it has been over 12 months since any holiday was observed, then the employee may be able to use their PTO for a personal day instead of taking an extra day off from wo

KeyboardInterrupt: Interrupted by user

There's a lot more to learn and do with dlt and LanceDB, find more info the [dlt docs](https://dlthub.com/docs/) and the [LanceDB docs](https://lancedb.github.io/lancedb/)

If you have questions about this workshop or dlt, feel free to join our [community on Slack](https://dlthub.com/community).

If you're at EuroPython in Prague this week, come see us at our booth!