# Basic RAG tutorial with templates

:::info
In this tutorial we show you how to do retrieval augmented generation (RAG) with `superduper`.
Note that this is just an example of the flexibility and power which `superduper` gives 
to developers. `superduper` is about much more than RAG and LLMs. 
:::

As in the vector-search tutorial we'll use `superduper` documentation for the tutorial.
We'll add this to a testing database by downloading the data snapshot:

In [None]:
!curl -O https://superduper-public-demo.s3.amazonaws.com/text.json

In [4]:
import json

from superduper import superduper, Document

db = superduper('mongomock://test')

with open('text.json') as f:
    data = json.load(f)

_ = db['docu'].insert_many([{'txt': r} for r in data]).execute()

2024-Jun-17 09:11:37.77| INFO     | Duncans-MBP.fritz.box| superduper.base.build:71   | Data Client is ready. mongomock.MongoClient('localhost', 27017)
2024-Jun-17 09:11:37.77| INFO     | Duncans-MBP.fritz.box| superduper.base.build:44   | Connecting to Metadata Client with engine:  mongomock.MongoClient('localhost', 27017)
2024-Jun-17 09:11:37.77| INFO     | Duncans-MBP.fritz.box| superduper.base.build:162  | Connecting to compute client: None
2024-Jun-17 09:11:37.77| INFO     | Duncans-MBP.fritz.box| superduper.base.datalayer:86   | Building Data Layer
2024-Jun-17 09:11:37.77| INFO     | Duncans-MBP.fritz.box| superduper.base.build:227  | Configuration: 
 +---------------+------------------+
| Configuration |      Value       |
+---------------+------------------+
|  Data Backend | mongomock://test |
+---------------+------------------+
2024-Jun-17 09:11:37.86| INFO     | Duncans-MBP.fritz.box| superduper.backends.mongodb.data_backend:194  | Table docu does not exist, auto creating..

Let's verify the data in the `db` by querying one datapoint:

In [5]:
db['docu'].find_one().execute()

Document({'txt': "---\nsidebar_position: 5\n---\n\n# Encoding data\n\nIn AI, typical types of data are:\n\n- **Numbers** (integers, floats, etc.)\n- **Text**\n- **Images**\n- **Audio**\n- **Videos**\n- **...bespoke in house data**\n\nMost databases don't support any data other than numbers and text.\n enables the use of these more interesting data-types using the `Document` wrapper.\n\n### `Document`\n\nThe `Document` wrapper, wraps dictionaries, and is the container which is used whenever \ndata is exchanged with your database. That means inputs, and queries, wrap dictionaries \nused with `Document` and also results are returned wrapped with `Document`.\n\nWhenever the `Document` contains data which is in need of specialized serialization,\nthen the `Document` instance contains calls to `DataType` instances.\n\n### `DataType`\n\nThe [`DataType` class](../apply_api/datatype), allows users to create and encoder custom datatypes, by providing \ntheir own encoder/decoder pairs.\n\nHere is

The first step in a RAG application is to create a `VectorIndex`. The results of searching 
with this index will be used as input to the LLM for answering questions.

Read about `VectorIndex` [here](../apply_api/vector_index.md) and follow along the tutorial on 
vector-search [here](./vector_search.md).

In [7]:
import requests 

from superduper import Application, Document, VectorIndex, Listener, vector
from superduper.ext.sentence_transformers.model import SentenceTransformer
from superduper.base.code import Code

def postprocess(x):
    return x.tolist()

datatype = vector(shape=384, identifier="my-vec")
    
model = SentenceTransformer(
    identifier="my-embedding",
    datatype=datatype,
    predict_kwargs={"show_progress_bar": True},
    signature="*args,**kwargs",
    model="all-MiniLM-L6-v2",      
    device="cpu",
    postprocess=Code.from_object(postprocess),
)

listener = Listener(
    identifier="my-listener",
    model=model,
    key='txt',
    select=db['docu'].find(),
    predict_kwargs={'max_chunk_size': 50},
)

vector_index = VectorIndex(
    identifier="my-index",
    indexing_listener=listener,
    measure="cosine"
)

db.apply(vector_index)

2024-Jun-17 09:11:48.66| INFO     | Duncans-MBP.fritz.box| superduper.base.code:33   | Created code object:
from superduper import code

@code
def postprocess(x):
    return x.tolist()





2024-Jun-17 09:11:50.56| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.listener.Listener'> with identifier: my-listener
2024-Jun-17 09:11:50.56| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.ext.sentence_transformers.model.SentenceTransformer'> with identifier: my-embedding
2024-Jun-17 09:11:50.56| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.backends.mongodb.query.MongoQuery'> with identifier: docu-find
2024-Jun-17 09:11:50.56| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:50.56| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.base.code.Code'> with identifier: postprocess




2024-Jun-17 09:11:51.64| INFO     | Duncans-MBP.fritz.box| superduper.backends.local.compute:42   | Submitting job. function:<function method_job at 0x10c0a1440>


210it [00:00, 118101.88it/s]

2024-Jun-17 09:11:52.72| INFO     | Duncans-MBP.fritz.box| superduper.components.model:752  | Computing chunk 0/4





Batches:   0%|          | 0/2 [00:00<?, ?it/s]

2024-Jun-17 09:11:53.31| INFO     | Duncans-MBP.fritz.box| superduper.components.model:778  | Adding 50 model outputs to `db`
2024-Jun-17 09:11:53.31| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:53.31| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:53.31| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:53.32| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:53.32| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
202

Batches:   0%|          | 0/2 [00:00<?, ?it/s]

2024-Jun-17 09:11:53.94| INFO     | Duncans-MBP.fritz.box| superduper.components.model:778  | Adding 50 model outputs to `db`
2024-Jun-17 09:11:53.94| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:53.94| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:53.94| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:53.94| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:53.94| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
202

Batches:   0%|          | 0/2 [00:00<?, ?it/s]

2024-Jun-17 09:11:54.57| INFO     | Duncans-MBP.fritz.box| superduper.components.model:778  | Adding 50 model outputs to `db`
2024-Jun-17 09:11:54.57| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:54.57| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:54.57| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:54.57| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:54.58| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
202

Batches:   0%|          | 0/2 [00:00<?, ?it/s]

2024-Jun-17 09:11:55.26| INFO     | Duncans-MBP.fritz.box| superduper.components.model:778  | Adding 50 model outputs to `db`
2024-Jun-17 09:11:55.26| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:55.26| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:55.26| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:55.26| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:55.26| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
202

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2024-Jun-17 09:11:55.47| INFO     | Duncans-MBP.fritz.box| superduper.components.model:778  | Adding 10 model outputs to `db`
2024-Jun-17 09:11:55.47| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:55.47| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:55.47| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
2024-Jun-17 09:11:55.47| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.DataType'> with identifier: my-vec
2024-Jun-17 09:11:55.47| INFO     | Duncans-MBP.fritz.box| superduper.base.document:362  | Building leaf <class 'superduper.components.datatype.Native'> with identifier: 
202

Loading vectors into vector-table...: 210it [00:00, 3589.49it/s]

2024-Jun-17 09:11:56.68| SUCCESS  | Duncans-MBP.fritz.box| superduper.backends.local.compute:48   | Job submitted on <superduper.backends.local.compute.LocalComputeBackend object at 0x28779e4d0>.  function:<function callable_job at 0x10c0a13a0> future:66146223-1232-409f-9ae7-2aa086879bc8





([<superduper.jobs.job.ComponentJob at 0x2edaeccd0>,
  <superduper.jobs.job.FunctionJob at 0x2e981ff90>],
 VectorIndex(identifier='my-index', uuid='1aab5610-85e9-48b3-b3cc-e44ab97e66e9', indexing_listener=Listener(identifier='my-listener', uuid='0246f030-5efd-48ab-9af4-a38140b6fef4', key='txt', model=SentenceTransformer(preferred_devices=('cuda', 'mps', 'cpu'), device='cpu', identifier='my-embedding', uuid='1472cac2-78bc-4345-ac79-2baf601253a3', signature='*args,**kwargs', datatype=DataType(identifier='my-vec', uuid='706fb6c5-8eeb-44b4-a7ab-9e39b92914f6', encoder=None, decoder=None, info=None, shape=(384,), directory=None, encodable='native', bytes_encoding=<BytesEncoding.BYTES: 'Bytes'>, intermediate_type='bytes', media_type=None), output_schema=None, flatten=False, model_update_kwargs={}, predict_kwargs={'show_progress_bar': True}, compute_kwargs={}, validation=None, metric_values={}, num_workers=0, object=SentenceTransformer(
   (0): Transformer({'max_seq_length': 256, 'do_lower_cas

Now that we've set up a `VectorIndex`, we can connect this index with an LLM in a number of ways.
A simple way to do that is with the `SequentialModel`. The first part of the `SequentialModel`
executes a query and provides the results to the LLM in the second part. 

The `RetrievalPrompt` component takes a query with a "free" variable as input. 
This gives users great flexibility with regard to how they fetch the context
for their downstream models.

We're using OpenAI, but you can use any type of LLm with `superduper`. We have several 
native integrations (see [here](../ai_integraitons/)) but you can also [bring your own model](../models/bring_your_own_model.md).

In [None]:
from superduper.ext.llm.prompter import *
from superduper.base.variables import Variable
from superduper import Document
from superduper.components.model import SequentialModel
from superduper.ext.openai import OpenAIChatCompletion

q = db['docu'].like(Document({'txt': Variable('prompt')}), vector_index='my-index', n=5).find().limit(10)

def get_output(c):
    return [r['txt'] for r in c]

prompt_template = RetrievalPrompt('my-prompt', select=q, postprocess=Code.from_object(get_output))

llm = OpenAIChatCompletion('gpt-3.5-turbo')
seq = SequentialModel('rag', models=[prompt_template, llm])

db.apply(seq)

Now we can test the `SequentialModel` with a sample question:

In [None]:
seq.predict('Tell be about vector-indexes')

:::tip
Did you know you can use any tools from the Python ecosystem with `superduper`.
That includes `langchain` and `llamaindex` which can be very useful for RAG applications.
:::

In [None]:
from superduper import Application

app = Application('rag', components=[vector_index, seq])

In [None]:
app.export('rag')

In [None]:
!cat rag/component.json | jq .