# Database-bound prompt templates

A common pattern with prompt templates is the following: you want to fill the variables in the prompt with information read from an external storage, such as a database. Conceptually at "format-time", the table(s) primary key(s), such as `user_id`, would be the inputs needed to retrieve the record(s) through DB lookup, but then _other columns_ from these records will need to end up in the formatted prompt string: for instance, `user_name` or `short_name`.

The "converter-based" prompt template, and its DB-specific implementations, are designed to enable this kind of usage in just a few lines of code.

In other words, the `ConverterPromptTemplate` class implements a decoupling between the inputs to the `format` method and the actual set of variables that need to be supplied to format the prompt, introducing the notion of a "converter" function, essentially bridging this newly-opened gap.

In the above example, the "converter" retains a connection to a database and handles the DB I/O, thereby providing a function from `{"user_id": "..."}` to `{"user_name": "...", "short_name": "..."}`.

While some custom flows may use converter-based prompt templates directly, in most cases you'll want to take advantage of one of the "DB reader prompt templates" that are built on top of them. In the following you'll first quickly see the general idea in action, then turn to a practical application using Apache Cassandra® as the database bound to the template.

### In pictures

The diagram below illustrates how the "converter function" (a map from a `dict` to a `dict`) binds to the database and provides a database-lookup bridge to convert the input variables into those required to complete the prompt.

![Cassandra-bound prompt template](https://user-images.githubusercontent.com/14221764/284892387-d8dcfd8f-828a-4ae8-9874-b4961747eaa5.png)
These prompt templates support "passthrough" variables as well (the `adj` string in the example) as well as partialing.

The converter, in general, can behave in any way it wants - although the intended usage is that of a lookup on one (or more) database tables.

## A simple example

Let's see the various parts in action with a mock "database retriever":

In [1]:
from typing import Any, Dict

from langchain.prompts.database.converter_prompt_template import ConverterPromptTemplate


def mock_db_reader(keys: Dict[str, str]) -> Dict[str, Any]:
    # we pretend we're reading this from a database to keep the demonstration simple:
    user_id = keys["user_id"]
    return {
        "user_name": user_id.replace("_", " ").title(),
        "short_name": user_id[:2].upper(),
    }


print(mock_db_reader({"user_id": "marc_triggiani"}))

{'user_name': 'Marc Triggiani', 'short_name': 'MA'}


When creating the template, we supply the converter, an F-string, and specify the input/output variables (with special care to tell the template what does the converter do):

In [2]:
prompt_fstring = (
    "Please write a {adj} greeting for {user_name} "
    "(you may use the informal name '{short_name}' where appropriate)"
)

c_p_template = ConverterPromptTemplate(
    template=prompt_fstring,
    input_variables=["adj"],
    converter=mock_db_reader,
    converter_input_variables=["user_id"],
    converter_output_variables=["user_name", "short_name"],
)

In [3]:
c_p_template.format(user_id="otto_schneider", adj="sassy")

"Please write a sassy greeting for Otto Schneider (you may use the informal name 'OT' where appropriate)"

## Cassandra-bound prompt templates

Now that the core logic is covered, let us use the `CassandraReaderPromptTemplate` specialized prompt template and bind the template formatting to actual data from a database table.

> If you want to run this part, you'll need access to a suitable database. The easiest path is to use [Astra DB through CQL](https://docs.datastax.com/en/astra-serverless/docs/), a serverless DB-as-a-service built on Cassandra (coming with a free tier for evaluation). Alternatively, you can get an [Apache Cassandra®](https://cassandra.apache.org) cluster running - we'll show how to connect to both types of database momentarily.

### Initialize DB access

In [4]:
import cassio

To connect to **Astra DB through CQL**, the quickest way is to get the necessary connection secrets from the Astra UI. These are [DB Admin Token](https://awesome-astra.github.io/docs/pages/astra/create-token/#c-procedure) and [Database ID](https://awesome-astra.github.io/docs/pages/astra/faq/#where-should-i-find-a-database-identifier), which respectively look like `"AstraCS:zW3fN4..."` and `"01234567-89ab-cdef-0123-456789abcdef"`. Further, you may optionally want to specify a keyspace name (or simply use the default one for the database).

In [None]:
cassio.init(
    database_id="REPLACE_ME",  # <-- "AstraCS:zW3fN4..."
    token="REPLACE_ME",  # <-- "01234567-89ab-cdef-0123-456789abcdef"
    # keyspace="my_keyspace"   #     Optional
)

If you want to connect to a **Cassandra cluster** instead, you will prepare a `cassandra.cluster.Session` object beforehand ([docs](https://docs.datastax.com/en/developer/python-driver/latest/getting_started/#connecting-to-cassandra)) and specify that, with the keyspace, to CassIO's `init` method:

In [5]:
# Run this cell as an alternative to the Astra DB cell above!

from cassandra.cluster import Cluster

# This is an example - check your configuration and the docs for more options
cluster = Cluster(["127.0.0.1"])
session = cluster.connect()

cassio.init(session=session, keyspace="my_keyspace")

### Prepare data on DB

Do not mind the next cell too much: it has the sole purpose of preparing suitable data in your database for this demo to run properly.

In [6]:
_session = cassio.config.resolve_session()
_keyspace = cassio.config.resolve_keyspace()

_session.execute(
    f"CREATE TABLE IF NOT EXISTS {_keyspace}.demo_users (user_id TEXT PRIMARY KEY, user_name TEXT, short_name TEXT);"
)
_session.execute(
    f"INSERT INTO {_keyspace}.demo_users (user_id, user_name, short_name) VALUES ('john_doe', 'John Doe', 'JO');"
)
_session.execute(
    f"INSERT INTO {_keyspace}.demo_users (user_id, user_name, short_name) VALUES ('ann_smith', 'Ann Smith', 'AN');"
)

<cassandra.cluster.ResultSet at 0x7fe72663c7f0>

### Set up the prompt template

In [7]:
from langchain.prompts.database import CassandraReaderPromptTemplate

When creating the `CassandraReaderPromptTemplate`, you will pass a "field mapper" to list, for each of the variables in the prompt,
the source table and column. The prompt template will figure out which inputs are required (e.g. `user_id`) and, if necessary, will merge information from several tables with an optimized query plan.

In [8]:
f_mapper = {
    "user_name": ("demo_users", "user_name"),
    "short_name": ("demo_users", "short_name"),
}

cassandra_prompt_template = CassandraReaderPromptTemplate(
    template=prompt_fstring,
    field_mapper=f_mapper,
)

Once the above template is created, you can format it over and over simply with:

In [9]:
cassandra_prompt_template.format(user_id="john_doe", adj="lackadaisical")

"Please write a lackadaisical greeting for John Doe (you may use the informal name 'JO' where appropriate)"

In [10]:
cassandra_prompt_template.format(user_id="ann_smith", adj="tongue-in-cheek")

"Please write a tongue-in-cheek greeting for Ann Smith (you may use the informal name 'AN' where appropriate)"

_Advanced topics_:
- the class has an `admit_nulls` boolean parameter to control the behaviour for missing data/rows;
- the full syntax for the field mapper values is `(table, column [, column-specific admit nulls [, default to supply if None found]])`;
- alternatively, the "column" can be any function of the row expressed as a dict: `("demo_users", lambda row: row["user_name"].upper())`.

Please check the [documentation](https://github.com/CassioML/cassio/blob/main/src/cassio/db_reader/multi_table_cassandra_reader.py) for CassIO's `MultiTableCassandraReader` for more details.

### As Runnable

These prompt templates implement the Runnable interface and as such can be seamlessly integrated in broader chains:

_(the next cells require OpenAI: make sure you have the `OPENAI_API_KEY` environment variable set)_

In [11]:
from langchain.llms import OpenAI
from langchain.schema.runnable import RunnableLambda

In [12]:
llm = OpenAI()


def output_cleaner(raw_output: str):
    return raw_output.strip().replace("\n", " ")

In [13]:
chain = c_p_template | llm | RunnableLambda(output_cleaner)

print(chain.invoke({"user_id": "ann_smith", "adj": "very enthusiastic"}))

Hello AN! It's so great to see you! We've been looking forward to your arrival and can't wait to get started! Welcome!


## Feast prompt templates

Another prompt template bound to a source of data is the `FeastReaderPromptTemplate`,
which behaves in a similar way as the Cassandra case seen so far. We are not going to detail
the process steps (if you are a Feast user you will immediately figure out what to do), rather just show
a usage example to get you started:

In [None]:
from langchain.prompts.database import FeastReaderPromptTemplate

In [None]:
store = ...  # your Feast feature store
my_prompt_string = ...

ecommerce_template = FeastReaderPromptTemplate(
    feature_store=store,
    template=my_prompt_string,
    input_variables=[
        "user_question"
    ],  # <-- additional, "passthrough" template variables
    field_mapper={
        # {
        #    prompt_variable: (view_name, feature_name, [admit_nulls, [default]])
        # }
        "age": ("user_data", "age"),
        "purchases": ("user_data", "purchases"),
        "visit_frequency": ("user_data", "visit_frequency"),
        "active_cart": ("active_cart", "active_cart"),
    },
)

In [None]:
ecommerce_template.format(
    user_name="marilyn",  # <-- "join keys" (one in this case) for the store
    user_question="How to reach Support?",  # <-- additional variables
)

Additional control over defaults and the policy about items not found is available, similarly to the Cassandra template.

This template supports partialing and implements the Runnable interface just like the Cassandra one.

For more details, check out the expanded example on [cassio.org](https://cassio.org/frameworks/langchain/prompt-templates-feast).

## Other DB-bound templates

This page aims not only at documenting the Cassandra and Feast ready-to-use "DB-bound prompt templates",
but also serves as a guide to implementing your own custom such template, to better automate the very common
operation of filling a prompt template with data from a database query.

All that is needed is to prepare a suitable "converter" function, which will - in most cases - retain a reference
to an external storage backend.