# Prompt Templates, intro

## Colab-specific setup

Make sure you have a Database and get ready to upload the Secure Connect Bundle and supply the Token string
(see [Pre-requisites](https://cassio.org/start_here/#vector-database) on cassio.org for details. Remember you need a **custom Token** with role [Database Administrator](https://awesome-astra.github.io/docs/pages/astra/create-token/)).

_Note: this notebook is part of the CassIO documentation. Visit [this page on cassIO.org](https://cassio.org/frameworks/langchain/prompt-templates-basic/)._


In [2]:
# install required dependencies
! pip install \
    "git+https://github.com/hemidactylus/langchain@updated-full-preview--lab#egg=langchain&subdirectory=libs/langchain" \
    "cassio>=0.1.1" \
    "google-cloud-aiplatform>=1.25.0" \
    "jupyter>=1.0.0" \
    "openai==0.27.7" \
    "python-dotenv==1.0.0" \
    "tensorflow-cpu==2.12.0" \
    "tiktoken==0.4.0" \
    "transformers>=4.29.2"

Collecting langchain
  Cloning https://github.com/hemidactylus/langchain (to revision updated-full-preview--lab) to /tmp/pip-install-4z6yjq1n/langchain_3069a66035be45399254938b9783f587
  Running command git clone --filter=blob:none --quiet https://github.com/hemidactylus/langchain /tmp/pip-install-4z6yjq1n/langchain_3069a66035be45399254938b9783f587
  Running command git checkout -b updated-full-preview--lab --track origin/updated-full-preview--lab
  Switched to a new branch 'updated-full-preview--lab'
  Branch 'updated-full-preview--lab' set up to track remote branch 'updated-full-preview--lab' from 'origin'.
  Resolved https://github.com/hemidactylus/langchain to commit 96a26aad2ec787a138fe8c321009f38625e8f4ce
  Running command git submodule update --init --recursive -q
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting cassio>=0.1.1
  Downloading cassio-0

You will likely be asked to "Restart the Runtime" at this time, as some dependencies
have been upgraded. **Please do restart the runtime now** for a smoother execution from this point onward.

In [None]:
# Input your database keyspace name:
ASTRA_DB_KEYSPACE = input('Your Astra DB Keyspace name (e.g. cassio_tutorials): ')

In [None]:
# Input your Astra DB token string, the one starting with "AstraCS:..."
from getpass import getpass
ASTRA_DB_APPLICATION_TOKEN = getpass('Your Astra DB Token ("AstraCS:..."): ')

### Astra DB Secure Connect Bundle

Please upload the Secure Connect Bundle zipfile to connect to your Astra DB instance.

The Secure Connect Bundle is needed to establish a secure connection to the database.
Click [here](https://awesome-astra.github.io/docs/pages/astra/download-scb/#c-procedure) for instructions on how to download it from Astra DB.

In [None]:
# Upload your Secure Connect Bundle zipfile:
import os
from google.colab import files


print('Please upload your Secure Connect Bundle')
uploaded = files.upload()
if uploaded:
    astraBundleFileTitle = list(uploaded.keys())[0]
    ASTRA_DB_SECURE_BUNDLE_PATH = os.path.join(os.getcwd(), astraBundleFileTitle)
else:
    raise ValueError(
        'Cannot proceed without Secure Connect Bundle. Please re-run the cell.'
    )

In [None]:
# colab-specific override of helper functions
from cassandra.cluster import (
    Cluster,
)
from cassandra.auth import PlainTextAuthProvider


def getCQLSession(mode='astra_db'):
    if mode == 'astra_db':
        cluster = Cluster(
            cloud={
                "secure_connect_bundle": ASTRA_DB_SECURE_BUNDLE_PATH,
            },
            auth_provider=PlainTextAuthProvider(
                "token",
                ASTRA_DB_APPLICATION_TOKEN,
            ),
        )
        astraSession = cluster.connect()
        return astraSession
    else:
        raise ValueError('Unsupported CQL Session mode')

def getCQLKeyspace(mode='astra_db'):
    if mode == 'astra_db':
        return ASTRA_DB_KEYSPACE
    else:
        raise ValueError('Unsupported CQL Session mode')

## Populate the Database

This notebook requires some data to be pre-populated on your database. Please follow these steps (roughly equivalent to [this section](https://cassio.org/more_info/#pre-populate-the-database) of the instructions on cassio.org):

In [1]:
!curl https://raw.githubusercontent.com/CassioML/cassio-website/main/setup/provision_db/write_sample_data.cql --output write_sample_data.cql
!curl https://downloads.datastax.com/enterprise/cqlsh-astra-20230526-vectortype-bin.tar.gz --output cqlsh.tar.gz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12680  100 12680    0     0  27464      0 --:--:-- --:--:-- --:--:-- 27505
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1159k  100 1159k    0     0  1317k      0 --:--:-- --:--:-- --:--:-- 1317k




```
CREATE TABLE IF NOT EXISTS people (
    city text,
    name text,
    age int,
    PRIMARY KEY (city, name)
) WITH CLUSTERING ORDER BY (name ASC);

CREATE TABLE IF NOT EXISTS nickname_by_city (
    city text PRIMARY KEY,
    nickname text
);
```



In [None]:
ASTRA_DB_KEYSPACE = getCQLKeyspace(mode="astra_db")
!tar -xzf cqlsh.tar.gz
!./cqlsh-astra/bin/cqlsh \
    -b "$ASTRA_DB_SECURE_BUNDLE_PATH" \
    -u token \
    -p "$ASTRA_DB_APPLICATION_TOKEN" \
    -k "$ASTRA_DB_KEYSPACE" \
    -f write_sample_data.cql

### Colab preamble completed

The following cells constitute the demo notebook proper.

# Prompt Templates, intro

CassIO powers a sophisticated set of bindings to seamlessly inject data
from Cassandra tables into your LangChain prompt templates.

## Basic usage

First, import the specialized Cassandra prompt template:

In [None]:
from langchain.prompts.database import CassandraReaderPromptTemplate

This cell simply obtains a `Session` object, i.e. an active connection to your database. Replace with custom code if you are not using Astra DB.

In [None]:
# creation of the DB connection
cqlMode = 'astra_db'
session = getCQLSession(mode=cqlMode)
keyspace = getCQLKeyspace(mode=cqlMode)

In [None]:
ctemplate0 = """Please answer a question from a user.
Keep in mind that the user's age is {user_age} and they live in a city with
nickname {city_nickname}.

USER'S QUESTION: {user_question}
YOUR ANSWER:
"""

### Natural binding with the DB

In the (string) template above, some variables are to be filled with a DB lookup.

The following instructions specifies the details of the binding: for instance,
the variable `user_age` is to be found on table `people`, specifically in column `age`:

In [None]:
cassPrompt = CassandraReaderPromptTemplate(
    session=session,
    keyspace=keyspace,
    field_mapper={
        'user_age': ('people', 'age'),
        'city_nickname': ('nickname_by_city', 'nickname'),
    },
    template=ctemplate0,
    input_variables=["user_question"],
)

**Note** that in the command above you specify the _primary key columns_ as `input_variables`, and not the variable names found in the prompt string above.

When formatting the Prompt Template, you will have to specify the primary key values
for the DB lookup -- the rest is done by the prompt template.

In this case there are two lookups from as many tables: the prompt template
takes care of everything, provided you pass all the primary key columns required
across tables.

Note: this operation essentially is a _client-side join_ (a standard pattern with Cassandra).

In [None]:
print(cassPrompt.format(city='turin', name='beppe',
                        user_question='Is functional programming fun?'))

Please answer a question from a user.
Keep in mind that the user's age is 2 and they live in a city with
nickname CereaNeh.

USER'S QUESTION: Is functional programming fun?
YOUR ANSWER:



#### Arbitrary row functions

You can specify an arbitrary function to transform the database row into the returned field. The function gets a `{column_name: value}` dictionary expressing the row and returns the value for the prompt template:

In [None]:
def nicknamer(row_dict):
    return f"{row_dict['nickname']} (i.e. {row_dict['city']})"

field_mapper_f = {
    'user_age': ('people', lambda row_dict: row_dict['age'] + 10000),
    'city_nickname': ('nickname_by_city', nicknamer),
}

cassPromptF = CassandraReaderPromptTemplate(
    session=session,
    keyspace=keyspace,
    field_mapper=field_mapper_f,
    template=ctemplate0,
    input_variables=["user_question"],
)

print(cassPromptF.format(city='milan', name='samanta',
                        user_question='Is there a square circle?'))

Please answer a question from a user.
Keep in mind that the user's age is 10014 and they live in a city with
nickname Taaac (i.e. milan).

USER'S QUESTION: Is there a square circle?
YOUR ANSWER:



#### Null and missing values

You can control how the prompt template should behave when a `None` value is encountered or even when a table has no rows altogether for a given primary key.

First, you can pass a boolean parameter `admit_nulls` to the prompt template.

Second, you can use the full four-element tuple format for the entries in the "field mapper". This would be `(table_name, column_name_or_function, admit_nulls, default_value)` (whose `admit_nulls` will override the overall default).

In [None]:
field_mapper_n = {
    'user_age': ('people', 'age'),
    'city_nickname': ('nickname_by_city', 'nickname', True, '(no nickname)'),
}

cassPromptN = CassandraReaderPromptTemplate(
    session=session,
    keyspace=keyspace,
    field_mapper=field_mapper_n,
    template=ctemplate0,
    input_variables=["user_question"],
    admit_nulls=False,
)

# Note: there is no "tokyo" in the nicknames table
print(cassPromptN.format(city='tokyo', name='hideo',
                         user_question='What are we having for lunch?'))

Please answer a question from a user.
Keep in mind that the user's age is 144 and they live in a city with
nickname (no nickname).

USER'S QUESTION: What are we having for lunch?
YOUR ANSWER:



In [None]:
try:
    # Note: there are no rows with city='madrid' in the "people" table
    print(cassPromptN.format(city='madrid', name='alberto',
                             user_question='What are we having for lunch?'))
except Exception as e:
    print(f"Exception => {str(e)}")

Exception => Null data found for "user_age"


## Partialing Prompt Templates

Cassandra-powered prompt templates support partialing. Suppose you have just enough information to bind the template to the DB-lookup values: you can leave the `user_question` unspecified for later completion at "format-time":

In [None]:
cassPartialPrompt = cassPrompt.partial(city='lisbon', name='Pedro')

The partial prompt template will keep the provided inputs ready to execute the full lookup-and-format operation when needed:

In [None]:
print(cassPartialPrompt.format(user_question='Em verdade, o que quiseres?'))

Please answer a question from a user.
Keep in mind that the user's age is 1 and they live in a city with
nickname ACidade.

USER'S QUESTION: Em verdade, o que quiseres?
YOUR ANSWER:



You can partial on any choice of input variables, even mixing database-bound and regular inputs:

In [None]:
cassPartialPrompt2 = cassPrompt.partial(city='lisbon', user_question='Estou perto do Tejo?')

print(cassPartialPrompt2.format(name='Pedro'))

Please answer a question from a user.
Keep in mind that the user's age is 1 and they live in a city with
nickname ACidade.

USER'S QUESTION: Estou perto do Tejo?
YOUR ANSWER:



## Chat Prompt Templates

The Cassandra-specific approach can be seamlessly integrated
with LangChain's "chat prompt templates", which represent a (template-based) way to manage chat exchanges.

Start with a prompt, not much dissimilar from what you've seen so far:

In [None]:
systemTemplate = """
You are a chat assistant, helping a user of age {user_age} from a city
they refer to as {city_nickname}.
"""

cassSystemPrompt = CassandraReaderPromptTemplate(
    session=session,
    keyspace=keyspace,
    template=systemTemplate,
    input_variables=[],
    field_mapper={
        'user_age': ('people', 'age'),
        'city_nickname': ('nickname_by_city', 'nickname'),
    },
)

Next, you need specific abstractions to wrap this "system prompt" as part of a broader chat exchange:

In [None]:
from langchain.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
systemMessagePrompt = SystemMessagePromptTemplate(prompt=cassSystemPrompt)

### A sequence of messages

Once you wrap a single prompt template as a "system message prompt", go ahead and make it part of a longer chat conversation:

In [None]:
humanTemplate = "{text}"
humanMessagePrompt = HumanMessagePromptTemplate.from_template(humanTemplate)

cassChatPrompt = ChatPromptTemplate.from_messages(
    [systemMessagePrompt, humanMessagePrompt]
)

### Formatting

LangChain takes care of correctly propagating the formatting steps throughout the sequence of messages, including the Cassandra-backed template:

In [None]:
print(cassChatPrompt.format_prompt(
    city='turin',
    name='beppe',
    text='Assistant, please help me!'
).to_string())

System: 
You are a chat assistant, helping a user of age 2 from a city
they refer to as CereaNeh.

Human: Assistant, please help me!


### Partialing and Chat Prompt Templates

In some cases, you may want to partial with respect to the database lookup key(s) even within a chat prompt template:

In [None]:
cassChatPartialPrompt = cassChatPrompt.partial(
    city='turin',
    name='beppe'
)

In [None]:
print(cassChatPartialPrompt.format(text="Hahaha!"))

System: 
You are a chat assistant, helping a user of age 2 from a city
they refer to as CereaNeh.

Human: Hahaha!


## What now?

This demo is hosted [here](https://cassio.org/frameworks/langchain/prompt-templates-basic/) at cassio.org.

Discover the other ways you can integrate
Cassandra/Astra DB with your ML/GenAI needs,
right **within [your favorite framework](https://cassio.org/frameworks/langchain/about/)**.