# Demo: Templates

The base unit is a `Message(role, context)`, which has generally been accepted by all LLM chat APIs.

A list of Messages is a `Conversation`, which provides easy conversion to a messages array for API calls.

> _Hint: `yaaal` provides a `format_json()` function that pretty formats json for printing, logging, and debugging_

Sometimes we may want to predefine the messages in the conversation via MessageTemplates.
A `MessageTemplate` defines the role, the template, and the rendering method to generate a Message.
It may also add variable validation with Pydantic through the `validation_model` attribute.

- `StaticMessageTemplate` provides a prompt template that is not templated, that is, there are no template variables and it renders exactly the same string every time.
- `StringMessageTemplate` uses string templates (_`$varname`, not `{varname}`!_) to render a templated string based on a dict provided at render-time.
- `JinjaMessageTemplate` uses a jinja2 Template to render a templated string based on a dict provided at render-time.
- `UserMessageTemplate` uses `StringMessageTemplate` for user message passthrough with `$user` var
A `ConversationTemplate` is a way to use various MessageTemplates to render a `Conversation`.
We may want to treat ConversationTemplate as Functions or Tools for the tool-calling API;
ConversationTemplate provides a `signature` method to mock a pydantic model representation of the function signature and a `schema` method that provides the JSON schema.

In [None]:
import logging

from pydantic import BaseModel, Field, create_model

from yaaal.core.template import (
    ConversationTemplate,
    JinjaMessageTemplate,
    StaticMessageTemplate,
    StringMessageTemplate,
)
from yaaal.types.base import JSON
from yaaal.types.core import Conversation, Message
from yaaal.utilities import basic_log_config, format_json

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
basic_log_config()
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

## Quick Start

A `StaticMessageTemplate` provides a prompt template that is not templated, that is, there are no template variables and it renders exactly the same string every time.

In [None]:
template = StaticMessageTemplate(role="system", template="You are a helpful assistant.")
template.render()

A `StringMessageTemplate` uses string templates (_`$varname` or `${varname}`, not `{varname}`!_) to render a templated string based on a dict provided at render-time.

In [None]:
template = StringMessageTemplate(role="system", template="You are a helpful assistant who specializes in $expertise.")
template.render(template_vars={"expertise": "Star Wars trivia"})

A `JinjaMessageTemplate` uses a jinja2 Template to render a templated string based on a dict provided at render-time.

In [None]:
template = JinjaMessageTemplate(
    role="system", template="You are a helpful assistant who specializes in {{expertise}}."
)
template.render(template_vars={"expertise": "Star Wars trivia"})

Note that `yaaal` has logged a warning message when we rendered our `StringMessageTemplate` and `JinjaMessageTemplate` messages.
This is because we did not provide a `validation_model` - a Pydantic model that defines the expectations for template variables.

Let's create a Pydantic model that defines what we expect to accept as input.

In [None]:
class Expertise(BaseModel):
    expertise: str

In [None]:
template = JinjaMessageTemplate(
    role="system",
    template="You are a helpful assistant who specializes in {{expertise}}.",
    validation_model=Expertise,
)
template.render(template_vars={"expertise": "Star Wars trivia"})

# No warning!

When we provide the validation model to the template, we do not get the validation warning;
and, though it is less obvious, our input is validated!

We can test this by using an invalid input, which will raise a `ValidationError`

In [None]:
# An invalid input will raise a ValidationError
template = JinjaMessageTemplate(
    role="system",
    template="You are a helpful assistant who specializes in {{expertise}}.",
    validation_model=Expertise,
)
template.render(template_vars={"expertise": 8675309})

## Example

Objective: Define a `ConversationTemplate` that provides a summarizes web content, with validation (this is a replica of the Summarizer provided as one of `yaaal`'s default ConversationTemplate)

- Define system prompt template
- Define user prompt template
- Define output format

### Templates

It is often easiest to start by drafting the instructions / system template before defining input/output validators.
Ultimately, the order doesn't particularly matter, except that all of the moving pieces must be defined before we use them with a `ConversationTemplate`.

> _Hint:_ [OpenAI](https://platform.openai.com/docs/guides/prompt-generation) and [Anthropic](https://www.anthropic.com/news/prompt-improver) provide meta-prompts that can help generate a well-defined set of instructions.

In [None]:
# this is a jinja string.
# Jinja is a powerful templating language that lets us do things like loop over variables (see 'for source in sources' at end)
summarizer_system_template_str = """
You are an AI research assistant. Your task is to summarize a piece of content and synthesize key takeaways. The user may provide additional guidance for topics of interest or directions for investigation.

Please follow these steps to complete your task:

1. Carefully read and analyze the provided content.
2. Summarize the main points of the content. Your summary should be detailed and comprehensive, capturing the essence of the content and the source's relevance with respect to the user's guidance.
3. If it exists, consider the user-provided guidance and ensure that your summary and analysis address the specified topics of interest or directions for investigation.
4. The summary may use up to three paragraphs to highlight the main idea, argument or goal, clarify critical information, and identify actionable insights or key takeaways.
5. Present your analysis adhering to the following json schema:

<schema>
{{summary_schema}}
</schema>

Here is the source you need to analyze:

<sources>
{% for source in sources %}
    <source>
    {{source}}
    </source>
{% endfor %}
</sources>
""".strip()

We will create Pydantic BaseModels to define our expectations around the source (`URLContent`, input) and response (`Summary`, output) schemas.  Note that the Summary schema used to validate the model response is also used to tell the model how to response in the system template!

In [None]:
# Assume our "sources" come as URLContent objects
class URLContent(BaseModel, extra="ignore"):
    """Text content from a webpage."""

    url: str = Field(description="The webpage url")
    title: str = Field(description="The page title")
    content: str = Field(description="The webpage's text content")


# We want our output to have the Summary structure
class Summary(BaseModel, extra="ignore"):
    url: str  # Annotated[str, AnyHttpUrl]
    title: str
    summary: str = Field(
        description="A comprehensive but concise summary of the source content that captures the essence of the original information."
    )


# This will ensure all inputs into the system template are valid
class SummarizerSystemVarsValidator(BaseModel):
    sources: list[URLContent] = Field(description="The text to be analyzed", min_length=1)
    summary_schema: dict[str, JSON] = Summary.model_json_schema()

Now we can construct and test the system prompt template

In [None]:
summarizer_system_template = JinjaMessageTemplate(
    role="system",
    template=summarizer_system_template_str,
    validation_model=SummarizerSystemVarsValidator,
)

In [None]:
# per our SummarizerSystemVarsValidator, our system template expects:
# - sources, list of URLContent objects
# - summary_schema, the json schema for the output (which is provided by default)
summarizer_system_template.render(
    template_vars={
        "sources": [
            URLContent(
                url="http://this.is/an/example",
                title="example",
                content="Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua",
            )
        ],
        # "summary_schema": Summary.model_json_schema(), # this has a default value in SummarizerSystemVarsValidator
    }
)

In [None]:
# we use a passthrough prompt to allow the user to provide their input using string templates
# a `PassthroughMessageTemplate` exists specifically for this reason;
# This example just recreates it.
summarizer_user_template_str = "$content"


# This will ensure all inputs into the user template are valid
class SummarizerUserVarsValidator(BaseModel):
    content: str


summarizer_user_template = StringMessageTemplate(
    role="user",
    template=summarizer_user_template_str,
    validation_model=SummarizerUserVarsValidator,
)

In [None]:
summarizer_user_template.render({"content": "Tell me about quantum entanglement."})

Great! We can use templates to render messages that change based on the variables we've configured, and we have validators that check to make sure the inputs are what we expect.

Now, we want to combine the message templates into a conversation so we can send the whole thing to an LLM to receive a response.

A `ConversationTemplate` is a way to use various MessageTemplates to render a `Conversation`.
ConversationTemplates render the Conversation based on a conversation_spec, a list of MessageTemplates and/or Messages that mimic the desired conversation.

In [None]:
summarizer_prompt = ConversationTemplate(
    name="Summarizer",
    description="Summarizes the content of web page(s)",
    conversation_spec=[summarizer_system_template, summarizer_user_template],
)

`ConversationTemplates` do some magic behind the scenes to flatten all of the template variables.  This means it's very easy to pass `ConversationTemplate.render()` a single dictionary with all values needed to render all MessageTemplates.

In [None]:
summarizer_prompt.render(
    {
        "sources": [
            URLContent(
                url="http://this.is/an/example",
                title="example",
                content="Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua",
            )
        ],
        "content": "Tell me about quantum entanglement.",
    }
)

You may notice that we had to provide `name` and `description` arguments to the `ConversationTemplate`.

This is because we may want to treat the `ConversationTemplate` as a tool for function-calling.  Tool use works best when the tools have a descriptive name and detailed description about their function so the LLM can determine when they are appropriate to use.

Concretely, `ConversationTemplate.signature()` returns a Pydantic model that defines the function signature of `ConversationTemplate.render()` for this use case.
We can convert the signature to a json schema with `model_json_schema()`, or use something like openai's pydantic integration with `openai.pydantic_function_tool(summarizer_prompt.signature())`

In [None]:
# this is the BaseModel
display(summarizer_prompt.signature)
display(type(summarizer_prompt.signature))

In [None]:
# this is the json schema
print(format_json(summarizer_prompt.schema))

The schema shows the required variables that we passed:

```json
{
  ...,
  "required": [
    "sources",
    "content",
  ],
  ...
}
```