# LangChain: A library used to build language model application

In [142]:
from dotenv import load_dotenv
from enum import Enum
from pydantic import BaseModel, Field
from langchain.chat_models import ChatOpenAI
from langchain.llms.openai import OpenAI
from langchain.llms.ollama import Ollama
from langchain.llms.huggingface_hub import HuggingFaceHub
from langchain.schema import HumanMessage, StrOutputParser, BaseOutputParser
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate, FewShotChatMessagePromptTemplate
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.chat import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores.chroma import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.output_parsers import (
  PydanticOutputParser, 
  CommaSeparatedListOutputParser,
  DatetimeOutputParser,
  EnumOutputParser,
)

In [28]:
load_dotenv()

True

Overall, LangChain consists of a `chain` which contains:
1. LLM
2. Prompt
3. Parser

In it's most basic form, with the default parser, you can invoke a LLM using:

In [12]:
llm = OpenAI()
print(llm("What do you think of the color green?"))

{
  "id": "cmpl-8UcGxfe7lXyRNq6e3JJpbuEHVnWn9",
  "object": "text_completion",
  "created": 1702307651,
  "model": "text-davinci-003",
  "choices": [
    {
      "text": "\n\nI think green is a beautiful and vibrant color. It symbolizes nature, growth, and prosperity, and is a great choice for many color schemes.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 32,
    "total_tokens": 41
  }
}



I think green is a beautiful and vibrant color. It symbolizes nature, growth, and prosperity, and is a great choice for many color schemes.


## LLM

There are two types of language models:
* `LLM`: a model that takes a string as input and returns a string
* `ChatModel`: a model that takes a list of messages as input and returns a message

The basic `LLM` is often referred to as an `Instruct` model, whereas the other is referred to as a `Chat` model. Ultimately, these are both foundational LLM models fine-tuned for instruction and conversations.

We already saw the basic usage of a LLM. Here is a simple example of a ChatModel that uses messages, where the `HumanMessage` is passed in and it returns an `AIMessage`. All messages are derived from the `BaseMessage` which has a `role` and `content`:

In [18]:
llm = ChatOpenAI()
input_message = HumanMessage(content="how many days are in a year?")
llm([input_message])

{
  "id": "chatcmpl-8UcLScfkblk0G97doOy137W1acGYv",
  "object": "chat.completion",
  "created": 1702307930,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "There are 365 days in a year, except in a leap year when there are 366 days."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 21,
    "total_tokens": 36
  },
  "system_fingerprint": null
}



AIMessage(content='There are 365 days in a year, except in a leap year when there are 366 days.')

We are also not limited to online LLMs. Here is an example using `Ollama` with the `Mistral` LLM running local:

In [22]:
llm = Ollama(model="mistral")
print(llm("The first man on the moon was ..."))

The first man on the moon was Neil Armstrong.


We can also stream the LLM response instead of waiting for the entire text to be generated:

In [25]:
llm = Ollama(
  model="mistral",
  callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])
)
llm("Who is Elon Musk?")
print()

Elon Musk is an entrepreneur and business magnate, born on June 28, 1971 in South Africa. He is known for his advancements in sustainable technology, space exploration, and electric automobiles. Musk is the founder and CEO of multiple successful companies including Tesla Inc., SpaceX, Neuralink, and The Boring Company. He has also been involved in various other ventures such as SolarCity and Zip2. Musk is widely considered to be one of the most influential figures in the tech industry and has been named one of Time Magazine's 100 most influential people multiple times.


It is also worth noting, it is easy to use models on HuggingFace models using their `huggingface_hub` library, just make sure your `HUGGINGFACEHUB_API_TOKEN` is setup in the environment. It can also be slow, since you are running on the shared infrastructure:

In [None]:
llm = HuggingFaceHub(
  repo_id="google/flan-t5-xl", 
  model_kwargs={"temperature": 1}
)
llm("translate English to German: Hello, my name is John.", raw_response=True)

## Prompt

Prompts are the instructions to the LLM. There are two tools provided by LangChain for prompts:
1. `Prompt Templates`: parameterized prompts
2. `Example Selectors`: dynamically select examples to include in the prompts

Up to this point, the prompts have been simple strings. However, usually the prompts will be more complicated:

In [37]:
prompt = PromptTemplate.from_template("What is a good company that makes {product}?")
print(prompt.format(product="cars"))

What is a good company that makes cars?


The `PromptTemplate` works with basic strings but you can also use the more powerful `ChatPromptTemplate` which works with messages and `Chat` models. The types of possible messages are:

1. System
2. Human
3. AI

In [54]:
prompt = ChatPromptTemplate.from_messages([
  ("system", "You are able to translate from {in_language} to {out_language}."),
  ("human", "{text}")
])
print(prompt.format(in_language="English", out_language="German", text="Hello, my name is John."))

System: You are able to translate from English to German.
Human: Hello, my name is John.


The above is just a shortcut way of using special messages, which can be non-variable messages or message prompt templates:

In [57]:
prompt = ChatPromptTemplate.from_messages([
  SystemMessagePromptTemplate(prompt=PromptTemplate.from_template("You are able to translate from {in_language} to {out_language}.")),
  HumanMessage(content="USER:"),
  HumanMessagePromptTemplate(prompt=PromptTemplate.from_template("{text}")),
])
print(prompt.format(in_language="English", out_language="German", text="Hello, my name is John."))

System: You are able to translate from English to German.
Human: USER:
Human: Hello, my name is John.


Prompt templates also implement the `Runnable` interface, which is how they can be used with LCEL:

In [66]:
prompt = PromptTemplate.from_template("My name is {name}?")
prompt.invoke({"name": "John"})

StringPromptValue(text='My name is John?')

In [68]:
prompt = ChatPromptTemplate.from_messages([('human', 'My name is {name}?')])
prompt.invoke({"name": "John"})

ChatPromptValue(messages=[HumanMessage(content='My name is John?')])

It is also very common to include a few examples within a prompt, referred to as `one-shot` or `few-shot` examples. The most basic way of doing that:

In [75]:
examples = [
  {
    "question": "is the name Brian a cool name?",
    "answer": 
"""
The length of the name Brian is 5 characters.
Because the name has an odd length, it is NOT a cool name.
"""
  },  
  {
    "question": "is the name Tami a cool name?",
    "answer": 
"""
The length of the name Tami is 4 characters.
Because the name has an even length, it is a cool name.
"""
  },  
  {
    "question": "is the name Jason a cool name?",
    "answer": 
"""
The length of the name Jason is 5 characters.
Because the name has an odd length, it is NOT a cool name.
"""
  },  
  {
    "question": "is the name Nick a cool name?",
    "answer": 
"""
The length of the name Nick is 4 characters.
Because the name has an even length, it is a cool name.
"""
  },
]

In [78]:
example_prompt = PromptTemplate.from_template("Question: {question}\n{answer}")
print(example_prompt.format(**examples[0]))

Question: is the name Brian a cool name?

The length of the name Brian is 5 characters.
Because the name has an odd length, it is NOT a cool name.



In [81]:
prompt = FewShotPromptTemplate(
  examples=examples,
  example_prompt=example_prompt,
  suffix="Question: {input}",
  input_variables=["input"]
)
print(prompt.format(input="is the name Jack a cool name?"))

Question: is the name Brian a cool name?

The length of the name Brian is 5 characters.
Because the name has an odd length, it is NOT a cool name.


Question: is the name Tami a cool name?

The length of the name Tami is 4 characters.
Because the name has an even length, it is a cool name.


Question: is the name Jason a cool name?

The length of the name Jason is 5 characters.
Because the name has an odd length, it is NOT a cool name.


Question: is the name Nick a cool name?

The length of the name Nick is 4 characters.
Because the name has an even length, it is a cool name.


Question: is the name Jack a cool name?


This works fine if you want to include all the examples in every prompt. However, if you want to only select some of the examples, then you need to use an `ExampleSelector`. In this case, we will use the `SemanticSimilarityExampleSelector` which will decide which examples to include based off of similarity of the input and the examples:

In [86]:
example_selector = SemanticSimilarityExampleSelector.from_examples(
  examples, # examples to  select from
  OpenAIEmbeddings(), # used to create the embeddings
  Chroma, # VectorStore class
  k=1 # number of examples to produce; one-shot in this case
)

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "6vlyvOJipTv0rR88GjsFvdsHuTv+na08AMbEvJvUbLx++zO8s1I8vJVI2rw9oFA701OPu5iEPrxdcPq7Ah+FPJ7L3TuMJyY9H58APboG5rqY5g28PuXDuwGp6LwCeMI8irobvHegx7zVGdc7FjDHu4fDKjvF4qm7viXuPDP1NTyma728RmhHO6fYx7zWoz26mqxVvMQT0DpZUXI82iSVu6nsFD3kzy87srQLvDpkbDwIZqS8LxshPGLAqzxH8q28eINrvPthybyMJ6Y8V9CdPDvaiLzhOo68hVagvIVq6rz08hI9RLbJPIfDqrz/J5S7s5cvPMF+rrvHlCe8mqxVO0bKlrulEgC7neg5PI8elzxLETa7J5gdu4lEfzwnNs48xBPQOttMrLz8RG081PE/vNk4X7tXi6o6tQS6u0oukjwYRBQ9v1aXvLiZW7yAD4E8fOdmOkYPirvMPZa8AwIpPJ/8Br04nqS7yQEyuzFDODqYKwG801OPPL+vVDvGbBA8tCGWO+NFSTzUj3C8f0CnPItEAj15+Ye85x9eOsaxA7sn8Vq8sg1JPLpoNb3lstO7N89KPEMsY7ykQya88yO5PItYzDu03CK9L9atO6yBNjxH8q08T5KNvIIavLxaIMy7BIwPPf5YOjz8RG28+VYOvYp1KD09W108T5INvaUmyrx2Ape8BW8zvJxeUz1v7B083NaSPJqs1TrvXe68qLvrOhWSlrw2RWQ7CARVvBVNI7yMbJk8m8CiPGuIIr2lEoA8XDSZu+s+Zjspjw68m9TsO8A5u7ytxqm8iKbOuomJ8jwzVwU9/XUWPEbKFjxtk908He0Cvcm8vjoe0Ka7lMcFvQTlTDwSuAE9KUqbPEMs47wVTSM8NaezPLHlsTzml

Now we can select the examples based off a new question:

In [87]:
question = "is the name Jack a cool name?"
example_selector.select_examples({"question": question})

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "qNyXO/3X07s2Ipk82i7fvG5L2LzQDuQ87ucGuxjch7xwrDI8E19EvBgJorwnnzq8WiXCPG3Xwzt9tNa7ozI6PEFCFD0ZkHA8ho3XPJIOrbxKW2m7TLzDPNypGTzCGXq89OsYu9s1hbuCXjQ8wdJ/u6eVHTudSIi8Ig89PM5ttTy/6la8HzpOu+MH4LwJWSk8UWahvPgHgrxsfY85gSr0O1R75Dxzm4G8NGeKPEtij7uvsw87Din7uVROSjoI5RS94sBlvPsJizw837A8gHaLPDLZlTzOgO+8vUkovC/XDDqhkYu6VyM5u43Y4zsH+M48bOr9PIlixrm86M28NZSkPOVourtaJUI7nUiIu3smYrxcs7Y7WD2ZPKdoAz3wtc88XhQRvC8XYbyS4ZK8A69LO76jXLwH+E68CCXpu75jCD0uvaw88oq+vOgQj7vFrhS7fG1cuDdpk7xUTkq8Ajs3PP4eTr2iBSC79XJnvDo+gjyxmzg81BeTukpb6bybFMg89TITPFmxrbssAh68et9nPAqzXTyZWTm8HX8/O8trLLkxeLu77frAOyeFWrzHPAk7pe3IPMiDAzwNoqy8exMoPajCNzodrFm9MXg7u66Zr7zyMIq6X7U/vMyyJrw+QIs8wdJ/PG548rpT9JU84oCRvPVy5zzEOoA8AIAovYOLTrxvf5i8FKY+vNcsVj0kyks7I/yCPAu6AzyfMDG9FmHNPOLAZTxxIEe69BgzvOROWryqUCw8cAbnPHqfk7yZP9k76vi3O52ivLuwZ3g8X7U/Oi12srvxFqq8ZxqsvOU7IDzGCMk8P/TzO1WCCjxYPZk8eYWzu805dTybFEg7whl6PIS/DjzOE4G89UXNPC6QEr1odGA81bhBPesSmDykp

[{'answer': '\nThe length of the name Nick is 4 characters.\nBecause the name has an even length, it is a cool name.\n',
  'question': 'is the name Nick a cool name?'}]

Using the example selector we can now define a `FewShotPromptTemplate` without passing in all the examples:

In [88]:
prompt = FewShotPromptTemplate(
  example_selector=example_selector,
  example_prompt=example_prompt,
  suffix="Question: {input}",
  input_variables=["input"]
)

print(prompt.format(input="Is Brian a cool name?"))

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "PHotvKgh4bhF9po7FdzvvK62gTv0SgE9YQWQu2BBQbztLQ+8EEebvMxgo7z4kzW8FxWBPORPujvkT7q5wm/zOxGA4Dww7tA8mOpoPGSdULyo0tS8dIU8vFvS1jqC+GU82DsMvB7Qi7yNOKI8RDLMu1W15Dq+E2S6+kE9Pddh9jwi3Y684FWSO52o37zTBWe63n7ou4QeHLt+sp07TRAhPMYHAD0bl/o7o4wMPLlV7bqlYP67kDJKu6foGzxuA/e8d6iGvKXCZbxmS9g8ILfYPIzWOrwSppa8f03KvKDLKbt25Dc8zeh0POv0/br6VBg8Yo3hO+M8X7wGB186UPdtOyn6ALpdqQA8n7jOPIyHrjs/w+G7lCxyupLgUTyl6wc7u90KOxPf27wq5Dk7vhPku2/dDLw1v6K77mZUu/LrhTwrzvI8fNvzuw8h5bqutgE86/R9O3FicjqyOHu8lCxyPAkqKb3Rz427BLs+vKk0vDzVyTU7z0dwPPAU3Do3C8M8pcLlubX53bmQMso7gJnqPC0HBD2a5/y78XZDPONlAbwQHnk7djPEPAKrz7xzm4O8Un8LPQO+qrwi3Q68c8HtPMfxuDw2Wk+9gMKMOogrH7vuteA82DuMvAgEc7zp9+k45TnzPB9rODyGfZc6zfvPvIMLwTyAwow8JIsWvfZtf7zwjAq8iY0GPJKnDD1n/Ms7XUcZPApjbjrOvx69e6KuPDtnUrtq0Im7WTqWvOLa97omEHw8uAnNPOeY7rz7PtE7EOUzvME2rjqWA5w8FKMqPDyNCLxtt9a82iJZvODzKjySpww9cibBPJSOWTxnXjM86ffpvOB7fDxszR08VionvK3ysjxq0Ak8PmH6PHl8eLxJAx67F2QNPTW/ojz0N

Now that we have the prompt template defined, let's use it with a LLM:

In [93]:
llm = Ollama(model="mistral")
chain = prompt | llm
chain.invoke({"input": "is the name Nick a cool name?"})

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "g8VMuxtjNbzwnQQ9yOajvJirUrxPbvU8IIZjuxXmn7yDGIy5BHZUvFfyGLxTinu8QF/sOyawnzzbvnO816Igu9Tn9jwg4Eo8KL5vPP9Sprwbat28HBcEPfd79jpu5m68i/1Yu8uoQrubuYg8clU0uxJ+NTzrJ+S8xzKIPIv92DrlUJq7QbnTvMMWAr3dckI7HctsvItQGLyzWoU8cKFlPH5P3zxIkNC8uCr0u3NcjzvNtsW80BeIu7VoiLtWRaW8ffV3vKezszzh4dQ8zVyRPMTKHTzmqgG9HXG4vNxrGjvBvGe5BHbUu/mC0TvIk2Q86GXFPEz/FTzThme8opAFPSgRrzs51Aa8lvCOurDyGry3I8y6gLAhPBsQqTwhjb48KBEvO9j8h7tf12W7vUatPH+pRrtEe3I7VIquu3OvGzyZqwU9S6WuvEtS77uqGx462rAjPBF3WruSgcm7VpgxPNElWL1k+ka8h4eeOwyukzxqyps8SD3Eu00N5rz9S8s8sPJnuZu5CDsCFUU6y06OPHMCKDseJYc6nceLui003btWRSW7mgxiPAJv+buYUZ6845xLPaSeCLug3La8ah0oPcG8Zzx52fG8kXruOzyWJbrXoqA6QxoWvA0IyLyNXps54++KPA/DPruOEjc7Y5m3vO6PzjyCa2U8H3+7vPCd0bwJRna8fO5PvBLYaT1f1xg8bYWSPDYZEDwyqkq9GFWyPDuPyjt0CQO7VDeivP1Eo7pBDBM8pw0bPQflZrweJQc83R+2Ov+sWjxiP1A8QLIrPNsRMzyqG568xHdevFDIXDuNXhs9AruQPELArrqCa+U8nnsnvKKQhTtEe3I8ah0oPCAz1zuoug68TKzWPLJTKr2U4os7jP0LPYWAwzwVk

'\nThe name Nick has a length of 4 characters, which makes it an even-length name. Therefore, according to the given criteria, the name Nick is considered cool. However, it\'s important to note that what constitutes a "cool" name can be subjective and may vary from person to person.'

Using examples with a `Chat` is slightly different but not too much.

Here is the simplest example, where `FewShotChatMessagePromptTemplate` is included in every message. This example will also demonstrate that the example few shot prompt template doesn't need to include the `input` suffix, but can be included within another prompt template:

In [95]:
examples = [
  {"input": "2+2", "output": "4"},
  {"input": "2+3", "output": "5"}
]

In [97]:
example_prompt = ChatPromptTemplate.from_messages([
  ("human", "{input}"),
  ("ai", "{output}")
])
few_shot_prompt = FewShotChatMessagePromptTemplate(
  example_prompt=example_prompt,
  examples=examples
)
print(few_shot_prompt.format())

Human: 2+2
AI: 4
Human: 2+3
AI: 5


In [100]:
chat_prompt = ChatPromptTemplate.from_messages([
  ("system", "You are a wizard of math."),
  few_shot_prompt,
  ("human", "{input}")
])
print(chat_prompt.format(input="5+2"))

System: You are a wizard of math.
Human: 2+2
AI: 4
Human: 2+3
AI: 5
Human: 5+2


Now we will look at dynamic examples, which is something we've already seen. However, I will also show working with a `VectorStore` to create an examples selector using mixed examples as would be seen in a chat history:

In [108]:
examples = [
    {"input": "2+2", "output": "4"},
    {"input": "2+3", "output": "5"},
    {"input": "2+4", "output": "6"},
    {"input": "Who are you?", "output": "My name is Mistral."},
    {"input": "Hello", "output": "Hello, my name is Mistral."}
]
to_vectorize = [" ".join(e.values()) for e in examples]
print(to_vectorize)

['2+2 4', '2+3 5', '2+4 6', 'Who are you? My name is Mistral.', 'Hello Hello, my name is Mistral.']


In [109]:
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_texts(to_vectorize, embeddings, metadatas=examples)

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "36cVu6RZzjs3Pqc8d+VOuxlJHr3NKXM6d1B3vMpsXbzA/Om7JI/iu7Abr7vFZr48yAqZvM6lDzv1I8e7ahOXO/JFZjzkjGm7VtjePNnrU7y5ale8ei7YPJfRc7zLMle8I7mRO5HLVLoulK077arvu94bojzRzc28a36/PBaMiLyVygC9yyIAvCV2p7uGlec73G5ju5hNkLvkIcE8sm2cvI/UODth1EU8rNKlOqq6PjwgRtm7pUATPNbTbDwZDxg7KxGevMNvIjycXJO7tbYlPGUdz7yC4TU7S1iUPBkPmLz2VGm86x78uyt8xrtmbzw8sm2cvOSM6bwfcAi95RhdPI7dnLy7F5Y7phZkvHIQ0jxpl3q7DYe9OxHxkTx4N7w85Oe6u5KymbzgEr48eompvN1VqLxCdBS8tIWDvPLaPTyqJec8W84mvVCYObt+d+G6esOvOyV2pzsaekC7raj2PF8XMLyfSss7mF1nu/ZU6TxMw7w7kPWDPEdJkbv/k7o8hWRFvDMvJD1RyVu8R+6/vG7HyLrPQVo51mjEvOf/Ibxe5o28+mPsu3AZNjwpvzA7xcGPOkDH1bvf8fK8krIZPRJ9hTtmqcK8iv+7ulFeM7wJPjS7rjTqu97A0LuLIAe825gSuxE777sZD5g8Md02vEweDj04b8m6kD/hPE9nl7xux8i88qA3vAv7ST3EC+08cL7kPBzMrTwlG9a7Q6W2PGYEFLycbOo8/BCrvO2qb7zm3tY7CLLAPEHooLsuWqe8HJKnO0QxKj1Sj9U8/3LvPFIkrbvJ0JK8HJInO/d1NLxZ14o8d1D3O4KG5DukWU486kgrvMUsuDxcWpq86oKxutE49rucbGq8V2TSO+MA9rvIr0e8Z5CHPLjeYzxr2

In [110]:
example_selector = SemanticSimilarityExampleSelector(
  vectorstore=vectorstore,
  k=2
)
example_selector.select_examples({"input": "2+2"})

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "lM37O4LGAbuQs5E8h+4lvCH0Cb1tqWs8um2MvDcTx7sh9Im82MywvL8QrLuiMls8wBPYvF9B4DfaWz682ErYu88HRTxRVNA72t0WPcZGCryulI68AxMNPKCjzTnVuJ677GCRu5yMDzkmDnQ7jZtOvDw4vzwz9au8e5AjPW4yobwYKMG83wKPvD9TrrwJvQm72lRhvE27OTsYrcU8TLGwvNzjbjs+RfQ7KzrJO7fXoTv/chm861k0PA5gqTzIUz+6bSiYvB3SPbsml6m7UdlUPNW4nrxBZJS7vgnPO44dJ7z6TaG8k8PyuPGFibsPajI7MF9BvAtMl7yeF+y8nhudPBH5P7yPJ7C718VTuwzN6jz5wT+7gT3MOxF0uzzTIrQ8PTtruy9VOLzjntE8D+F8vEqgyrzzEGa8/miQvAQWOTz4OQ89No5CvVFYgbvnvOy7qWg5PL2Bnrsw4Rk7Pb3DPHZu17zdb1A7I4OXvGsXsjzJWhw8M3cEPYZixLvmsuM8+DkPvJZdDj3qTyu8FyHkvGPeJ7wh8Fi7C0jmvDghgbyXZ5e8mwNaPNdEADzO/bu7I/4SPPAABbybgoa8SZntPHC9/Tszc1O8cMEuvKfgiLz5wb+77nSju/UoqTnsYJG8YlmjOwOK1zpcqMk7egufvJuFMj17k0+7PTtruxxNubszdn+8/WEzvPxXKj1BZBQ9hFWPPFwqojyJ/4u8VGwTPejDSbwoJjc8CscSvIfupbxPRxs8aY+BPGiBx7x394y8PcDvOxgoQT3Z0w09fBwFPVulnbw9Qki8fByFPFsgmby4WPU8P1OuPLz8GTyGZXA8bCE7vPg5jzxxyAu86EUiPC/XkDrRmoO8RwrgO4n4rrw+RfS706DbPLlmLzzM8

[{'input': '2+2', 'output': '4'}, {'input': '2+4', 'output': '6'}]

We will create the few shot prompt template using the example selector and example prompt:

In [113]:
few_shot_prompt = FewShotChatMessagePromptTemplate(
  input_variables=["input"],
  example_selector=example_selector,
  example_prompt=ChatPromptTemplate.from_messages([
    ("human", "{input}"),
    ("ai", "{output}")
  ])
)

print(few_shot_prompt.format(input="2+7"))

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "6ZqtO7DGkbt7/hY8PM7bOpSbRr1sR60719eivL4yaLuxw/G8g90mvPH9grzKJaQ8219+vINUdjtVc5Y7soq/u0j/NDxWeIG7OQXDPOTHvrc/3Zy8s83Hux0gILtWsbO65Y4MvOzni7zEi+c7e3pROtD6XTyn54G8DBzYPB+mMLxSIc28xNEPvV5SprzjhLa7oMc0u9mb0LtkaJ08cBWxvEQsRjxfEWk8aDuMOzLl9brI3bC8gJDIO6CJFzzVURI7gc5lvOhXJTzFkFK84rj9u5dfdLwSMk+8n8LJu8ADDLzqnxi8K4x2utwmzLqufh46h6uqu5fopLz4VoK8ECj5O3btirwyp1i8f0hVu4BSqzzFFJi89YDzPDb+DD0OJi49R34PPBfMC70akLk8RXS5vKqwmro6Cq68UWKKvNiWZTzyOyA9OYH9vOHACDq56Km6NTe/PHXoH7x047S7rkCBO+dSurt9xy+7o5BNvBwbtTxA2nw8ZmroPNVRkrzQ+t08nHVrO21MGD0FCYG7K07ZvBK2FLt/SNW7diuovKKLYrxgFtS8fISnujU3vzsKkdy7DSFDO7JMoju67ZS8TlPJPI/DbDxIPdK7LFNEvN1pVLzM6VE8duX/uigJBr3YWMi86ZotO3cwEzzsoWM7YNg2vGbuLT0oRyO8M7GuOwHw6bvXmYW8nvt7vCR5Hz3p2Mo74H0APXUmvTyufp67sUc3PSS3PLwj8G487SqUvHt60byD3SY8UWKKO7VTWLxyXSS8kZQQvPJ5vTwVfy09Q7ALPSiFwLt471W8C5uyPDxSobyrtQU9b9KoPCR5nzyn54E7Njyquy8c3TxlJ2C8N/vsO5WgMTzNtYq8ch+HPEMn27luj6A7awSlPGVtCLr8H

In [114]:
final_prompt = ChatPromptTemplate.from_messages([
  ("system", "You are a wizard of math."),
  few_shot_prompt,
  ("human", "{input}")
])
print(final_prompt.format(input="5+2"))

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "gmCbvDnCzzus8308xSOevN1jLb0FDZo8pvbSvLwCDbz1ovq8EVKRvAd6nbq0PIw8SwTyvIi5mDs4VUy8+djTumpCbjxZW1y6XDXjPC0RFzk9nNa7U14xPFJd7zlOzUe8Z8P3ujwwFTr4IXE7E2QEvKp1STzwphG9U7lBPVOoEL2kLr+7hvEEvdUJ7rzCtdi4lzJlu+3Mirk0ILU8A6AWvLUEIDzaiSY85oS+u3kG3Dw9nNa8gha8uxs6dDy6Smi7M8TivKN3XDp2djQ8G4UVO5E1uryLkt27KTeQO0fP2jrDW4q8q+LMOuoUZrwy/ZC74fPUu9dlQLwTZIS87cyKPE1hBryEgz+8+Mbgu4U6ojzJDta5qnXJO0cZujzHkCE9HU2pvCtatLudxBA9+ceivOnyg7wKngO9YscOvPvqxrgoJR09rfQ/vQQMWLuFlbK7VLqDPJh9hryVatG7GinDPFyQc7yaVsu7wDckvDBGrjzH67E8k/3NPPgh8bvIRwQ9Lcc3uy3Htzz0Njm73KzKvBNkhLzDESs7FvSrvInberyVD0G8tE29u0sEcruFhIG7BQ2aPBdPvLrRw6W870q/PE68FjxI0Jy8adasunwZEbxMTxM8TGDEOzlnv7zqXsW8t8yzO9mIZLvgPHI86PHBvFO5QT06wxE8xH1sPF/th7ya6gm8Pp0YvJPsHD3E2Hw8voBBPC+g/Dy2FVG8e2KuPML/t7xtZlS7Vcu0vLI7yryVxWE84xZ5PPJupbypvyi86hTmOt1jrTxdkbU8DS3pPBY+C7149Gi8tKjNPLvwGb2bZ/w82HezPOHiIzwJrnI8/bJau876zzyAX9m77ZNcu90IHbzlzVu8RZuFPNZUD71Mu1S7cve9PBhhLzv6f

Finally, we will use the prompt with a `Chat` model:

In [117]:
llm = ChatOpenAI()
chain = final_prompt | llm
chain.invoke({"input": "5+2"})

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": "gmCbvDnCzzus8308xSOevN1jLb0FDZo8pvbSvLwCDbz1ovq8EVKRvAd6nbq0PIw8SwTyvIi5mDs4VUy8+djTumpCbjxZW1y6XDXjPC0RFzk9nNa7U14xPFJd7zlOzUe8Z8P3ujwwFTr4IXE7E2QEvKp1STzwphG9U7lBPVOoEL2kLr+7hvEEvdUJ7rzCtdi4lzJlu+3Mirk0ILU8A6AWvLUEIDzaiSY85oS+u3kG3Dw9nNa8gha8uxs6dDy6Smi7M8TivKN3XDp2djQ8G4UVO5E1uryLkt27KTeQO0fP2jrDW4q8q+LMOuoUZrwy/ZC74fPUu9dlQLwTZIS87cyKPE1hBryEgz+8+Mbgu4U6ojzJDta5qnXJO0cZujzHkCE9HU2pvCtatLudxBA9+ceivOnyg7wKngO9YscOvPvqxrgoJR09rfQ/vQQMWLuFlbK7VLqDPJh9hryVatG7GinDPFyQc7yaVsu7wDckvDBGrjzH67E8k/3NPPgh8bvIRwQ9Lcc3uy3Htzz0Njm73KzKvBNkhLzDESs7FvSrvInberyVD0G8tE29u0sEcruFhIG7BQ2aPBdPvLrRw6W870q/PE68FjxI0Jy8adasunwZEbxMTxM8TGDEOzlnv7zqXsW8t8yzO9mIZLvgPHI86PHBvFO5QT06wxE8xH1sPF/th7ya6gm8Pp0YvJPsHD3E2Hw8voBBPC+g/Dy2FVG8e2KuPML/t7xtZlS7Vcu0vLI7yryVxWE84xZ5PPJupbypvyi86hTmOt1jrTxdkbU8DS3pPBY+C7149Gi8tKjNPLvwGb2bZ/w82HezPOHiIzwJrnI8/bJau876zzyAX9m77ZNcu90IHbzlzVu8RZuFPNZUD71Mu1S7cve9PBhhLzv6f

AIMessage(content='7')

## Output Parsers

Output parsers convert the raw output from the language model into a format that you want to use. Most models will return a `string` and the default and most basic parser is the `StrOutputParser`. All parsers are based on the `BaseOutputParser` interface and have a `parse()` function. Here is a simple example:

In [120]:
StrOutputParser().parse("my output")

'my output'

If you want to have output that is structured, like with Json, then you need to define the data structure with `pydantic`:

In [125]:
class Joke(BaseModel):
  setup: str = Field(description="set up for the joke")
  punchline: str = Field(description="punchline for the joke")

In [126]:
parser = PydanticOutputParser(pydantic_object=Joke)
parser.parse('{"setup": "What do you call a bear with no teeth?", "punchline": "A gummy bear!"}')

Joke(setup='What do you call a bear with no teeth?', punchline='A gummy bear!')

Let's see it all together in a chain:

In [127]:
llm = ChatOpenAI()
prompt = ChatPromptTemplate.from_messages([
  ("system", "You create jokes with a <setup> and a <punchline> about the topic provided by the user. Return the joke as JSON with a <setup> and <punchline> property."),
  ("human", "{input}")
])
chain = prompt | llm | parser

In [128]:
chain.invoke({"input": "Tell me a joke about dentists."})

{
  "id": "chatcmpl-8UihHKSmwSubz3TnRLLevWnmauZWz",
  "object": "chat.completion",
  "created": 1702332347,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\n  \"setup\": \"Why did the dentist take up gardening?\",\n  \"punchline\": \"Because they wanted to flossom their skills!\"\n}"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 52,
    "completion_tokens": 32,
    "total_tokens": 84
  },
  "system_fingerprint": null
}



Joke(setup='Why did the dentist take up gardening?', punchline='Because they wanted to flossom their skills!')

There are a number of built-in parsers like the two we have already seen. Here are few more:

In [135]:
CommaSeparatedListOutputParser().parse("1, 2, 3, 4, 5") # the space between is required

['1', '2', '3', '4', '5']

In [139]:
DatetimeOutputParser().parse("2008-01-03T18:15:05.000000Z") # ISO 8601 format

datetime.datetime(2008, 1, 3, 18, 15, 5)

In [143]:
class Colors(Enum):
  RED = "red"
  BLUE = "blue"
  GREEN = "green"

EnumOutputParser(enum=Colors).parse("red")

<Colors.RED: 'red'>

Now let's look at how easy it is to create your own output parser:

In [144]:
class BetterCommaSeperatedListOutputParser(BaseOutputParser):
  def parse(self, text: str) -> list:
    return text.strip().split(",")

BetterCommaSeperatedListOutputParser().parse("1,2,3,4,5")

['1', '2', '3', '4', '5']