**LCEL**: a declarative way to compose chains together

Basics:
1. **A unified interface**: every lcel object implements `Runnable` eiifccuherecdinterface (common invocation methods: `invoke`, `batch`, `stream`, `ainvoke`...)
2. **Composition Primitives**: easy to compose chains, parallelize components, add fallbacks, dynamically configure chain internal, and more.

Reasons to use:
1. **streaming support**: best possible TTFT. stream from llm -> op_parser -> user. (incremental chunks of op at the same rate as the LLM provider outputs the raw tokens)
2. **async support**: async API. handle many concurrent requests in the same server.
3. **optimized parallel execution**
4. **retries and fallbacks**: streaming for retries/fallbacks - not available yet
5. **access intermediate steps**: with streaming


In [None]:
!pip install -qu langchain langchain-core langchain-community

### basic example: prompt + model + output parser

In [11]:
import boto3
from langchain_community.chat_models import BedrockChat
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_template(
    "tell me a short joke about {topic}"
)

llm = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

output_parser = StrOutputParser()

chain = prompt | llm | output_parser

print(chain.invoke({"topic": "AI research"}))

Here's a short joke about AI research:

Why did the AI researcher cross the road? To get to the other side... and then analyze the data to optimize their crossing strategy.


 ### rag search example

In [15]:
# ! pip install faiss-cpu --quiet

import os, json
with open("/home/ubuntu/config.json") as file:
    config = json.load(file)
os.environ["COHERE_API_KEY"] = config["cohere_api_key"]

In [23]:
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import CohereEmbeddings
from langchain_core.prompts import ChatPromptTemplate
import boto3
from langchain_community.chat_models import BedrockChat
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


vectorstore = FAISS.from_texts(
    ["harrison worked at kensho", "bears like to eat honey"],
    embedding=CohereEmbeddings(),
)
retriever = vectorstore.as_retriever()

prompt_template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template=prompt_template)

llm = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

output_parser = StrOutputParser()

chain = (
    {
        "context": retriever,
        "question": RunnablePassthrough()
    }
    | prompt
    | llm
    | output_parser
)

print(chain.invoke("where did harrison work?"))


Based on the given context, which consists of two documents with the page contents "harrison worked at kensho" and "bears like to eat honey", the answer to the question "where did harrison work?" is that Harrison worked at Kensho.


### add tracing

In [22]:
os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_ENDPOINT"]="https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"]=config["langsmith_api_key"]

In [24]:
# check in Langsmith UI
print(chain.invoke("where did harrison work?"))

Based on the given context, which consists of two documents with the page contents "harrison worked at kensho" and "bears like to eat honey", the answer to the question "where did harrison work?" is that Harrison worked at Kensho.


### add streaming

In [28]:
stream = chain.stream("where did harrison work?")
# stream = chain.stream("what is the current state of the art in AGI research ?")
for chunk in stream:
    print(chunk, end="", flush=True)

Based on the given context, which consists of two documents with the page contents "harrison worked at kensho" and "bears like to eat honey", the answer to the question "where did harrison work?" is that Harrison worked at Kensho.

### add fallbacks

In [30]:
prompt = ChatPromptTemplate.from_template(
    "tell me a short joke about {topic}"
)

haiku_llm = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    # model_kwargs={"temperature": 0.0, "max_tokens":128}
    model_kwargs={"temperature": 0.0, "max_tokens_to_sample":128} # intentionally done to test faillback
)

sonnet_llm  = BedrockChat(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

output_parser = StrOutputParser()

haiku_chain = prompt | haiku_llm | output_parser
sonnet_chain = prompt | sonnet_llm | output_parser

In [33]:
print(haiku_chain.invoke({"topic": "AI research"})) # will throw an error

# print(sonnet_chain.invoke({"topic": "AI research"})) # will work perfectly

ValueError: Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: #: subject must not be valid against schema {"required":["messages"]}#: extraneous key [max_tokens_to_sample] is not permitted, please reformat your input and try again.

In [34]:
fallback_chain = haiku_chain.with_fallbacks([sonnet_chain])

print(fallback_chain.invoke({"topic": "AI research"}))

Why was the AI researcher's office so hot? The machine learning was overheating!


### add configurables

In [49]:
prompt = ChatPromptTemplate.from_template(
    "tell me a short joke about {topic}"
)

haiku = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    # model_kwargs={"temperature": 0.0, "max_tokens":128}
    model_kwargs={"temperature": 0.0, "max_tokens_to_sample": 128} # intentionally done to test faillback
)

sonnet  = BedrockChat(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens": 128}
)

claude2 = BedrockChat(
    model_id="anthropic.claude-v2:1",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens": 128}
)

output_parser = StrOutputParser()

In [50]:
from langchain_core.runnables import ConfigurableField

configurable_llm = llm.configurable_alternatives(
    ConfigurableField(id="llm"),
    default_key="claude2",
    haiku=haiku,
    sonnet=sonnet,
)

configurable_chain = (
    {
        "topic": RunnablePassthrough()
    }
    | prompt
    | configurable_llm
    | output_parser
)

In [51]:
configurable_chain.invoke(
    "AI research",
    config={
        "llm": "haiku"
    }
)

"Here's a short joke about AI research:\n\nWhy did the AI researcher cross the road? To get to the other side... and then analyze the data to optimize their crossing strategy."

In [52]:
configurable_chain.invoke(
    "AI research",
    # config={
    #     "llm": "haiku"
    # }
)

"Here's a short joke about AI research:\n\nWhy did the AI researcher cross the road? To get to the other side... and then analyze the data to optimize their crossing strategy."

In [53]:
configurable_chain.invoke(
    "AI research",
    config={
        "llm": "sonnet"
    }
)

"Here's a short joke about AI research:\n\nWhy did the AI researcher cross the road? To get to the other side... and then analyze the data to determine the optimal path for future crossings."

### combine - tracing + streaming + fallbacks + configurables

In [59]:
import boto3
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.chat_models import BedrockChat
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import ( ConfigurableField, RunnablePassthrough )


os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_ENDPOINT"]="https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"]=config["langsmith_api_key"]

prompt = ChatPromptTemplate.from_template(
    "tell me a short joke about {topic}"
)

haiku = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
    # model_kwargs={"temperature": 0.0, "max_tokens_to_sample": 128} # intentionally done to test faillback
)

sonnet  = BedrockChat(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens": 128}
)

claude2 = BedrockChat(
    model_id="anthropic.claude-v2:1",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens": 128}
)

output_parser = StrOutputParser()

llm = (
    claude2
    .with_fallbacks([sonnet])
    .configurable_alternatives(
        ConfigurableField(id="llm"),
        default_key="claude2",
        haiku=haiku,
        sonnet=sonnet,
    )
)

chain = (
    {
        "topic": RunnablePassthrough()
    }
    | prompt
    | llm
    | output_parser
)

In [61]:
# chain.invoke("AI research", config={"llm": "haiku"})
response = chain.stream("AI research", config={"llm": "haiku"})

for chunk in response:
    print(chunk, end="", flush=True)

Why can't AI researchers tell the difference between Halloween and Christmas? Because Oct 31 = Dec 25!

### interface

- `stream`: stream back chunks of response
- `invoke`: call the chain on an input
- `batch`: call the chain on a list of inputs

async methods:
- `astream`: stream back chunks of response async
- `ainvoke`: call the chain on an input async
- `abatch`: call the chain on a list of inputs async
- `astream_log`: stream back intermediate steps as they happen, in addition to the final response
- `astream_events`:  stream events as they happen in chain

In [71]:
# response = chain.astream_log("AI research", config={"llm": "haiku"})
response = chain.astream_events("AI research", config={"llm": "haiku"}, version="v1")

async for chunk in response:
    print("-" * 70)
    print(chunk)


----------------------------------------------------------------------
{'event': 'on_chain_start', 'run_id': 'e348e1ac-9a95-46a1-b74c-6ac75ffd7a16', 'name': 'RunnableSequence', 'tags': [], 'metadata': {}, 'data': {'input': 'AI research'}}
----------------------------------------------------------------------
{'event': 'on_chain_start', 'name': 'RunnableParallel<topic>', 'run_id': '4b6586f5-24fc-4cb2-95ee-c37176f06e95', 'tags': ['seq:step:1'], 'metadata': {}, 'data': {}}
----------------------------------------------------------------------
{'event': 'on_chain_start', 'name': 'RunnablePassthrough', 'run_id': '26cd260f-f8aa-41d4-9996-602038031591', 'tags': ['map:key:topic'], 'metadata': {}, 'data': {}}
----------------------------------------------------------------------
{'event': 'on_chain_stream', 'name': 'RunnablePassthrough', 'run_id': '26cd260f-f8aa-41d4-9996-602038031591', 'tags': ['map:key:topic'], 'metadata': {}, 'data': {'chunk': 'AI research'}}
--------------------------------

----------------------------------------------------------------------
{'event': 'on_chat_model_end', 'name': 'BedrockChat', 'run_id': 'b94eb4ff-ec35-491f-91e0-a3d1365e2d11', 'tags': [], 'metadata': {}, 'data': {'input': {'messages': [[HumanMessage(content='tell me a short joke about AI research')]]}, 'output': {'generations': [[{'text': "Why can't AI researchers tell the difference between Halloween and Christmas? Because Oct 31 = Dec 25!", 'generation_info': None, 'type': 'ChatGeneration', 'message': AIMessage(content="Why can't AI researchers tell the difference between Halloween and Christmas? Because Oct 31 = Dec 25!")}]], 'llm_output': None, 'run': None}}}
----------------------------------------------------------------------
{'event': 'on_parser_start', 'name': 'StrOutputParser', 'run_id': 'b188dca1-1468-49d0-b730-fb1bf1e43b24', 'tags': ['seq:step:4'], 'metadata': {}, 'data': {}}
----------------------------------------------------------------------
{'event': 'on_parser_stream',

### streaming 🥷

1. `stream` and `astream` - for final output from chain
2. `astream_events`, `astream_log` - for both intermediate steps and final output from chain
3. `Streaming` is only possible if all steps in the program know how to process an input stream; i.e., process an input chunk one at a time, and yield a corresponding output chunk.
4. `Challenges` - (easy) to stream tokens produced by LLM -> (hard) to stream parts of JSON result before the entire JSON is complete

In [86]:
### streaming a json response
import boto3
from langchain_community.chat_models import BedrockChat
from langchain_core.output_parsers import JsonOutputParser


haiku = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

chain =  haiku | JsonOutputParser()
# Due to a bug in older versions of Langchain, JsonOutputParser did not stream results from some models

input_text = 'output a list of the countries france, spain and japan and their populations in JSON format. Do not explain.  Use a dict with an outer key of "countries" which contains a list of countries. Each country should have the key `name` and `population`'
# async for text in chain.astream(input_text):
for text in chain.stream(input_text):
    print(text, flush=True)


{}
{'countries': []}
{'countries': [{}]}
{'countries': [{'name': ''}]}
{'countries': [{'name': 'France'}]}
{'countries': [{'name': 'France', 'population': 67}]}
{'countries': [{'name': 'France', 'population': 67059}]}
{'countries': [{'name': 'France', 'population': 67059887}]}
{'countries': [{'name': 'France', 'population': 67059887}, {}]}
{'countries': [{'name': 'France', 'population': 67059887}, {'name': ''}]}
{'countries': [{'name': 'France', 'population': 67059887}, {'name': 'Spain'}]}
{'countries': [{'name': 'France', 'population': 67059887}, {'name': 'Spain', 'population': 46}]}
{'countries': [{'name': 'France', 'population': 67059887}, {'name': 'Spain', 'population': 46754}]}
{'countries': [{'name': 'France', 'population': 67059887}, {'name': 'Spain', 'population': 46754778}]}
{'countries': [{'name': 'France', 'population': 67059887}, {'name': 'Spain', 'population': 46754778}, {}]}
{'countries': [{'name': 'France', 'population': 67059887}, {'name': 'Spain', 'population': 4675477

In [89]:
from langchain_core.output_parsers import (
    JsonOutputParser,
)


# A function that operates on finalized inputs
# rather than on an input_stream
def _extract_country_names(inputs):
    """A function that does not operates on input streams and breaks streaming."""
    if not isinstance(inputs, dict):
        return ""

    if "countries" not in inputs:
        return ""

    countries = inputs["countries"]

    if not isinstance(countries, list):
        return ""

    country_names = [
        country.get("name") for country in countries if isinstance(country, dict)
    ]
    return country_names


chain = haiku | JsonOutputParser() | _extract_country_names

async for text in chain.astream(
    'output a list of the countries france, spain and japan and their populations in JSON format. Do not explain.  Use a dict with an outer key of "countries" which contains a list of countries. Each country should have the key `name` and `population`'
):
    print(text, end="|", flush=True)

['France', 'Spain', 'Japan']|

In [92]:
from langchain_core.output_parsers import JsonOutputParser


async def _extract_country_names_streaming(input_stream):
    """A function that operates on input streams."""
    country_names_so_far = set()

    async for input in input_stream:
        if not isinstance(input, dict):
            continue

        if "countries" not in input:
            continue

        countries = input["countries"]

        if not isinstance(countries, list):
            continue

        for country in countries:
            name = country.get("name")
            if not name:
                continue
            if name not in country_names_so_far:
                yield name
                country_names_so_far.add(name)


chain = haiku | JsonOutputParser() | _extract_country_names_streaming

async for text in chain.astream(
    'output a list of the countries france, spain and japan and their populations in JSON format. Do not explain.  Use a dict with an outer key of "countries" which contains a list of countries. Each country should have the key `name` and `population`'
):
    print(text, end="|", flush=True)

France|Spain|Japan|

### runnables

1. RunnableParallel
2. RunnablePassthrough
3. RunnableLambda
4. RunnableConfig
5. RunnableBranch

### accepting a runnable config

In [102]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableConfig, RunnableLambda

In [157]:
import json

def parse_or_fix(
    text: str, 
    config: RunnableConfig
):
    # print(f"text: {text}")
    fix_prompt = ChatPromptTemplate.from_template(
        "\n\nHuman: Fix the following text:\n\n```text\n{input}\n```\nError: {error}\n"
        "Don't narrate, just respond with the fixed data. \nAssistant: "
    )

    llm = BedrockChat(
        # model_id="anthropic.claude-3-haiku-20240307-v1:0",
        model_id="anthropic.claude-3-sonnet-20240229-v1:0",
        client=boto3.client("bedrock-runtime"),
        model_kwargs={"temperature": 0.0, "max_tokens":128}
    )

    fixing_chain = fix_prompt | llm | StrOutputParser()

    for i in range(5):
        try:
            return json.loads(text)
        except Exception as e:
            # print(f"Error: {e}")
            text = fixing_chain.invoke(
                {
                    "input": text,
                    "error": e
                },
                # config
            )

    return "Failed to parse"

# result = parse_or_fix(
#     text="{foo: bar}",
#     config={}
)

In [158]:
result

{'foo': 'bar'}

In [159]:
from langchain.callbacks import get_openai_callback

# to use get_openai_callback - replace the llm with ChatOpenAI()

with get_openai_callback() as cb:
    output = RunnableLambda(parse_or_fix).invoke(
        "{foo: bar}", {"tags": ["my-tag"], "callbacks": [cb]}
    )
    print(output)
    print(cb)

{'foo': 'bar'}
Tokens Used: 0
	Prompt Tokens: 0
	Completion Tokens: 0
Successful Requests: 0
Total Cost (USD): $0.0


### dynamically route logic based on input

#### naive

In [162]:
import boto3
from langchain_community.chat_models import BedrockChat
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate

claude2 = BedrockChat(
    model_id="anthropic.claude-v2:1",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens": 128}
)

chain = (
    PromptTemplate.from_template(
        """Given the user question below, classify it as either being about `Haiku`, `Sonnet`, or `Other`.

Do not respond with more than one word.

<question>
{question}
</question>

Classification:"""
    )
    | claude2
    | StrOutputParser()
)

chain.invoke({"question": "how do I call Haiku?"})

'Haiku'

In [163]:
haiku = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

sonnet  = BedrockChat(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens": 128}
)

claude2 = BedrockChat(
    model_id="anthropic.claude-v2:1",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens": 128}
)


haiku_chain = (
    PromptTemplate.from_template(
        """You are an AI research assistant called Haiku. \
Always answer questions starting with "Haiku:". \
Respond to the following question:

Question: {question}
Answer:"""
    )
    | haiku
)

sonnet_chain = (
    PromptTemplate.from_template(
         """You are an AI research assistant called Sonnet. \
Always answer questions starting with "Sonnet:". \
Respond to the following question:

Question: {question}
Answer:"""
    )
    | sonnet
)
general_chain = (
    PromptTemplate.from_template(
        """Respond to the following question:

Question: {question}
Answer:"""
    )
    | claude2
)


def route(info):
    if "haiku" in info["topic"].lower():
        return haiku_chain
    elif "sonnet" in info["topic"].lower():
        return sonnet_chain
    else:
        return general_chain 

full_chain = (
    {
        "topic": chain,
        "question": lambda x: x["question"]
    }
    | RunnableLambda(route)
)

In [164]:
full_chain.invoke({"question": "how do I use Anthropic?"})

AIMessage(content='Here are a few tips for using Anthropic:\n\n- Ask Claude open-ended questions or have conversations. Claude is designed to have natural conversations and be helpful, harmless, and honest. Simply ask questions or make statements to have a dialogue.\n\n- Provide context and clarification when needed. If Claude seems confused or gives an unhelpful response, try rephrasing your question or providing more details to guide the conversation. \n\n- Adjust your expectations. Claude has limitations in its knowledge and capabilities. Expect insightful but simplistic responses rather than fully comprehensive expertise.\n\n- Check the website and documentation.')

In [165]:
full_chain.invoke({"question": "how do I use Anthropic Haiku?"})

AIMessage(content='Haiku: To use Anthropic Haiku, simply ask me questions and I will respond in the form of a haiku poem. I am an AI research assistant created by Anthropic to provide helpful information in a concise, poetic format. Feel free to ask me anything, and I will do my best to answer in a thoughtful, 3-line haiku.')

In [167]:
full_chain.invoke({"question": "how do I use Anthropic Sonnet for research?"})

AIMessage(content="Sonnet: As an AI research assistant, I can assist you with various tasks related to your research projects. Some ways I can help include literature review, data analysis, writing and editing research papers, creating visualizations, and answering specific questions within my knowledge domain. Please provide more details about your research area and specific needs, and I'll do my best to support you effectively.")

In [168]:
full_chain.invoke({"question": "whats 2 + 2"})

AIMessage(content='4')

#### using a RunnableBranch

In [170]:
from langchain_core.runnables import RunnableBranch

branch = RunnableBranch(
    (lambda x: "haiku" in x["topic"].lower(), haiku_chain),
    (lambda x: "sonnet" in x["topic"].lower(), sonnet_chain),
    general_chain,
)

full_chain = (
    {
        "topic": chain,
        "question": lambda x: x["question"]
    }
    | branch
)

In [171]:
full_chain.invoke({
    "question": "how to use Anthropic's Haiku to work on AGI research?"
})

AIMessage(content="Haiku: Anthropic's Haiku is a powerful AI research tool that can assist with advanced AI research, including work on artificial general intelligence (AGI). To use Haiku for AGI research, I recommend the following steps:\n\n1. Familiarize yourself with Haiku's capabilities and features. Explore the documentation and experiment with the tool to understand how it can support your research.\n\n2. Identify specific areas of your AGI research where Haiku could be beneficial, such as data processing, model training, or analysis of results.\n\n3. Integrate Haiku into")

In [172]:
full_chain.get_graph()

Graph(nodes={'39ba451d9f274f8c8484fe5a4b661f3f': Node(id='39ba451d9f274f8c8484fe5a4b661f3f', data=<class 'pydantic.main.RunnableParallel<topic,question>Input'>), '4d3b56a918c64317a5e863b64808ea1c': Node(id='4d3b56a918c64317a5e863b64808ea1c', data=<class 'pydantic.main.RunnableParallel<topic,question>Output'>), '40f95b98b41e400691f43883b4e402f9': Node(id='40f95b98b41e400691f43883b4e402f9', data=PromptTemplate(input_variables=['question'], template='Given the user question below, classify it as either being about `Haiku`, `Sonnet`, or `Other`.\n\nDo not respond with more than one word.\n\n<question>\n{question}\n</question>\n\nClassification:')), '4e16a4a65a2b43828e3d73261a15d0c8': Node(id='4e16a4a65a2b43828e3d73261a15d0c8', data=BedrockChat(client=<botocore.client.BedrockRuntime object at 0x7f9a2397f2e0>, model_id='anthropic.claude-v2:1', model_kwargs={'temperature': 0.0, 'max_tokens': 128})), '096bed4f6f564d3f9868c4db32b3a830': Node(id='096bed4f6f564d3f9868c4db32b3a830', data=StrOutput

In [174]:
# ! pip install grandalf --quiet

full_chain.get_graph().print_ascii()

[0m       +-------------------------------+       
       | Parallel<topic,question>Input |       
       +-------------------------------+       
                **            ***              
              **                 **            
            **                     **          
+----------------+                   **        
| PromptTemplate |                    *        
+----------------+                    *        
          *                           *        
          *                           *        
          *                           *        
  +-------------+                     *        
  | BedrockChat |                     *        
  +-------------+                     *        
          *                           *        
          *                           *        
          *                           *        
+-----------------+           +-------------+  
| StrOutputParser |           | Lambda(...) |  
+-----------------+           +-----

In [175]:
full_chain.get_prompts()

[PromptTemplate(input_variables=['question'], template='Given the user question below, classify it as either being about `Haiku`, `Sonnet`, or `Other`.\n\nDo not respond with more than one word.\n\n<question>\n{question}\n</question>\n\nClassification:')]

### add memory

purpose: add message history to chain

how: wrap Runnable with another that can manage the chat message history -> RunnableWithMessageHistory

In [1]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_models import  BedrockChat
import boto3

llm = BedrockChat(
    model_id="anthropic.claude-3-haiku-20240307-v1:0",
    client=boto3.client("bedrock-runtime"),
    model_kwargs={"temperature": 0.0, "max_tokens":128}
)

prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        "You're an assistant who's good at {ability}. Respond in 20 words or fewer"
    ),
    MessagesPlaceholder(variable_name="chat_history"),
    (
        "human",
        "{question}"
    )
])

chain = prompt | llm


In [2]:
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

chain_with_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="question",
    history_messages_key="chat_history"
)

In [4]:
chain_with_history.invoke(
    {
        "ability": "math",
        "question": "What does cosine mean?"
    },
    config = {
        "configurable": {
            "session_id": "001"
        }
    }
)

AIMessage(content='Cosine is the trigonometric function that represents the ratio of the adjacent side to the hypotenuse of a right triangle.')

In [5]:
chain_with_history.invoke(
    {
        "ability": "math",
        "question": "Simplify it and give examples"
    },
    config = {
        "configurable": {
            "session_id": "001"
        }
    }
)

AIMessage(content='Cosine is the ratio of the adjacent side to the hypotenuse. Examples: cos(0°) = 1, cos(90°) = 0, cos(45°) = √2/2.')

In [6]:
chain_with_history.invoke(
    {
        "ability": "math",
        "question": "Simplify it and give examples"
    },
    config = {
        "configurable": {
            "session_id": "002"
        }
    }
)

AIMessage(content='Simplify mathematical expressions, provide step-by-step solutions, and give numerical examples.')