## Still under development
## Introduction to Chains using the LangChain Package
The concept of an "chain" is an important tool for understanding and building generative AI applications. Chains are a key concept in LangChain, as they allow developers to create complex workflows by chaining together various components, such as language models, data sources, and processing steps.

Understanding chains is crucial because they enable the creation of more sophisticated and powerful generative AI systems. Chains allow analysts to seamlessly integrate multiple functionalities, such as retrieving relevant information from external data sources, processing that information using language models, and generating coherent and contextual outputs.

### Table of Contents <a name="top"></a>
1. [Create a LLM inside of LangChain](#llm)
2. [Prompt Templates](#template)
3. [Text Generation](#text-generation)
4. [Question Answering](#question-answering)
5. [Summarization](#summarization)
6. [What's inside a pipeline?](#pipeline)
7. [Your assignment](#assign)

This content was adapted from: https://python.langchain.com/docs/get_started/quickstart/


In [1]:
# Remember, in order to run this notebook, you have to run the notebook M2-8a-Config_SM_image to configure the 
# SageMaker Docker image everytime you stop and restart the Jupyterlab Space.
from dotenv import load_dotenv
import os
import langchain

In [2]:
# Load environment variables from .env file
load_dotenv()
# Now you can access the environment variables
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
openai_api_key = os.getenv('OPENAI_API_KEY')
langchain_api_key = os.getenv('LANGCHAIN_API_KEY')
huggingface_api_key = os.getenv('HUGGINGFACE_API_KEY')

In [80]:
# Now import everything we will need
# from langchain_anthropic import ChatAnthropic
# from langchain_openai import ChatOpenAI
# from langchain import HuggingFaceHub
# from langchain_core.prompts import ChatPromptTemplate
# from langchain_core.output_parsers import StrOutputParser
# import json

## Create a chat model inside LangChain <a name="llm"></a>
Let's start with creating a chat model. We can use one from a wide selection, some of which will be familiar to you.

https://python.langchain.com/docs/integrations/chat/

In [3]:
# Docs: https://api.python.langchain.com/en/latest/chat_models/langchain_anthropic.chat_models.ChatAnthropic.html

from langchain_anthropic import ChatAnthropic

claude_llm = ChatAnthropic(model="claude-3-sonnet-20240229",api_key=anthropic_api_key)
# Show some details about the model
claude_llm.dict()

{'model': 'claude-3-sonnet-20240229',
 'max_tokens': 1024,
 'temperature': None,
 'top_k': None,
 'top_p': None,
 'model_kwargs': {},
 'streaming': False,
 'max_retries': 2,
 'default_request_timeout': None,
 '_type': 'anthropic-chat'}

In [4]:
# Create a simple prompt
raw_input = "Please briefly explain the langchain package."

In [5]:
# Invoke the model with our prompt
response = claude_llm.invoke("Please briefly explain the langchain package.", max_tokens = 100)
print("What is the type of response?:", type(response))

What is the type of response?: <class 'langchain_core.messages.ai.AIMessage'>


In [6]:
# Here is the documentation:
# https://api.python.langchain.com/en/latest/messages/langchain_core.messages.ai.AIMessage.html
# Just dump the AIMessage to the screen
response

AIMessage(content='LangChain is a Python library that aims to make it easier to build applications with large language models (LLMs) like GPT-3, BLOOM, and others. It provides a set of abstractions and utilities that simplify the process of working with LLMs, allowing developers to focus on building their applications rather than dealing with the low-level details of interacting with the models.\n\nHere are some key features and capabilities of LangChain:\n\n1.', response_metadata={'id': 'msg_01MRW7XeNiEqaqpwaPaj1a6x', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'max_tokens', 'stop_sequence': None, 'usage': {'input_tokens': 15, 'output_tokens': 100}}, id='run-3a9080a5-d75a-481d-8ee1-3f3cc9b425d4-0')

In [7]:
# We can also get a Python dictionary
response.dict()

{'content': 'LangChain is a Python library that aims to make it easier to build applications with large language models (LLMs) like GPT-3, BLOOM, and others. It provides a set of abstractions and utilities that simplify the process of working with LLMs, allowing developers to focus on building their applications rather than dealing with the low-level details of interacting with the models.\n\nHere are some key features and capabilities of LangChain:\n\n1.',
 'additional_kwargs': {},
 'response_metadata': {'id': 'msg_01MRW7XeNiEqaqpwaPaj1a6x',
  'model': 'claude-3-sonnet-20240229',
  'stop_reason': 'max_tokens',
  'stop_sequence': None,
  'usage': {'input_tokens': 15, 'output_tokens': 100}},
 'type': 'ai',
 'name': None,
 'id': 'run-3a9080a5-d75a-481d-8ee1-3f3cc9b425d4-0',
 'example': False,
 'tool_calls': [],
 'invalid_tool_calls': []}

#### OpenAI Model

In [8]:
# The OpenAI Chat is very similar
# Documentation: https://python.langchain.com/docs/integrations/chat/openai/
from langchain_openai import ChatOpenAI

openai_llm = ChatOpenAI(model='gpt-3.5-turbo', api_key=openai_api_key)
openai_llm.dict()

{'model_name': 'gpt-3.5-turbo',
 'model': 'gpt-3.5-turbo',
 'stream': False,
 'n': 1,
 'temperature': 0.7,
 '_type': 'openai-chat'}

In [32]:
#  Invoke the same way
response = openai_llm.invoke("Please briefly explain the langchain package.", max_tokens = 100)
print('Check out the type of the return from the model:', type(response))
# Look at some detail of the return
response.dict()

Check out the type of the return from the model: <class 'langchain_core.messages.ai.AIMessage'>


{'content': 'The langchain package is a Python library for working with natural language processing (NLP) tasks, such as text classification, sentiment analysis, and named entity recognition. It provides a simple interface for building and training machine learning models for NLP tasks, as well as tools for preprocessing text data and evaluating model performance. The langchain package is designed to be easy to use and flexible, making it a useful tool for developers working on NLP projects.',
 'additional_kwargs': {},
 'response_metadata': {'token_usage': {'completion_tokens': 90,
   'prompt_tokens': 15,
   'total_tokens': 105},
  'model_name': 'gpt-3.5-turbo',
  'system_fingerprint': 'fp_c2295e73ad',
  'finish_reason': 'stop',
  'logprobs': None},
 'type': 'ai',
 'name': None,
 'id': 'run-6822ff4a-22e8-4bb7-acb7-fdc116871ece-0',
 'example': False,
 'tool_calls': [],
 'invalid_tool_calls': []}

#### Hugging Face Models

In [35]:
# We can also use HuggingFace models inside of LangChain
# Documentation: https://python.langchain.com/docs/integrations/chat/huggingface/
#
from langchain_community.llms import HuggingFaceHub

hf_llm = HuggingFaceHub(
    huggingfacehub_api_token=huggingface_api_key,
    repo_id="HuggingFaceH4/zephyr-7b-beta",
    task="text-generation",
    model_kwargs={
        "max_new_tokens": 512,
        "top_k": 2,
        "temperature": 0.1,
        "repetition_penalty": 1.03,
    },
)
hf_llm.dict()

{'repo_id': 'HuggingFaceH4/zephyr-7b-beta',
 'task': 'text-generation',
 'model_kwargs': {'max_new_tokens': 512,
  'top_k': 2,
  'temperature': 0.1,
  'repetition_penalty': 1.03},
 '_type': 'huggingface_hub'}

In [36]:
# Invoke the same way.
response = hf_llm.invoke("Please briefly explain the langchain package.", max_tokens=100)
print(response)

Please briefly explain the langchain package. How does it help in building intelligent agents?

LangChain is a Python library for building intelligent agents that can interact with various data sources, including text, audio, and video. It provides a framework for working with large-scale language models, such as GPT-3, and enables developers to build applications that can understand natural language, generate responses, and reason about information. LangChain's modular architecture allows for easy integration of different components, such as vector databases, caching, and dialog management, making it a versatile tool for building intelligent agents. Overall, LangChain helps in building intelligent agents by providing a set of tools and libraries for working with large-scale language models, enabling developers to build applications that can understand natural language, generate responses, and reason about information.


In [113]:
# We now have 3 models defined
# claude_llm, openai_llm, hf_llm

## Prompt Templates <a name="template"></a>
Prompt templates are predefined recipes for generating prompts for language models. A template may include instructions, few-shot examples, and specific context and questions appropriate for a given task. LangChain provides tooling to create and work with prompt templates. 
LangChain strives to create model agnostic templates to make it easy to reuse existing templates across different language models. Typically, language models expect the prompt to either be a string or else a list of chat messages.

### Simple Prompt Template

In [37]:
from langchain.prompts import PromptTemplate

prompt = PromptTemplate(
    input_variables=["adjective","topic"],
    template="What is a {adjective} joke about {topic}?",
)

prompt.input_schema()

PromptInput(adjective=None, topic=None)

In [38]:
# We can use this by passing the parameters
prompt.format(adjective="funny",topic="dogs")

'What is a funny joke about dogs?'

In [39]:
# But we can also use the invoke() method and pass a dictionary. This is more common
prompt.invoke({"adjective":"strange","topic": "bears"})

StringPromptValue(text='What is a strange joke about bears?')

## Chains
A LangChain chain is a fundamental concept in the LangChain framework that allows developers to create  workflows by chaining together various components, such as prompts, language models, parsers, data sources, and processing steps.

Chains allow you to go beyond just a single API call to a language model and instead chain together multiple calls in a logical sequenc

In [40]:
# Define a chain
chain = prompt | claude_llm 
type(chain)

langchain_core.runnables.base.RunnableSequence

In [41]:
# To use the chain, use the invoke() method
chain.invoke({"adjective":"happy","topic": "pumpkins"})

AIMessage(content="Here's a happy, pun-filled joke about pumpkins:\n\nWhy did the pumpkin cross the road? To get to the other celery!", response_metadata={'id': 'msg_015ewLVvffoCcpHF4sPzYHJS', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 17, 'output_tokens': 37}}, id='run-9c6fbe59-861b-43c9-9d7c-24251eb1e6d2-0')

## Parser
Output parsers are responsible for taking the output of an LLM and transforming it to a more suitable format. This is very useful when you are using LLMs to generate any form of structured data.

In [61]:
from langchain_core.output_parsers import StrOutputParser

# This parser takes the AIMessage reutrned form the LLM and converts it to a string
output_parser = StrOutputParser()

In [43]:
# Create a chain with 3 steps
chain = prompt | openai_llm | output_parser
type(chain)

langchain_core.runnables.base.RunnableSequence

In [44]:
# Use the chain
response = chain.invoke({"adjective":"rude","topic": "chefs"})
# Check out the response now. No longer an AIMessage datatype
print("Type of the resopnse:", type(response))
response

Type of the resopnse: <class 'str'>


"Why did the chef get fired? Because he couldn't take the heat... or the criticism from Gordon Ramsay!"

## Message Prompt Template
Here is the message format that you have seen before in the API. This allows you to call the model with some more specific context or instruction.

In [58]:
from langchain_core.prompts import ChatPromptTemplate

chat_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful programming assistant. Your name is Clyde."),
        ("human", "Hello, I'm Kurt. Can you help me with my programming tasks?"),
        ("ai", "Yes, I am ready, willing and able."),
        ("human", "{user_input}"),
    ]
)
chat_prompt.input_schema()

PromptInput(user_input=None)

In [59]:
messages = chat_prompt.format_messages( user_input="What is your name?")
messages

[SystemMessage(content='You are a helpful programming assistant. Your name is Clyde.'),
 HumanMessage(content="Hello, I'm Kurt. Can you help me with my programming tasks?"),
 AIMessage(content='Yes, I am ready, willing and able.'),
 HumanMessage(content='What is your name?')]

In [60]:
claude_llm.invoke(messages)

AIMessage(content="Hello Kurt, my name is Claude. It's nice to meet you! I'm an AI assistant created by Anthropic to help with all sorts of tasks, including programming. How can I assist you today?", response_metadata={'id': 'msg_01NtWwphrypBUWHHEUrbfaqi', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 57, 'output_tokens': 46}}, id='run-b61b5eb2-b8a0-4e9c-869b-b85e0c4a74d1-0')

In [63]:
chain = chat_prompt | claude_llm | output_parser

In [69]:
current_input = input("Your turn: ")
print("You entered: ", current_input)
messages = chat_prompt.format_messages( user_input=current_input)
messages

Your turn:  Help me create a for loop.


You entered:  Help me create a for loop.


[SystemMessage(content='You are a helpful programming assistant. Your name is Clyde.'),
 HumanMessage(content="Hello, I'm Kurt. Can you help me with my programming tasks?"),
 AIMessage(content='Yes, I am ready, willing and able.'),
 HumanMessage(content='Help me create a for loop.')]

In [72]:
response = chain.invoke(messages)
print(response)

Hello Kurt, certainly! I'd be happy to help you create a for loop. A for loop is used to iterate over a sequence (such as a list, tuple, or string) or other iterable objects. Here's a basic example in Python:

```python
# Iterating over a list
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
    print(fruit)

# Output:
# apple
# banana
# cherry
```

In this example, the loop will iterate over each item in the `fruits` list, and for each iteration, the variable `fruit` will take the value of the current item. The loop body (indented code block) will be executed for each iteration.

You can also use the `range()` function to iterate over a sequence of numbers:

```python
# Iterating over a range of numbers
for i in range(5):
    print(i)

# Output:
# 0
# 1
# 2
# 3
# 4
```

Here, the `range(5)` generates a sequence of numbers from 0 to 4 (up to, but not including, 5), and the loop iterates over each number, assigning it to the variable `i`.

You can customize the start, stop, a

In [192]:
# from langchain_core.prompts import ChatPromptTemplate

# chat_prompttemplate = ChatPromptTemplate.from_messages(
#     [
#         ("system", "You are a helpful AI bot. Your name is {name}."),
#         ("human", "Hello, how are you doing?"),
#         ("ai", "I'm doing well, thanks!"),
#         ("human", "{user_input}"),
#     ]
# )

# messages = chat_template.format_messages(name="Bob", user_input="What is your name?")

In [193]:
messages

[SystemMessage(content='You are a helpful AI bot. Your name is Bob.'),
 HumanMessage(content='Hello, how are you doing?'),
 AIMessage(content="I'm doing well, thanks!"),
 HumanMessage(content='What is your name?')]

In [198]:
chain = chat_template | openai_llm 

In [199]:
chain.invoke({"name":"Clyde","user_input": "What is your name?"})

AIMessage(content='My name is Clyde. How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 12, 'prompt_tokens': 50, 'total_tokens': 62}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_d9767fc5b9', 'finish_reason': 'stop', 'logprobs': None}, id='run-a69b6f81-1872-40ad-b1d6-8c4558ea5e80-0')

In [180]:
# Define the chain
chain = prompt | openai_llm

In [185]:
# Now we can use the chain by using the invoke method and the dicitonary with the parameters
chain.invoke({"adjective":"silly","topic": "monkeys"})

AIMessage(content='Why did the monkey like the banana?\n\nBecause it had appeal!', response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 15, 'total_tokens': 28}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_d9767fc5b9', 'finish_reason': 'stop', 'logprobs': None}, id='run-8fff555a-246c-4e22-81c6-98bcf9da26d7-0')

In [None]:
#myInput= {"input": "how can langsmith help with testing?"}
#response = chain.invoke({"input": "how can langsmith help with testing?"})
response = chain.invoke("horses")

## Parser

In [188]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

In [189]:
chain = prompt | openai_llm | output_parser

In [190]:
chain.invoke({"adjective":"rude","topic": "chefs"})

"Why did the chef get kicked out of the kitchen? \n\nBecause he couldn't take the heat and kept throwing pans at everyone!"

In [144]:
from langchain.chains import LLMChain

chain = LLMChain(llm=openai_llm, prompt=prompt)
#chain.run("podcast player")
#chain.invoke({"input": "how can langsmith help with testing?"})
chain.invoke(topic = "lawyers")

TypeError: Chain.invoke() missing 1 required positional argument: 'input'

In [133]:
from langchain.prompts import PromptTemplate

prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)

print(prompt.format(product="podcast player"))

What is a good name for a company that makes podcast player?


In [65]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template(
    "Tell me a {adjective} joke about {content}."
)

a = 'funny'
c = 'dogs'

prompt_template.format(adjective = a, content = c)

'Tell me a funny joke about dogs.'

In [114]:
a = 'silly'
c = 'cats'
prompt_template.format(adjective= a, content= c)

'Tell me a silly joke about cats.'

In [None]:
chain = prompt | llm 

In [159]:
#myInput= {"input": "how can langsmith help with testing?"}
#response = chain.invoke({"input": "how can langsmith help with testing?"})
response = chain.invoke("horses")

In [160]:
response.dict()['content']

"Why couldn't the pony sing a lullaby? \n\nBecause he was a little hoarse!"

In [131]:
# We can quickly call any of the models using the Prompt Template
a = 'odd'
c = 'Llamas'

print('Claude:', claude_llm.invoke(prompt_template.format(adjective = a, content = c)).dict()['content'])
print('\nOpenAI:', openai_llm.invoke(prompt_template.format(adjective = a, content = c)).dict()['content'])
print('\nHF Zepher:', hf_llm.invoke(prompt_template.format(adjective = a, content = c)))

Claude: Why did the llama go to the barbershop? To get a llama cut!

OpenAI: Why did the llama go to therapy? Because it had too much drama in its herd!

HF Zepher: Tell me a odd joke about Llamas.
I once knew an llama that could play the piano, but it was all uphill from there.
How do you keep an llama in suspense?
I'll tell you tomorrow.
Why did the llama join the circus?
To spit in the lion's face!
Why did the llama get married?
Because he found his cam-mo-le!
What do you call a llama that can't be managed?
A failama!
Why did the llama wear sunglasses to the party?
Because he heard it was a blinder!
Why did the llama go to the doctor?
Because he was feeling spitty!
Why did the llama wear a lifejacket to the party?
Because he heard it was a float!
Why did the llama wear a hard hat to the party?
Because he heard it was a construction!
Why did the llama wear a helmet to the party?
Because he heard it was a shell-ebration!
Why did the llama wear a wetsuit to the party?
Because he heard 

In [None]:
# Chain
chain = prompt | llm 

In [10]:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class technical documentation writer writing /
    specifically for business master's degree students."),
    ("user", "{input}")
])

SyntaxError: unterminated string literal (detected at line 3) (3512043507.py, line 3)

In [9]:
# Chain
chain = prompt | llm 

In [10]:
# Invoke
myInput= {"input": "how can langsmith help with testing?"}
#response = chain.invoke({"input": "how can langsmith help with testing?"})
response = chain.invoke(myInput)

In [11]:
print(response.content)

Langsmith can be a valuable tool for testing software applications and systems. Here are some ways Langsmith can assist with testing:

1. **Test Case Generation**: Langsmith's natural language processing capabilities can be used to generate test cases from requirements documents, user stories, or other textual descriptions of the system's expected behavior. This can help automate the process of creating comprehensive test suites.

2. **Test Data Generation**: Langsmith can analyze the application's data requirements and generate realistic test data, including edge cases and boundary conditions. This can be particularly useful for testing data-intensive applications or scenarios where manually creating test data is time-consuming or error-prone.

3. **Test Automation**: Langsmith can be integrated with testing frameworks and tools to automate the execution of test cases. By understanding natural language instructions, Langsmith can translate high-level test scenarios into executable tes

In [12]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

In [13]:
# Defimne the 3-step-chain
chain = prompt | llm | output_parser

In [14]:
# Now the response is the result of the 3-steps
response = chain.invoke(myInput)

In [15]:
print(response)

LangSmith can be a valuable tool for testing in several ways:

1. **Test Case Generation**: LangSmith can be used to generate test cases automatically based on the requirements or specifications provided. It can analyze the input and generate a comprehensive set of test cases, covering various scenarios and edge cases.

2. **Test Data Generation**: LangSmith can generate realistic and diverse test data for testing purposes. This can be particularly useful when testing with large datasets or when dealing with complex data structures.

3. **Test Script Generation**: LangSmith can generate test scripts in various programming languages, such as Python, Java, or JavaScript. These scripts can be used to automate the testing process, reducing the need for manual testing and increasing efficiency.

4. **Natural Language Test Case Description**: LangSmith can understand natural language descriptions of test cases and convert them into executable test scripts or test case specifications. This ca