# Langchain Basis

By end of this notebook, you should understand some basic concepts, such as Large Language Model (LLM), prompt, memory/chat history. These are the basis when designing Gen AI applications.

## LLM

One of the main reasons of using LangChain is that it provides a standard interface for interacting with many different LLMs (although it is not the only one). 

The quick way to try LangChain is to use OpenAI, you don't need to have a powerful GPU in your machine. Below is an example.

```python

from langchain.llms import OpenAI

# Default model is "text-davinci-003"
llm = OpenAI()

# or
llm = OpenAI(openai_api_key="sk-xxx", model_name="text-davinci-003", temperature=1.0, max_tokens=512)
```

Apart from OpenAI, langchain also has integrations to many other language models. (Check for the full list from: https://python.langchain.com/docs/integrations/llms/). From AWS side, it supports Bedrock, Sagemaker Endpoint, API Gateway etc.

In this notebook, we will use open source language model (`meta-llama/Llama-2-7b-chat-hf`) instead. You will need to accept the Llama license, and then download the model by providing huggingface hub token. Check more details from https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

In [1]:
import torch
from transformers import pipeline
from langchain.llms import HuggingFacePipeline

pipe = pipeline(
    task="text-generation",
    model="meta-llama/Llama-2-7b-chat-hf",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    temperature=0.0, 
    max_new_tokens=1024,
    repetition_penalty=1.1,
)

llm = HuggingFacePipeline(pipeline=pipe)

  from .autonotebook import tqdm as notebook_tqdm
Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.33s/it]


The basic usage of LLM is text generation, once you have define the `llm`, you can use it to start generate new texts

In [2]:
llm.predict("Once upon a time")

# This is same as
# llm("Once upon a time")

', there was a young woman named Sophia who lived in a small village nestled in the rolling hills of Tuscany. She had always been fascinated by the ancient art of alchemy, and spent many hours studying the dusty tomes in the local library.\nOne day, while out on a walk, Sophia stumbled upon an old, mysterious-looking book hidden away in a forgotten corner of the library. The cover was worn and faded, but as she opened it, she found that the pages were filled with strange symbols and illustrations that seemed to glow with an otherworldly light.\nIntrigued, Sophia took the book home and began to study it more closely. As she delved deeper into its secrets, she discovered that it contained the knowledge of a powerful alchemist who had lived centuries before.\nSophia spent many long days and nights pouring over the book, learning the intricacies of alchemy and practicing the techniques described within its pages. And as she did so, she began to notice a change within herself - her body was

OpenAI provides different models for general text generation (e.g. text-davinci-003) and chat generation (e.g. "gpt-3.5-turbo"), langchain also has a concept of `Chat Models`. 

You can try that out by yourself, below is an quick example

## PromptTemplate and Chain

Let's try some instructions.

In [3]:
llm.predict("Translate this sentence into Chinese: I love programming.")

'\n\nHere is the translation of "I love programming" in Chinese:\n\n我爱编程。\n\nNote: The word "编程" (pinyin: biān chéng) means "programming" in Chinese.'

Llama 2 Chat is already fine-tuned to perform well on instructions, we can use the predefined prompt template as below

In [4]:
prompt = """[INST] <<SYS>>
 You are a helpful, respectful and honest assistant.
 <</SYS>>

Translate this sentence into Chinese: I love programming.[/INST] """

In [5]:
llm.predict(prompt)

' Sure! Here\'s the translation of "I love programming" in Chinese:\n\n我爱编程 (wǒ ài biǎo chéng)\n\nNote: In Chinese, the word for "love" is 爱 (ài), and the word for "programming" is 编程 (biǎo chéng).'

Now, what if we want to predict for other instructions and we don't want to pass the system prompt everytime.  Of course, we can use the `str.format()` in Python.

In [6]:
text_prompt_template = """[INST] <<SYS>>
 You are a helpful, respectful and honest assistant.
 <</SYS>>

{instruction}[/INST] """
prompt = text_prompt_template.format(instruction="Translate this sentence into Chinese: I love programming.")
prompt

'[INST] <<SYS>>\n You are a helpful, respectful and honest assistant.\n <</SYS>>\n\nTranslate this sentence into Chinese: I love programming.[/INST] '

In [7]:
llm.predict(prompt)

' Sure! Here\'s the translation of "I love programming" in Chinese:\n\n我爱编程 (wǒ ài biǎo chéng)\n\nNote: In Chinese, the word for "love" is 爱 (ài), and the word for "programming" is 编程 (biǎo chéng).'

In Langchain, you can achieve the same by usign PromptTemplate and Chains. A `Chain` is` very generically as a sequence of calls to components, which can include other chains. 

The basic chain is `LLMChain` which takes in a prompt template, formats it with the user input and returns the response from an LLM. 


In [8]:

from langchain import PromptTemplate, LLMChain

prompt_template = PromptTemplate(
    input_variables=["instruction"],
    template=text_prompt_template,
)
llm_chain = LLMChain(llm=llm, prompt=prompt_template)


In [9]:
llm_chain.run(instruction="Translate this sentence into Chinese: I love programming.")

' Sure! Here\'s the translation of "I love programming" in Chinese:\n\n我爱编程 (wǒ ài biǎo chéng)\n\nNote: In Chinese, the word for "love" is 爱 (ài), and the word for "programming" is 编程 (biǎo chéng).'

There are some other ways of using the chain

```python
# Use llm_chain.predict(...)
llm_chain.predict(instruction="Translate this sentence into Chinese: I love programming.")

# Or
llm_chain.run({"instruction": "Translate this sentence into Chinese: I love programming."})

# Or 
llm_chain(inputs={"instruction": "Translate this sentence into Chinese: I love programming."})

# If it only has one parameter, you can simply use:
llm_chain.run("Translate this sentence into Chinese: I love programming.")
```

In [10]:
#  We can check the prompt template of a chain. This is useful as sometimes the prompt template is not defined by us.
llm_chain.prompt

PromptTemplate(input_variables=['instruction'], output_parser=None, partial_variables={}, template='[INST] <<SYS>>\n You are a helpful, respectful and honest assistant.\n <</SYS>>\n\n{instruction}[/INST] ', template_format='f-string', validate_template=True)

You can add `verbose=True` to show more details (can be used for debug)

In [11]:
llm_chain = LLMChain(llm=llm, prompt=prompt_template, verbose=True)

In [12]:
llm_chain.run("Translate this sentence into Chinese: I love programming.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m[INST] <<SYS>>
 You are a helpful, respectful and honest assistant.
 <</SYS>>

Translate this sentence into Chinese: I love programming.[/INST] [0m

[1m> Finished chain.[0m


' Sure! Here\'s the translation of "I love programming" in Chinese:\n\n我爱编程 (wǒ ài biǎo chéng)\n\nNote: In Chinese, the word for "love" is 爱 (ài), and the word for "programming" is 编程 (biǎo chéng).'

## Memory

Now let's try to translate that into a different lauguage, only we don't provide the sentence again.

In [13]:
llm_chain.run("Translate the sentence into French.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m[INST] <<SYS>>
 You are a helpful, respectful and honest assistant.
 <</SYS>>

Translate the sentence into French.[/INST] [0m

[1m> Finished chain.[0m


' Of course! Here is the translation of the sentence "You are a helpful, respectful and honest assistant" into French:\n\nVous êtes un assistante utile, respectueuse et honnête.'

Clearly it doesn't know what which `sentence` wa are referring to, as it doesn't know the previous conversations (chat history). To resolve this issue, we will add memory to the chain (or llm)

In [14]:
text_prompt_template_with_history = """[INST] <<SYS>>
 You are a helpful, respectful and honest assistant.
 <</SYS>>

Previous conversation:
{chat_history}

{instruction}[/INST] """

In [15]:
from langchain.memory import ConversationBufferMemory

prompt = PromptTemplate.from_template(text_prompt_template_with_history)
# Notice that we need to align the `memory_key`
memory = ConversationBufferMemory(memory_key="chat_history")
llm_chain_with_memory = LLMChain(
    llm=llm,
    prompt=prompt,
    memory=memory,
    verbose=True,
)

In [16]:
llm_chain_with_memory.run("Translate this sentence into Chinese: I love programming.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m[INST] <<SYS>>
 You are a helpful, respectful and honest assistant.
 <</SYS>>

Previous conversation:


Translate this sentence into Chinese: I love programming.[/INST] [0m

[1m> Finished chain.[0m


' Of course! Here\'s the translation of "I love programming" in Chinese:\n\n我爱编程 (wǒ ài biǎo chéng)\n\nI hope that helps! Let me know if you have any other questions or if there\'s anything else I can assist you with.'

In [17]:
llm_chain_with_memory.run("Translate the sentence into French.")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m[INST] <<SYS>>
 You are a helpful, respectful and honest assistant.
 <</SYS>>

Previous conversation:
Human: Translate this sentence into Chinese: I love programming.
AI:  Of course! Here's the translation of "I love programming" in Chinese:

我爱编程 (wǒ ài biǎo chéng)

I hope that helps! Let me know if you have any other questions or if there's anything else I can assist you with.

Translate the sentence into French.[/INST] [0m

[1m> Finished chain.[0m


' Of course! Here\'s the translation of "I love programming" in French:\nJe suis amoureux de la programmation.\nI hope that helps! Let me know if you have any other questions or if there\'s anything else I can assist you with.'

We can see the chat history is added into the prompt for new instructions

`ConversationBufferMemory` basically put all histories into the prompt, and we know there are always limitions on context length for large language model, e.g. for Llama2, the context length is only 4096. Langchain provides many other types of memories.

For example, So we may only want to store the most recent chats into the prompt only, we can use `ConversationBufferWindowMemory`, such as memory=ConversationBufferWindowMemory(k=2) for the last two chats only.  You can try more yourself.


If you don't want to define the prompt template yourself, here is another chain you can use

In [18]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

conversation = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory()
)

In [19]:
# check the prompt template it used
conversation.prompt

PromptTemplate(input_variables=['history', 'input'], output_parser=None, partial_variables={}, template='The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n\nCurrent conversation:\n{history}\nHuman: {input}\nAI:', template_format='f-string', validate_template=True)

## Context

Llama 2 is pretrained with a lot of data, that doesnt mean it know everything. Let's try to ask some question about the new AWS product: Amazon Bedrock.

In [20]:
text_prompt_template = """[INST] <<SYS>>
 You are a helpful, respectful and honest assistant. If you don't know the answer to a question, please don't share false information.
 <</SYS>>

{question} [/INST] """

prompt_template = PromptTemplate(
    template=text_prompt_template, input_variables=["question"]
)

llm_chain = LLMChain(llm=llm, prompt=prompt_template)


In [21]:
llm_chain.run(question="What is Bedrock?")

' Thank you for asking! I\'m happy to help. However, I must inform you that "Bedrock" is not a real place or location. It is actually a fictional setting from the classic cartoon show "The Flintstones," which was set in the prehistoric town of Bedrock. The show was created in the 1960s and featured a Stone Age family called the Flintstones, who lived in the town of Bedrock with their friends and neighbors. So, while Bedrock may not be a real place, it has become a cultural reference point and a beloved part of popular culture. Is there anything else I can help you with?'

One way to resolve this is that we can add more contexts into the prompt.

In [22]:
text_prompt_template_with_context = """[INST] <<SYS>>
 You are a helpful, respectful and honest assistant. If you don't know the answer to a question, please don't share false information. 
 <</SYS>>

Here is the context
{context}

{question} [/INST] """

prompt_template = PromptTemplate(
    template=text_prompt_template_with_context, input_variables=["context", "question"]
)

llm_chain_with_context = LLMChain(llm=llm, prompt=prompt_template)

In [23]:
# Copy the information from https://aws.amazon.com/bedrock/?nc1=h_ls as context

context = "Amazon Bedrock is a fully managed service that makes foundation models (FMs) from Amazon and leading AI startups available through an API, so you can choose from various FMs to find the model that's best suited for your use case. With the Amazon Bedrock serverless experience, you can quickly get started, easily experiment with FMs, privately customize FMs with your own data, and seamlessly integrate and deploy them into your applications using AWS tools and capabilities. Agents for Amazon Bedrock are fully managed and make it easier for developers to create generative-AI applications that can deliver up-to-date answers based on proprietary knowledge sources and complete tasks for a wide range of use cases."

llm_chain_with_context.run(context=context, question="What is Bedrock?")



" Hello! I'm here to help you understand what Bedrock is. Based on the context provided, Bedrock appears to be a fully managed service offered by Amazon that provides access to foundation models (FMs) from Amazon and leading AI startups through an API. This means that developers can choose from various FMs to find the one that best suits their needs and use case.\nWith Bedrock, developers can quickly get started, easily experiment with FMs, privately customize FMs with their own data, and seamlessly integrate and deploy them into their applications using AWS tools and capabilities. Additionally, agents for Bedrock are fully managed, making it easier for developers to create generative-AI applications that can deliver up-to-date answers based on proprietary knowledge sources and complete tasks for a wide range of use cases.\nIn summary, Bedrock is a service offered by Amazon that provides easy access to foundation models for generative-AI applications, allowing developers to quickly and

Sounds about right, because all the information is added into prompt. But let's ask a different question.

In [24]:
llm_chain_with_context.run(context=context, question="What FMs are supported by bedrock?")

' As a responsible and honest assistant, I must inform you that I cannot provide a definitive list of FMs supported by Amazon Bedrock as the information is not publicly available. The Amazon Bedrock website does not mention any specific FMs that are supported, and there is no official documentation or announcement regarding the matter.\nHowever, based on the description provided in the context, it seems that Amazon Bedrock offers a variety of foundation models (FMs) from Amazon and leading AI startups through its API. These FMs may be tailored to different use cases and can be easily integrated and deployed into applications using AWS tools and capabilities.\nIf you have any further questions or need more information, please feel free to ask!'

As you can see, we don't provide enough context to answer the questions, however, we can not predict what the questions are and we also can't put all information into the prompt as context.

To resolve this issues, there are two other ways: 
 
1. Retrieval Augmented Generation
2. Using Agent/Tools

Check out the next notebooks for more details.
