In [1]:
from decouple import AutoConfig
config = AutoConfig(search_path='./../.env')
import os

os.environ["AZURE_OPENAI_API_KEY"] = config('OPENAI_API_KEY')
os.environ["AZURE_OPENAI_ENDPOINT"] = config('AZURE_ENDPOINT')


## Models

### Vanila Large Language Models (LLMs)

LLMs are primarily designed for generating contextually relevant text, with primary focus on generating, completing, and language understanding. These models are pre-trained on diverse corpus capturing linguistic patterns for language understanding. They are widely used for downstream tasks like translation, summarization, task/domain-specific fine-tuning. etc.

Some prominent examples:
- GPT-3
- llama, llama-2, llama-3

#### OpenAI Model (Azure endpoint)

In [2]:
from langchain_openai import AzureOpenAI

temp = 0.3
max_tokens = 1024
llm = AzureOpenAI(
    deployment_name="gpt-3",
    model_name="text-davinci-003",
    api_version = "2022-12-01",
    temperature=temp,
    max_tokens=max_tokens
)

In [3]:
print(llm.invoke("What is the meaning of life in 10 words?"))



Living with purpose, love, and joy.


### Chat or Instruction tuned Models

Chat or instruction models are specifically designed for following user instructions or engaging in conversation with the user. They are LLMs that are further fine-tuned with specific datasets. Their main focus is to understand the context from user queries and respond accordingly. They are widely used for question answering, chatbots, dialogoe systems, etc.

Some prominent examples:
- GPT-3.5-turbo, GPT-4
- llama-chat models
- claude-2

In langchain, a chat model is a language model that uses chat messages as inputs and returns chat messages as outputs.

##### Passing user message to model through HumanMessage

In [4]:
from langchain_core.messages import HumanMessage
message = [HumanMessage("What is the meaning of life in 10 words?")]

#### OpenAI models (Azure endpoints)

In [5]:
from langchain_openai import AzureChatOpenAI

In [8]:
chat_llm = AzureChatOpenAI(
    openai_api_version="2023-03-15-preview",
    azure_deployment="gpt-35-turbo-0613",
    temperature=temp,
    max_tokens=max_tokens
)

`invoke()` call the chain on an input

In [9]:
print(chat_llm.invoke(message))

content='The meaning of life is subjective and varies for individuals.'


`stream()` stream back chunks of the response

In [10]:
for chunk in chat_llm.stream(message):
    print(chunk.content, end="", flush=True)

The meaning of life is subjective and open to interpretation.

In [13]:
chat_llm_gpt4 = AzureChatOpenAI(
    openai_api_version="2023-03-15-preview",
    azure_deployment="gpt-4-32k",
    temperature=temp,
    max_tokens=max_tokens
)

In [14]:
print(chat_llm_gpt4.invoke(message))

content='To learn, love, grow, contribute, experience, enjoy, understand, adapt, create, evolve.'


P.S.: The LLM returns a string, while the ChatModel returns a message.

## Prompts and Prompt Templates

A **prompt** could be an instruction or a query that is passed to the llm. At times, it can also contain some more details in the form of context, input, or example.

A **prompt template** is a wrapper around user-prompt providing extra layer of information specific to model and task. With prompt template user input can become more dynamic, as it can provide a placeholder.

### PromptTemplate

`PromptTemplate` is used to create a template for a string prompt.

Important Functions:
- `PromptTemplate.from_template()` to load a prompt template from a template.
- `PromptTemplate.format()` to format the defined template with user input. ==> Format the chat template into a string.

Reference: [langchain PromptTemplate](https://python.langchain.com/docs/modules/model_io/prompts/quick_start/#prompttemplate)

In [15]:
from langchain_core.prompts import PromptTemplate

In [16]:
prompt = PromptTemplate.from_template("What is the meaning of life in less than {num_of_words} words {style}?")
print(prompt.format(num_of_words=100, style=""))

What is the meaning of life in less than 100 words ?


In [17]:
prompt

PromptTemplate(input_variables=['num_of_words', 'style'], template='What is the meaning of life in less than {num_of_words} words {style}?')

In [18]:
print(llm.invoke(prompt.format(num_of_words=10, style="")))



Live with purpose and joy.


In [19]:
print(llm.invoke(prompt.format(num_of_words=50, style="")))



The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.


In [20]:
print(llm.invoke(prompt.format(num_of_words=50, style="in a funny way")))



To find out what makes you laugh, then do it as much as possible!


### ChatPromptTemplate

`ChatPromptTemplate`, prompt template for chat models, is a list of `ChatMessageTemplates`. Each `ChatMessageTemplate` contains instructions for how to format that `ChatMessage` - its role, and then also its content.

Important Classes:
- `SystemMessagePromptTemplate`
- `SystemMessage`
- `HumanMessagePromptTemplate`
- `HumanMessage`

Important Functions:
- `ChatPromptTemplate.from_messages()` defines the chat template. Most commonly used with `ChatPromptTemplate`. ==> Create a chat prompt template from a variety of message formats.
- `ChatPromptTemplate.format_messages()` to format the defined template with user input. ==> Format the chat template into a list of finalized messages.

Reference: 
- [langchain ChatPromptTemplate](https://python.langchain.com/docs/modules/model_io/prompts/quick_start/#chatprompttemplate)
- [OpenAI ChatCOmpletion](https://platform.openai.com/docs/guides/text-generation/chat-completions-api)

In [21]:
from langchain_core.prompts.chat import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("What is the meaning of life in less than {num_of_words} words {style}?")
message = prompt.format(num_of_words=50, style="in a funny way")

In [22]:
print(message)
print(type(message))

Human: What is the meaning of life in less than 50 words in a funny way?
<class 'str'>


default message becomes `HumanMessage`. This represent user instruction.

In [28]:
template = "You are a helpful assistant that translates {input_language} to {output_language}."
human_template = "{text}"

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", human_template),
])

chat_message = chat_prompt.format_messages(input_language="English", 
                            output_language="Hindi", 
                            text="The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.")

In [29]:
print(chat_message)
print(type(chat_message))
for msg in chat_message:
    print(msg, type(msg))

[SystemMessage(content='You are a helpful assistant that translates English to Hindi.'), HumanMessage(content='The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.')]
<class 'list'>
content='You are a helpful assistant that translates English to Hindi.' <class 'langchain_core.messages.system.SystemMessage'>
content='The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.' <class 'langchain_core.messages.human.HumanMessage'>


##### Using Messages 

In [31]:
from langchain_core.messages.system import SystemMessage
from langchain_core.messages.human import HumanMessage

template = "You are a helpful assistant that translates {input_language} to {output_language}."
human_template = "{text}"

chat_prompt = ChatPromptTemplate.from_messages([
    SystemMessage(template),
    HumanMessage(human_template),
])

chat_message = chat_prompt.format_messages(input_language="English", 
                            output_language="Hindi", 
                            text="The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.")

In [32]:
print(chat_message)
print(type(chat_message))
for msg in chat_message:
    print(msg, type(msg))

[SystemMessage(content='You are a helpful assistant that translates {input_language} to {output_language}.'), HumanMessage(content='{text}')]
<class 'list'>
content='You are a helpful assistant that translates {input_language} to {output_language}.' <class 'langchain_core.messages.system.SystemMessage'>
content='{text}' <class 'langchain_core.messages.human.HumanMessage'>


In [30]:
print(chat_llm.invoke(chat_prompt.format_messages(input_language="English", 
                            output_language="Hindi", 
                            text="The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.")))

content='जीवन का अर्थ है खुशी और जीने के उद्देश्य को ढूंढना, और दुनिया पर सकारात्मक प्रभाव डालना। (Jeevan ka arth hai khushi aur jeene ke uddeshya ko dhoondhna, aur duniya par sakaaratmak prabhaav daalna.)'


#### Using Placeholder

In [63]:
from langchain_core.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder
)

human_template = "Summarise the converstion in {word_count} words."
humman_message_template = HumanMessagePromptTemplate.from_template(human_template)
print(humman_message_template)

chat_prompt = ChatPromptTemplate.from_messages(
    [MessagesPlaceholder(variable_name="conversation"), humman_message_template]
)
print(chat_prompt)

prompt=PromptTemplate(input_variables=['word_count'], template='Summarise the converstion in {word_count} words.')
input_variables=['conversation', 'word_count'] input_types={'conversation': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]} messages=[MessagesPlaceholder(variable_name='conversation'), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['word_count'], template='Summarise the converstion in {word_count} words.'))]


In [69]:
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
system_message = SystemMessage(content="You are a smart AI assistant.")
human_message = HumanMessage(content="What is the meaning of life in less than 20 words?")
ai_message = AIMessage(
    content="""The meaning of life is to find joy and purpose in living, and to make a positive impact on the world."""
)

chat_message = chat_prompt.format_prompt(
    conversation=[system_message, human_message, ai_message], word_count=20,
)
print(chat_message)

messages=[SystemMessage(content='You are a smart AI assistant.'), HumanMessage(content='What is the meaning of life in less than 20 words?'), AIMessage(content='The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.'), HumanMessage(content='Summarise the converstion in 20 words.')]


In [70]:
print(chat_llm_gpt4.invoke(chat_message))

content='You asked for the meaning of life in less than 20 words and I provided a succinct, purpose-driven definition.'


In [62]:
from langchain.chains import LLMChain

chain = LLMChain(
    prompt=chat_prompt,
    llm=chat_llm,
    verbose=True)
chain.predict(conversation=[system_message, human_message, ai_message], word_count=20)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: You are a smart AI assistant.
Human: What is the meaning of life in less than 20 words?
AI: The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.
Human: Summarise the converstion in 20 words.[0m

[1m> Finished chain.[0m


'Discussion on the meaning of life: finding joy, purpose, and making a positive impact on the world.'

For more examples, reference [langchain docs](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html), [langchain tutorials](https://python.langchain.com/docs/modules/model_io/prompts/quick_start/)

## Output Parser

`OutputParsers` convert the raw output of a language model into a format that can be used downstream.

#### PydanticOutputParser

In [None]:
from langchain.output_parsers import PydanticOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field

class Answer(BaseModel):
    thought: str = Field(description="answer with thought.")

parser = PydanticOutputParser(pydantic_object=Answer)

prompt = PromptTemplate(
    template="Answer the user query in less than {word_count} words.\n\n{format_instructions}\n\n{query}\n",
    input_variables=["word_count", "query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

response = chat_llm.invoke(prompt.format(word_count=20, query="What is the meaning of life?"))
print(response)

In [None]:
print(parser.invoke(response))

In [None]:
print(parser.invoke(response).thought)

#### Built-In Parsers

##### JSONOutputParser

In [53]:
from langchain_core.output_parsers import JsonOutputParser

parser = JsonOutputParser()

prompt = PromptTemplate(
    template="""Return the response in JSON format with keys Question and Answer by answering the user query in less than {word_count} words.\n\n{format_instructions}\n\n{query}\n""",
    input_variables=["word_count", "query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

response = chat_llm.invoke(prompt.format(word_count=20, query="What is the meaning of life?"))
print(response)

content='{\n  "Question": "What is the meaning of life?",\n  "Answer": "The meaning of life is subjective and can vary for each individual."\n}'


In [54]:
print(type(parser.invoke(response)))
print(parser.invoke(response))

<class 'dict'>
{'Question': 'What is the meaning of life?', 'Answer': 'The meaning of life is subjective and can vary for each individual.'}


In [55]:
print(parser.invoke(response)['Question'])

What is the meaning of life?


In [56]:
print(parser.invoke(response)['Answer'])

The meaning of life is subjective and can vary for each individual.


## Chains

Chains or LLMChains is a concept native to langchain. It is a set of connected components that works together to generate an output for a given input. The simples chain is a combination of a **prompt (instruction + user-input)** and an **LLM**. However, this can be further enhanced by adding other components, such as retrievers, input pre-processing, output post-processing, etc.

Reference: [langchain docs Chains](https://python.langchain.com/docs/modules/chains/)

### LLMChain

In [57]:
chat_llm

AzureChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x11a036890>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x11a037b20>, temperature=0.3, openai_api_key='15c324a5cf7a467cb538e884ec4f88d9', openai_proxy='', max_tokens=1024, azure_endpoint='https://gis-enlt-sbx-instance.openai.azure.com/', deployment_name='gpt-35-turbo-0613', openai_api_version='2023-03-15-preview', openai_api_type='azure')

In [58]:
prompt

PromptTemplate(input_variables=['query', 'word_count'], partial_variables={'format_instructions': 'Return a JSON object.'}, template='Return the response in JSON format with keys Question and Answer by answering the user query in less than {word_count} words.\n\n{format_instructions}\n\n{query}\n')

In [59]:
from langchain.chains import LLMChain

chain = LLMChain(
    prompt=prompt,
    llm=chat_llm
)
query = "What are the impact of LLMs on NLP?"
word_count = 50

response = chain.predict(
    query=query,
    word_count=word_count
)

In [None]:
print(response)

P.S.: The `LLMChain` has been deprecated in the newer versions following the introduction on **LCEL Chains**.

#### LCEL Chain

In [None]:
lcel_chain = prompt | chat_llm
response = lcel_chain.invoke(
    {
        "query":query,
        "word_count":word_count
    }
)

In [None]:
print(response.content)

### Stuff Document Chain

This chain takes a list of documents and first combines them into a single string. It does this by formatting each document into a string with the `document_prompt` and then joining them together with `document_separator`. It then adds that new string to the inputs with the variable name set by `document_variable_name`. Those inputs are then passed to the `llm_chain`.



In [None]:
from langchain.chains import StuffDocumentsChain

In [None]:
document_prompt = PromptTemplate(
    input_variables = ["page_content"],
    template="{page_content}"
)

document_variable_name = "context"
document_separator = '\nEND\n'

In [None]:
template = "You are a helpful AI assistant."
summary_template = """Summarise the following content in less than {no_of_words} words:
{context}
"""

prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", summary_template),
])

In [None]:
llm_chain = LLMChain(
    llm=chat_llm_gpt4,
    prompt=prompt,
    verbose=True
)

In [None]:
stuff_chain = StuffDocumentsChain(
    llm_chain = llm_chain,
    document_prompt = document_prompt,
    document_variable_name = document_variable_name,
    document_separator = document_separator,
    verbose=True,
)

In [None]:
from langchain.docstore.document import Document
input_context = [
    "Stuff Document Chain: This chain takes a list of documents and formats them all into a prompt, then passes that prompt to an LLM. It passes ALL documents, so you should make sure it fits within the context window the LLM you are using."
    "Map-Reduce Document Chain: This chain first passes each document through an LLM, then reduces them using the ReduceDocumentsChain. Useful in the same situations as ReduceDocumentsChain, but does an initial LLM call before trying to reduce the documents."
    "Refine Document Chain: This chain collapses documents by generating an initial answer based on the first document and then looping over the remaining documents to refine its answer. This operates sequentially, so it cannot be parallelized. It is useful in similar situatations as MapReduceDocuments Chain, but for cases where you want to build up an answer by refining the previous answer (rather than parallelizing calls)."
]

docs = [Document(page_content=txt) for txt in input_context]

In [None]:
response = stuff_chain.invoke({"input_documents":docs, "no_of_words":50})

In [None]:
response

In [None]:
print(response['output_text'])

### Refine Document Chain

### Map-Reduce Document Chain

## Memmory

##### Using memory to store conversation history

In [None]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

In [None]:
chatbot = ConversationChain(
    llm = chat_llm,
    memory = ConversationBufferMemory(),
    verbose=True
)

In [None]:
chatbot.prompt

In [None]:
chatbot.predict(input="What are language models?")

In [None]:
chatbot.predict(input="What are the different types of transformer models?")

## Chatbot with Chat History Memory

In [None]:
from langchain_core.prompts import SystemMessagePromptTemplate

prompt = ChatPromptTemplate(
    messages=[
        SystemMessagePromptTemplate.from_template(
            "You are a smart and humble AI assistant for having a conversation with a human."
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        HumanMessagePromptTemplate.from_template("{question}")
    ]
)

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chatbot = LLMChain(
    llm=chat_llm,
    prompt=prompt,
    memory=memory,
    verbose=True
)

In [None]:
chatbot({"question": "Hello, My name is Akshay. How are you?"})

In [None]:
chatbot({"question": "What is my name?"})['text']

In [None]:
chatbot({"question": "What is AI language model?"})['text']

## Evaluation