In [None]:
from decouple import AutoConfig
config = AutoConfig(search_path='./../.env')
import os

os.environ["AZURE_OPENAI_API_KEY"] = config('OPENAI_API_KEY')
os.environ["AZURE_OPENAI_ENDPOINT"] = config('AZURE_ENDPOINT')


## Models

### Vanila Large Language Models (LLMs)

LLMs are primarily designed for generating contextually relevant text, with primary focus on generating, completing, and language understanding. These models are pre-trained on diverse corpus capturing linguistic patterns for language understanding. They are widely used for downstream tasks like translation, summarization, task/domain-specific fine-tuning. etc.

Some prominent examples:
- GPT-3
- llama, llama-2, llama-3

#### Loading open-source models using Ollama.
Using Ollama one can setup server for quantized models locally.

References:
1. [langchain](https://python.langchain.com/v0.1/docs/modules/model_io/)
2. [ollama github](https://github.com/ollama/ollama?tab=readme-ov-file)
3. [ollama model library](https://ollama.com/library)

In [None]:
!ollama pull "llama3:text"

In [None]:
from langchain_community.llms import Ollama
llama3 = Ollama(model="llama3:text")

In [None]:
print(llama3.invoke("What is the meaning of life in 10 words?"))

### Chat or Instruction tuned Models

Chat or instruction models are specifically designed for following user instructions or engaging in conversation with the user. They are LLMs that are further fine-tuned with specific datasets. Their main focus is to understand the context from user queries and respond accordingly. They are widely used for question answering, chatbots, dialogoe systems, etc.

Some prominent examples:
- GPT-3.5-turbo, GPT-4
- llama-chat models
- claude-2

In langchain, a chat model is a language model that uses chat messages as inputs and returns chat messages as outputs.

##### Passing user message to model through HumanMessage

In [None]:
from langchain_core.messages import HumanMessage
message = [HumanMessage("What is the meaning of life in 10 words?")]

#### Setting Model Parameters

In [None]:
temp = 0.0
max_tokens = 1024
top_k = 5

#### OpenAI models (Azure endpoints)

In [None]:
from langchain_openai import AzureChatOpenAI

In [None]:
chat_llm = AzureChatOpenAI(
    openai_api_version="2023-03-15-preview",
    azure_deployment="gpt-35-turbo-0613",
    temperature=temp,
    max_tokens=max_tokens
)

`invoke()` call the chain on an input

In [None]:
print(chat_llm.invoke(message))

`stream()` stream back chunks of the response

In [None]:
for chunk in chat_llm.stream(message):
    print(chunk.content, end="", flush=True)

In [None]:
chat_llm_gpt4 = AzureChatOpenAI(
    openai_api_version="2023-03-15-preview",
    azure_deployment="gpt-4-32k",
    temperature=temp,
    max_tokens=max_tokens
)

In [None]:
print(chat_llm_gpt4.invoke(message))

P.S.: The LLM returns a string, while the ChatModel returns a message.

#### Loading open-source chat models using Ollama.

In [None]:
!ollama pull "llama3"

In [None]:
from langchain_community.chat_models import ChatOllama

llama3_chat = ChatOllama(model="llama3",
                         temperature=temp,
                         max_tokens=max_tokens,
                         top_k=top_k,)

In [None]:
print(llama3_chat.invoke(message).content)

## Prompts and Prompt Templates

A **prompt** could be an instruction or a query that is passed to the llm. At times, it can also contain some more details in the form of context, input, or example.

A **prompt template** is a wrapper around user-prompt providing extra layer of information specific to model and task. With prompt template user input can become more dynamic, as it can provide a placeholder.

### PromptTemplate

`PromptTemplate` is used to create a template for a string prompt.

Important Functions:
- `PromptTemplate.from_template()` to load a prompt template from a template.
- `PromptTemplate.format()` to format the defined template with user input. ==> Format the chat template into a string.

Reference: [langchain PromptTemplate](https://python.langchain.com/docs/modules/model_io/prompts/quick_start/#prompttemplate)

In [None]:
from langchain_core.prompts import PromptTemplate

In [None]:
prompt = PromptTemplate.from_template("What is the meaning of life in less than {num_of_words} words {style}?")
print(prompt.format(num_of_words=100, style=""))

In [None]:
prompt

In [None]:
print(llama3.invoke(prompt.format(num_of_words=10, style="")))

In [None]:
print(llama3.invoke(prompt.format(num_of_words=50, style="")))

In [None]:
print(llama3.invoke(prompt.format(num_of_words=50, style="in a royal way")))

### ChatPromptTemplate

`ChatPromptTemplate`, prompt template for chat models, is a list of `ChatMessageTemplates`. Each `ChatMessageTemplate` contains instructions for how to format that `ChatMessage` - its role, and then also its content.

Important Classes:
- `SystemMessagePromptTemplate`
- `SystemMessage`: This represents a system message, which tells the model how to behave. This generally only consists of content. Not every model supports this.
- `HumanMessagePromptTemplate`
- `HumanMessage`: This represents a message from the user. Generally consists only of content.

Important Functions:
- `ChatPromptTemplate.from_messages()` defines the chat template. Most commonly used with `ChatPromptTemplate`. ==> Create a chat prompt template from a variety of message formats.
- `ChatPromptTemplate.format_messages()` to format the defined template with user input. ==> Format the chat template into a list of finalized messages.

Reference: 
- [langchain ChatPromptTemplate](https://python.langchain.com/docs/modules/model_io/prompts/quick_start/#chatprompttemplate)
- [OpenAI ChatCOmpletion](https://platform.openai.com/docs/guides/text-generation/chat-completions-api)

In [None]:
from langchain_core.prompts.chat import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("What is the meaning of life in less than {num_of_words} words {style}?")
message = prompt.format(num_of_words=50, style="in a funny way")

In [None]:
print(message)
print(type(message))

default message becomes `HumanMessage`. This represent user instruction.

In [None]:
template = "You are a helpful assistant that translates {input_language} to {output_language}."
human_template = "{text}"

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", human_template),
])

chat_message = chat_prompt.format_messages(input_language="English", 
                            output_language="Hindi", 
                            text="The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.")

In [None]:
print(chat_message)
print(type(chat_message))
for msg in chat_message:
    print(msg, type(msg))

In [None]:
print(chat_llm.invoke(chat_prompt.format_prompt(input_language="English", 
                            output_language="Hindi", 
                            text="The meaning of life is to find joy and purpose in living, and to make a positive impact on the world.")))

#### Using Placeholder

In [None]:
from langchain_core.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder
)

human_template = "Summarise the converstion in {word_count} words."
humman_message_template = HumanMessagePromptTemplate.from_template(template=human_template,
                                                                   partial_variables={"word_count": 10})
print(humman_message_template)

chat_prompt = ChatPromptTemplate.from_messages(
    [MessagesPlaceholder(variable_name="conversation"), humman_message_template]
)
print(chat_prompt)

In [None]:
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
system_message = SystemMessage(content="You are a smart AI assistant.")
human_message = HumanMessage(content="What is the meaning of life in less than 50 words?")
ai_message = AIMessage(
    content="""The meaning of life is to find joy and purpose in living, and to make a positive impact on the world."""
)

chat_message = chat_prompt.format_messages(
    conversation=[system_message, human_message, ai_message], word_count=20,
)
print(chat_message)

In [None]:
print(chat_llm_gpt4.invoke(chat_message))

In [None]:
from langchain.chains import LLMChain

chain = LLMChain(
    prompt=chat_prompt,
    llm=chat_llm,
    verbose=True)
chain.predict(conversation=[system_message, human_message, ai_message], word_count=20)

For more examples, reference [langchain docs](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html), [langchain tutorials](https://python.langchain.com/docs/modules/model_io/prompts/quick_start/)

## Output Parser

`OutputParsers` convert the raw output of a language model into a format that can be used downstream.

#### PydanticOutputParser

In [None]:
from langchain.output_parsers import PydanticOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field

class Answer(BaseModel):
    thought: str = Field(description="answer with thought.")

parser = PydanticOutputParser(pydantic_object=Answer)

prompt = PromptTemplate(
    template="Answer the user query in less than {word_count} words.\n\n{format_instructions}\n\n{query}\n",
    input_variables=["word_count", "query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

response = chat_llm.invoke(prompt.format(word_count=20, query="What is the meaning of life?"))
print(response)

In [None]:
print(parser.invoke(response))

In [None]:
print(parser.invoke(response).thought)

#### Built-In Parsers

##### JSONOutputParser

In [None]:
from langchain_core.output_parsers import JsonOutputParser

parser = JsonOutputParser()

prompt = PromptTemplate(
    template="""Return the response in JSON format with keys Question and Answer by answering the user query in less than {word_count} words.\n\n{format_instructions}\n\n{query}\n""",
    input_variables=["word_count", "query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

print(prompt)
response = chat_llm.invoke(prompt.format(word_count=20, query="What is the meaning of life?"))
print(response)

In [None]:
print(type(parser.invoke(response)))
print(parser.invoke(response))

In [None]:
print(parser.invoke(response)['Question'])

In [None]:
print(parser.invoke(response)['Answer'])

## Chains

Chains or LLMChains is a concept native to langchain. It is a set of connected components that works together to generate an output for a given input. The simples chain is a combination of a **prompt (instruction + user-input)** and an **LLM**. However, this can be further enhanced by adding other components, such as retrievers, input pre-processing, output post-processing, etc.

Reference: [langchain docs Chains](https://python.langchain.com/docs/modules/chains/)

### LLMChain

In [None]:
chat_llm

In [None]:
prompt

In [None]:
from langchain.chains import LLMChain

chain = LLMChain(
    prompt=prompt,
    llm=chat_llm
)
query = "What are the impact of LLMs on NLP?"
word_count = 50

response = chain.predict(
    query=query,
    word_count=word_count
)

In [None]:
print(response)

P.S.: The `LLMChain` has been deprecated in the newer versions following the introduction on **LCEL Chains**.

#### LCEL Chain

In [None]:
lcel_chain = prompt | chat_llm
response = lcel_chain.invoke(
    {
        "query":query,
        "word_count":word_count
    }
)

In [None]:
print(response.content)

### Stuff Document Chain

This chain takes a list of documents and first combines them into a single string. It does this by formatting each document into a string with the `document_prompt` and then joining them together with `document_separator`. It then adds that new string to the inputs with the variable name set by `document_variable_name`. Those inputs are then passed to the `llm_chain`.



In [None]:
from langchain.chains import StuffDocumentsChain

In [None]:
document_prompt = PromptTemplate(
    input_variables = ["page_content"],
    template="{page_content}"
)

document_variable_name = "context"
document_separator = '\nEND\n'

In [None]:
template = "You are a helpful AI assistant."
summary_template = """Summarise the following content in less than {no_of_words} words:
{context}
"""

prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", summary_template),
])

In [None]:
llm_chain = LLMChain(
    llm=chat_llm_gpt4,
    prompt=prompt,
    verbose=True
)

In [None]:
stuff_chain = StuffDocumentsChain(
    llm_chain = llm_chain,
    document_prompt = document_prompt,
    document_variable_name = document_variable_name,
    document_separator = document_separator,
    verbose=True,
)

In [None]:
from langchain.docstore.document import Document
input_context = [
    "Stuff Document Chain: This chain takes a list of documents and formats them all into a prompt, then passes that prompt to an LLM. It passes ALL documents, so you should make sure it fits within the context window the LLM you are using."
    "Map-Reduce Document Chain: This chain first passes each document through an LLM, then reduces them using the ReduceDocumentsChain. Useful in the same situations as ReduceDocumentsChain, but does an initial LLM call before trying to reduce the documents."
    "Refine Document Chain: This chain collapses documents by generating an initial answer based on the first document and then looping over the remaining documents to refine its answer. This operates sequentially, so it cannot be parallelized. It is useful in similar situatations as MapReduceDocuments Chain, but for cases where you want to build up an answer by refining the previous answer (rather than parallelizing calls)."
]

docs = [Document(page_content=txt) for txt in input_context]

In [None]:
response = stuff_chain.invoke({"input_documents":docs, "no_of_words":50})

In [None]:
response

In [None]:
print(response['output_text'])

### Refine Document Chain

In [None]:
from langchain.chains import RefineDocumentsChain, LLMChain
from langchain_core.prompts import PromptTemplate


# This controls how each document will be formatted. Specifically,
# it will be passed to `format_document` - see that function for more
# details.
document_prompt = PromptTemplate(
    input_variables=["page_content"]+[key for key in docs[0].metadata.keys()],
    template="\n".join([str(key)+": " "{"+key+"}" for key in docs[0].metadata.keys()])+"\n{page_content}"
)
document_variable_name = "text"

# The prompt here should take as an input variable the
# `document_variable_name`
template = "You are a helpful AI assistant."
summary_template = """Summarise the following content:
{text}
"""

prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", summary_template),
])


initial_llm_chain = LLMChain(llm=chat_llm_gpt4, prompt=prompt, verbose=True)
initial_response_name = "prev_response"
# The prompt here should take as an input variable the
# `document_variable_name` as well as `initial_response_name`
prompt_refine = PromptTemplate.from_template(
template="""Here's your Previous Response: 
{prev_response}
Given the new context, refine the original response using user message and additional instructions, if the context isn't useful, return the original response.

{text}
"""
)
refine_llm_chain = LLMChain(llm=chat_llm_gpt4, prompt=prompt_refine, verbose=True)
refine_chain = RefineDocumentsChain(
    initial_llm_chain=initial_llm_chain,
    refine_llm_chain=refine_llm_chain,
    document_prompt=document_prompt,
    document_variable_name=document_variable_name,
    initial_response_name=initial_response_name,
)


In [None]:
response = refine_chain.invoke({"input_documents":docs, "no_of_words":50})

In [None]:
response

### Map-Reduce Document Chain

In [None]:
from langchain.chains import (
    StuffDocumentsChain,
    LLMChain,
    ReduceDocumentsChain,
    MapReduceDocumentsChain,
)
from langchain_core.prompts import PromptTemplate
# This controls how each document will be formatted. Specifically,
# it will be passed to `format_document` - see that function for more
# details.
document_prompt = PromptTemplate(
    input_variables=["page_content"],
     template="{page_content}"
)
document_variable_name = "text"
# The prompt here should take as an input variable the
# `document_variable_name`
template = "You are a helpful AI assistant."
summary_template = """Summarise the following content:
{text}
"""

prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", summary_template),
])

llm_chain = LLMChain(llm=chat_llm_gpt4, prompt=prompt)
# We now define how to combine these summaries
reduce_prompt = PromptTemplate.from_template(
    "Combine these summaries: {text}"
)
reduce_llm_chain = LLMChain(llm=chat_llm_gpt4, prompt=reduce_prompt)
combine_documents_chain = StuffDocumentsChain(
    llm_chain=reduce_llm_chain,
    document_prompt=document_prompt,
    document_variable_name=document_variable_name
)
reduce_documents_chain = ReduceDocumentsChain(
    combine_documents_chain=combine_documents_chain,
)
map_reduce_chain = MapReduceDocumentsChain(
    llm_chain=llm_chain,
    reduce_documents_chain=reduce_documents_chain,
)

In [None]:
response = map_reduce_chain.invoke({"input_documents":docs, "no_of_words":50})

In [None]:
response

## Memmory

##### Using memory to store conversation history

In [None]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

In [None]:
chatbot = ConversationChain(
    llm = chat_llm,
    memory = ConversationBufferMemory(),
    verbose=True
)

In [None]:
chatbot.prompt

In [None]:
chatbot.predict(input="What are language models?")

In [None]:
chatbot.predict(input="What are the different types of transformer models?")

## Chatbot with Chat History Memory

In [None]:
from langchain_core.prompts import SystemMessagePromptTemplate

prompt = ChatPromptTemplate(
    messages=[
        SystemMessagePromptTemplate.from_template(
            "You are a smart and humble AI assistant for having a conversation with a human."
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        HumanMessagePromptTemplate.from_template("{question}")
    ]
)

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chatbot = LLMChain(
    llm=chat_llm,
    prompt=prompt,
    memory=memory,
    verbose=True
)

In [None]:
chatbot({"question": "Hello, My name is Akshay. How are you?"})

In [None]:
chatbot({"question": "What is my name?"})['text']

In [None]:
chatbot({"question": "What is AI language model?"})['text']