<center><h1>Langchain Retrieval Chain Guide:</h1></center>
<hr><hr><hr>

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
import os

azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
api_key = os.getenv("AZURE_OPENAI_KEY")
api_version = "2023-05-15"

# For working with AzureOpenAI, in place of model, the deployment name is used
deployment_name = os.getenv("DEPLOYMENT_NAME")

In [4]:
os.environ["OPENAI_API_TYPE"]     = "azure"
os.environ["OPENAI_API_VERSION"]  = api_version
os.environ["OPENAI_API_KEY"]      = api_key

In [5]:
from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
    openai_api_version=api_version,
    azure_deployment=deployment_name,
)

question = "What is Integral calculus?"

print( llm.invoke( question ) )

content='Integral calculus is a branch of calculus that focuses on the concept of integration. Integration deals with finding the antiderivative of a function, which is the reverse process of differentiation. It involves determining the area under a curve or the accumulation of quantities over a given interval. Integration is used to solve problems related to areas, volumes, work, and other applications in mathematics, physics, engineering, and economics. It is an essential tool in mathematical analysis and is often used in conjunction with differential calculus to solve complex problems.'


# Guiding LLM response using prompt template:

### Creating langchain prompt template

In [8]:
# The {input} placeholder will be replaced by user's input using chain.

from langchain_core.prompts import ChatPromptTemplate
my_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class technical documentation writer."),
    ("user", "{input}")
])

In [19]:
my_chain = my_prompt | llm 

- Using the chain `my_chain` created above, we can pass the question to the `input` placeholder in the prompt `my_prompt`
- As system messge is also passed, we will get a more refined wel-formatted answer in the response.

In [20]:
response = my_chain.invoke({"input": "how can langsmith help with testing?"})

In [22]:
response

AIMessage(content="Langsmith can greatly assist with testing in multiple ways:\n\n1. Test Case Documentation: Langsmith can generate comprehensive test case documentation by analyzing the software codebase. It can automatically identify the different modules, classes, and functions within the code, and generate test cases for each of them. This documentation provides a systematic approach to testing and ensures that all areas of the software are thoroughly tested.\n\n2. Test Data Generation: Langsmith can also assist in generating test data for various scenarios. By analyzing the code, it can identify the different types of input parameters and their possible values. Based on this analysis, Langsmith can generate a wide range of test data to cover different test scenarios, including boundary cases, edge cases, and negative test cases.\n\n3. Test Automation: Langsmith can help in automating the testing process by generating code snippets for test scripts. It can analyze the codebase to 

In [23]:
print( type(response) )

<class 'langchain_core.messages.ai.AIMessage'>


- The output of a ChatModel ( `my_chain.invoke()` function call ) is a `AIMessage` object, as can be seen above.
- To get this output as a string, we can use output parsers.
- We can use the output parser combined with the chain `my_chain` such that it directly gives the output as a String, and not any type of object.

<img src="./images/02. langchain_model-io.jpg" />

In [24]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

In [25]:
my_chain2 = prompt | llm | output_parser

In [26]:
response2 = my_chain2.invoke( {"input": "How docker solves the problem related to environment setup?"} )

In [27]:
response2

'Docker solves the problem of environment setup by providing a lightweight and isolated containerization platform. Traditionally, setting up a development or production environment can be a complex and time-consuming process, involving the installation and configuration of various dependencies, libraries, and tools.\n\nWith Docker, developers can define the entire environment in a Dockerfile, which is a text file that specifies the base image, dependencies, and configurations required for the application to run. Docker uses a layered file system and containerization technology to package the application and its dependencies into a portable container.\n\nThis container can then be easily shared and deployed across different environments, such as development, testing, and production. Docker ensures that the environment is consistent and reproducible, as the container includes all the required dependencies and configurations, eliminating the need for manual setup.\n\nFurthermore, Docker a

In [28]:
print( type(response2) )

<class 'str'>


# Basic Retrieval Chain :
- **Retrieval** is useful when you have **too much data** to pass to the LLM directly, such as, when we need to pass context on which the result should be based upon.
- We can use a retriever to fetch only the most relevant pieces of data from the entire context and pass those to the llm. This reduces number of tokens used.
- **Context** - For example, we have a document (word or pdf or some other form) containing details of a thing, and we want to query based on the details in the document. Thus, our answer should be from the details in that document. *That document data will be referred to as the context.*
- The **Retriever context** can be a local document, a document or webpage on the internet, an SQL table, etc.

### Basic Retrieval Chain made with `create_retrieval_chain()` function must have following placeholder's in the prompt used in them:
- `{context}`
- `{input}`

Requirements:-
- `pip install beautifulsoup4`
- `pip install faiss-cpu`

## Setting a web document as Retriever context, and use it to answer queries:
- In this process, we will look up relevant documents from a Retriever and then pass them into the prompt.
- A Retriever can be backed by anything - a SQL table, the internet, etc.
- In this example, we will populate a vector store and use that as a retriever, thus, here, **the retriever context will be a vector store.**

### loading the webpage data into a variable:

In [30]:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")

docs = loader.load()

In [45]:
docs

[Document(page_content='\n\n\n\n\nLangSmith Overview and User Guide | 🦜️🛠️ LangSmith\n\n\n\n\n\nSkip to main content🦜️🛠️ LangSmith DocsPython DocsJS/TS DocsSearchGo to AppLangSmithOverviewTracingTesting & EvaluationOrganizationsHubLangSmith CookbookRelease NotesOverviewOn this pageLangSmith Overview and User GuideBuilding reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.Over the past two months, we at LangChain have been building and using LangSmith with the goal of bridging this gap. This is our tactical user guide to outline effective ways to use LangSmith and maximize its benefits.On by default\u200bAt LangChain, all of us have LangSmith’s tracing running in the background by default. On the Python side, this is achieved by setting environment variables, which we establish whenever we launch a 

### index loaded data into a vectore store:
- This requires 2 main components: **embedding model** and a **vectorstore**

In [31]:
# Creating the embedding-model instance
from langchain_openai import AzureOpenAIEmbeddings

embeddings = AzureOpenAIEmbeddings(
    azure_deployment="text-embedding-ada-002",
    openai_api_version=api_version,
)

In [82]:
# Building the vector index and vectorstore
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter


text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)

In [47]:
documents

[Document(page_content='LangSmith Overview and User Guide | 🦜️🛠️ LangSmith', metadata={'source': 'https://docs.smith.langchain.com/overview', 'title': 'LangSmith Overview and User Guide | 🦜️🛠️ LangSmith', 'description': 'Building reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.', 'language': 'en'}),
 Document(page_content="Skip to main content🦜️🛠️ LangSmith DocsPython DocsJS/TS DocsSearchGo to AppLangSmithOverviewTracingTesting & EvaluationOrganizationsHubLangSmith CookbookRelease NotesOverviewOn this pageLangSmith Overview and User GuideBuilding reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.Over the past

In [48]:
vector = FAISS.from_documents(documents, embeddings)

In [51]:
print( vector )

<langchain_community.vectorstores.faiss.FAISS object at 0x0000026349D64EE0>


- This `vector` variable contains our vector store of the web-page document.
- Now that we have this data indexed in a vectorstore, we will create a retrieval chain.
- This chain will perform the following things in sequence:
    1. Take an incoming question,
    2. Look up relevant documents,
    3. Pass those relevant documents along with the original question into an LLM and ask it to answer the original question.

In [49]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

- Now, we want the documents to first come from the retriever we just set up. By doing this, for a given question we can use the retriever to dynamically select the most relevant documents and pass those in as context in the `{context}` placeholder of the above `prompt`, so that only relevant smal-sized context is passed to the llm.

In [65]:
from langchain.chains import create_retrieval_chain

# This "retriever" will take user "input" and return the related data according to the question, which will be passed as "context" to the "document_chain"
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

- We can now invoke this `retrieval_chain` chain, created above.
- This chain returns a Python dictionary in the response.
- The response dictionary returned from the LLM has a key named `answer`, which contains the answer to the question asked by the user.

In [35]:
user_question = "How can langsmith help with testing?"

response = retrieval_chain.invoke({"input": user_question})

In [38]:
response

{'input': 'How can langsmith help with testing?',
 'context': [Document(page_content="You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring\u200bAfter all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be assigned string tags or key-value metadata, allowing you to attach correlation ids or AB test variants, and filter runs accordingly.We’ve also made it possible to associate feedback programmatically with runs. This means that if your application has a thumbs up/down button on it, you can use that to log feedback back to LangSmith. This can be used to track performance over time and pinpoint under performing data

In [36]:
print( type(response) )

<class 'dict'>


In [37]:
print( response.keys() )

dict_keys(['input', 'context', 'answer'])


In [39]:
# The answer as returned by LLM
print( response["answer"] )

LangSmith can help with testing by providing the ability to run chains over data points and visualize the outputs. It makes it easy to pull down a dataset, run a chain over the data points, and log the results to a new project associated with the dataset. This allows for reviewing and evaluating the outputs. LangSmith also offers a set of evaluators that can be specified during a test run to evaluate the results. Additionally, LangSmith provides annotation queues that allow for manual review and annotation of runs, which is useful for assessing subjective qualities and validating auto-evaluated runs.


In [40]:
print( response["input"] )

How can langsmith help with testing?


In [42]:
# We can view what relevant pieces are being selected and are being passed to the llm as context

print( response["context"] )

[Document(page_content="You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring\u200bAfter all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be assigned string tags or key-value metadata, allowing you to attach correlation ids or AB test variants, and filter runs accordingly.We’ve also made it possible to associate feedback programmatically with runs. This means that if your application has a thumbs up/down button on it, you can use that to log feedback back to LangSmith. This can be used to track performance over time and pinpoint under performing data points, which you can subsequently add to a dataset for futur

# Conversation Retrieval Chain:
---------------------------------
- The chain we've created so far can only answer single questions. But most common LLM applications are chatbots, which are context-aware, as well as, they take whole conversation history into account in their context, such that they can answer follow-up questions.
- To include conversation history also into the context, we can still use the `create_retrieval_chain()` function, but we need to change two things:
    1. <u>**Updating Retrieval**</u>: The **retrieval method** should now not just work on the most recent input, but rather should take the whole history into account.
    2. <u>**Updating Chain**</u>: The **final LLM chain** should likewise take the whole history into account

### Conversation Retrieval made with `create_retrieval_chain()` function [it is a retriever, and not a chain] must have following placeholder's in the prompt used in them:
- `{chat_history}` : This includes the related data as context, as well as the previous user messages.
- `{input}` : Current user question / query.

### 1. Updating Retrieval

- This **retrieval <u>cannout be made</u> using only the FAISS vector as `vector.as_retriever()`**
- A new retrieval should be made that will take into account conversation details / chat history.
- In order to update retrieval, we will create a new chain. This chain will take in the most recent input (`input`) and the conversation history (`chat_history`) and use an LLM to generate a search query.

In [66]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

# First we need a prompt that we can pass into an LLM to generate this search query

prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation")
])

# Obtained retriever in above case will be used here
# retriever = vector.as_retriever()

history_aware_retriever = create_history_aware_retriever(llm, retriever, prompt)

### Testing the new retriever:
- We can test this out by passing in an instance where the user is asking a follow up question:

In [63]:
from langchain_core.messages import HumanMessage, AIMessage

chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
response = history_aware_retriever.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

### The object returned by retriever chain made using `create_history_aware_retriever()` is a `List`

In [64]:
print( type(response) )

<class 'list'>


In [58]:
for i in response:
    print(i, end="\n\n\n")

page_content='LangSmith Overview and User Guide | 🦜️🛠️ LangSmith' metadata={'source': 'https://docs.smith.langchain.com/overview', 'title': 'LangSmith Overview and User Guide | 🦜️🛠️ LangSmith', 'description': 'Building reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.', 'language': 'en'}


page_content="Skip to main content🦜️🛠️ LangSmith DocsPython DocsJS/TS DocsSearchGo to AppLangSmithOverviewTracingTesting & EvaluationOrganizationsHubLangSmith CookbookRelease NotesOverviewOn this pageLangSmith Overview and User GuideBuilding reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.Over the past two months, we at La

### 2. Updating Chain
Now we have the new retriever, i.e., the **history-aware retriever**. Using this, we can create a new chain to continue the conversation with these retrieved documents in account.

In [67]:
conversation_prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer the user's questions based on the below context:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])
document_chain = create_stuff_documents_chain(llm, conversation_prompt)

history_aware_retrieval_chain = create_retrieval_chain(history_aware_retriever, document_chain)

- We can now test this out end-to-end:

In [68]:
chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
chat_response = history_aware_retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

In [69]:
print( type( chat_response ) )

<class 'dict'>


In [70]:
print( chat_response.keys() )

dict_keys(['chat_history', 'input', 'context', 'answer'])


In [71]:
print( chat_response )

{'chat_history': [HumanMessage(content='Can LangSmith help test my LLM applications?'), AIMessage(content='Yes!')], 'input': 'Tell me how', 'context': [Document(page_content='LangSmith Overview and User Guide | 🦜️🛠️ LangSmith', metadata={'source': 'https://docs.smith.langchain.com/overview', 'title': 'LangSmith Overview and User Guide | 🦜️🛠️ LangSmith', 'description': 'Building reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.', 'language': 'en'}), Document(page_content="Skip to main content🦜️🛠️ LangSmith DocsPython DocsJS/TS DocsSearchGo to AppLangSmithOverviewTracingTesting & EvaluationOrganizationsHubLangSmith CookbookRelease NotesOverviewOn this pageLangSmith Overview and User GuideBuilding reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still

In [72]:
from IPython.display import JSON

JSON( chat_response )

<IPython.core.display.JSON object>

In [74]:
JSON( chat_response["chat_history"] )

<IPython.core.display.JSON object>

## Continuos Chat Program:

In [75]:
from langchain_core.messages import HumanMessage, AIMessage

In [78]:
chat_history = []

while True:
    user_question = input("User:  ")
    if user_question in ("quit", "exit", "close"):
        break

    chat_response = history_aware_retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": user_question
    })

    print("Assistant: ", chat_response["answer"])
    print("========================================================================================", end="\n\n\n")

    # Appending the current question and answer to chat_history
    chat_history.append( HumanMessage( content=user_question ) )
    chat_history.append( AIMessage( content=chat_response["answer"] ) )

User:   what is langsmith?


Assistant:  LangSmith is a tool developed by LangChain to simplify the process of building reliable Language Model (LLM) applications. It serves as a tactical user guide and provides effective ways to use LangSmith and maximize its benefits. LangSmith offers features such as tracing, debugging, monitoring, collecting examples, testing and evaluation, collaborative debugging, and exporting datasets. It helps developers understand and optimize the behavior of LLMs, chains, agents, and retrievers, and provides insights into input/output data, prompt editing, sequence of events, latency analysis, token usage, and more.




User:   Give 5 of its topmost features in bullet points


Assistant:  Certainly! Here are five top features of LangSmith:

1. Tracing: LangSmith provides built-in tracing capabilities that visualize the sequence of events in complicated chains and agents, making it easier to understand the interactions and inputs/outputs of each call.

2. Debugging: With LangSmith, you can debug LLMs, chains, and agents by accessing the exact input to the LLM, understanding the impact of prompt editing on the output, and identifying the cause of slow chain execution.

3. Monitoring: LangSmith allows you to monitor your application by logging all traces, visualizing latency and token usage statistics, and troubleshooting specific issues as they arise. It also supports adding feedback programmatically and tracking performance over time.

4. Collecting Examples: LangSmith enables you to collect and add examples to datasets for future testing and evaluation. You can edit the examples before adding them, making it valuable for tracking failures, testing different 

User:   Explain the 3rd bullet point above


Assistant:  Certainly! The third bullet point refers to the monitoring capabilities of LangSmith. Once your application is ready to go into production, LangSmith can be used to monitor its performance. Here's a breakdown of the features related to monitoring in LangSmith:

- Logging Traces: LangSmith allows you to log all traces of your application, capturing important information about the calls made to LLMs, chains, agents, tools, and retrievers. This comprehensive logging helps in understanding the behavior of your application and diagnosing any issues that may arise.

- Visualizing Latency and Token Usage Statistics: LangSmith provides visualizations of latency and token usage statistics. You can analyze the time taken by each step in a chain or agent, identify potential bottlenecks, and optimize the performance of your application. Additionally, you can track the token usage of each step, helping you identify costly parts of the chain and optimize token efficiency.

- Troubleshoot

User:   quit


In [81]:
for obj in chat_history:
    print( f"Type: {type(obj)}", obj, sep="\n", end="\n\n\n" )

Type: <class 'langchain_core.messages.human.HumanMessage'>
content='what is langsmith?'


Type: <class 'langchain_core.messages.ai.AIMessage'>
content='LangSmith is a tool developed by LangChain to simplify the process of building reliable Language Model (LLM) applications. It serves as a tactical user guide and provides effective ways to use LangSmith and maximize its benefits. LangSmith offers features such as tracing, debugging, monitoring, collecting examples, testing and evaluation, collaborative debugging, and exporting datasets. It helps developers understand and optimize the behavior of LLMs, chains, agents, and retrievers, and provides insights into input/output data, prompt editing, sequence of events, latency analysis, token usage, and more.'


Type: <class 'langchain_core.messages.human.HumanMessage'>
content='Give 5 of its topmost features in bullet points'


Type: <class 'langchain_core.messages.ai.AIMessage'>
content='Certainly! Here are five top features of LangSmith:\n