In [3]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM


In [5]:

template = """Question: {question}

Answer: Let's think step by step."""

prompt = ChatPromptTemplate.from_template(template)

model = OllamaLLM(model="llama3.1")

In [8]:
chain = prompt | model

print(chain.invoke({"question": "What is LangChain?"}))

To answer the question about LangChain, let's break it down into smaller, more manageable steps.

**Step 1: Understanding the context**
LangChain seems to be related to technology or programming. It might have something to do with AI, machine learning, or a specific tool or platform.

**Step 2: Researching possible meanings**
After some research, I found that LangChain is actually a Python library for building conversational AI and multimodal models. It provides a framework for creating robust and efficient chatbots, voice assistants, and other natural language processing (NLP) applications.

**Step 3: Confirming the information**
To ensure the accuracy of my answer, I'll take a look at some online resources, such as documentation, GitHub repositories, or forums related to LangChain. This will help me confirm that LangChain is indeed a Python library for conversational AI and multimodal models.

**Conclusion**
Based on my research and analysis, it appears that LangChain is a Python lib

In [10]:
from langchain_ollama import OllamaEmbeddings

embeddings = OllamaEmbeddings(
    model="llama3.1",
)

In [12]:
import bs4
from langchain import hub
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()


In [13]:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)



In [14]:
# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)





'Task Decomposition is a technique that breaks down complex tasks into smaller, manageable steps. This involves instructing models like LLMs to "think step by step" to decompose hard tasks, transforming them into multiple simpler ones. It helps shed light on the model\'s thought process and enhances performance on complex tasks.'

In [16]:
prompt

ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"), additional_kwargs={})])

In [15]:
print(rag_chain.invoke("What is Task Decomposition?"))

Task Decomposition is a technique that breaks down complex tasks into smaller, simpler steps, allowing the agent to plan ahead and manage each step individually. This approach is inspired by the Chain of Thought (CoT) prompting technique, which instructs models to "think step by step" to decompose hard tasks into manageable ones. By doing so, agents can utilize more test-time computation and shed light on their thinking process.


# Retriever: An object that returns Documents given a text query

1. MultiQueryRetriever generates variants of the input question to improve retrieval hit rate.
2. MultiVectorRetriever instead generates variants of the embeddings, also in order to improve retrieval hit rate.
3. Max marginal relevance selects for relevance and diversity among the retrieved documents to avoid passing in duplicate context.
4. Documents can be filtered during vector store retrieval using metadata filters, such as with a Self Query Retriever.

In [17]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")

len(retrieved_docs)

6

In [20]:

print(retrieved_docs[0].page_content)

Fig. 4. Experiments on AlfWorld Env and HotpotQA. Hallucination is a more common failure than inefficient planning in AlfWorld. (Image source: Shinn & Labash, 2023)


## Retrieval and Generation: Generate

In [22]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()

print(example_messages[0].content)



You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: filler question 
Context: filler context 
Answer:


In [23]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


In [24]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)


In [25]:
for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)

Task Decomposition is the process of breaking down complex tasks into smaller, manageable steps. This can be achieved through prompting techniques such as Chain of Thought (CoT), which instructs the model to "think step by step" and decompose hard tasks into simpler ones. Task decomposition transforms big tasks into multiple manageable tasks, allowing models to plan ahead and solve problems more efficiently.

##Let's trace how the input question flows through the above runnables.

1. As we've seen above, the input to prompt is expected to be a dict with keys "context" and "question". So the first element of this chain builds runnables that will calculate both of these from the input question:

    1. retriever | format_docs passes the question through the retriever, generating Document objects, and then to format_docs to generate strings;
    2. RunnablePassthrough() passes through the input question unchanged.



In [27]:
# The chain.invoke(question) would build a formatted prompt, ready for inference. 

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
)

## Prompt Creation

In [29]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [33]:
prompt

ChatPromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}"), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})])

In [30]:

question_answer_chain = create_stuff_documents_chain(model, prompt)


In [40]:
retriever.invoke("What is Decomposition")

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='inquired about current trends in anticancer drug discovery;\nselected a target;\nrequested a scaffold targeting these compounds;\nOnce the compound was identified, the model attempted its synthesis.'),
 Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Fig. 2.  Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).\nIn both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.\nReflexion (Shinn & Labash 2023) is a framework to equips agents with dynamic memory and self-reflection capabilities to improve reasoning skills. Reflexion has a standard RL setup, in which the reward model provides a simple binary reward and the actio

In [31]:
rag_chain = create_retrieval_chain(retriever, question_answer_chain)



In [32]:
response = rag_chain.invoke({"input": "What is Task Decomposition?"})
print(response["answer"])

Task decomposition is the process of breaking down complex tasks into smaller and simpler steps. This can be done through techniques such as Chain of Thought (CoT), Tree of Thoughts, or using task-specific instructions, and enables models to plan ahead and utilize more test-time computation. It helps transform big tasks into multiple manageable tasks.


In [42]:
for document in response["context"]:
    print(document)
    print()

page_content='This benchmark evaluates the agent’s tool use capabilities at three levels:

Level-1 evaluates the ability to call the API. Given an API’s description, the model needs to determine whether to call a given API, call it correctly, and respond properly to API returns.
Level-2 examines the ability to retrieve the API. The model needs to search for possible APIs that may solve the user’s requirement and learn how to use them by reading documentation.
Level-3 assesses the ability to plan API beyond retrieve and call. Given unclear user requests (e.g. schedule group meetings, book flight/hotel/restaurant for a trip), the model may have to conduct multiple API calls to solve it.' metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}

page_content='Fig. 4. Experiments on AlfWorld Env and HotpotQA. Hallucination is a more common failure than inefficient planning in AlfWorld. (Image source: Shinn & Labash, 2023)' metadata={'source': 'https://lilianweng.github.i