# WS24 - Intelligente Informationssysteme

## Block 3: Retrieval Augmented Generation

**Part 3: Understand LangChain Chains**


In [1]:
## Understand the concept of Chains
# LangChain uses the LCEL (LangChain Executable Language) Runnable protocol to define the chain
# https://python.langchain.com/docs/concepts/lcel/#runnablesequence
#
# In LangChain Retrievers, Prompts or LLMs are instances of Runnable.
# This means that they implement the same methods, such as sync and async .invoke, .stream, or .batch
# which makes them easier to connect together. 
# They can be connected into a RunnableSequence -- another Runnable-- via the | operator.
#
# Look at some examples from https://python.langchain.com/docs/how_to/lcel_cheatsheet/

In [2]:
# Start with the concept of a Runnable
from langchain_core.runnables import RunnableLambda

def some_function(x: int) -> str:
    return str(x)

runnable = RunnableLambda(some_function)
print(type(runnable.invoke(5)))

<class 'str'>


In [3]:
# Async variant:
print(await runnable.ainvoke(7))

# or as batch
runnable = RunnableLambda(lambda x: str(x))
print(runnable.batch([5, 6, 7]))

7
['5', '6', '7']


In [4]:
# Runnables can be streamed
import time
def some_function(x: list):
    for y in x:
        time.sleep(1) # wait 1 second!
        yield str(y)


for i in some_function([1,2,3,4,5]):
    print(i)

print("----- now with a streamed runnable -----")
runnable = RunnableLambda(some_function)

for chunk in runnable.stream(range(6,10,1)):
    print(chunk)

# Async variant:
# async for chunk in await runnable.astream(range(5)):
#     print(chunk)

1
2
3
4
5
----- now with a streamed runnable -----
6
7
8
9


In [5]:
# Compose two runnables: composed to a chain: left to rigth
runnable1 = RunnableLambda(lambda x: {"context": x})
runnable2 = RunnableLambda(lambda x: [x]*3) # returns a list of 3 elements

chain = runnable2 | runnable1
print(chain.invoke(2))

chain = runnable1 | runnable2
print(chain.invoke(2))

{'context': [2, 2, 2]}
[{'context': 2}, {'context': 2}, {'context': 2}]


In [6]:
# Merge input dict with output dict
from langchain_core.runnables import RunnableLambda, RunnablePassthrough

runnable1 = RunnableLambda(lambda x: x["foo"] + 7)

runnable1.invoke({"foo": 10})

17

In [7]:
# Merge input dict with output dict
from langchain_core.runnables import RunnableLambda, RunnablePassthrough

runnable1 = RunnableLambda(lambda x: x["foo"] + 7)

chain = RunnablePassthrough.assign(bar=runnable1)

chain.invoke({"foo": 10})

{'foo': 10, 'bar': 17}

In [8]:
#Include input dict in output dict
from langchain_core.runnables import RunnableLambda, RunnableParallel, RunnablePassthrough

runnable1 = RunnableLambda(lambda x: x["foo"] + 7)

chain = RunnableParallel(bar=runnable1, baz=RunnablePassthrough())

chain.invoke({"foo": 10})

{'bar': 17, 'baz': {'foo': 10}}

In [40]:
from langchain_core.runnables import RunnableLambda, RunnablePassthrough

runnable1 = RunnableLambda(lambda x: x["foo"] + 7)

chain = RunnableParallel(bar=runnable1, baz=RunnablePassthrough())
print(chain.invoke({"foo": 10}))

chain = RunnablePassthrough.assign(baz=RunnablePassthrough() | runnable1)
print(chain.invoke({"foo": 10}))

chain = RunnableParallel(bar=RunnablePassthrough(), baz=RunnablePassthrough()| runnable1)
print(chain.invoke({"foo": 10}))

{'bar': 17, 'baz': {'foo': 10}}
{'foo': 10, 'baz': 17}
{'bar': {'foo': 10}, 'baz': 17}


In [42]:
#!pip install grandalf

Collecting grandalf
  Downloading grandalf-0.8-py3-none-any.whl.metadata (1.7 kB)
Collecting pyparsing (from grandalf)
  Using cached pyparsing-3.2.0-py3-none-any.whl.metadata (5.0 kB)
Downloading grandalf-0.8-py3-none-any.whl (41 kB)
Using cached pyparsing-3.2.0-py3-none-any.whl (106 kB)
Installing collected packages: pyparsing, grandalf
Successfully installed grandalf-0.8 pyparsing-3.2.0


In [9]:
chain.get_graph().print_ascii()

   +------------------------+        
   | Parallel<bar,baz>Input |        
   +------------------------+        
          **         ***             
        **              *            
       *                 **          
+--------+          +-------------+  
| Lambda |          | Passthrough |  
+--------+          +-------------+  
          **         ***             
            **      *                
              *   **                 
   +-------------------------+       
   | Parallel<bar,baz>Output |       
   +-------------------------+       


In [10]:
def prompt_funct(x: dict) -> str:
    return f"This is the context:\n\n {x['context']} \n\nThis is the question:\n\n {x['question']}"

runnable_1 = RunnableLambda(prompt_funct)

print(runnable_1.invoke({"context": "CONTEXT", "question": "QUESTION"}))

This is the context:

 CONTEXT 

This is the question:

 QUESTION


In [12]:
# Put all together
from langchain_ollama import OllamaEmbeddings
from langchain_ollama import ChatOllama
from langchain_chroma import Chroma
from langchain_core.prompts import PromptTemplate

embeddings = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = Chroma(persist_directory="vector_store", collection_name="lils_blogs", embedding_function=embeddings)
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

llm = ChatOllama(model="llama3.2:latest")

prompt = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:"""

prompt_template = PromptTemplate.from_template(prompt)

question = "What are the approaches to task decomposition?"

In [13]:
## Put all together
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

def format_docs(docs):
    return "\n\n".join(f"{i}.) {doc.page_content}" for i, doc in enumerate(docs))

retriever_chain = retriever | format_docs
print(retriever_chain.invoke(question))

0.) Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

1.) (3) Task execution: Expert models execute on the specific tasks and log results.
Instruction:

With the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first

In [14]:
rag_chain = (
    {"context": retriever_chain, "question": RunnablePassthrough()}
    | prompt_template
    | llm
    | StrOutputParser()
)
rag_chain.get_graph().print_ascii()

            +---------------------------------+          
            | Parallel<context,question>Input |          
            +---------------------------------+          
                    ***               ***                
                 ***                     ***             
               **                           ***          
+----------------------+                       **        
| VectorStoreRetriever |                        *        
+----------------------+                        *        
            *                                   *        
            *                                   *        
            *                                   *        
    +-------------+                     +-------------+  
    | format_docs |                     | Passthrough |  
    +-------------+*                    +-------------+  
                    ***               ***                
                       ***         ***                   
              

In [15]:
for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)

Task decomposition is a technique used to break down complex tasks into smaller, simpler steps. It involves using natural language processing (NLP) techniques, such as chain of thought (CoT) or tree of thoughts, to transform big tasks into manageable ones. This allows the agent to plan ahead and utilize more test-time computation to decompose hard tasks into smaller and simpler steps.