## Understanding Chaining And Runnables

### Load the ENV file

In [5]:
from dotenv import load_dotenv
import os

load_dotenv(override=True)
load = load_dotenv('../.env')

print(os.getenv("LANGSMITH_PROJECT"))

MyFirstLangchainAutomation


### Create LLM Object

In [None]:
from langchain_ollama import ChatOllama
from langchain_core.runnables import chain
from langchain_core.messages import AIMessage
from typing import Union
import re

# ChatOllama (LLM) is a RunnableInterface itself
llm_qwen = ChatOllama(
    base_url="http://localhost:11434",
    model="qwen3:latest",
    temperature=0.6,
    max_tokens=200
)

# ChatOllama (LLM) is a RunnableInterface itself
llm_llama = ChatOllama(
    base_url="http://localhost:11434",
    model="llama3.1:8b",
    temperature=0.6,
    max_tokens=200
)

# Making remove_any_think_block as a Runnable in order to chain it
@chain
def remove_any_think_block(input: Union[str, AIMessage]) -> str:
    if isinstance (input, AIMessage):
        input = input.content
    
    return re.sub(r"<think>.*?</think>", "", input, flags=re.DOTALL | re.IGNORECASE)


### Chaining the Runnables

In [15]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate([
    ("system", "You are an LLM expert"), 
    ("user", "What is the advantage of {action} AI models on Cloud ? /no_think")
    ])

# Chaining the prompt & LLM by providing the input of prompt to the LLM itself
chain = prompt | llm_qwen

chain.invoke({"action": "running"})

AIMessage(content='<think>\n\n</think>\n\nRunning AI models on the **cloud** offers numerous advantages, making it a preferred choice for developers, data scientists, and organizations. Here are the key benefits of running AI models on the cloud:\n\n---\n\n### 1. **Scalability**\n- **On-demand resources**: Cloud platforms provide the ability to scale computing power, storage, and memory up or down based on the workload.\n- **Elasticity**: AI models can be scaled to handle large volumes of data or increased inference requests without the need for physical hardware upgrades.\n\n---\n\n### 2. **Cost Efficiency**\n- **Pay-as-you-go model**: You only pay for the resources you actually use, which reduces upfront capital expenditure.\n- **No need for on-premises infrastructure**: Avoid the costs of maintaining physical servers, cooling, and other infrastructure.\n\n---\n\n### 3. **Access to Advanced Tools and Services**\n- **AI/ML platforms**: Cloud providers offer integrated AI/ML platforms 

### Output STRING Parsing

In [26]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate([
    ("system", "You are the {tool} expert"),
    ("user", "What is the advantage of {action} the {tool} {article} {var} ?"),
])

# Chaining the outputParser along with the prompt & llm
# chain = prompt | llm_qwen | StrOutputParser()
chain = prompt | llm_qwen | remove_any_think_block

output = chain.invoke({"action": "running", "tool": "LLM", "article": "on", "var": "Local machine"})

print(output)



Running a Large Language Model (LLM) on a **local machine** offers several advantages, depending on the use case, infrastructure, and requirements. Here’s a structured breakdown of the key benefits:

---

### **1. Data Privacy and Security**
- **Sensitivity of Data**: When handling **sensitive or proprietary data** (e.g., healthcare records, financial information, or intellectual property), running the model locally avoids transmitting data over the internet, reducing the risk of data breaches or unauthorized access.
- **Compliance**: Satisfies **data localization laws** (e.g., GDPR in the EU, CCPA in the US) that require sensitive data to be processed within specific geographic boundaries.

---

### **2. Control and Customization**
- **Fine-Tuning and Optimization**: Local deployment allows **hyperparameter tuning**, **model quantization**, or **custom training** to adapt the LLM to specific tasks (e.g., domain-specific language, niche applications).
- **Avoiding Vendor Lock-in**: R

### Chaining Multiple Chains | RunnableSequence

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

firstPrompt = ChatPromptTemplate([
    ("system", "You are the {tool} expert"),
    ("user", "What is the advantage of {action} the {tool} {article} {var} ? /no_think"),
])

detailedResponseChain = firstPrompt | llm_qwen | StrOutputParser()

followingPrompt = ChatPromptTemplate.from_template("""
                                     Analyse the response & get me just the headings from the {response}.
                                     Response should be in bullet points. Please do not provide explanation, but only precise output.
                                     """)

chainWithHeading = {"response": detailedResponseChain} | followingPrompt | llm_llama | StrOutputParser()

output = chainWithHeading.invoke({"action": "running", "tool": "LLM", "article": "on", "var": "Local machine"})

print(output)


### Running Chains in Parallel | RunnableParallel

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel

firstPrompt = ChatPromptTemplate([
    ("system", "You are the {tool} expert"),
    ("user", "What is the advantage of {action} the {tool} {article} {var} ? /no_think"),
])

detailedResponseChain = firstPrompt | llm_qwen | StrOutputParser()

nextPrompt = ChatPromptTemplate.from_template("""
                                     What is the capital of India ?
                                     Please do not provide LLM thoughts but rather directly the precise output.
                                     """)

chainWithHeading = nextPrompt | llm_llama | StrOutputParser()

parallelChain = RunnableParallel(chain1 = detailedResponseChain, chain2 = chainWithHeading)

output = parallelChain.invoke({"action": "running", "tool": "LLM", "article": "on", "var": "Local machine"})

# Output time reduced, even though 2 inputs are passed
print(output['chain1'])
print("\n\n")
print(output['chain2'])

### Runnable Lambda

In [27]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda

firstPrompt = ChatPromptTemplate([
    ("system", "You are the {tool} expert"),
    ("user", "What is the advantage of {action} the {tool} {article} {var} ? /no_think"),
])

# Chain 1
firstResponseChain = firstPrompt | llm_qwen | StrOutputParser()

followingPrompt = ChatPromptTemplate.from_template("""
                                     Analyse the response & get me just the headings from the {response}.
                                     Response should be in bullet points. Without explanation, but just the headings.
                                     """)

# Creating the RunnableLamda
decide_llm = lambda response: llm_llama if len(str(response)) > 300 else llm_qwen

llm_selector = RunnableLambda(decide_llm)

#Chain 2
chainWithHeading = {"response": firstResponseChain} | followingPrompt | llm_selector | StrOutputParser() | remove_any_think_block

output = chainWithHeading.invoke({"action": "running", "tool": "LLM", "article": "on", "var": "Local machine"})

print(output)

• **Privacy and Data Security**
• **Customization and Flexibility**
• **Performance and Latency**
• **Cost Efficiency**
• **Avoiding Vendor Lock-in**
• **Regulatory Compliance**
• **Scalability and Integration**
• **Training and Development** 
• **Summary**


### Using @chain Decorator

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import chain

firstPrompt = ChatPromptTemplate([
    ("system", "You are the {tool} expert"),
    ("user", "What is the advantage of {action} the {tool} {article} {var} ? /no_think"),
])

# Chain 1
firstResponseChain = firstPrompt | llm_qwen | StrOutputParser()

followingPrompt = ChatPromptTemplate.from_template("""
                                     Analyse the response & get me just the headings from the {response}.
                                     Response should be in bullet points.
                                     """)

# Using the @chain decorator to make `decide_llm` Runnable
@chain
def decide_llm(response):
    response_text = str(response)
    if len(response_text) > 300:
        return llm_llama
    return llm_qwen

#Chain 2
chainWithHeading = {"response": firstResponseChain} | followingPrompt | decide_llm | remove_any_think_block

output = chainWithHeading.invoke({"action": "running", "tool": "LLM", "article": "on", "var": "Local machine"})

print(output)