# Integration
Langchain offers a variety of plugging in external tools and systems — like APIs, databases, vector stores, file loaders, or LLM providers — directly into your LLM workflow. It makes your agent or app smarter, more interactive, and connected to real-world data.

- Setup

In [21]:
# =================== Section to change according to your choice of APIs you have access to===============
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint

repo_id = "mistralai/Mistral-7B-Instruct-v0.3"
# llm
llm = HuggingFaceEndpoint(
    repo_id=repo_id,
    task="text-generation",  # Ensure the correct task type
    temperature=0.5,
    max_new_tokens=128  # Adjust max_new_tokens for generation
)

# Set up ChatHuggingFace
chatllm = ChatHuggingFace(llm=llm)


# Test with a prompt
try:
    response = chatllm.invoke("Explain the concept of quantum computing.")
    print(response.content)
except Exception as e:
    print(e)

Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


Quantum computing is a branch of computing that utilizes the principles of quantum mechanics to perform operations on data. Unlike classical computers that use bits (0s or 1s) for processing information, quantum computers use quantum bits, or qubits. A qubit can exist in multiple states at once—a property called superposition—and can be entangled with other qubits, meaning the state of one qubit can be dependent on the state of another, no matter the distance between them.

Two fundamental ideas in quantum computing are quantum parallelism and quantum tunneling. Quantum parallelism allows a quantum computer to perform many calculations simultaneously due to the superposition property of qubits. Instead of following a sequential path like a classical computer, a quantum computer can exist in many states at once, each representing a possible solution. Quantum tunneling, on the other hand, refers to the ability of particles to pass through barriers that they wouldn't be able to in classic

## OpenAI

In [None]:
from langchain_openai import OpenAI, ChatOpenAI
from langchain_openai import OpenAIEmbeddings

# Text completion models
llm = OpenAI(
    model_name="gpt-3.5-turbo-instruct",
    temperature=0.7,
    max_tokens=256
)

# Chat models
chat_model = ChatOpenAI(
    model_name="gpt-4",
    temperature=0.7
)

# Embeddings
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small"
)

## Anthropic


In [None]:
from langchain_anthropic import ChatAnthropic

# Chat models
claude = ChatAnthropic(
    model="claude-3-opus-20240229",
    temperature=0.7,
    max_tokens_to_sample=1000
)

response = claude.invoke("Explain the concept of quantum entanglement")

## Hugging Face

In [None]:
from langchain_huggingface import HuggingFaceEndpoint, HuggingFaceEmbeddings

# LLM via Inference API
llm = HuggingFaceEndpoint(
    endpoint_url="https://api-inference.huggingface.co/models/google/flan-t5-xxl",
    task="text2text-generation"
)

# Local pipeline
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_id = "mistralai/Mistral-7B-Instruct-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

local_llm = HuggingFacePipeline(pipeline=pipe)

# Embeddings
hf_embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

## Google Cloud Vertex AI

You can use the GOOGLE_APPLICATION_CREDENTIALS environment variable to provide the location of a JSON credentials file. This file will store the credentials loaded by google-auth.

In [None]:
from langchain_google_vertexai import VertexAI, ChatVertexAI, VertexAIEmbeddings

# Text models
vertex_llm = VertexAI(
    model_name="text-bison@002",
    max_output_tokens=256,
    temperature=0.7,
    top_p=0.8,
    top_k=40,
    project="your-project-id",
    location="us-central1"
)

# Chat models
vertex_chat = ChatVertexAI(
    model_name="chat-bison@002",
    max_output_tokens=256,
    temperature=0.7,
    project="your-project-id",
    location="us-central1"
)

# Embeddings
vertex_embeddings = VertexAIEmbeddings(
    model_name="textembedding-gecko@001",
    project="your-project-id",
    location="us-central1"
)

## AWS Bedrock

The he name of the profile in the ~/.aws/credentials or ~/.aws/config files, which has either access keys or role information specified. If not specified, the default credential profile or, if on an EC2 instance, credentials from IMDS will be used.

In [None]:
from langchain_aws import BedrockLLM, BedrockEmbeddings

# Text models
bedrock_llm = BedrockLLM(
    model_id="anthropic.claude-v2",
    region_name="us-west-2",
    credentials_profile_name="default"
)

# Embeddings
bedrock_embeddings = BedrockEmbeddings(
    model_id="amazon.titan-embed-text-v1",
    region_name="us-west-2"
)

# With custom model parameters
from langchain_aws import BedrockChat

bedrock_chat = BedrockChat(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    model_kwargs={
        "temperature": 0.7,
        "max_tokens": 500,
        "top_p": 0.9
    }
)

## Templates and Best Practices

### RAG (Retrieval-Augmented Generation) Pattern


In [23]:
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_openai import  ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# 1. Load documents
loader = WebBaseLoader("https://docs.langchain.com/docs/")
docs = loader.load()

# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

# 3. Create embeddings and vector store
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# 4. Create prompt template
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)


# 5. Create RAG chain
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | chatllm
    | StrOutputParser()
)

# 7. Run the chain
response = rag_chain.invoke("What is LangChain?")
print(response)

USER_AGENT environment variable not set, consider setting it to identify your requests.


LangChain is a framework for developing applications powered by large language models (LLMs). It simplifies every stage of the LLM application lifecycle and is part of a rich ecosystem of tools that integrate with it. LangChain provides tutorials, conceptual guides, how-to guides, API reference, and integrations to help in the development process. LangGraph, a part of the LangChain ecosystem, is a tool for building stateful, multi-actor applications with LLMs.


### Agent with Memory

In [None]:
from langchain.agents import Tool, AgentExecutor, create_openai_functions_agent
from langchain.memory import ConversationBufferMemory
from langchain_core.prompts import MessagesPlaceholder

# 1. Define tools
tools = [
    Tool(
        name="Search",
        func=lambda x: f"Search results for: {x}",
        description="Useful for searching information"
    ),
    Tool(
        name="Calculator",
        func=lambda x: eval(x),
        description="Useful for performing calculations"
    )
]

# 2. Setup memory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# 3. Create prompt with memory
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to tools."),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# 4. Create the agent
agent = create_openai_functions_agent(chatllm, tools, prompt)

# 5. Create the executor
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    memory=memory,
    verbose=True
)

# 6. Run the agent
response = agent_executor.invoke({"input": "What's the capital of France?"})
follow_up = agent_executor.invoke({"input": "What's the population of its capital?"})

### Custom Tool Creation Workflow

In [27]:
from langchain.tools import StructuredTool, tool, Tool
from pydantic import BaseModel, Field
from langchain.agents import AgentType, initialize_agent
from typing import Optional, List

# 1. Simple function-based tool with @tool decorator
@tool
def search_database(query: str) -> str:
    """Search the database for information."""
    # Implement your database search logic here
    return f"Database results for: {query}"

# 2. Tool with Pydantic schema
class WeatherInput(BaseModel):
    location: str = Field(..., description="City and country or ZIP code")
    unit: str = Field("celsius", description="Temperature unit (celsius/fahrenheit)")
    
@tool(args_schema=WeatherInput)
def get_weather(location: str, unit: str = "celsius") -> str:
    """Get the current weather for a location."""
    # Implement weather API call here
    return f"Weather in {location} is sunny and 22°{unit[0].upper()}"

# 4. Create agent with tools
tools = [search_database, get_weather]

agent = initialize_agent(
    tools,
    chatllm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True
)

### LangSmith Integration
LangSmith is a tool within the LangChain ecosystem designed to help you manage, monitor, and evaluate the performance of your language model (LLM) applications. It provides features for debugging, tracking, and analyzing the behavior of models during real-world use

In [None]:
import os
import uuid
from langsmith import Client
from langsmith.evaluation import RunEvaluator

# Create LangSmith client
client = Client()

# Normal LangChain operations will now be traced automatically
result = llm.invoke("Tell me a joke")

# Log feedback after execution
# Generate a valid run_id
run_id = str(uuid.uuid4())
client.create_feedback(
    run_id=run_id,
    key="helpfulness",
    score=0.9,
    comment="Very helpful response"
)

# Run evaluations
# Assuming you have already configured your evaluation criteria and runs
evaluator = RunEvaluator()

# Evaluate the runs based on your dataset and experiment
evaluation_results = evaluator.evaluate_runs(
    dataset_name="your_dataset",  # Ensure you have a dataset
    experiment_name="your_experiment"  # Ensure you have an experiment
)

print(evaluation_results)


### Custom Chain with LCEL 

In [28]:
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from typing import Dict, Any

# 1. Define component functions
def fetch_data(query: str) -> Dict[str, Any]:
    # Fetch relevant data for the query
    return {
        "query": query,
        "timestamp": "2024-04-23",
        "results": ["Result 1", "Result 2", "Result 3"]
    }

def format_results(results: List[str]) -> str:
    return "\n".join([f"- {item}" for item in results])

# 2. Create a custom data processing chain
data_chain = RunnableLambda(fetch_data)

# 3. Create a prompt template
template = """Based on the following information:
Query: {query}
Timestamp: {timestamp}
Results:
{formatted_results}

Provide a comprehensive answer to the query."""

prompt = ChatPromptTemplate.from_template(template)

# 4. Build the full chain
chain = (
    RunnablePassthrough() 
    | data_chain 
    | {
        "query": lambda x: x["query"],
        "timestamp": lambda x: x["timestamp"],
        "formatted_results": lambda x: format_results(x["results"])
    }
    | prompt
    | chatllm
    | StrOutputParser()
)

# 5. Run the chain
response = chain.invoke("What's the latest information?")

print(response)

The latest information, as of April 23, 2024, includes the following:
1. Result 1
2. Result 2
3. Result 3

Please note that the time and date of the information are crucial for understanding the context. If more specific details are needed, it would be beneficial to know the topic or field to which these results pertain, as the meaning of "latest" can vary depending on the context. For example, "latest" in the field of technology might mean something different from "latest" in the field of politics.


### Error Handling and Retry Logic

In [29]:
from langchain_core.runnables.retry import RunnableRetry
from langchain_core.exceptions import OutputParserException
import tenacity
from langchain_core.runnables import RunnableBranch
from langchain.chat_models import ChatOpenAI

# Assuming 'chatllm' is defined earlier as an instance of an LLM

# 1. Simple retry with exponential backoff
llm_with_retry = chatllm.with_retry(
    retry_if_exception_type=(ValueError, ConnectionError),
    # max_retries=5,  # Set retry limit
    # backoff_factor=2,  # Exponential backoff factor
    # wait=tenacity.wait_exponential(multiplier=1, min=4, max=10)  # Exponential wait
)

# 2. Advanced retry with RunnableRetry
retry_config = RunnableRetry(
    bound=llm_with_retry,
    retry_if_exception_type=(ValueError, ConnectionError),
    max_retries=5,
    wait=tenacity.wait_exponential(multiplier=1, min=4, max=10)
)

# Wrap the LLM with advanced retry configuration
llm_with_advanced_retry = retry_config

# 3. Try/except pattern for more control
def safe_parse_json(text):
    try:
        import json
        return json.loads(text)
    except json.JSONDecodeError:
        return {"error": "Failed to parse JSON", "text": text}

# 4. Fallback chains
def is_complex_question(input):
    # Determine if a question requires advanced reasoning
    return len(input.split()) > 10

chain = RunnableBranch(
    (is_complex_question, chatllm),  # Use model 1 for complex questions
    chatllm# Use same model (we could have changed to another one ) for simpler questions
)

chain.invoke("What is the capital of France?")  # Simple question

AIMessage(content='The capital of France is Paris. It is one of the most significant cities in the world for art, fashion, gastronomy, and culture and is known for its iconic landmarks like the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral. Paris has long been a leading global city and a major center for academics, arts, commerce, and politics.', additional_kwargs={}, response_metadata={'token_usage': ChatCompletionOutputUsage(completion_tokens=82, prompt_tokens=10, total_tokens=92), 'model': '', 'finish_reason': 'stop'}, id='run-cc38cb3a-7852-4f9a-8086-bf7af6a2ae1d-0')

### Streaming Response Handling

In [None]:
from langchain_core.output_parsers import StrOutputParser
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_core.runnables import RunnableConfig
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
import asyncio

# 1. Basic streaming with callbacks
# =================== Section to change according to your choice of APIs you have access to===============
streaming_llm  = ChatHuggingFace(llm=llm,streaming=True, callbacks=[StreamingStdOutCallbackHandler()])
#=========================================================================================================


streaming_llm.invoke("Write a short story about space exploration")

In [None]:
from langchain_core.output_parsers import StrOutputParser
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_core.runnables import RunnableConfig
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
import asyncio

# 1. Basic streaming with callbacks
# =================== Section to change according to your choice of APIs you have access to===============
streaming_llm  = ChatHuggingFace(llm=llm,streaming=True, callbacks=[StreamingStdOutCallbackHandler])
#=========================================================================================================


streaming_llm.invoke("Write a short story about space exploration")

# 2. Custom streaming handler
class CustomStreamingHandler(StreamingStdOutCallbackHandler):
    def __init__(self):
        self.text = ""

    def on_llm_new_token(self, token: str, **kwargs) -> None:
        self.text += token
        print(token, end="", flush=True)

handler = CustomStreamingHandler()

# =================== Section to change according to your choice of APIs you have access to===============
chatllm = ChatHuggingFace(llm=llm,streaming=True, callbacks=[handler])
#=========================================================================================================

llm.invoke("Write a poem about AI")

# 3. Async streaming with LCEL (Jupyter-friendly)
async def generate_stream():
    prompt = ChatPromptTemplate.from_template("Write a {length} story about {topic}")
    chain = prompt | chatllm | StrOutputParser()

    async for chunk in chain.astream({
        "length": "short",
        "topic": "robots learning to paint"
    }):
        print(chunk, end="", flush=True)

# If running in notebook, use this instead of asyncio.run
await generate_stream()  # ✅ Works in notebooks or any already-running event loop

# 4. Stream with config
config = RunnableConfig(
    callbacks=[StreamingStdOutCallbackHandler()],
    run_name="Streaming Example"
)

chain = ChatPromptTemplate.from_template("Explain {topic} in detail") | chatllm 
chain.invoke({"topic": "quantum computing"}, config=config)


### Building a Chatbot with LCEL and Streaming

In [15]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.memory import ChatMessageHistory
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

# 1. Create chat template with history
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant."),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}")
])

# 2. Create LLM
model = GroqChatLLM(temperature=0.7, streaming=True, callbacks=[StreamingStdOutCallbackHandler()])

# 3. Create the chain
chain = prompt | model

# 4. Add message history
message_history = {}  # store histories by session_id

def get_session_history(session_id: str) -> ChatMessageHistory:
    if session_id not in message_history:
        message_history[session_id] = ChatMessageHistory()
    return message_history[session_id]

chain_with_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history"
)

# 5. Run the chatbot
response1 = chain_with_history.invoke(
    {"input": "Hello, my name is Alice."},
    config={"configurable": {"session_id": "alice_session"}}
)

response2 = chain_with_history.invoke(
    {"input": "What's my name?"},
    config={"configurable": {"session_id": "alice_session"}}
)

response1, response2

(AIMessage(content="Nice to meet you, Alice! I'm happy to be your helpful AI assistant. Is there something I can assist you with, or would you like to chat for a bit?", additional_kwargs={}, response_metadata={}, id='run-0376cea7-5d86-451c-8bf8-6b25bae51ca1-0'),
 AIMessage(content='Your name is Alice!', additional_kwargs={}, response_metadata={}, id='run-cc629b10-0d3d-4b3e-b810-bffa5e4590c0-0'))