# 1. Fundamentals of LangChain

## Overview of LangChain

LangChain is a powerful framework designed to simplify the development of applications powered by Language Models (LLMs). Think of it as a toolkit that helps you build sophisticated AI applications by connecting various components together.

### Why Use LangChain?
- **Simplifies LLM Integration**: Makes it easier to work with language models
- **Promotes Reusability**: Provides ready-to-use components
- **Enhances Flexibility**: Supports multiple LLM providers (OpenAI, Anthropic, etc.)
- **Standardizes Development**: Offers consistent patterns for building applications

![LangChain Image](https://daxg39y63pxwu.cloudfront.net/images/blog/langchain/LangChain.webp)


## **Main Components of LangChain**

### 1. Language Model (LLM)
- The core component that powers text generation.
- Supports various LLM providers like OpenAI, Anthropic, and Google.

### 2. Prompt Templates
- Helps structure and format prompts dynamically.
- Useful for standardizing inputs to the LLM.

### 3. Chains
- Sequences multiple steps together, such as retrieving data and generating responses.
- Example: Input → Retrieval → LLM → Output.

### 4. Memory
- Enables storing and recalling conversation history.
- Useful for chatbots and contextual interactions.

### 5. Agents
- AI components that make decisions dynamically.
- Can decide which tools to call based on the input.

### 6. Tools
- External functionalities an agent can use, such as API calls, web searches, or calculations.
- Extends the model’s capabilities beyond text generation.

### 7. Retrieval & Vector Stores
- Helps fetch relevant data using embeddings.
- Supports databases like FAISS, Pinecone, and Chroma.

### 8. Document Loaders
- Reads and processes files (PDFs, CSVs, text, etc.).
- Useful for knowledge retrieval applications.

### 9. Output Parsers
- Structures the output into a machine-readable format.
- Converts responses into JSON, tables, or structured data.



## Key Features of LangChain

### 1. Component-Based Architecture
- **Modular Design**: Components can be mixed, matched, and customized
- **Composability**: Build complex applications by combining simple components
- **Extensibility**: Easy to create custom components to meet specific needs

### 2. LLM Abstraction and Integration
- **Model Agnostic**: Works with multiple LLM providers (OpenAI, Anthropic, local models, etc.)
- **Simple Switching**: Easily switch between different models without changing application logic
- **Parameter Standardization**: Consistent interface across different model providers

### 3. Advanced Chains
- **Sequential Processing**: Chain multiple steps together in logical sequences
- **Conditional Logic**: Implement branching and decision-making in workflows
- **Specialized Chain Types**: SequentialChain, RouterChain, TransformChain, and more

### 4. Memory Systems
- **Conversation Buffers**: Store and retrieve conversation history
- **Summary Memory**: Maintain summaries of longer conversations
- **Vector-Based Memory**: Store information based on semantic relevance
- **Multiple Memory Types**: ConversationBufferMemory, ConversationSummaryMemory, VectorStoreMemory

### 5. Agent Frameworks
- **Autonomous Decision-Making**: Agents that can plan and execute multi-step tasks
- **Tool Integration**: Enables LLMs to use external tools and functions
- **ReAct Framework**: Reasoning and acting based on environment feedback
- **Agent Types**: Zero-shot, Plan-and-execute, OpenAI functions agents, etc.

### 6. Document Processing
- **Multiple Loaders**: Import data from diverse sources (PDFs, websites, databases)
- **Text Splitters**: Divide documents into chunks for processing
- **Document Transformers**: Process and enhance document content

### 7. Vectorstores and Retrieval
- **Vector Databases**: Store and query data based on semantic similarity
- **Embedding Integration**: Works with multiple embedding models
- **Retrieval Types**: Similarity search, MMR, filtering
- **Integration**: Connects with many vector databases (Pinecone, Chroma, FAISS, etc.)

### 8. Prompt Management
- **Templating**: Create and reuse prompt templates
- **Dynamic Generation**: Dynamically construct prompts based on context
- **Optimization Tools**: Improve prompts for better results

### 9. Evaluation and Debugging
- **Tracing Framework**: Track execution of chains and agents
- **Callbacks System**: Hook into the execution process
- **Metrics Collection**: Evaluate performance of components

### 10. Streaming Support
- **Token Streaming**: Process outputs as they're generated
- **Websocket Integration**: Real-time communication in web applications
- **Incremental Processing**: Handle partial results effectively

These features make LangChain particularly powerful for building sophisticated AI applications that go beyond simple prompt-response interactions, enabling complex workflows that combine language models with external data and tools.

## **Simple LLM call using LangChain**

In [3]:
from langchain_openai import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage, SystemMessage

# Initialize the LLM (Make sure to set OPENAI_API_KEY in your environment)
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.7 , api_key="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")

# Example of a simple LLM call
response = llm.invoke("What is LangChain?")

print("LLM Response:", response.content)


LLM Response: LangChain is a decentralized platform that aims to provide language learning opportunities for users through a combination of artificial intelligence, blockchain technology, and community-based learning. Users can access language courses, practice speaking with native speakers, and earn rewards for their participation in the platform. The use of blockchain technology ensures transparency and security of user data and transactions.


## Example code for Langchain components

**Don't worry if you don't understand this right now. We'll explore all of this in detail in later notebooks!**

In [2]:
import os
from langchain_openai import ChatOpenAI, OpenAIEmbeddings  # Updated OpenAI imports
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import Tool
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.output_parsers.structured import StructuredOutputParser, ResponseSchema

# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "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"

# 1. Language Model (LLM)
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")

# 2. Prompt Templates
# Agent prompt template (requires agent_scratchpad)
agent_prompt_template = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),  # For memory
    ("user", "{input}"),  # User input
    MessagesPlaceholder(variable_name="agent_scratchpad")  # Required for OpenAI tools agent
])

# QA prompt template (doesn't need agent_scratchpad)
qa_prompt_template = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),  # For memory
    ("system", "You are a helpful assistant that answers questions based on the provided context."),
    ("user", "Question: {input}\nContext: {context}")  # User input with context
])

# 3. Memory
# For the agent - uses default input_key="input"
agent_memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# For the QA chain - specify the input_key explicitly
qa_memory = ConversationBufferMemory(memory_key="chat_history", input_key="input", return_messages=True)

# 4. Agents and Tools
def custom_tool_function(input_text):
    return f"You provided: {input_text}"

tools = [
    Tool(
        name="CustomTool",
        func=custom_tool_function,
        description="A tool that echoes back the input text."
    )
]

# Create the OpenAI tools agent
agent_executor = create_openai_tools_agent(llm, tools, agent_prompt_template)
agent = AgentExecutor(agent=agent_executor, tools=tools, memory=agent_memory)

# 5. Document Loaders
loader = TextLoader("sample.txt")  # Create a file named "sample.txt" with some text content.
documents = loader.load()

# Split documents into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

# 6. Retrieval & Vector Stores
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(texts, embeddings)

# 7. Create a QA chain with the correct prompt template
qa_chain = LLMChain(
    llm=llm,
    prompt=qa_prompt_template,  # Using the QA-specific prompt template
    memory=qa_memory
)

# 8. Output Parsers
response_schemas = [
    ResponseSchema(name="answer", type="string", description="The answer to the question"),
    ResponseSchema(name="confidence", type="float", description="Confidence score between 0 and 1")
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()

# Combine everything
def process_query(question):
    # Use the agent for dynamic decision-making
    agent_response = agent.invoke({"input": question})
    
    # Retrieve relevant context from the vector store
    relevant_context = vectorstore.similarity_search(question, k=3)
    context = "\n".join([doc.page_content for doc in relevant_context])
    
    # Use the QA chain for knowledge-based responses with instructions
    formatted_question = f"{question}\n\n{format_instructions}"
    qa_response = qa_chain.invoke({
        "input": formatted_question, 
        "context": context
    })
    
    try:
        # Parse the output into structured data
        parsed_output = output_parser.parse(qa_response["text"])
    except Exception as e:
        # Handle parsing errors gracefully
        parsed_output = {"answer": qa_response["text"], "confidence": 0.5}
        print(f"Warning: Could not parse response - {str(e)}")
    
    return {
        "agent_response": agent_response["output"],
        "qa_response": parsed_output
    }

# Example usage
if __name__ == "__main__":
    question = "What is AI"
    result = process_query(question)
    print("Agent Response:", result["agent_response"])
    print("QA Response:", result["qa_response"])

  agent_memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
  qa_chain = LLMChain(


Agent Response: AI stands for Artificial Intelligence, which refers to the simulation of human intelligence processes by machines, especially computer systems. These processes include learning, reasoning, problem-solving, perception, and language understanding.
QA Response: {'answer': 'Artificial Intelligence (AI) has transformed various industries, from healthcare to finance. Companies are leveraging AI to automate processes, improve decision-making, and enhance customer experiences. One of the most prominent applications of AI is in Natural Language Processing (NLP). Technologies like chatbots, virtual assistants, and AI-powered content generation have made communication more efficient. Moreover, AI in data analytics helps businesses extract valuable insights from large datasets, enabling smarter strategies and more informed decisions. As AI continues to evolve, ethical considerations, data privacy, and bias mitigation remain critical areas of focus.', 'confidence': 0.95}


This code creates an AI-powered question-answering system using LangChain and OpenAI. Here's what it does step by step:

1. **Setup**: It imports necessary libraries and sets up an OpenAI API key to access AI models.

2. **Two AI Components**:
   - An "Agent" that can use tools and make decisions
   - A "QA Chain" that answers questions using relevant information

3. **Memory System**: Both components remember previous conversations so they can maintain context.

4. **Document Processing**:
   - The system loads text from a file called "sample.txt"
   - It breaks this text into smaller chunks
   - It creates a searchable database of these chunks using embeddings (vector representations of text)

5. **Question Processing Flow**:
   - When you ask a question, the Agent tries to answer it using its tools
   - The system also searches for relevant information in the document database
   - The QA Chain uses this relevant information to provide a more informed answer
   - The system tries to structure the answer with a confidence score

6. **Output Format**: For each question, you get:
   - An "agent response" (direct from the AI)
   - A "QA response" (based on the document knowledge with structured format)

The main advantage of this system is that it combines general AI capabilities with specific knowledge from your documents. 


**Don't worry if you don't understand this right now. We'll explore all of this in detail in later notebooks!**

##  **Setup**

Let's get started with installing LangChain and its dependencies. We'll go through this step by step.

1. Install Python (if not installed)
Download and install Python (version 3.8 or later) from:
🔗 https://www.python.org/downloads/

2. Create a Virtual Environment (Optional but Recommended)

###### On Windows (Command Prompt or PowerShell)
- python -m venv langchain_env
- langchain_env\Scripts\activate

###### On macOS/Linux (Terminal)
- python -m venv langchain_env
- source langchain_env/bin/activate


3. Install Required Packages

- pip install langchain openai tiktoken

4. Set Up OpenAI API Key
You need an API key from OpenAI:

- Sign up at 🔗 https://platform.openai.com/signup
- Get your API key from 🔗 https://platform.openai.com/api-keys

##### On Windows (Command Prompt)
set OPENAI_API_KEY="your-api-key"

##### On PowerShell
$env:OPENAI_API_KEY="your-api-key"




5. Run a Simple Test Script

In [15]:
from langchain.chat_models import ChatOpenAI

# Initialize LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.7)

# Test call
response = llm.invoke("Tell me a fun fact about space.")

print("LLM Response:", response.content)


LLM Response: There is a planet called HD 189733b where it rains glass sideways in 7,000 km/h winds.


## What's Next?

Now that we have covered the fundamentals and set up our environment, in the next notebooks we'll explore:
1. Working with different types of LLMs
2. Creating and using Chains
3. Understanding Prompts and Templates
4. Implementing Memory in our applications
5. Working with Agents and Tools

Each topic will include practical examples and real-world use cases!