
llamaindex vs langchain check stars on github
crew ai vs langgraph



#### Drawbacks of Using LLMs Directly

The instructor highlighted the limitations of Large Language Models (LLMs) when building generative AI applications:

1. **Limited Context Window:** LLMs have input limits (token size).
2. **Outdated Knowledge Base:** LLMs are trained on a specific dataset and may not have access to the latest information. For example, ChatGPT's knowledge base is generally considered to be up to October 2023.
3. **Data Privacy Concerns:** Using LLMs directly can pose data privacy risks.
4. **Cost:** Using LLMs, especially raw models, can be expensive, especially with large inputs and outputs.
5. **Lack of Third-Party Tool Connections:** LLMs generally lack direct integration with external tools and data sources.

#### Why LangChain?

LangChain addresses the limitations of using LLMs directly by providing a universal framework for building generative AI applications. It acts as an orchestration framework, integrating various tools and technologies commonly used in generative AI, including:

- Large Language Models (LLMs) from various providers (OpenAI, Llama, Mistral, etc.)
- Vector databases
- LangSmith (for monitoring)
- LangServe (for deployment)
- LangGraph (for agents and multi-agent systems)

The instructor contrasted LangChain with LlamaIndex, stating both are good but have different capabilities. The choice to focus on LangChain was due to the series' focus and LangChain's comprehensive ecosystem. The instructor also mentioned `kuru.ai` as an alternative that builds on LangChain, offering a higher-level abstraction with less customization but easier to understand syntax. LangChain, in comparison, provides greater customization capabilities.



#### LangChain Evaluation and Evolution

The evolution of LangChain was discussed, covering these key points:

- **From Completion Models to Chat Models:** LangChain initially primarily supported completion models (like older GPT models) but has evolved to integrate chat models, which allow for conversational interactions and system prompts. The instructor demonstrated the difference between the two using the OpenAI playground.
  
- **From Legacy LangChain to LangChain Expression Language (LCEL):** LangChain's syntax has evolved from a function-based approach to using a pipeline-based approach with the `|>` operator. This shift simplifies chain creation.
  
- **LangChain Ecosystem:** The development of a robust ecosystem, including LangSmith, LangServe, and LangGraph, enhanced LangChain’s capabilities.
  
- **Documentation:** The instructor noted that the LangChain documentation has historically been considered difficult to navigate.



#### Learning Roadmap

The proposed learning path for LangChain was outlined as follows:

1. **Introduction and Overview:** (Covered in this session)
2. **Interacting with LLMs using LangChain:** Using different LLMs (OpenAI, open-source).
3. **Working with Data and RAG:** Loading custom data, connecting LLMs with data sources.
4. **Chains and Runnables:** Building chains of execution using LangChain’s expression language.
5. **Memory:** Implementing memory to maintain conversational context.
6. **Ecosystem Exploration:** Using LangSmith, LangServe, and LangGraph.
7. **Application Development:** Creating applications at three levels of complexity:
   - **Level 1:** Basic Python-based applications without a UI.
   - **Level 2:** Applications with a proof-of-concept (POC) UI using frameworks like Streamlit, Flask, or FastAPI.
   - **Level 3:** Professional, full-stack applications.
  

# LangChain: Interacting with Large Language Models (LLMs) - Session Summary

## Introduction

This session focuses on interacting with Large Language Models (LLMs) using LangChain, emphasizing practical application and local environment setup. The agenda includes connecting with OpenAI and alternative LLMs, interacting with LLMs in different languages, and an introduction to prompt templates and chains.

## Rationale for Using OpenAI

OpenAI is preferred for production environments due to its robustness, accuracy, and cost-effectiveness when combined with efficient prompt engineering. While open-source models like Meta Llama and Mistral are suitable for Proof of Concepts (POCs), they may not be as reliable for production use.

## Environment Setup

The session advocates for local development to avoid dependencies on cloud environments like Google Colab. The setup involves:

1. **Creating a Conda Environment:**
   ```bash
   conda create -n llm_app python=3.11 -y
   conda activate llm_app
   ```

difference between conda env and virtualenv 
don't try to download the latest version try to download a just previous stable version.

2. **Installing Required Packages:**
   ```bash
   pip install -r requirements.txt
   ```
   - `requirements.txt` includes:
     ```
     langchain==0.2.10
     langchain-openai==0.1.17
     python-dotenv
     langchain-grok==0.1.6
     ```

## Connecting to OpenAI Models

### Loading API Keys
```python
from dotenv import load_dotenv
import os

def load_env():
    load_dotenv()
    return os.getenv("OPENAI_API_KEY")

openai_api_key = load_env()
```

completion model is only for completing things they don't chat
chat model is for conversation

### Using OpenAI Completion Model
```python
from langchain_openai import OpenAI

llm_model = OpenAI()
response = llm_model.invoke("Tell me one fact about the Kennedy family.")
print(response)
```

# for streaming
for chunk in llmModel.stream(
"Tell me one fun fact about the Kennedy family in detail"
):
    print(chunk, end="", flush=True)

### Using OpenAI Chat Model
```python
from langchain_openai import ChatOpenAI

chat_model = ChatOpenAI(model_name="gpt-3.5-turbo")
messages = [
    {"role": "system", "content": "You are a historian expert in the Kennedy family."},
    {"role": "user", "content": "Tell me one curious thing about JFK."},
]
or 
messages = [
    ("system", "You are a historian expert in the Kennedy family."),
    ("user", "Tell me one curious thing about JFK."),
]
response = chat_model.invoke(messages)
print(response.content)
response.schema()
```



## Alternative LLMs via Grok

### Loading and Using Grok LLMs
```python
from langchain_groq import ChatGroq

grok_api_key = load_env()
os.environ["GROQ_API_KEY"] = grok_api_key
llama_model = ChatGroq(model="llama-3-7b-8192")
mistral_model = ChatGroq(model="mistral-7b-v0.1")

messages = [
    {"role": "system", "content": "You are a historian expert in the Kennedy family."},
    {"role": "user", "content": "How many members of the family died tragically?"},
]
llama_response = llama_model.invoke(messages)
print(llama_response.content)
```

## Tips and Best Practices

- **Stable Versions:** Use specific versions of packages to avoid breaking changes.
- **Local Setup:** Focus on local development for real-world projects.
- **Environment Variables:** Manage sensitive information like API keys using `.env` files.

## Conclusion and Resources

The session concludes with a summary, emphasizing the importance of using chat models over completion models in modern applications. 


## LangChain & LLM Ops Series: Day 3 - Prompt Templates, Chains, and Output Parsers

This document summarizes Day 3 of a LangChain and LLM Ops series, focusing on prompt templates, chains, and output parsers.


### II. Today's Agenda (Day 3)

1. **Prompt Templates**
2. **Chains**
3. **Output Parsers**

### III. Prompt Templates

**A. What is a Prompt?**

A prompt is a way to communicate with LLMs, essentially a programming language for them.

**B. What is a Prompt Template?**

A prompt template allows dynamic input during runtime, enabling the insertion of variables instead of hardcoding prompts.

**Example:**

Instead of multiple hardcoded prompts, a template can be used:



prompttemplate is for completion model.
chatprompttemplate is for chat model


```python
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

llm = OpenAI(temperature=0)
prompt_template = PromptTemplate(
    input_variables=["adjective", "topic"],
    template="Tell me a {adjective} story about {topic}.",
)

llm_prompt = prompt_template.format(adjective="curious", topic="the Kennedy family")
response = llm(llm_prompt)
print(response)
```

```python
from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an {profession} expert on {topic}."),
        ("user", "Hello, Mr. {profession}, can you please answer a question?"),
        ("assistant", "Sure"),
        ("user", "{user_input}"),
    ]
)

message = chat_template.format_messages(profession="software engineer", topic="python", user_input="What is the output of print(1+1)?")

```

###   Prompt things


Certainly! Let's break down the differences among these prompt templates and understand when to use each one with examples.

### 1. **BasePromptTemplate**
   - **Description**: This is the base class for all prompt templates. It defines the common interface and methods that all prompt templates should implement.
   - **Usage**: You typically don't use this directly but rather as a base class for other prompt templates.

### 2. **StringPromptTemplate**
   - **Description**: This is a simple prompt template that outputs a single string. It is used when you want to generate a prompt as a single string.
   - **Usage**: 
     ```python
     from langchain import StringPromptTemplate

     template = StringPromptTemplate(template="Translate the following text to {language}: {text}")
     prompt = template.format(language="French", text="Hello, how are you?")
     print(prompt)  # Output: "Translate the following text to French: Hello, how are you?"
     ```

### 3. **PromptTemplate**
   - **Description**: This is a more general prompt template that can handle multiple input variables and format them into a single string.
   - **Usage**:
     ```python
     from langchain import PromptTemplate

     template = PromptTemplate(template="Translate the following text to {language}: {text}", input_variables=["language", "text"])
     prompt = template.format(language="Spanish", text="Hello, how are you?")
     print(prompt)  # Output: "Translate the following text to Spanish: Hello, how are you?"
     ```

### 4. **ChatPromptTemplate**
   - **Description**: This is used for generating prompts for chat-based models. It can handle multiple messages (e.g., system, user, assistant messages) and format them into a structured chat prompt.
   - **Usage**:
     ```python
     from langchain import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

     system_template = SystemMessagePromptTemplate.from_template("You are a helpful assistant.")
     human_template = HumanMessagePromptTemplate.from_template("{text}")

     chat_prompt = ChatPromptTemplate.from_messages([system_template, human_template])
     prompt = chat_prompt.format_prompt(text="Translate the following text to French: Hello, how are you?").to_messages()
     print(prompt)
     # Output: [SystemMessage(content='You are a helpful assistant.'), HumanMessage(content='Translate the following text to French: Hello, how are you?')]
     ```

### 5. **SystemMessagePromptTemplate**
   - **Description**: This is used to create a system message in a chat prompt. System messages typically set the context or role of the assistant.
   - **Usage**:
     ```python
     from langchain import SystemMessagePromptTemplate

     system_template = SystemMessagePromptTemplate.from_template("You are a helpful assistant.")
     print(system_template.format())  # Output: "You are a helpful assistant."
     ```

### 6. **HumanMessagePromptTemplate**
   - **Description**: This is used to create a human message in a chat prompt. Human messages are typically the input from the user.
   - **Usage**:
     ```python
     from langchain import HumanMessagePromptTemplate

     human_template = HumanMessagePromptTemplate.from_template("{text}")
     print(human_template.format(text="Hello, how are you?"))  # Output: "Hello, how are you?"
     ```

### 7. **AIMessagePromptTemplate**
   - **Description**: This is used to create an AI message in a chat prompt. AI messages are typically the responses from the assistant.
   - **Usage**:
     ```python
     from langchain import AIMessagePromptTemplate

     ai_template = AIMessagePromptTemplate.from_template("{response}")
     print(ai_template.format(response="I'm doing well, thank you!"))  # Output: "I'm doing well, thank you!"
     ```

### 8. **FewShotPromptTemplate**
   - **Description**: This is used to create a prompt that includes few-shot examples. Few-shot examples are used to guide the model by showing it how to respond to similar inputs.
   - **Usage**:
     ```python
     from langchain import FewShotPromptTemplate, PromptTemplate

     examples = [
         {"input": "Hello", "output": "Hi there!"},
         {"input": "How are you?", "output": "I'm doing well, thank you!"}
     ]

     example_prompt = PromptTemplate(input_variables=["input", "output"], template="Input: {input}\nOutput: {output}")
     few_shot_prompt = FewShotPromptTemplate(
         examples=examples,
         example_prompt=example_prompt,
         prefix="Translate the following text to French:",
         suffix="Input: {text}\nOutput:",
         input_variables=["text"]
     )

     prompt = few_shot_prompt.format(text="Hello, how are you?")
     print(prompt)
     # Output: "Translate the following text to French:\nInput: Hello\nOutput: Hi there!\nInput: How are you?\nOutput: I'm doing well, thank you!\nInput: Hello, how are you?\nOutput:"
     ```

### 9. **PipelinePromptTemplate**
   - **Description**: This is used to create a pipeline of prompts where the output of one prompt is used as the input for the next.
   - **Usage**:
     ```python
     from langchain import PipelinePromptTemplate, PromptTemplate

     full_template = PromptTemplate(template="Final answer: {answer}", input_variables=["answer"])
     partial_template = PromptTemplate(template="Translate the following text to {language}: {text}", input_variables=["language", "text"])

     pipeline_prompt = PipelinePromptTemplate(
         final_prompt=full_template,
         pipeline_prompts=[("answer", partial_template)]
     )

     prompt = pipeline_prompt.format(language="French", text="Hello, how are you?")
     print(prompt)  # Output: "Final answer: Translate the following text to French: Hello, how are you?"
     ```

### 10. **MessagesPlaceholder**
   - **Description**: This is used to dynamically insert a list of messages into a chat prompt. It allows you to dynamically generate or modify the messages in the prompt.
   - **Usage**:
     ```python
     from langchain import ChatPromptTemplate, MessagesPlaceholder, HumanMessagePromptTemplate

     chat_prompt = ChatPromptTemplate.from_messages([
         MessagesPlaceholder(variable_name="history"),
         HumanMessagePromptTemplate.from_template("{input}")
     ])

     messages = [HumanMessage(content="Hello"), AIMessage(content="Hi there!")]
     prompt = chat_prompt.format_prompt(history=messages, input="How are you?").to_messages()
     print(prompt)
     # Output: [HumanMessage(content='Hello'), AIMessage(content='Hi there!'), HumanMessage(content='How are you?')]
     ```

### Summary
- **BasePromptTemplate**: Base class, not used directly.
- **StringPromptTemplate**: Simple string-based prompts.
- **PromptTemplate**: General-purpose prompt with multiple input variables.
- **ChatPromptTemplate**: Structured chat prompts with multiple message types.
- **SystemMessagePromptTemplate**: System messages in chat prompts.
- **HumanMessagePromptTemplate**: Human messages in chat prompts.
- **AIMessagePromptTemplate**: AI messages in chat prompts.
- **FewShotPromptTemplate**: Prompts with few-shot examples.
- **PipelinePromptTemplate**: Pipelined prompts.
- **MessagesPlaceholder**: Dynamically insert messages into chat prompts.

Each of these templates is designed for a specific use case, and choosing the right one depends on the structure and complexity of the prompt you need to generate.


Certainly! The methods like `from_message`, `from_template`, and others are convenience methods provided by LangChain to simplify the creation of prompt templates. Let's break down each of these methods and understand how they work.

### 1. **from_template**
   - **Description**: This method is used to create a prompt template from a string template. It automatically infers the input variables from the template string.
   - **Usage**:
     ```python
     from langchain import PromptTemplate

     template = PromptTemplate.from_template("Translate the following text to {language}: {text}")
     prompt = template.format(language="French", text="Hello, how are you?")
     print(prompt)  # Output: "Translate the following text to French: Hello, how are you?"
     ```

### 2. **from_message**
   - **Description**: This method is used to create a message template from a message object. It is typically used in chat-based models to create message templates from existing message objects.
   - **Usage**:
     ```python
     from langchain import HumanMessagePromptTemplate, HumanMessage

     message = HumanMessage(content="Hello, how are you?")
     message_template = HumanMessagePromptTemplate.from_message(message)
     print(message_template.format())  # Output: "Hello, how are you?"
     ```

### 3. **from_role**
   - **Description**: This method is used to create a message template from a role. It is typically used in chat-based models to create message templates based on the role of the message (e.g., system, user, assistant).
   - **Usage**:
     ```python
     from langchain import HumanMessagePromptTemplate

     message_template = HumanMessagePromptTemplate.from_role(role="user")
     print(message_template.format(content="Hello, how are you?"))  # Output: "Hello, how are you?"
     ```

### 4. **from_examples**
   - **Description**: This method is used to create a few-shot prompt template from a list of examples. It is useful for guiding the model with examples of how to respond to similar inputs.
   - **Usage**:
     ```python
     from langchain import FewShotPromptTemplate, PromptTemplate

     examples = [
         {"input": "Hello", "output": "Hi there!"},
         {"input": "How are you?", "output": "I'm doing well, thank you!"}
     ]

     example_prompt = PromptTemplate(input_variables=["input", "output"], template="Input: {input}\nOutput: {output}")
     few_shot_prompt = FewShotPromptTemplate.from_examples(
         examples=examples,
         example_prompt=example_prompt,
         prefix="Translate the following text to French:",
         suffix="Input: {text}\nOutput:",
         input_variables=["text"]
     )

     prompt = few_shot_prompt.format(text="Hello, how are you?")
     print(prompt)
     # Output: "Translate the following text to French:\nInput: Hello\nOutput: Hi there!\nInput: How are you?\nOutput: I'm doing well, thank you!\nInput: Hello, how are you?\nOutput:"
     ```

### 5. **from_file**
   - **Description**: This method is used to create a prompt template from a file. It reads the template from a file and infers the input variables.
   - **Usage**:
     ```python
     from langchain import PromptTemplate

     template = PromptTemplate.from_file("path/to/template.txt")
     prompt = template.format(language="French", text="Hello, how are you?")
     print(prompt)  # Output: "Translate the following text to French: Hello, how are you?"
     ```

### 6. **from_prompt**
   - **Description**: This method is used to create a prompt template from another prompt template. It is useful when you want to reuse or extend an existing prompt template.
   - **Usage**:
     ```python
     from langchain import PromptTemplate

     base_template = PromptTemplate(template="Translate the following text to {language}: {text}", input_variables=["language", "text"])
     extended_template = PromptTemplate.from_prompt(base_template, additional_input_variables=["context"])
     prompt = extended_template.format(language="French", text="Hello, how are you?", context="Formal conversation")
     print(prompt)  # Output: "Translate the following text to French: Hello, how are you?"
     ```

### Summary
- **from_template**: Create a prompt template from a string template.
- **from_message**: Create a message template from a message object.
- **from_role**: Create a message template from a role.
- **from_examples**: Create a few-shot prompt template from a list of examples.
- **from_file**: Create a prompt template from a file.
- **from_prompt**: Create a prompt template from another prompt template.

These methods provide a convenient way to create and manage prompt templates, making it easier to work with different types of prompts and messages in LangChain.



from langchain_core.prompts import FewShotChatMessagePromptTemplate

examples = [
    {
        "question": "What is the capital of France?",
        "answer": "The capital of France is Paris.",
        "source": "Wikipedia",
    },
    {
        "question": "What is the capital of Germany?",
        "answer": "The capital of Germany is Berlin.",
        "source": "Wikipedia",
    },
]

example_prompt = ChatPromptTemplate.from_messages(
    [
        ("user","{input}"),
        ("assistant","{output}"),
    ]
)

few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)

final_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "you are an English-Spanish translator."),
        few_shot_prompt,
        ("human", "{input}"),
    ]
    
)
chain  = final_prompt | llm 

chain.invoke({"input":"How are you"})



### IV. Chains

**A. Understanding Chains**

Chains represent sequences of executable actions, including prompts, models, and output parsers.

**Example:**

```python
from langchian.output_parsers.json import SimpleJsonOutputParser

from langchain_core.prompts import PromptTemplate

from langchain.output_parsers import SimpleJsonOutputParser

prompt_template = PromptTemplate.from_template(
    template="Return a JSON object with an 'answer' key that answers the following question: {question}.",
    input_variables=["question"]
)

llm = OpenAI()
output_parser = SimpleJsonOutputParser()

chain = prompt_template | llm | output_parser

response = chain.invoke({"question": "What is the biggest country?"})
print(response)
```

### V. Output Parsers

**A. Understanding Output Parsers**

Output parsers format the LLM's response into a specific structure.

**Example: JSON Output Parser**



**B. Custom Output Parser using Pydantic**

```python
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field


class Joke(BaseModel):
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline of the joke")
    
    
parser = JsonOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="Answer the user query.\n {format_instructions}\n{user_query}\n",
    input_variables=["user_query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | llm | parser

chain.invoke({"user_query": "What is the meaning of life?"})
```


The `partial_variables` parameter in the `PromptTemplate` class allows you to specify variables that are pre-filled with values when the template is created. These variables are not expected to be provided as input when the template is used. Instead, they are included in the template with predefined values.

In your example, `partial_variables` is used to pre-fill the `format_instructions` variable with the value returned by `parser.get_format_instructions()`. This means that when you create the `PromptTemplate`, the `format_instructions` variable will already have a value, and you won't need to provide it as input when you use the template.

Here's a breakdown of the code and an explanation of how `partial_variables` works:

### Example Code

```python
from langchain.prompts import PromptTemplate

# Assume parser is an object with a method get_format_instructions()
class Parser:
    def get_format_instructions(self):
        return "Please provide the answer in a clear and concise manner."

parser = Parser()

# Create the prompt template with partial variables
prompt = PromptTemplate(
    template="Answer the user query.\n {format_instructions}\n{user_query}\n",
    input_variables=["user_query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

# Example usage
user_query = "What is the capital of Italy?"
formatted_prompt = prompt.format(user_query=user_query)
print(formatted_prompt)
```

### Explanation

1. **Parser Class**:
   - The `Parser` class has a method `get_format_instructions()` that returns a string with format instructions.

2. **PromptTemplate**:
   - The `PromptTemplate` class is used to create a prompt template.
   - The `template` parameter defines the structure of the prompt, including placeholders for variables.
   - The `input_variables` parameter specifies the variables that will be provided as input when the template is used. In this case, `user_query` is the input variable.
   - The `partial_variables` parameter specifies the variables that are pre-filled with values. In this case, `format_instructions` is pre-filled with the value returned by `parser.get_format_instructions()`.

3. **Example Usage**:
   - The `format` method of the `PromptTemplate` instance is used to generate the formatted prompt.
   - The `user_query` variable is provided as input, and the `format_instructions` variable is already pre-filled.

### Output

The output will be:

```
Answer the user query.
Please provide the answer in a clear and concise manner.
What is the capital of Italy?
```

### Benefits of Using `partial_variables`

1. **Pre-filled Values**:
   - You can pre-fill certain variables with values that are constant or predefined, reducing the need to provide them as input every time you use the template.

2. **Simplified Input**:
   - By pre-filling some variables, you simplify the input required to use the template, making it easier to use and reducing the risk of errors.

3. **Consistency**:
   - Pre-filled variables ensure consistency in the prompt, as the same values are used every time the template is used.

By using `partial_variables`, you can create more flexible and reusable prompt templates that are easier to work with.



# Day Four: Langchain and LLM Ops Series - Working with Data and RAG Basic Demo

This document summarizes a technical session on working with data and Retrieval Augmented Generation (RAG) using Langchain.

## I. Introduction

The session is part of a series covering Langchain and LLM Ops. The focus is on the basics of RAG, specifically how to work with custom data and overcome limitations of directly feeding large datasets to LLMs. The session covers data loading, an introduction to RAG, its components, and a basic RAG demo with Langchain's expression language.

## II. Agenda

1. **Data Loading:** Techniques for loading custom data in various formats.
2. **Introduction to RAG:** Explanation of RAG and its components.
3. **RAG Components:** Detailed explanation of splitter, embeddings, vector store, retriever, and top-k.
4. **RAG Demo:** Building a RAG system using Langchain's expression language and custom data.

## III. Limitations of Direct LLM Input

Directly using pre-trained LLMs without custom data has limitations. For example, querying about specific information not known to the model results in unsatisfactory responses.

```python
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model_name="gpt-3.5-turbo-0301")

prompt = """You are an helpful assistant.
Tell me about DS with Bubby"""

response = llm(prompt)
print(response)  # Output indicates lack of specific information.
```

## IV. Data Loading with Langchain

Langchain provides tools to load data in various formats. Install necessary packages:

```bash
pip install langchain-community
```

### A. TXT Data Loading

```python
from langchain.document_loaders import TextLoader

loader = TextLoader('data/b_good.txt')
loaded_data = loader.load()
print(loaded_data)  # List of Document objects
```

### B. CSV Data Loading

```python
from langchain.document_loaders import CSVLoader

loader = CSVLoader('data/straight_tree_list.csv')
loaded_data = loader.load()
print(loaded_data)  # List of Document objects per row
```

### C. HTML Data Loading

```python
from langchain.document_loaders import UnstructuredURLLoader UnstructedHTMLLoader

# Load HTML data (path or URL to HTML file)
```

### D. PDF Data Loading

```python
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader('data/mypdf.pdf')
loaded_data = loader.load()
print(loaded_data)  # List of Document objects per page
```

### E. Wikipedia Data Loading

```python
from langchain.utilities import WikipediaAPILoader

loader = WikipediaAPILoader(query="Tesla", load_max_docs=1)
loaded_data =loader.load()[0].page_content
print(loaded_data)  # List of Document objects
```

## V. Drawbacks of Directly Loading Large Datasets

1. **Input Limit:** LLMs have context window limits (e.g., 16385 tokens for `gpt-3.5-turbo`).
2. **Cost:** Processing large datasets increases costs due to per-token charges.

## VI. Introduction to RAG (retrieval augmented generation)

RAG overcomes these limitations by:

1. **Splitting:** Dividing documents into smaller chunks.
2. **Embedding:** Converting text chunks into numerical vectors.
3. **Vector Database:** Storing embeddings for efficient retrieval.
4. **Retrieval:** Fetching relevant embeddings based on user queries.
5. **LLM Processing:** Using retrieved context with queries to generate responses.

## VII. Next Steps and Conclusion

The next session will delve into RAG components and demonstrate building a RAG application. The speaker shares resources like GitHub repositories and suggests skills for generative AI engineering. The session concludes with a Q&A and a thank you to viewers.

--- 

This summary consolidates key points from both sessions, providing a comprehensive overview of working with data and RAG using Langchain.

## Summary of Technical Session on RAG Components

### I. Introduction

The session focuses on Retrieval-Augmented Generation (RAG) components, emphasizing the importance of understanding these components before implementing a RAG-based system. The previous session discussed data loaders and the necessity of splitting large datasets for large language models (LLMs). The current session covers the core components of RAG: data splitting, embedding generation, vector database storage, and retrievers. The next session will cover the full implementation of RAG using LangChain.

### II. Data Splitting (Chunking)

**Purpose:**  
Dividing large datasets into smaller chunks suitable for LLM input to manage input limits of LLMs.

**Techniques:**

1. **Character Text Splitter:**
   - **Parameters:** `separator`, `chunk_size`, `chunk_overlap`, `length_function`, `is_separator_regex`.
   - **Example Code:**
     ```python
     from langchain.text_splitter import CharacterTextSplitter
     from langchain_community.document_loader import TxtLoader

     # Load data
     data = TxtLoader("in.txt").load()

     # Initialize CharacterTextSplitter
     text_splitter = CharacterTextSplitter(
         separator='\n',
         chunk_size=1000,
         chunk_overlap=200,
         length_function=len,
     )

     # Split the data
     chunks = text_splitter.create_documents([data[0].page_content])

     # Access individual chunks (example)
     print(chunks[0].page_content)  # First chunk
     print(len(chunks)) # Number of chunks
     ```

recursive text spliter first separated by x, then by y and recursivelt keep on splitting.you can check what symbols it splits on.

2. **Recursive Character Text Splitter:**
   - **Example Code:**
     ```python
     from langchain.text_splitter import RecursiveCharacterTextSplitter

     # ... (load data as before) ...
     recursive_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
     chunks = recursive_splitter.create_documents([data])
     #... (access chunks as before) ...
     ```

### III. Embedding Generation

**Purpose:**  
Transforming text chunks into numerical vector representations (embeddings) for efficient similarity searching in vector databases.

**Example Code:**
```python
from langchain.embeddings import OpenAIEmbeddings

# Initialize embedding model
embeddings = OpenAIEmbeddings()

# Generate embeddings
text_chunks = ["hi there", "hello", "what's your name", "james bond"]
embeddings_list = embeddings.embed_documents(text_chunks)

# Access the embeddings (example)
print(embeddings_list[0]) #First embedding.
print(len(embeddings_list[0])) # Dimension of embeddings.
```

### IV. Vector Database Storage

**Purpose:**  
Storing and retrieving embeddings efficiently.

**Vector Databases Mentioned:** ChromaDB, FAISS, Pinecone, Weaviate, Qdrant.

**Example using ChromaDB:**
```python
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_chroma import Chroma

# Load the document, split it into chunks, embed each chunk and load it into the vector database
loaded_document = TextLoader('./data/state_of_the_union.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
chunks_of_text = text_splitter.split_documents(loaded_document)
db = Chroma.from_documents(chunks_of_text, OpenAIEmbeddings())

# Perform similarity search
query = "What did the president say about the John Lewis Voting Rights Act?"
results = db.similarity_search(query)

# Print results
print(results)
```

**Example using FAISS:**
```python
from langchain.vectorstores import FAISS
# ... (load data, split, embed as before) ...
db = FAISS.from_documents(documents, embeddings)
# ... (Perform similarity search and print results as before) ...
```

### V. Retrievers

**Purpose:**  
Providing a more powerful method for accessing information than simple similarity search, returning metadata (source document and location) along with the retrieved text.

**Example using FAISS Retriever:**
```python
# ... (load data, split, embed, and create FAISS vectorstore as before) ...

# Create retriever
retriever = db.as_retriever()



retriever.invoke(query)

# Retrieve information
query = "What did he say about kitten?"
results = retriever.get_relevant_documents(query)

# Print results with metadata
print(results)
```

**Parameter `k` (top_k):**  
Controls the number of retrieved documents. The default is 4. This can be adjusted in the retriever creation (e.g., `retriever = db.as_retriever(search_kwargs={"k": 2})`).

### VI. Tips

- **Practice:** Download and execute the provided code to understand the concepts better.
- **Resources:** Utilize the provided playlists and documentation for further learning.
- **Model Selection:** For professional applications, prefer using powerful models like OpenAI's.

# Summary: Implementing Basic RAG with LangChain Expression Language

## Introduction

This session focuses on implementing a basic Retrieval-Augmented Generation (RAG) system using LangChain's Expression Language (LCEL). Building on previous sessions that covered LangChain introductions, connecting to LLMs, prompt templates, and RAG components, this tutorial provides a comprehensive guide to creating a RAG system.

## Key Concepts and Components

### RAG Components

1. **Data Splitting (Chunking)**:
   - Custom data, such as PDFs, TXT, or XML, is split into manageable chunks.

2. **Embedding Generation**:
   - Vector embeddings are generated using models like OpenAI's embeddings.

3. **Vector Storage**:
   - Embeddings are stored in a vector database, such as FAISS, forming a knowledge base.

4. **Retriever**:
   - A retriever fetches the most relevant chunks based on user queries.

5. **Large Language Model (LLM)**:
   - The LLM processes the retrieved context and query to generate accurate responses.

### Architecture Overview

1. **Data Loading**:
   - Custom data is loaded into the system.

2. **Chunking**:
   - Data is split into smaller chunks using techniques like `CharacterTextSplitter`.

3. **Embedding Generation**:
   - Vector embeddings are generated for each chunk.

4. **Vector Storage**:
   - Embeddings are stored in a vector database.

5. **Retriever**:
   - Retrieves relevant information based on user queries.

6. **LLM Connection**:
   - The LLM generates responses based on retrieved context and user queries.

## Code Implementation

### Step-by-Step Implementation

1. **Loading the API Key**

   ```python
   import os
   from dotenv import load_dotenv

   load_dotenv()
   openai_api_key = os.getenv("OPENAI_API_KEY")
   ```

2. **Importing Libraries**

   ```python
   from langchain.vectorstores import FAISS
   from langchain.embeddings import OpenAIEmbeddings
   from langchain.text_splitter import CharacterTextSplitter
   from langchain.document_loaders import TextLoader
   from langchain.chains import LLMChain
   from langchain_openai import ChatOpenAI
   from langchain.prompts import ChatPromptTemplate
   from langchain_core.output_parsers import StrOutputParser
   from langchain_core.runnables import RunnablePassthrough
   ```

3. **Loading Data**

   ```python
   loader = TextLoader('state_of_the_union.txt')
   documents = loader.load()
   ```

4. **Data Splitting (Chunking)**

   ```python
   text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
   chunks = text_splitter.split_documents(documents)
   print(f"Number of chunks: {len(chunks)}")
   ```

5. **Embedding Generation and Vector Storage**

   ```python
   embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
   vectorstore = FAISS.from_documents(chunks, embeddings)
   vectorstore.save_local("faiss_index")
   ```

6. **Creating the Retriever**

   ```python
   retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
   ```

7. **Connecting to the LLM**

   ```python
   template = """Answer the question based on the context below.
   Context: {context}
   Question: {question}
   """
   prompt = ChatPromptTemplate.from_template(template)

   llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")

   def format_documents(docs):
       return "\n\n".join([d.page_content for d in docs])

   chain = (
   {"context" : retriever | format_docs, "question": RunnablePassthrough}
   | prompt | model | StrOutputParser()
   )
   ```

8. **Querying the Chain**

   ```python
   response = chain.invoke("What did he say about Ketanji Brown Jackson?")
   print(response)
   ```

## Tips and Best Practices

- **Parameter Tuning**: Adjust the `top_k` parameter in the retriever to control the number of retrieved results.
- **Context Management**: Ensure the context provided to the LLM is accurate and relevant.
- **Output Parsing**: Use output parsers to format the LLM's response as needed.

# LangChain Expression Language (LCEL) - Day 7 Summary


## Introduction to LCEL

LCEL is presented as a newer, more efficient approach to creating chains in LangChain compared to the legacy method. The instructor emphasizes that while legacy chains are still supported, LCEL is favored for its enhanced functionality and integration with newer LangChain tools. The core concept is building chains as sequences of "runnables," which are executable objects. The primary difference between the legacy and LCEL approaches lies in how chains are defined: legacy chains use predefined functions from the `langchain.chains` module, while LCEL uses a pipe (`|`) symbol to sequence runnables.

## High-Level Architecture of LangChain Applications

The lecture illustrates the typical architecture of a LangChain application:

1. **Data Input:** The process begins with input data.
2. **Action:** Actions are performed on the data, including prompt creation and chain execution.
3. **Runnables:** These actions involve the use of runnables (executable objects like prompts, models, and parsers).
4. **LLM Interaction:** The data is passed to the Large Language Model (LLM).
5. **Output:** The LLM generates an output.

LCEL plays a crucial role in the "Action" stage.

## Legacy LangChain vs. LCEL

The instructor contrasts the legacy LangChain approach with the new LCEL method. Legacy chains utilize predefined chain functions (e.g., `RetrievalQA`, `LLMMathChain`, `LLMCheckerChain`), requiring explicit function calls and parameter passing. LCEL, in contrast, uses a more concise syntax with the pipe symbol (`|`) to chain runnables together. The order of runnables in LCEL is strictly left-to-right, unlike the legacy approach where order may not always be critical. The instructor notes that legacy methods are still relevant in some cases and that LCEL is under active development.

## Example: Basic LCEL Chain

The following Python code demonstrates a basic LCEL chain using OpenAI's `gpt-3.5-turbo` model:

```python
# Import necessary modules
from langchain.output_parsers import StrOutputParser
from langchain.prompts import ChatPromptTemplate

# Create a prompt template
prompt_template = """Tell me a curious fact about {politician}"""

# Define the chain using LCEL syntax
chain = (
    ChatPromptTemplate.from_template(prompt_template)
    | llm
    | StrOutputParser()
)

# Invoke the chain with an input
politician = "JFK"
output = chain.invoke({"politician": politician})
print(output) 



or 

for s in chain.stream({"politician": politician}):
    print(s, end="", flush=True)
    
    
    
chain.batch([{"politician": politician}, {"politician": politician}])
```

if you import something from chains it is legacy code

all things are runnable means they can be executed

This code creates a chain that:

1. Constructs a prompt using `ChatPromptTemplate`.
2. Passes the prompt to the LLM (`llm`).
3. Uses `StrOutputParser` to extract the string content from the LLM's AI message format response.

**Key Point:** The pipe symbol (`|`) connects the runnables, enforcing left-to-right execution. The `StrOutputParser` is crucial for converting the LLM's response into a usable string.

## Runnable Execution Order and Alternatives

The lecture emphasizes the left-to-right execution order of runnables in LCEL chains. An image visually demonstrates the flow: Input -> Prompt -> LLM -> Output Parser -> Output.

The lecture also presents alternatives to the `invoke` method for executing chains:

- **Streaming:** Using `chain.stream()` allows for streaming the output from the LLM, word by word, instead of receiving the entire output at once. Example code is provided demonstrating this approach.

- **Batching:** Using the chain with a list of inputs enables processing multiple inputs in batches. Example code is provided, demonstrating how to pass a list of soccer players to the chain and receive responses for each.




## LangChain Expression Language Session Summary


### Agenda
- **Focus:** Built-in Runnables in LangChain Expression Language (LCEL).
- **Subtopics:**
  - Runnable Pass-Through
  - Runnable Lambda
  - Runnable Parallel
  - Item Getter

### Built-in Runnables

#### Runnable Pass-Through
- **Description:** Outputs the input unchanged.
- **Purpose:** Useful for passing user input directly.
- **Example:**
  ```python
  from langchain_core.runnables import RunnablePassThrough

  chain = RunnablePassThrough()
  output = chain.invoke("Puppy")
  print(output)  # Output: Puppy
  ```

#### Runnable Lambda   lambda means function
- **Description:** Wraps custom functions for use in chains.
- **Purpose:** Integrates custom logic into the chain.
- **Example:**
  ```python
  def russian_last_name(name: str) -> str:
      return name + "ovs"

  chain = runnables.RunnableLambda(russian_last_name)
  output = chain.invoke("Puppy")
  print(output)  # Output: Puppyovs
  ```

Difference between runnable passthrough, runnable parallel, runnable lambda

#### Runnable Parallel
- **Description:** Executes multiple runnables concurrently.
- **Purpose:** Improves efficiency in complex chains.
- **Example:**
  ```python
  runnable_parallel = 
      runnables.RunnablePassThrough() |
      runnables.RunnableLambda(russian_last_name)
 

  output = runnable_parallel.invoke("Puppy")
  print(output)  # Output: {'operation_0': 'Puppy', 'operation_1': 'Puppyovs'}
  ```

### Advanced Use of Runnable Parallel
- **Example with Vector Database:**
  - Demonstrates parallel execution with a vector database and retriever.
  - Combines context retrieval and question handling efficiently.

chain = RunnableParallel({
    "operation_a": RunnablePassthrough(),
    "soccer_player": RunnableLambda(russian_lastname_from_dictionary),
    "operation_c": RunnablePassthrough(),
}) | prompt | model | output_parser


chain.invoke({
    "name1" : "Jordon", 
    "name":"Abhram"
})



from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

vectorstore = FAISS.from_texts(
    ["dswithhappy focuses on providing content on Data Science, AI, ML, DL, CV, NLP, Python programming, etc. in English."],
    embedding=OpenAIEmbeddings()
)
retriever = vectorstore.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(model="gpt-3.5-turbo")
retrieval_chain = (
    RunnableParallel({"context": retriever, "question": RunnablePassthrough()})
    | prompt
    | model
    | StrOutputParser()
)
retrieval_chain.invoke("What is dswithbappy?")




### Item Getter


```python

from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

vectorstore = FAISS.from_texts(
    ["dswithbappy focuses on providing content on Data Science, AI, ML, DL, CV, NLP, Python programming, etc. in English."],
    embedding=OpenAIEmbeddings()
)
retriever = vectorstore.as_retriever()
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(model="gpt-3.5-turbo")

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
        "language": itemgetter("language"),
    }
    | prompt
    | model
    | StrOutputParser()
)

chain.invoke({"question": "What is dswithhappy?", "language": "Pirate English"})



```

- **Description:** Extracts specific items from parallel operations.
- **Purpose:** Handles multiple inputs in parallel.
- **Example:**
  ```python
  from langchain_core import runnables, operator

  chain = runnables.RunnableParallel({
      "operator_a" : RunnablePassThrough(), 
      "operator_b" : RunnableLambda(russian_last_name)
  })

  context = retriever.retrieve(data)
  question = "What is DS with Puppy?"
  language = "pirate"
  prompt = prompt_template.format(context=context, question=question, language=language)
  response = chat_model.generate_response(prompt)
  output = output_parser.StrOutputParser().parse(response)
  print(output)  # Output: DS with Puppy be a platform focusin' on content like data science, AI, ML, DL, CV, NLP, Python programmin'.
  ```


from operator import itemgetter
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

model = ChatOpenAI(model="gpt-3.5-turbo")

vectorstore = FAISS.from_texts(
    ["dswithbappy focuses on providing content on Data Science, AI, ML, DL, CV, NLP, Python programming, etc. in Eng"],
    embedding=OpenAIEmbeddings()
)
retriever = vectorstore.as_retriever()

template = """Answer the question based only on the following context:
{context}
Question: {question}
Answer in the following language: {language}
"""
prompt = ChatPromptTemplate.from_template(template)
chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
        "language": itemgetter("language"),
    }
    | prompt
    | model
    | StrOutputParser()
)

chain.invoke({"question": "What is dswithbappy?", "language": "Pirate English"})


### Code Snippets
- **Runnable Pass-Through Example:**
  ```python
  from langchain import runnables

  chain = runnables.RunnablePassThrough()
  output = chain.invoke("Puppy")
  print(output)  # Output: Puppy
  ```

- **Runnable Lambda Example:**
  ```python
  from langchain import runnables

  def russian_last_name(name: str) -> str:
      return name + "ovs"

  chain = runnables.RunnableLambda(russian_last_name)
  output = chain.invoke("Puppy")
  print(output)  # Output: Puppyovs
  ```

- **Runnable Parallel Example:**
  ```python
  from langchain import runnables

  runnable_parallel = runnables.RunnableParallel([
      runnables.RunnablePassThrough(),
      runnables.RunnableLambda(russian_last_name)
  ])

  output = runnable_parallel.invoke("Puppy")
  print(output)  # Output: {'operation_0': 'Puppy', 'operation_1': 'Puppyovs'}
  ```

- **Advanced Use of Runnable Parallel with Vector Database:**
  - Combines context retrieval and question handling in parallel.

- **Item Getter Example:**
  - Handles multiple inputs in parallel using `itemgetter`.

**Merged Summary of LangChain Session**

---
Use of .bind() to add arguments to a Runnable in a LCEL Chain • For example, we can add an argument to stop the model response when it reaches the word "Ronaldo".

### Built-in Functions in Runnables

- **`.bind` Function:**
  - Example: Using `.bind` to stop the model's response at a specific point.
  ```python
  chain = (
      prompt_model | 
      model.bind(stop=["Ronaldo"]) 
      | output_parser
  )
  response = chain.invoke({"soccer_player": "Ronaldo"})
  print(response)  # Output: Stopping when "Ronaldo" is encountered.
  ```

### Combining Chains

- **Simple Chain Combination:**
  - Example: Creating a chain to assess a politician's impact.
  ```python
  from operator import itemgetter


chain1 = (
    ChatPromptTemplate.from_template("What is the country {politician} is from?")
    | llm
    | StrOutputParser()
)
chain2 = (
    {"country" : chain1, "language":itemgetter("language")}
    | ChatPromptTemplate.from_template("What is the continent of {country}? respond in {language}")
    | llm
    | StrOutputParser()
)
response = chain2.invoke({"politician": "Emmanuel Macron", "language":"hindi"})
print(response)  # Output: L'Europe
  ```


from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

prompt1 = ChatPromptTemplate.from_template("what is the country {politician} is from?")
prompt2 = ChatPromptTemplate.from_template(
    "what continent is the country {country} in? respond in {language}"
)

model = ChatOpenAI()

chain1 = prompt1 | model | StrOutputParser()

chain2 = (
    {"country": chain1, "language": itemgetter("language")}
    | prompt2
    | model
    | StrOutputParser()
)

chain2.invoke({"politician": "Mitterrand", "language": "French"})


See webbaseloader and langchain hub for prompts
### RAG Application Demo

- **Process:**
  - Loading data from URLs using `WebBaseLoader`.
  - Chunking and vector storage with ChromaDB.
  - Creating the RAG chain with ParallelRunnable for efficiency.
  ```python
  rag_chain =  {
      "retriever": retriever,
      "question": question,
      "format_documents": lambda x: x,
  } | prompt | LLMChatOpenAI(...) | str_output_parser
  response = rag_chain.invoke("What is task decomposition?")
  ```





You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context as a reference to answer the question. 
If you don't know the answer, just say that you don't know. 
Context: Solidity​
Here's an example using the Solidity text splitter:
SOL_CODE = """pragma solidity ^0.8.20;contract HelloWorld {   function add(uint a, uint b) pure public returns(uint) {       return a + b;   }}"""sol_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.SOL, chunk_size=128, chunk_overlap=0)sol_docs = sol_splitter.create_documents([SOL_CODE])sol_docs
[Document(page_content='pragma solidity ^0.8.20;'), Document(page_content='contract HelloWorld {\n   function add(uint a, uint b) pure public returns(uint) {\n       return a + b;\n   }\n}')]
C#​
Here's an example using the C# text splitter:

for over a dozen programming languages."headers_to_split_on = [    ("#", "Header 1"),    ("##", "Header 2"),]# MD splitsmarkdown_splitter = MarkdownHeaderTextSplitter(    headers_to_split_on=headers_to_split_on, strip_headers=False)md_header_splits = markdown_splitter.split_text(markdown_document)# Char-level splitsfrom langchain_text_splitters import RecursiveCharacterTextSplitterchunk_size = 250chunk_overlap = 30text_splitter = RecursiveCharacterTextSplitter(    chunk_size=chunk_size, chunk_overlap=chunk_overlap)# Splitsplits = text_splitter.split_documents(md_header_splits)splitsAPI Reference:RecursiveCharacterTextSplitter

text_splitter.split_text(state_of_the_union)[:2]
['Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and', 'of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.']
Let's go through the parameters set above for RecursiveCharacterTextSplitter:

C#​
Here's an example using the C# text splitter:
C_CODE = """using System;class Program{    static void Main()    {        int age = 30; // Change the age value as needed        // Categorize the age without any console output        if (age < 18)        {            // Age is under 18        }        else if (age >= 18 && age < 65)        {            // Age is an adult        }        else        {            // Age is a senior citizen        }    }}"""c_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.CSHARP, chunk_size=128, chunk_overlap=0)c_docs = c_splitter.create_documents([C_CODE])c_docs

How-to guidesHow to recursively split text by charactersOn this pageHow to recursively split text by characters
This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is ["\n\n", "\n", " ", ""]. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.

How the text is split: by list of characters.
How the chunk size is measured: by number of characters.

recommended text splitter for generic text use cases.
from langchain_text_splitters import RecursiveCharacterTextSplittertext_splitter = RecursiveCharacterTextSplitter(    chunk_size=1000,  # chunk size (characters)    chunk_overlap=200,  # chunk overlap (characters)    add_start_index=True,  # track index in original document)all_splits = text_splitter.split_documents(docs)print(f"Split blog post into {len(all_splits)} sub-documents.")API Reference:RecursiveCharacterTextSplitter
Split blog post into 66 sub-documents.
Go deeper​
TextSplitter: Object that splits a list of Documents into smaller
chunks. Subclass of DocumentTransformers.

How-to guidesHow to split codeOn this pageHow to split code
RecursiveCharacterTextSplitter includes pre-built lists of separators that are useful for splitting text in a specific programming language.
Supported languages are stored in the langchain_text_splitters.Language enum. They include:
"cpp","go","java","kotlin","js","ts","php","proto","python","rst","ruby","rust","scala","swift","markdown","latex","html","sol","csharp","cobol","c","lua","perl","haskell"
To view the list of separators for a given language, pass a value from this enum into
RecursiveCharacterTextSplitter.get_separators_for_language`
To instantiate a splitter that is tailored for a specific language, pass a value from the enum into
RecursiveCharacterTextSplitter.from_language
Below we demonstrate examples for the various languages.
%pip install -qU langchain-text-splitters
from langchain_text_splitters import (    Language,    RecursiveCharacterTextSplitter,)API Reference:Language | RecursiveCharacterTextSplitter

This can be done using the .split_documents method of the second splitter:
from langchain_text_splitters import RecursiveCharacterTextSplitterchunk_size = 500chunk_overlap = 30text_splitter = RecursiveCharacterTextSplitter(    chunk_size=chunk_size, chunk_overlap=chunk_overlap)# Splitsplits = text_splitter.split_documents(html_header_splits)splits[80:85]API Reference:RecursiveCharacterTextSplitter

from langchain_text_splitters import RecursiveCharacterTextSplittertext_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(    model_name="gpt-4",    chunk_size=100,    chunk_overlap=0,)API Reference:RecursiveCharacterTextSplitter
We can also load a TokenTextSplitter splitter, which works with tiktoken directly and will ensure each split is smaller than chunk size.
from langchain_text_splitters import TokenTextSplittertext_splitter = TokenTextSplitter(chunk_size=10, chunk_overlap=0)texts = text_splitter.split_text(state_of_the_union)print(texts[0])API Reference:TokenTextSplitter
Madam Speaker, Madam Vice President, our

= html_splitter.split_text(html_string)chunk_size = 500chunk_overlap = 30text_splitter = RecursiveCharacterTextSplitter(    chunk_size=chunk_size, chunk_overlap=chunk_overlap)# Splitsplits = text_splitter.split_documents(html_header_splits)splitsAPI Reference:RecursiveCharacterTextSplitter

latex_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.MARKDOWN, chunk_size=60, chunk_overlap=0)latex_docs = latex_splitter.create_documents([latex_text])latex_docs

The RecursiveCharacterTextSplitter attempts to keep larger units (e.g., paragraphs) intact.
If a unit exceeds the chunk size, it moves to the next level (e.g., sentences).
This process continues down to the word level if necessary.

Here is example usage:
from langchain_text_splitters import RecursiveCharacterTextSplittertext_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=0)texts = text_splitter.split_text(document)API Reference:RecursiveCharacterTextSplitter
Further reading
See the how-to guide for recursive text splitting.

Document-structured based​
Some documents have an inherent structure, such as HTML, Markdown, or JSON files.
In these cases, it's beneficial to split the document based on its structure, as it often naturally groups semantically related text.
Key benefits of structure-based splitting:

Below we show example usage.
To obtain the string content directly, use .split_text.
To create LangChain Document objects (e.g., for use in downstream tasks), use .create_documents.
%pip install -qU langchain-text-splitters
from langchain_text_splitters import RecursiveCharacterTextSplitter# Load example documentwith open("state_of_the_union.txt") as f:    state_of_the_union = f.read()text_splitter = RecursiveCharacterTextSplitter(    # Set a really small chunk size, just to show.    chunk_size=100,    chunk_overlap=20,    length_function=len,    is_separator_regex=False,)texts = text_splitter.create_documents([state_of_the_union])print(texts[0])print(texts[1])API Reference:RecursiveCharacterTextSplitter
page_content='Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and'page_content='of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.'
text_splitter.split_text(state_of_the_union)[:2]

Example implementation using LangChain's CharacterTextSplitter with token-based splitting:
from langchain_text_splitters import CharacterTextSplittertext_splitter = CharacterTextSplitter.from_tiktoken_encoder(    encoding_name="cl100k_base", chunk_size=100, chunk_overlap=0)texts = text_splitter.split_text(document)API Reference:CharacterTextSplitter
Further reading
See the how-to guide for token-based splitting.
See the how-to guide for character-based splitting.

Text-structured based​
Text is naturally organized into hierarchical units such as paragraphs, sentences, and words.
We can leverage this inherent structure to inform our splitting strategy, creating split that maintain natural language flow, maintain semantic coherence within split, and adapts to varying levels of text granularity.
LangChain's RecursiveCharacterTextSplitter implements this concept:

from langchain_text_splitters import RecursiveCharacterTextSplitterhtml_string = """    <!DOCTYPE html>    <html>    <body>        <div>            <h1>Foo</h1>            <p>Some intro text about Foo.</p>            <div>                <h2>Bar main section</h2>                <p>Some intro text about Bar.</p>                <h3>Bar subsection 1</h3>                <p>Some text about the first subtopic of Bar.</p>                <h3>Bar subsection 2</h3>                <p>Some text about the second subtopic of Bar.</p>            </div>            <div>                <h2>Baz</h2>                <p>Some text about Baz</p>            </div>            <br>            <p>Some concluding text about Foo</p>        </div>    </body>    </html>"""headers_to_split_on = [    ("h1", "Header 1"),    ("h2", "Header 2"),    ("h3", "Header 3"),    ("h4", "Header 4"),]html_splitter = HTMLSectionSplitter(headers_to_split_on)html_header_splits = html_splitter.split_text(html_string)chunk_size =

JS​
Here's an example using the JS text splitter:
JS_CODE = """function helloWorld() {  console.log("Hello, World!");}// Call the functionhelloWorld();"""js_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.JS, chunk_size=60, chunk_overlap=0)js_docs = js_splitter.create_documents([JS_CODE])js_docs
[Document(page_content='function helloWorld() {\n  console.log("Hello, World!");\n}'), Document(page_content='// Call the function\nhelloWorld();')]
TS​
Here's an example using the TS text splitter:
TS_CODE = """function helloWorld(): void {  console.log("Hello, World!");}// Call the functionhelloWorld();"""ts_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.TS, chunk_size=60, chunk_overlap=0)ts_docs = ts_splitter.create_documents([TS_CODE])ts_docs
[Document(page_content='function helloWorld(): void {'), Document(page_content='console.log("Hello, World!");\n}'), Document(page_content='// Call the function\nhelloWorld();')]
Markdown​

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(    encoding_name="cl100k_base", chunk_size=100, chunk_overlap=0)texts = text_splitter.split_text(state_of_the_union)
print(texts[0])
Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  Last year COVID-19 kept us apart. This year we are finally together again. Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. With a duty to one another to the American people to the Constitution.
To implement a hard constraint on the chunk size, we can use RecursiveCharacterTextSplitter.from_tiktoken_encoder, where each split will be recursively split if it has a larger size:

mitigate the possibility of separating a statement from important
context related to it. We use the
RecursiveCharacterTextSplitter,
which will recursively split the document using common separators like
new lines until each chunk is the appropriate size. This is the
recommended text splitter for generic text use cases.
We set add_start_index=True so that the character index where each
split Document starts within the initial Document is preserved as
metadata attribute “start_index”.
See this guide for more detail about working with PDFs, including how to extract text from specific sections and images.
from langchain_text_splitters import RecursiveCharacterTextSplittertext_splitter = RecursiveCharacterTextSplitter(    chunk_size=1000, chunk_overlap=200, add_start_index=True)all_splits = text_splitter.split_documents(docs)len(all_splits)API Reference:RecursiveCharacterTextSplitter
514
Embeddings​

from langchain_text_splitters import SentenceTransformersTokenTextSplittersplitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0)text = "Lorem "count_start_and_stop_tokens = 2text_token_count = splitter.count_tokens(text=text) - count_start_and_stop_tokensprint(text_token_count)API Reference:SentenceTransformersTokenTextSplitter
2
token_multiplier = splitter.maximum_tokens_per_chunk // text_token_count + 1# `text_to_split` does not fit in a single chunktext_to_split = text * token_multiplierprint(f"tokens in text to split: {splitter.count_tokens(text=text_to_split)}")
tokens in text to split: 514
text_chunks = splitter.split_text(text=text_to_split)print(text_chunks[1])
lorem
NLTK​
noteThe Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language.

How-to guidesHow to split by characterHow to split by character
This is the simplest method. This splits based on a given character sequence, which defaults to "\n\n". Chunk length is measured by number of characters.

How the text is split: by single character separator.
How the chunk size is measured: by number of characters.

[Document(page_content='main :: IO ()'), Document(page_content='main = do\n    putStrLn "Hello, World!"\n-- Some'), Document(page_content='sample functions\nadd :: Int -> Int -> Int\nadd x y'), Document(page_content='= x + y')]
PHP​
Here's an example using the PHP text splitter:
PHP_CODE = """<?phpnamespace foo;class Hello {    public function __construct() { }}function hello() {    echo "Hello World!";}interface Human {    public function breath();}trait Foo { }enum Color{    case Red;    case Blue;}"""php_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.PHP, chunk_size=50, chunk_overlap=0)php_docs = php_splitter.create_documents([PHP_CODE])php_docs

Rather than just splitting on "\n\n", we can use NLTK to split based on NLTK tokenizers.

CharacterTextSplitter, RecursiveCharacterTextSplitter, and TokenTextSplitter can be used with tiktoken directly.
%pip install --upgrade --quiet langchain-text-splitters tiktoken
from langchain_text_splitters import CharacterTextSplitter# This is a long document we can split up.with open("state_of_the_union.txt") as f:    state_of_the_union = f.read()API Reference:CharacterTextSplitter
To split with a CharacterTextSplitter and then merge chunks with tiktoken, use its .from_tiktoken_encoder() method. Note that splits from this method can be larger than the chunk size measured by the tiktoken tokenizer.
The .from_tiktoken_encoder() method takes either encoding_name as an argument (e.g. cl100k_base), or the model_name (e.g. gpt-4). All additional arguments like chunk_size, chunk_overlap, and separators are used to instantiate CharacterTextSplitter:

Markdown​
Here's an example using the Markdown text splitter:
markdown_text = """# 🦜️🔗 LangChain⚡ Building applications with LLMs through composability ⚡## What is LangChain?# Hopefully this code block isn't splitLangChain is a framework for...As an open-source project in a rapidly developing field, we are extremely open to contributions."""
md_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.MARKDOWN, chunk_size=60, chunk_overlap=0)md_docs = md_splitter.create_documents([markdown_text])md_docs

# pip install nltk
# This is a long document we can split up.with open("state_of_the_union.txt") as f:    state_of_the_union = f.read()
from langchain_text_splitters import NLTKTextSplittertext_splitter = NLTKTextSplitter(chunk_size=1000)API Reference:NLTKTextSplitter
texts = text_splitter.split_text(state_of_the_union)print(texts[0])

HTML​
Here's an example using an HTML text splitter:
html_text = """<!DOCTYPE html><html>    <head>        <title>🦜️🔗 LangChain</title>        <style>            body {                font-family: Arial, sans-serif;            }            h1 {                color: darkblue;            }        </style>    </head>    <body>        <div>            <h1>🦜️🔗 LangChain</h1>            <p>⚡ Building applications with LLMs through composability ⚡</p>        </div>        <div>            As an open-source project in a rapidly developing field, we are extremely open to contributions.        </div>    </body></html>"""
html_splitter = RecursiveCharacterTextSplitter.from_language(    language=Language.HTML, chunk_size=60, chunk_overlap=0)html_docs = html_splitter.create_documents([html_text])html_docs

# This is a custom parser that splits an iterator of llm tokens# into a list of strings separated by commasdef split_into_list(input: Iterator[str]) -> Iterator[List[str]]:    # hold partial input until we get a comma    buffer = ""    for chunk in input:        # add current chunk to buffer        buffer += chunk        # while there are commas in the buffer        while "," in buffer:            # split buffer on comma            comma_index = buffer.index(",")            # yield everything before the comma            yield [buffer[:comma_index].strip()]            # save the rest for the next iteration            buffer = buffer[comma_index + 1 :]    # yield the last chunk    yield [buffer.strip()]list_chain = str_chain | split_into_listfor chunk in list_chain.stream({"animal": "bear"}):    print(chunk, flush=True)
['lion']['tiger']['wolf']['gorilla']['raccoon']
Invoking it gives a full array of values:
list_chain.invoke({"animal": "bear"})

Use .split_text to obtain the string content directly:
text_splitter.split_text(state_of_the_union)[0]

1) How to split HTML strings:​
from langchain_text_splitters import HTMLSectionSplitterhtml_string = """    <!DOCTYPE html>    <html>    <body>        <div>            <h1>Foo</h1>            <p>Some intro text about Foo.</p>            <div>                <h2>Bar main section</h2>                <p>Some intro text about Bar.</p>                <h3>Bar subsection 1</h3>                <p>Some text about the first subtopic of Bar.</p>                <h3>Bar subsection 2</h3>                <p>Some text about the second subtopic of Bar.</p>            </div>            <div>                <h2>Baz</h2>                <p>Some text about Baz</p>            </div>            <br>            <p>Some concluding text about Foo</p>        </div>    </body>    </html>"""headers_to_split_on = [("h1", "Header 1"), ("h2", "Header 2")]html_splitter = HTMLSectionSplitter(headers_to_split_on)html_header_splits = html_splitter.split_text(html_string)html_header_splitsAPI Reference:HTMLSectionSplitter

To obtain the string content directly, use .split_text.
To create LangChain Document objects (e.g., for use in downstream tasks), use .create_documents.
%pip install -qU langchain-text-splitters
from langchain_text_splitters import CharacterTextSplitter# Load an example documentwith open("state_of_the_union.txt") as f:    state_of_the_union = f.read()text_splitter = CharacterTextSplitter(    separator="\n\n",    chunk_size=1000,    chunk_overlap=200,    length_function=len,    is_separator_regex=False,)texts = text_splitter.create_documents([state_of_the_union])print(texts[0])API Reference:CharacterTextSplitter:

Question: write code for recursive text splitter


https://python.langchain.com/docs/how_to/message_history/

## LangChain Ecosystem and Q&A Session Summary

### Introduction

- **Session Overview**: The session focuses on the LangChain ecosystem, including Lang Smith, Lang Serve, and Lang Graph. The instructor emphasizes the importance of understanding the ecosystem and encourages participants to ask questions.

### LangChain Ecosystem

- **LangChain Overview**: LangChain is a comprehensive tool for managing and optimizing large language models (LLMs). It has evolved significantly, introducing new tools like Lang Smith, Lang Serve, and Lang Graph.

- **Lang Smith**:
  - **Purpose**: An LLM ops tool for monitoring and evaluating LLM applications.
  - **Functionality**: Helps monitor the performance of LLM applications, including relevance, accuracy, and hallucination detection.

- **Lang Serve**:
  - **Purpose**: A tool for deploying LLM applications.
  - **Functionality**: Uses FastAPI to create APIs for LLM applications, making deployment easier.

- **Lang Graph**:
  - **Purpose**: A tool for creating agents and multi-agent systems.
  - **Functionality**: Connects different third-party tools to fetch real-time information, enhancing the capabilities of LLMs.

### Tools and Ecosystem

- **Integration Tools**:
  - Lang Smith, Lang Serve, and Lang Graph are part of the LangChain ecosystem, aimed at making LLM applications more robust and versatile.
  - **Crew AI**: A high-level framework built on top of LangChain, similar to how Keras is built on TensorFlow.

- **Documentation and Learning**:
  - **Importance of Documentation**: The instructor emphasizes the importance of referring to the latest documentation for updates and new features.
  - **Stable Versions**: Uses stable versions of packages to avoid installation issues.


## LANGSERVE
```python

import os
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv, find_dotenv
from fastapi import FastAPI
from langserve import add_routes
import uvicorn

_ = load_dotenv(find_dotenv())
openai_api_key = os.environ["OPENAI_API_KEY"]
llm = ChatOpenAI(model="gpt-3.5-turbo")
parser = StrOutputParser()
system_template = "Translate the following into {language}:"
prompt_template = ChatPromptTemplate.from_messages([
    ('system', system_template),
    ('user', '{text}')
)
chain = prompt_template | llm | parser
app = FastAPI(
    title="simpleTranslator",
    version="1.0",
    description="A simple API server using LangChain's Runnable interfaces",
)

add_routes(
    app,
    chain,
    path="/chain",
)

go to route  /chain/playground

```

```python
from langserve import RemoteRunnable
chain = RemoteRunnable("http://localhost:8000/chain/c/N4XyA")
print(chain.invoke({"language": "Spanish", "text": "Generative AI is a bigger opport"}))

```

check langserve docs

## LangChain and LangGraph: Building AI Agents for Real-Time Search

This document details a walkthrough of building AI agents using LangChain and LangGraph, focusing on real-time search capabilities. The tutorial uses Tabily, a search API, and demonstrates agent creation, memory management, and stream processing. A course announcement for advanced LLM application development is also included.

### I. Setup and API Keys

1. **Environment:** The tutorial begins by creating a Jupyter Notebook in VS Code named "line_graph_demo.ipynb".

2. **OpenAI API Key:** The first step involves loading the OpenAI API key. This is crucial for interacting with the language model (LLM). The key is loaded from an `.env` file.

   ```bash
   # .env file content (Example)
   OPENAI_API_KEY=your_actual_api_key
   TABILY_API_KEY=your_actual_tabily_api_key
   ```

3. **LLM Loading:** The LLM (GPT-3.5-turbo) is loaded.

   ```python
   from langchain import ChatCompletion
   llm = ChatCompletion(model="gpt-3.5-turbo")
   ```





2. **Tabily Search:** The tutorial uses LangChain's `TabularSearch` tool to interact with Tabily. The following code snippet demonstrates searching for information and retrieving results.

   ```python
   from langchain_community.tools.tavily_search import TavilySearchResults
search = TavilySearchResults(max_results=2)
search.invoke("Who are the top stars of the 2024 Eurocup?")
   ```

```python

from langgraph.prebuilt import create_react_agent
agent_executor = create_react_agent(llm, tools)

from langchain_core.messages import HumanMessage
response = agent_executor.invoke({"messages": [HumanMessage(content="Where is the soccer Eurocup 2024")]})
response["messages"]

```
   This code snippet runs the query through the `agents_executor` and prints the response, showing how the agent makes use of the Tabily API when necessary. The difference in response time and the presence/absence of URLs indicate the source used (real-time search vs. knowledge base).

3. **Stream Processing:** The tutorial mentions the ability to use `executor.stream` for stream processing of the agent's responses, although the code for this is not included.
```python


for chunk in agent_executor.stream(
    {"messages": [HumanMessage(content="When and where will it be the 2024 Eurocup final match?")]}
):
    print(chunk)
    print("----")


```
### IV. Memory Management with LangGraph

1. **Memory Implementation:** The use of LangGraph's `MemorySaver` allows for conversational context. A `MemorySaver` object is added to the `create_react_agents` function.

```python
config = {"configurable": {"thread_id": "1"}}
input_message = {"type": "user", "content": "hi! I'm bob"}
for chunk in graph.stream({"messages": [input_message]}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()
    
input_message = {"type": "user", "content": "what's my name?"}
for chunk in graph.stream({"messages": [input_message]}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()
    
```

   ```python
   from langgraph.checkpoint.memory import MemorySaver
   memory = MemorySaver()
   agents_executor_with_memory = create_react_agents(llm=llm, tools=[search_tool], checkpointer=memory, config={"config": {"trainable_id": "001"}})
   ```

   This shows how to incorporate memory into the agent, maintaining conversational context using a specific `trainable_id` for different users. The `trainable_id` helps manage separate conversation histories for each user.

2. **Memory Demonstration:** Examples are given to demonstrate how the agent retains conversation history when using the `MemorySaver`. Switching `trainable_id` shows that separate conversations are maintained for different users.


## LangSmith: An LLM Ops Platform - Session Summary

create an account in langchian, create a project, open langsmith docs. get api key for
langsmith 
This document summarizes a session introducing LangSmith, an LLM Ops platform within the LangChain ecosystem.


check langsmith and langgraph once.
see memory in langchain

### Session Overview

The session (Day 14 of a series) demonstrated LangSmith's capabilities for monitoring, debugging, tracing, and evaluating LLM-powered applications. The instructor emphasized practical application over theoretical explanations. The session began with an overview of LangSmith's place within the LangChain ecosystem, then proceeded to a hands-on demo. The course announcement for a LangChain live batch starting December 17th, 2024, was also included.

### What is LangSmith?

LangSmith is an LLM Ops platform developed by LangChain. It allows for:

- **Monitoring:** Tracking application performance, costs, and errors.
- **Debugging:** Identifying and resolving issues within LLM applications.
- **Tracing:** Observing the execution flow of applications.
- **Evaluation:** Assessing the performance and quality of LLM applications (briefly mentioned, further exploration promised for the future).

LangSmith works independently of LangChain, though integration is straightforward. It requires setting environment variables for API key and project name.

### Setting up LangSmith

1. **Create a LangChain Account:** If you don't have one, sign up on the LangChain website.
2. **Generate an API Key:** Navigate to the settings in your LangChain dashboard and generate a personal access token.
3. **Set Environment Variables:** Add the API key and project name to your `.env` file. The example shown used the following variables:

   ```bash
   LANGCHAIN_API_KEY=<your_api_key>
   LANGCHAIN_PROJECT=test_app 
   ```

**Tip:** Ensure you replace `<your_api_key>` with your actual API key and choose a descriptive project name. The instructor strongly cautioned against using the provided key.

### Practical Demo: Monitoring an LLM Application

The demo used a Jupyter Notebook with the following code:

```python
import os
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.output_parsers import StrOutputParser
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Load API keys and configure LangChain tracing
openai_api_key = os.getenv("OPENAI_API_KEY")
langchain_api_key = os.getenv("LANGCHAIN_API_KEY")

# Optional: You can use other models like Grok here
model = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)

# LangChain tracing setup (Crucial for LangSmith integration)
os.environ["LANGCHAIN_TRACING_V2"] = "true"

# Create a ChatPromptTemplate
prompt_template = """You are a helpful assistant. Please response to the user request only based on the given context.
{context}
User: {question}"""
chat_prompt = ChatPromptTemplate.from_template(prompt_template)

# Create a simple chain
chain = chat_prompt | model | StrOutputParser()

# Example question and context
question = "Can you summarize this story?"
context = """[Insert story here. Example from Google search]"""

# Invoke the chain
summary = chain.invoke({"question": question, "context": context})

# Print the summary
print(summary)
```

This code snippet shows a simple LangChain application using ChatOpenAI. The crucial part for LangSmith integration is `os.environ["LANGCHAIN_TRACING_V2"] = "true"`. This enables LangChain's v2 tracing functionality, which sends data to LangSmith.

The instructor highlighted the importance of using LangChain's v2 (second generation) for compatibility with LangSmith. Legacy LangChain versions will not work.

After running this code, the instructor demonstrated viewing the execution details within the LangSmith platform.

### LangSmith Dashboard Features

The LangSmith dashboard displays various metrics and logs:

- **Project Overview:** Shows high-level statistics like total cost, total tokens used, and error rates.
- **Run Details:** Provides detailed information for each execution, including inputs, outputs, logs, and execution time.
- **Monitor Section:** Presents graphical visualizations of application performance over time (hours and days).
- **Runs Section:** Displays all completed runs, allowing for individual inspection of the runnables (prompt, model, parser, etc.).

The ability to view costs and resource utilization was emphasized. The dashboard also displays the execution as a sequence of runnables, allowing detailed analysis of each step. The instructor mentioned the ability to view data in JSON and YAML formats.

### Comparison to MLflow

The instructor drew parallels between LangSmith and MLflow, both enabling experiment tracking and monitoring.

### Conclusion

The session provided a practical introduction to LangSmith, demonstrating its ease of use and the valuable insights it provides for developing and deploying production-ready LLM applications. The instructor encouraged viewers to explore the LangSmith documentation further and announced an upcoming course focused on practical LLM application development.

In [1]:
# !pip install langchain

In [2]:
# !pip install -qU langchain-google-vertexai langchain-groq langchain-mistralai google-generativeai langchain-google-genai langchain-groq

In [3]:
from dotenv import load_dotenv
import os
load_dotenv()

True

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_groq import ChatGroq
from langchain_mistralai import ChatMistralAI

# setting api keys
os.environ["MISTRAL_API_KEY"] = os.getenv("MISTRAL_API_KEY")
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")

"""
llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GEMINI_API_KEY)
llm = ChatMistralAI(model="mistral-large-latest")
llm = ChatGroq(model="llama3-8b-8192")
"""
llm = ChatGroq(model="llama3-8b-8192")



LangChain also supports chat model inputs via strings or OpenAI format. The following are equivalent:

model.invoke("Hello")

model.invoke([{"role": "user", "content": "Hello"}])

model.invoke([HumanMessage("Hello")])

In [None]:
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate(
    input_variables=["animal_type", "pet_color"],
    template="I have a {animal_type} pet, and it is {pet_color} in color. Suggest me five cool names for it."
)
prompt_template

In [None]:
prompt_template.invoke({
    "animal_type" : "dog",
    "pet_color" : "pink"
})

In [None]:
prompt_template.format(**{
    "animal_type" : "dog",
    "pet_color" : "pink"
})

In [None]:
prompt_template.format(animal_type = "dog", pet_color = "pink")

In [None]:
from langchain_core.prompts import ChatPromptTemplate

system_template = "Translate the following from English into {language}"

prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

In [None]:
prompt_template.format(text="Hello, how are you?", language="Spanish")

In [None]:
prompt_template.invoke({"language": "Italian", "text": "hi!"})

In [None]:
prompt_template.invoke({"language": "Italian", "text": "hi!"}).to_messages()

In [None]:
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("Tell me a joke about {topic}")

prompt_template.invoke({"topic": "cats"})

In [None]:
from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate.from_template("Tell me a joke about {topic}")

In [None]:
chain = prompt | llm
chain.invoke({"topic":"cat"})
chain.run({"topic":"sheep"})  ## deprecated

### basic rag working example

In [None]:
from sentence_transformers import SentenceTransformer

In [None]:
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
print(embeddings)

In [None]:
len(embeddings[0])

In [None]:
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")    # sentence-transformers/all-mpnet-base-v2


In [None]:
from bs4 import BeautifulSoup
import requests

url = 'https://python.langchain.com/docs/how_to/'

response = requests.get(url)

soup = BeautifulSoup(response.content, 'html.parser')


In [None]:
article = soup.find_all('article')[0]
text = article.get_text()

In [None]:
from bs4 import BeautifulSoup
import requests
from langchain.text_splitter import RecursiveCharacterTextSplitter
import os
from langchain_mistralai import MistralAIEmbeddings
from langchain_chroma import Chroma
from langchain_huggingface.embeddings import HuggingFaceEmbeddings



os.environ["MISTRAL_API_KEY"] = os.getenv("MISTRAL_API_KEY")
# embeddings = MistralAIEmbeddings(model="mistral-embed")
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")    # sentence-transformers/all-mpnet-base-v2
vector_store = Chroma(collection_name="langchain",  embedding_function=embeddings, persist_directory="./chroma_db")

# Set of visited links to prevent loops

In [None]:
visited_links = set()

In [None]:
import time

def scrape_and_store(url, base_url):
    """
    Recursively scrape and store content in FAISS from a given URL.
    """
    
    try:
        if url in visited_links or not "/docs/how_to" in url or not base_url in url or "v0.2" in url or "v0.1" in url or "#" in url: 
            return
        
        print(url)
        # Mark this URL as visited
        visited_links.add(url)
        
        # Fetch the page content
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        
        # Find the `article` tag and extract text
        article = soup.find_all('article')[0]
        text = article.get_text()
        
        # Use LangChain's RecursiveCharacterTextSplitter to split the text
        splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
        docs = splitter.create_documents([text])
        
        # Add chunks to FAISS vector store
        vector_store.add_documents(docs)
        
        # Find all links within the article tag
        links = article.find_all('a', href=True)
        for link in links:
            href = link['href']
            # Make sure the link is absolute
            if href.startswith('/'):
                href = base_url + href
            elif not href.startswith('http'):
                continue  # Skip malformed or relative links
            
            # Recursively scrape linked pages
            scrape_and_store(href, base_url)
            
        time.sleep(0.5)  # Add a delay to avoid overloading the server
    except Exception as e:
        # print("Troubled url : ", url)
        print(f"Error scraping {url}: {e}")
        

from concurrent.futures import ThreadPoolExecutor


def scrape_concurrently(start_url, base_url):
    """
    Use a thread pool to scrape URLs concurrently.
    """
    with ThreadPoolExecutor(max_workers=10) as executor:
        urls_to_process = [start_url]
        while urls_to_process:
            # Submit scraping tasks for all URLs
            futures = [executor.submit(scrape_and_store, url, base_url) for url in urls_to_process]
            urls_to_process = []

            # Collect results and add new URLs to the queue
            for future in futures:
                result = future.result()
                if result:
                    urls_to_process.extend(result)



# Start scraping


# Base URL of the website
base_url = 'https://python.langchain.com'

# Starting URL
start_url = 'https://python.langchain.com/docs/tutorials/'

scrape_concurrently(start_url, base_url)
# Start scraping and storing
# scrape_and_store(start_url, base_url)

# Persist FAISS vector store to disk
# vector_store.save_local("langchain")

print("Scraping and storage complete!")



In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_groq import ChatGroq
from langchain_mistralai import ChatMistralAI

# setting api keys
os.environ["MISTRAL_API_KEY"] = os.getenv("MISTRAL_API_KEY")
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")

"""
llm = ChatMistralAI(model="mistral-large-latest")
llm = ChatGroq(model="llama3-8b-8192")
"""
llm = ChatGroq(model="llama3-8b-8192")
# llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GEMINI_API_KEY)



In [None]:


from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
question = "from_messages create prompt from message? explain it in detail with examples explain to me like a 13 year old child" 

prompt = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context as a reference to come up with an answer. 
Context: {context}:

Question: {question}

it may happen that all information is not available in the context. In that case, try to come up with best guesses.
"""

prompt = ChatPromptTemplate.from_template(prompt)

retriever = vector_store.as_retriever(search_kwargs={'k': 100})
# docs = retriever.invoke(question)
# docs_text = "\n\n".join(d.page_content for d in docs)

def func(docs):
    return "\n\n".join(d.page_content for d in docs)

docs_text = RunnablePassthrough() | {"context" : retriever | func, "question" : RunnablePassthrough()} | prompt | llm | StrOutputParser() 


In [None]:

docs_text.invoke(question)


In [None]:
ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "Who won the world series in 2020?"),
    ("assistant", "The Los Angeles Dodgers won the World Series in 2020."),
])

def russian_lastname(name: str) -> str:
    return f"{name}ovich"


In [None]:

from langchain_core.runnables import RunnableLambda, RunnablePassthrough
chain2 = RunnableLambda(russian_lastname)
prompt = ChatPromptTemplate([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
])
prompt.invoke({"input":"mridul", "age":"12"}).messages[1].content


In [None]:
from operator import itemgetter


chain1 = (
    ChatPromptTemplate.from_template("What is the country {politician} is from?")
    | llm
    | StrOutputParser()
)
chain2 = (
    {"country" : chain1, "language":itemgetter("language")}
    | ChatPromptTemplate.from_template("What is the continent of {country}? respond in {language}")
    | llm
    | StrOutputParser()
)
response = chain2.invoke({"politician": "Emmanuel Macron", "language":"hindi"})
print(response)  # Output: L'Europe


In [None]:

from dotenv import load_dotenv
import os

load_dotenv()

# os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
[
    ("system", "You are a helpful assistant. Please response to the user request only based on the given context"),
    ("user", "Question: {question}\nContext: {context}")
]
)
parser = StrOutputParser()
chain = prompt | llm | parser
question = "Can you summarize this text?"

context = "A dragon is a mythical beast that is said to live in the mountains of a fictional country. It is said to be the king of the mountain and is feared by the people of the country. The dragon is often depicted as a fierce and powerful creature, with scales and wings. It is also said to have the ability to breathe fire and other powerful magics. In some stories, dragons are depicted as having the body of a lion, while in others, they are depicted as having the body of a serpent. Overall, dragons are a popular and fascinating subject in mythology and folklore. They are often seen as symbols of power, strength, and wisdom. They are also often associated with good fortune and prosperity. In some stories, dragons are said to bring good luck and blessings to those who cross their paths. They are also said to be protectors of the forest and the animals that live in it. Overall, dragons are a powerful and mysterious creature that continues to captivate people's imaginations and imaginations."
res = chain.invoke({"question":question, "context":context})
print(res)




LANGCHAIN

Langchain Framework Overview
Langchain is a framework designed to simplify the creation of applications using large language models (LLMs). It allows developers to connect AI models with various data sources to create customized NLP applications. The framework is open-source and currently offered in Python and JavaScript (TypeScript). It supports LLMs like GPT-4 from OpenAI or HuggingFace.
Key Concepts
Components:
LLM Wrappers: Connect to LLMs like GPT-4 or HuggingFace.
Prompt Templates: Avoid hardcoding text inputs.
Indexes: Extract relevant information for LLMs.
Chains: Combine multiple components to solve specific tasks.
Agents: Allow LLMs to interact with external environments and APIs.
Prerequisites
Python: Version 3.8 or higher.
Pip: Python package manager.
Code Editor: Visual Studio Code (or any other preferred editor).
OpenAI Account: For using OpenAI's LLM.
Setting Up the Environment
Create an OpenAI Account:
Go to OpenAI and sign up.
Generate an API key from the user account settings.
Create a Project Directory:
Open a terminal and navigate to your desired directory.
Create a new directory: mkdir langchain-llm-app.
Set Up a Virtual Environment:
Create a virtual environment: python -m venv .venv.
Activate the virtual environment: env\Scripts\activate.ps1 (Windows) or source .venv/bin/activate (macOS/Linux).
Install Required Packages:
Use pip to install necessary packages: pip install langchain openai streamlit python-dotenv.

Using Agents
Import Agent Components:
Import the necessary components:
from langchain.agents import initialize_agent, AgentType
from langchain.tools import Wikipedia, LLMMath


Define the Agent:
Create an agent function:
def langchain_agent():
    llm = LLM(model="text-davinci-003", temperature=0.5)
    tools = [Wikipedia(), LLMMath()]
    agent = initialize_agent(tools, llm, agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
    result = agent.run("What is the average age of a dog? Multiply the age by 3.")
    return result


Run the Agent Function:
Add the following code to run the agent function:
if __name__ == "__main__":
    print(langchain_agent())


Building a YouTube Assistant
Create a youtube_assistant Directory:
Create a new directory for the YouTube assistant.
Set Up the Environment:
Create a virtual environment and install necessary packages:
python -m venv .venv
source .venv/bin/activate
pip install langchain openai youtube_transcript faiss-cpu python-dotenv


Create the langchain_helper.py File:
Import necessary libraries:
from langchain.document_loaders import YoutubeLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain import LLM, PromptTemplate, LLMChain
from dotenv import load_dotenv
import os


Load Environment Variables:
Load the environment variables:
load_dotenv()


Define the Function to Create Vector DB:
Create a function to load and split the YouTube transcript:
def create_vector_db_from_youtube(video_url: str):
    loader = YoutubeLoader.from_youtube_url(video_url)
    transcript = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
    docs = text_splitter.split_documents(transcript)
    db = FAISS.from_documents(docs, OpenAIEmbeddings())
    return db


Define the Function to Get Response:
Create a function to get the response from the query:
def get_response_from_query(db, query: str, k: int = 4):
    docs = db.similarity_search(query, k=k)
    docs_page_content = " ".join([doc.page_content for doc in docs])
    llm = LLM(model="text-davinci-003")
    prompt = PromptTemplate(
        input_variables=["question", "docs"],
        template="You are a helpful YouTube assistant that can answer questions about videos based on the video's transcript.\n\nQuestion: {question}\n\nVideo Transcript: {docs}\n\nAnswer:"
    )
    llm_chain = LLMChain(llm=llm, prompt=prompt)
    response = llm_chain.run(question=query, docs=docs_page_content)
    return response


Create the Streamlit Interface:
Update main.py to create the interface:
import streamlit as st
import langchain_helper as lch
import textwrap

st.title("YouTube Assistant")

with st.form(key="my_form"):
    video_url = st.text_input("What is the YouTube video URL?", max_chars=50)
    query = st.text_area("Ask me about the video", max_chars=50, key="curie")
    submit_button = st.form_submit_button(label="Submit")

if submit_button:
    db = lch.create_vector_db_from_youtube(video_url)
    response = lch.get_response_from_query(db, query)
    st.subheader("Answer")
    st.text(textwrap.fill(response, width=80))


Run the Streamlit App:
Run the app in the terminal: streamlit run main.py.
Cost Considerations
OpenAI API Costs:
The cost of using OpenAI's API is relatively low, around $0.002 per 1,000 tokens.
For the course, the total cost was less than $0.50.
Public App Considerations:
To avoid being charged for public use, add a field for users to input their OpenAI API key.
Conclusion
Langchain is a powerful framework for building applications using large language models. By understanding its key components—LLMs, prompt templates, chains, agents, and indexes—you can create innovative applications that leverage the power of LLMs.
Tips
Streamlit Course: If interested, consider creating a course on Streamlit to build cool Python interfaces.
Environment Variables: Store sensitive information like API keys in environment variables for security.
This structured format covers all the information provided in the text, ensuring nothing is left out.


Langchain Framework for Building LLM Applications: A Beginner's Course Summary
This document summarizes a Langchain course for beginners, covering core concepts, setup, and example applications.
I. Course Overview and Requirements
Course Goal: To simplify the creation of applications using large language models (LLMs) with Langchain. The course focuses on connecting AI models to various data sources for customized NLP applications.
Instructor: Rishabh Kumar, an experienced engineer.
Langchain: An open-source framework (Python and TypeScript) enabling developers to combine LLMs (e.g., GPT-4 from OpenAI or Hugging Face) with external computation and data sources. It goes beyond simple text input, allowing integration with databases, PDFs (converted to vector databases), and external APIs.


Components
Llm wrappers
Prompt templates
Indexes for information retrieval




Chains
Agents

II. Setup and Environment
OpenAI Account and API Key:
Create an OpenAI account at openai.com.
Generate an API key (found under your user account -> view API keys -> create new API key). Important: Save this key securely; it's only shown once.
Project Setup:
Create a project directory (e.g., langchain-llm-app) using the command mkdir langchain-llm-app.
Navigate to the directory using cd langchain-llm-app.
Open the directory in your code editor.
Virtual Environment:
Create a virtual environment: python -m venv .venv
Activate the virtual environment (Windows): .\.venv\Scripts\activate.ps1
Package Installation:
Install necessary packages using pip:
pip install langchain openai streamlit python-dotenv


.env File:
Create a .env file in your project directory.
Add your OpenAI API key as an environment variable: OPENAI_API_KEY=your_api_key


V. Langchain Agents: Interacting with the Environment
Agents allow LLMs to interact with external APIs and tools. The course demonstrates this with a function that retrieves the average age of a dog from Wikipedia and performs a calculation:
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.llms import OpenAI
from langchain.tools import Wikipedia, LLMMath

llm = OpenAI(temperature=0.5)

tools = [
    Tool(
        name="Wikipedia",
        func=Wikipedia().run,
        description="useful for looking up information",
    ),
    Tool(name="Calculator", func=LLMMath().run, description="useful for math"),
]

agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
result = agent.run("What is the average age of a dog? Multiply the age by 3.")
print(result)



from langchain.agents import initialize_agent, load_tools
from langchain.agents import AgentType
from langchain.llms import OpenAI


llm = OpenAI(temperature=0.5)

tools = load_tools([“wikipedia”, “llm-math”], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
result = agent.run("What is the average age of a dog? Multiply the age by 3.")
print(result)



VI. Indexes and Vector Databases: A YouTube Assistant
This section builds a YouTube assistant that answers questions based on a YouTube video transcript. It demonstrates using document loaders, text splitters, and vector databases (using FAISS) to handle large text data.
1. Vector Database Creation:
from langchain.document_loaders import YoutubeLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
#import faiss
From langchain.llms import OpenAI
From langchain import PromptTemplate
From langchain.chains import LLMChain
From langchain.vectorstores import FAISS
From dotenv import load_dotenv
from langchain.embeddings.openai import OpenAIEmbeddings

load_dotenv()
def create_vector_db_from_youtube(video_url):
    loader = YoutubeLoader.from_youtube_url(video_url)
    transcript = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
    docs = text_splitter.split_documents(transcript)
embeddings = OpenAIEmbeddings()
	Db = FAISS.from_documents(docs, embeddings)
    #db = faiss.index_factory(len(docs[0].page_content.split()), "IVF10,PQ8") # Example FAISS index
    
    #db.add(embeddings.embed_documents([doc.page_content for doc in docs]))
    return db


2. Querying the Database:
def get_response_from_query(db, query, k=4):
    #docs = db.search(query, k)
	Docs = db.similarity_search(query)
    docs_page_content = " ".join([doc.page_content for doc in docs])
    llm = OpenAI(temperature=0, model_name="text-davinci-003") # Note temperature = 0 for factual answers
    prompt_template = PromptTemplate(
        input_variables=["question", "docs"],
        template="""You are a helpful YouTube assistant that can answer questions about videos based on the video's transcript.
        The following is the question and the video transcript. Use only factual information from the transcript to answer the question. If you feel like you don't have enough information simply say I don't know.

        Question: {question}
        Video Transcript: {docs}""",
    )

    chain = LLMChain(llm=llm, prompt=prompt_template)
    response = chain.run({"question": query, "docs": docs_page_content})
    return response.replace("\n", "")



3. Streamlit Interface for YouTube Assistant: Similar to the pet name generator, a Streamlit app is created to allow users to input a YouTube URL and a question. The create_vector_db_from_youtube and get_response_from_query functions are called to generate the answer.
=================================================================


Summary of LanqChain Crash Course
Introduction to LanqChain
LanqChain is a framework designed to build applications on top of Large Language Models (LLMs) like GPT-3.5 or GPT-4. This course will cover the basics of LanqChain and demonstrate how to build a restaurant idea generator application using Streamlit.
Key Points
LLMs vs. Applications: LLMs are models like GPT-3.5 or GPT-4, while applications like ChatGPT use these models via APIs.
Limitations of Direct API Use:
Cost associated with API calls.
Limited knowledge (e.g., ChatGPT's knowledge cutoff is September 2021).
No access to internal organizational data.
Need for a Framework: LanqChain provides a framework to build applications using LLMs, supporting various models and integrations.
Setting Up LanqChain
Steps to Install and Set Up LanqChain
Create an OpenAI Account:
Go to the OpenAI website and create an account.
Obtain an API key from the dashboard.
Install Required Modules:
pip install langchain
pip install openai


Set Up Environment Variables:
import os
os.environ["OPENAI_API_KEY"] = "sk-your-api-key"


Code Snippets
Importing and Setting Up OpenAI
from langchain import OpenAI

# Create an OpenAI model instance
llm = OpenAI(temperature=0.7)


Creating a Prompt Template
from langchain.prompts import PromptTemplate

# Define a prompt template
prompt_template = PromptTemplate(
    input_variables=["cuisine"],
    template="I want to open a restaurant for {cuisine} food."
)


Using the Prompt Template
formatted_prompt = prompt_template.format(cuisine="Mexican")
print(formatted_prompt)


Creating a Chain
from langchain.chains import LLMChain

# Create a chain
chain = LLMChain(llm=llm, prompt=prompt_template)

# Run the chain
response = chain.run(cuisine="American")
print(response)


Building a Sequential Chain
Steps to Create a Sequential Chain
Create Individual Chains:
One for generating restaurant names.
Another for generating menu items.
Combine Chains into a Sequential Chain:
Code Snippets
Creating Individual Chains
# Chain for generating restaurant names
name_chain = LLMChain(llm=llm, prompt=name_prompt_template)

# Chain for generating menu items
menu_chain = LLMChain(llm=llm, prompt=menu_prompt_template)


Combining Chains
from langchain.chains import SimpleSequentialChain

# Create a sequential chain
sequential_chain = SimpleSequentialChain(chains=[name_chain, menu_chain], input_variables=["cuisine"], output_variables=["restaurant_name", "menu_items"])

# Run the sequential chain
response = sequential_chain(cuisine="Indian")
print(response)


Building a Streamlit Application
Steps to Create a Streamlit Application
Install Streamlit:
pip install streamlit


Create the Streamlit App:
Create a main.py file.
Import Streamlit and create the UI.
Code Snippets
Basic Streamlit App
import streamlit as st

# Title of the app
st.title("Restaurant Name Generator")

# Sidebar for cuisine selection
cuisine = st.sidebar.selectbox("Pick a cuisine", ["Indian", "American", "Mexican"])

# Function to generate restaurant name and menu items
def get_restaurant_name_and_items(cuisine):
    # Placeholder for actual implementation
    return {"restaurant_name": "Curry Delight", "menu_items": "Biryani, Naan, Samosa"}

# Generate and display the restaurant name and menu items
if cuisine:
    response = get_restaurant_name_and_items(cuisine)
    st.header(response["restaurant_name"].strip())
    st.write("Menu Items:")
    for item in response["menu_items"].split(","):
        st.write(f"- {item.strip()}")


Modularizing the Code
Create a langchain_helper.py file for the LanqChain logic.
Create a secret_key.py file for the OpenAI API key.
Example of langchain_helper.py
from langchain import OpenAI
from langchain.chains import SimpleSequentialChain, LLMChain
from langchain.prompts import PromptTemplate
from secret_key import OPENAI_API_KEY

# Set up OpenAI
llm = OpenAI(temperature=0.7, api_key=OPENAI_API_KEY)

# Define prompt templates
name_prompt_template = PromptTemplate(
    input_variables=["cuisine"],
    template="I want to open a restaurant for {cuisine} food."
)

menu_prompt_template = PromptTemplate(
    input_variables=["restaurant_name"],
    template="Suggest some food menu items for {restaurant_name}."
)

# Create chains
name_chain = LLMChain(llm=llm, prompt=name_prompt_template)
menu_chain = LLMChain(llm=llm, prompt=menu_prompt_template)

# Create a sequential chain
sequential_chain = SimpleSequentialChain(chains=[name_chain, menu_chain], input_variables=["cuisine"], output_variables=["restaurant_name", "menu_items"])

# Function to generate restaurant name and menu items
def generate_restaurant_name_and_items(cuisine):
    response = sequential_chain(cuisine=cuisine)
    return response

# Main function for testing
if __name__ == "__main__":
    print(generate_restaurant_name_and_items("Italian"))


Agents in LanqChain
Key Points
Agents: Use LLM's reasoning capabilities to perform tasks by connecting with external tools.
Tools: Wikipedia, Google Search API, LLM Math, etc.
Example: Finding flight options, calculating age, etc.
Code Snippets
Setting Up Agents
from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
from langchain.utilities import WikipediaAPIWrapper, GoogleSerperAPIWrapper

# Initialize tools
tools = [
    Tool(
        name="Wikipedia",
        func=WikipediaAPIWrapper().run,
        description="A wrapper around Wikipedia."
    ),
    Tool(
        name="LLM Math",
        func=llm.run,
        description="A wrapper for performing mathematical calculations."
    )
]

# Initialize agent
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

# Run the agent
response = agent.run("When was Elon Musk born and what is his age in 2023?")
print(response)


Memory in LanqChain
Key Points
Memory: Allows chains to remember past conversations.
Conversational Buffer Memory: Stores conversation history.
Conversational Buffer Window Memory: Limits the size of the conversation history.
Code Snippets
Conversational Buffer Memory
from langchain.memory import ConversationBufferMemory

# Create a memory object
memory = ConversationBufferMemory()

# Attach memory to a chain
chain = LLMChain(llm=llm, prompt=prompt_template, memory=memory)

# Run the chain
response = chain.run(cuisine="Indian")
print(chain.memory.buffer)


Conversational Buffer Window Memory
from langchain.memory import ConversationBufferWindowMemory

# Create a memory object with a window size
memory = ConversationBufferWindowMemory(k=1)

# Attach memory to a chain
chain = LLMChain(llm=llm, prompt=prompt_template, memory=memory)

# Run the chain
response = chain.run(cuisine="Indian")
print(chain.memory.buffer)


Conclusion
This crash course covered the basics of LanqChain, including setting up the environment, building a restaurant idea generator application, creating sequential chains, building a Streamlit application, and exploring agents and memory in LanqChain. For more advanced features and future updates, stay tuned.
LangChain Crash Course: Building a Restaurant Idea Generator
This document summarizes a crash course video on LangChain, a framework for building applications using Large Language Models (LLMs). The video culminates in building a Streamlit application that generates restaurant names and menu items based on a chosen cuisine.
1. Introduction to LangChain and its Advantages
LangChain addresses limitations of directly using LLMs like OpenAI's GPT models for application development. These limitations include:
Cost: OpenAI API calls are costly.
Limited Knowledge: LLMs have knowledge cutoffs (e.g., September 2021).
Lack of Access to Internal Data: LLMs cannot access private organizational data.
LangChain overcomes these by:
Providing a framework to integrate various LLMs (OpenAI, Hugging Face, etc.)
Enabling integration with external data sources (Google Search, Wikipedia, databases).
Offering a plug-and-play architecture for different models.
2. Setup and Initial LangChain Usage
2.1 OpenAI API Key:
Obtain an API key from your OpenAI account. Store it securely (e.g., in a separate Python file, environment variable). The video uses a secret_key.py file. Example:
# secret_key.py
OPENAI_API_KEY = "sk-YOUR_API_KEY_HERE"


# main.py
import os
From secret_key import OPEN_API_KEY
os.environ["OPENAI_API_KEY"] = "sk-YOUR_API_KEY_HERE" # or OPEN_API_KEY


2.2 Installation:
Install necessary packages:
pip install langchain openai


2.3 Basic LLM Interaction:
from langchain.llms import OpenAI

llm = OpenAI(temperature=0.7) # temperature controls creativity (0-1)
print(llm("I want to open a restaurant for Indian food and need a fancy name."))


This code snippet shows a basic interaction with the OpenAI LLM. The temperature parameter influences the creativity of the response.
3. Prompt Templates and Chains
To avoid hardcoding prompts, use PromptTemplate:
from langchain.prompts import PromptTemplate

template = """I want to open a restaurant for {cuisine} food. Suggest a fancy name for it."""
prompt = PromptTemplate(input_variables=["cuisine"], template=template)
print(prompt.format(cuisine="Mexican"))


This code creates a prompt template that can be dynamically populated with different cuisines. The format method substitutes the input variable.
LangChain's LLMChain simplifies the process of combining an LLM and a prompt:
from langchain.chains import LLMChain

llm = OpenAI(temperature=0.7)
chain = LLMChain(llm=llm, prompt=prompt)
print(chain.run("Italian"))


4. Sequential Chains
To perform multiple tasks sequentially, use SimpleSequentialChain or SequentialChain. The output of one chain becomes the input for the next.
4.1 SimpleSequentialChain Example:
This example generates a restaurant name and then menu items:
from langchain.chains import SimpleSequentialChain

name_chain = LLMChain(llm=llm, prompt=PromptTemplate(input_variables=["cuisine"], template="Suggest a fancy name for an {cuisine} restaurant."))
menu_chain = LLMChain(llm=llm, prompt=PromptTemplate(input_variables=["restaurant_name"], template="Suggest some food menu items for {restaurant_name} restaurant, separated by commas."))

overall_chain = SimpleSequentialChain(chains=[name_chain, menu_chain], verbose=True)
print(overall_chain.run("Indian"))


4.2 SequentialChain for Multiple Outputs:
To obtain multiple outputs from a sequential chain, use SequentialChain. This example gets both the restaurant name and menu items:
from langchain.chains import SequentialChain

name_chain = LLMChain(llm=llm, prompt=PromptTemplate(input_variables=["cuisine"], template="Suggest a fancy name for an {cuisine} restaurant."), output_key="restaurant_name")
menu_chain = LLMChain(llm=llm, prompt=PromptTemplate(input_variables=["restaurant_name"], template="Suggest some food menu items for the {restaurant_name} restaurant, separated by commas."), output_key="menu_items")

chain = SequentialChain(chains=[name_chain, menu_chain], input_variables=["cuisine"], output_variables=["restaurant_name", "menu_items"])
print(chain({"cuisine": "Arabic"}))


5. Streamlit Application
The video demonstrates creating a Streamlit application using the LangChain functionality developed previously.
5.1 Streamlit Setup:
Install Streamlit:
pip install streamlit


6. LangChain Agents
Agents enhance LLMs by allowing interaction with external tools (e.g., Wikipedia, search engines).
6.1 Agent Example with Wikipedia and Math Tools:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI

tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
print(agent.run("What was the US GDP in 2022, plus 5?"))


This example uses the serpapi tool (Google Search API) and llm-math to answer a question requiring external data and calculation. Remember to set up your SERP API key.
7. LangChain Memory
LLMs are typically stateless. LangChain provides mechanisms to add memory, making them remember past conversations.
7.1 Conversational Buffer Memory:
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain

memory = ConversationBufferMemory()
chain = LLMChain(llm=llm, prompt=prompt, memory=memory)
print(chain.run("Who won the first Cricket World Cup?"))
print(chain.run("Who was the captain of the winning team?"))
print(chain.memory)
print(chain.memory.buffer)



From langchain import ConversationChain
Convo = ConversationChain(llm=OpenAI(temperature=0.6))
print(convo.prompt.template)

convo.run(“who won the first world cup in cricket”)
convo.run(“what is 6+6”)

print(convo.memory)
print(convo.memory.buffer)




From langchain import ConversationChain
from langchain.memory import ConversationBufferWindowMemory

Convo = ConversationChain(llm=OpenAI(temperature=0.6), memory= memory)
print(convo.prompt.template)

convo.run(“who won the first world cup in cricket”)
convo.run(“what is 6+6”)

print(convo.memory)
print(convo.memory.buffer)





This adds conversation history to the chain's memory.
7.2 Conversational Buffer Window Memory (for cost optimization):
To limit memory size and reduce costs, use ConversationBufferWindowMemory:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1) # remember only the last conversation
chain = LLMChain(llm=llm, prompt=prompt, memory=memory)
# ... (Run multiple conversations here)


This example limits the memory to the last one conversation turn.
This comprehensive summary covers all aspects of the LangChain crash course video, including code snippets, explanations, and key concepts. Remember to replace placeholders like "YOUR_API_KEY_HERE" with your actual keys.





======================================================

LANGCHAINN DOCUEMENTATION

Chat Models Overview
Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific fine-tuning for every scenario.
Modern LLMs are typically accessed through a chat model interface that takes a list of messages as input and returns a message as output.
Features of New Generation Chat Models
Tool Calling
Many popular chat models offer a native tool calling API. This API allows developers to build rich applications that enable LLMs to interact with external services, APIs, and databases. Tool calling can also be used to extract structured information from unstructured data and perform various other tasks.
Structured Output
A technique to make a chat model respond in a structured format, such as JSON that matches a given schema.
Multimodality
The ability to work with data other than text; for example, images, audio, and video.
LangChain Features
LangChain provides a consistent interface for working with chat models from different providers while offering additional features for monitoring, debugging, and optimizing the performance of applications that use LLMs.
Integrations
Integrations with many chat model providers: e.g., Anthropic, OpenAI, Ollama, Microsoft Azure, Google Vertex, Amazon Bedrock, Hugging Face, Cohere, Groq.
Message Formats: Use either LangChain's messages format or OpenAI format.
Standard Tool Calling API: Standard interface for binding tools to models, accessing tool call requests made by models, and sending tool results back to the model.
Standard API for Structuring Outputs: Via the with_structured_output method.
Support for Async Programming, Efficient Batching, Rich Streaming API.
Integration with LangSmith: For monitoring and debugging production-grade applications based on LLMs.
Additional Features: Standardized token usage, rate limiting, caching, and more.
Types of Integrations
Official Models: Supported by LangChain and/or model provider. Found in langchain-<provider> packages.
Community Models: Mostly contributed and supported by the community. Found in the langchain-community package.
Naming Convention
LangChain chat models are named with a convention that prefixes "Chat" to their class names (e.g., ChatOllama, ChatAnthropic, ChatOpenAI, etc.).
Note
Models that do not include the prefix "Chat" in their name or include "LLM" as a suffix in their name typically refer to older models that do not follow the chat model interface and instead use an interface that takes a string as input and returns a string as output.
Interface
LangChain chat models implement the BaseChatModel interface. Because BaseChatModel also implements the Runnable Interface, chat models support a standard streaming interface, async programming, optimized batching, and more.
Key Methods
invoke: The primary method for interacting with a chat model. It takes a list of messages as input and returns a list of messages as output.
stream: A method that allows you to stream the output of a chat model as it is generated.
batch: A method that allows you to batch multiple requests to a chat model together for more efficient processing.
bind_tools: A method that allows you to bind a tool to a chat model for use in the model's execution context.
with_structured_output: A wrapper around the invoke method for models that natively support structured output.
Inputs and Outputs
Modern LLMs are typically accessed through a chat model interface that takes messages as input and returns messages as output. Messages are typically associated with a role (e.g., "system", "human", "assistant") and one or more content blocks that contain text or potentially multimodal data (e.g., images, audio, video).
Message Formats
LangChain Message Format: LangChain's own message format, which is used by default and is used internally by LangChain.
OpenAI's Message Format: OpenAI's message format.
Standard Parameters
Parameter
Description
model
The name or identifier of the specific AI model you want to use (e.g., "gpt-3.5-turbo" or "gpt-4").
temperature
Controls the randomness of the model's output. A higher value (e.g., 1.0) makes responses more creative, while a lower value (e.g., 0.0) makes them more deterministic and focused.
timeout
The maximum time (in seconds) to wait for a response from the model before canceling the request. Ensures the request doesn’t hang indefinitely.
max_tokens
Limits the total number of tokens (words and punctuation) in the response. This controls how long the output can be.
stop
Specifies stop sequences that indicate when the model should stop generating tokens. For example, you might use specific strings to signal the end of a response.
max_retries
The maximum number of attempts the system will make to resend a request if it fails due to issues like network timeouts or rate limits.
api_key
The API key required for authenticating with the model provider. This is usually issued when you sign up for access to the model.
base_url
The URL of the API endpoint where requests are sent. This is typically provided by the model's provider and is necessary for directing your requests.
rate_limiter
An optional BaseRateLimiter to space out requests to avoid exceeding rate limits. See rate-limiting below for more details.

Important Notes
Standard parameters only apply to model providers that expose parameters with the intended functionality.
Standard parameters are currently only enforced on integrations that have their own integration packages (e.g., langchain-openai, langchain-anthropic, etc.), they're not enforced on models in langchain-community.
Chat models also accept other parameters that are specific to that integration. To find all the parameters supported by a Chat model, head to the respective API reference for that model.
Tool Calling
Chat models can call tools to perform tasks such as fetching data from a database, making API requests, or running custom code. Please see the tool calling guide for more information.
Structured Outputs
Chat models can be requested to respond in a particular format (e.g., JSON or matching a particular schema). This feature is extremely useful for information extraction tasks. Please read more about the technique in the structured outputs guide.
Multimodality
Large Language Models (LLMs) are not limited to processing text. They can also be used to process other types of data, such as images, audio, and video. This is known as multimodality.
Currently, only some LLMs support multimodal inputs, and almost none support multimodal outputs. Please consult the specific model documentation for details.
Context Window
A chat model's context window refers to the maximum size of the input sequence the model can process at one time. While the context windows of modern LLMs are quite large, they still present a limitation that developers must keep in mind when working with chat models.
If the input exceeds the context window, the model may not be able to process the entire input and could raise an error. In conversational applications, this is especially important because the context window determines how much information the model can "remember" throughout a conversation. Developers often need to manage the input within the context window to maintain a coherent dialogue without exceeding the limit. For more details on handling memory in conversations, refer to the memory.
The size of the input is measured in tokens which are the unit of processing that the model uses.
Advanced Topics
Rate-Limiting
Many chat model providers impose a limit on the number of requests that can be made in a given time period.
If you hit a rate limit, you will typically receive a rate limit error response from the provider, and will need to wait before making more requests.
Options to Deal with Rate Limits
Try to avoid hitting rate limits by spacing out requests: Chat models accept a rate_limiter parameter that can be provided during initialization. This parameter is used to control the rate at which requests are made to the model provider. Spacing out the requests to a given model is a particularly useful strategy when benchmarking models to evaluate their performance. Please see the how to handle rate limits for more information on how to use this feature.
Try to recover from rate limit errors: If you receive a rate limit error, you can wait a certain amount of time before retrying the request. The amount of time to wait can be increased with each subsequent rate limit error. Chat models have a max_retries parameter that can be used to control the number of retries. See the standard parameters section for more information.
Fallback to another chat model: If you hit a rate limit with one chat model, you can switch to another chat model that is not rate-limited.
Caching
Chat model APIs can be slow, so a natural question is whether to cache the results of previous conversations. Theoretically, caching can help improve performance by reducing the number of requests made to the model provider. In practice, caching chat model responses is a complex problem and should be approached with caution.
The reason is that getting a cache hit is unlikely after the first or second interaction in a conversation if relying on caching the exact inputs into the model. For example, how likely do you think that multiple conversations start with the exact same message? What about the exact same three messages?
An alternative approach is to use semantic caching, where you cache responses based on the meaning of the input rather than the exact input itself. This can be effective in some situations, but not in others.
A semantic cache introduces a dependency on another model on the critical path of your application (e.g., the semantic cache may rely on an embedding model to convert text to a vector representation), and it's not guaranteed to capture the meaning of the input accurately.
However, there might be situations where caching chat model responses is beneficial. For example, if you have a chat model that is used to answer frequently asked questions, caching responses can help reduce the load on the model provider, costs, and improve response times.
Please see the how to cache chat model responses guide for more details.
Messages
Messages are the unit of communication in chat models. They are used to represent the input and output of a chat model, as well as any additional context or metadata that may be associated with a conversation.
Each message has a role (e.g., "user", "assistant") and content (e.g., text, multimodal data) with additional metadata that varies depending on the chat model provider.
LangChain provides a unified message format that can be used across chat models, allowing users to work with different chat models without worrying about the specific details of the message format used by each model provider.
What is Inside a Message?
A message typically consists of the following pieces of information:
Role: The role of the message (e.g., "user", "assistant").
Content: The content of the message (e.g., text, multimodal data).
Additional Metadata: id, name, token usage, and other model-specific metadata.
Role
Roles are used to distinguish between different types of messages in a conversation and help the chat model understand how to respond to a given sequence of messages.
Role
Description
system
Used to tell the chat model how to behave and provide additional context. Not supported by all chat model providers.
user
Represents input from a user interacting with the model, usually in the form of text or other interactive input.
assistant
Represents a response from the model, which can include text or a request to invoke tools.
tool
A message used to pass the results of a tool invocation back to the model after external data or processing has been retrieved. Used with chat models that support tool calling.
function (legacy)
This is a legacy role, corresponding to OpenAI's legacy function-calling API. tool role should be used instead.

Content
The content of a message text or a list of dictionaries representing multimodal data (e.g., images, audio, video). The exact format of the content can vary between different chat model providers.
Currently, most chat models support text as the primary content type, with some models also supporting multimodal data. However, support for multimodal data is still limited across most chat model providers.
For more information see:
SystemMessage: for content which should be passed to direct the conversation
HumanMessage: for content in the input from the user.
AIMessage: for content in the response from the model.
Multimodality: for more information on multimodal content.
Other Message Data
Depending on the chat model provider, messages can include other data such as:
ID: An optional unique identifier for the message.
Name: An optional name property which allows differentiate between different entities/speakers with the same role. Not all models support this!
Metadata: Additional information about the message, such as timestamps, token usage, etc.
Tool Calls: A request made by the model to call one or more tools. See tool calling for more information.
Conversation Structure
The sequence of messages into a chat model should follow a specific structure to ensure that the chat model can generate a valid response.
For example, a typical conversation structure might look like this:
User Message: "Hello, how are you?"
Assistant Message: "I'm doing well, thank you for asking."
User Message: "Can you tell me a joke?"
Assistant Message: "Sure! Why did the scarecrow win an award? Because he was outstanding in his field!"
Please read the chat history guide for more information on managing chat history and ensuring that the conversation structure is correct.
LangChain Messages
LangChain provides a unified message format that can be used across all chat models, allowing users to work with different chat models without worrying about the specific details of the message format used by each model provider.
LangChain messages are Python objects that subclass from a BaseMessage.
The five main message types are:
SystemMessage: corresponds to system role
HumanMessage: corresponds to user role
AIMessage: corresponds to assistant role
AIMessageChunk: corresponds to assistant role, used for streaming responses
ToolMessage: corresponds to tool role
Other important messages include:
RemoveMessage: does not correspond to any role. This is an abstraction, mostly used in LangGraph to manage chat history.
Legacy FunctionMessage: corresponds to the function role in OpenAI's legacy function-calling API.
You can find more information about messages in the API Reference.
SystemMessage
A SystemMessage is used to prime the behavior of the AI model and provide additional context, such as instructing the model to adopt a specific persona or setting the tone of the conversation (e.g., "This is a conversation about cooking").
Different chat providers may support system message in one of the following ways:
Through a "system" message role: In this case, a system message is included as part of the message sequence with the role explicitly set as "system."
Through a separate API parameter for system instructions: Instead of being included as a message, system instructions are passed via a dedicated API parameter.
No support for system messages: Some models do not support system messages at all.
Most major chat model providers support system instructions via either a chat message or a separate API parameter. LangChain will automatically adapt based on the provider’s capabilities. If the provider supports a separate API parameter for system instructions, LangChain will extract the content of a system message and pass it through that parameter.
If no system message is supported by the provider, in most cases LangChain will attempt to incorporate the system message's content into a HumanMessage or raise an exception if that is not possible. However, this behavior is not yet consistently enforced across all implementations, and if using a less popular implementation of a chat model (e.g., an implementation from the langchain-community package) it is recommended to check the specific documentation for that model.
HumanMessage
The HumanMessage corresponds to the "user" role. A human message represents input from a user interacting with the model.
Text Content
Most chat models expect the user input to be in the form of text.
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hello, how are you?")])

API Reference: HumanMessage
Tip
When invoking a chat model with a string as input, LangChain will automatically convert the string into a HumanMessage object. This is mostly useful for quick testing.
model.invoke("Hello, how are you?")

Multi-modal Content
Some chat models accept multimodal inputs, such as images, audio, video, or files like PDFs.
Please see the multimodality guide for more information.
AIMessage
AIMessage is used to represent a message with the role "assistant". This is the response from the model, which can include text or a request to invoke tools. It could also include other media types like images, audio, or video -- though this is still uncommon at the moment.
from langchain_core.messages import HumanMessage
ai_message = model.invoke([HumanMessage("Tell me a joke")])
ai_message # <-- AIMessage

API Reference: HumanMessage
An AIMessage has the following attributes. The attributes which are standardized are the ones that LangChain attempts to standardize across different chat model providers. raw fields are specific to the model provider and may vary.
Attribute
Standardized/Raw
Description
content
Raw
Usually a string, but can be a list of content blocks. See content for details.
tool_calls
Standardized
Tool calls associated with the message. See tool calling for details.
invalid_tool_calls
Standardized
Tool calls with parsing errors associated with the message. See tool calling for details.
usage_metadata
Standardized
Usage metadata for a message, such as token counts. See Usage Metadata API Reference.
id
Standardized
An optional unique identifier for the message, ideally provided by the provider/model that created the message.
response_metadata
Raw
Response metadata, e.g., response headers, logprobs, token counts.

Content
The content property of an AIMessage represents the response generated by the chat model.
The content is either:
text: the norm for virtually all chat models.
A list of dictionaries: Each dictionary represents a content block and is associated with a type.
Used by Anthropic for surfacing agent thought process when doing tool calling.
Used by OpenAI for audio outputs. Please see multi-modal content for more information.
Important
The content property is not standardized across different chat model providers, mostly because there are still few examples to generalize from.
AIMessageChunk
It is common to stream responses for the chat model as they are being generated, so the user can see the response in real-time instead of waiting for the entire response to be generated before displaying it.
It is returned from the stream, astream, and astream_events methods of the chat model.
For example,
for chunk in model.stream([HumanMessage("what color is the sky?")]):
    print(chunk)

AIMessageChunk follows nearly the same structure as AIMessage, but uses a different ToolCallChunk to be able to stream tool calling in a standardized manner.
Aggregating
AIMessageChunks support the + operator to merge them into a single AIMessage. This is useful when you want to display the final response to the user.
ai_message = chunk1 + chunk2 + chunk3 + ...

ToolMessage
This represents a message with role "tool", which contains the result of calling a tool. In addition to role and content, this message has:
a tool_call_id field: which conveys the id of the call to the tool that was called to produce this result.
an artifact field: which can be used to pass along arbitrary artifacts of the tool execution which are useful to track but which should not be sent to the model.
Please see tool calling for more information.
RemoveMessage
This is a special message type that does not correspond to any roles. It is used for managing chat history in LangGraph.
Please see the following for more information on how to use the RemoveMessage:
Memory conceptual guide
How to delete messages
OpenAI Format​
Inputs​
Chat models also accept OpenAI's format as inputs to chat models:
chat_model.invoke([
   {
       "role": "user",
       "content": "Hello, how are you?",
   },
   {
       "role": "assistant",
       "content": "I'm doing well, thank you for asking.",
   },
   {
       "role": "user",
       "content": "Can you tell me a joke?",
   }
])
Outputs​
At the moment, the output of the model will be in terms of LangChain messages, so you will need to convert the output to the OpenAI format if you need OpenAI format for the output as well.
The convert_to_openai_messages utility function can be used to convert from LangChain messages to OpenAI format.

Summary 2
LangChain Chat Models: A Comprehensive Summary
This document summarizes key aspects of LangChain's chat model functionality, including features, integrations, interface, key methods, inputs/outputs, standard parameters, advanced topics (rate-limiting, caching), and message structure.
I. Overview of LangChain Chat Models
LangChain provides a unified interface for interacting with various Large Language Models (LLMs), primarily through a chat model interface. Modern LLMs are accessed via a chat model interface, taking a list of messages as input and returning a message as output. LangChain supports a wide range of providers (Anthropic, OpenAI, Ollama, Microsoft Azure, Google Vertex, Amazon Bedrock, Hugging Face, Cohere, Groq). Older LLMs, which use a string input/string output interface, are also supported but generally discouraged.
Key Features of LangChain's Chat Model Support:
Consistent Interface: Provides a standardized way to interact with diverse chat models.
Tool Calling: Native support for integrating LLMs with external tools and APIs.
Structured Output: Ability to receive responses in structured formats like JSON.
Multimodality: Support for various data types beyond text (images, audio, video - support varies by model).
Monitoring & Debugging: Integration with LangSmith for production-level applications.
Asynchronous Programming, Batching, Streaming: Enhanced efficiency and real-time feedback.
Official and Community Models: Access to models officially supported by LangChain and community-contributed models. Official models are found in langchain-<provider> packages; community models in langchain-community.
Naming Convention: LangChain chat models have class names prefixed with "Chat" (e.g., ChatOllama, ChatOpenAI). Models without "Chat" or with "LLM" as a suffix are typically older, string-based models.
II. LangChain Chat Model Interface
LangChain chat models adhere to the BaseChatModel interface, which extends the Runnable interface. This enables streaming, async programming, and optimized batching.
Key Methods:
invoke(messages): Primary method; takes a list of messages as input and returns a list of messages as output.
stream(messages): Streams the model's output as it's generated.
batch(messages): Processes multiple requests efficiently.
bind_tools(tools): Binds tools to the model for tool calling.
with_structured_output(schema): Wraps invoke for structured output.
Note: The terms "LLM" and "Chat Model" are often used interchangeably in the documentation, as most modern LLMs are accessed via a chat interface.
III. Inputs and Outputs
Message Format: LangChain supports two message formats:
LangChain Message Format: LangChain's default, internal format.
OpenAI Message Format: OpenAI's message format. Conversion between formats is possible using convert_to_openai_messages.
Message Structure: Messages consist of:
Role: ("system", "user", "assistant", "tool")
Content: Text or multimodal data.
Metadata: Optional additional information (ID, name, token usage, etc.).
IV. Standard Parameters
Many chat models use standardized parameters:
Parameter
Description
model
Model name/identifier (e.g., "gpt-3.5-turbo").
temperature
Controls randomness of output (0.0 = deterministic, 1.0 = creative).
timeout
Maximum response time (seconds).
max_tokens
Maximum number of tokens in the response.
stop
Stop sequences to halt generation.
max_retries
Maximum retry attempts for failed requests.
api_key
API key for authentication.
base_url
API endpoint URL.
rate_limiter
BaseRateLimiter for controlling request rate.

Important Notes:
Not all providers support all parameters.
Standard parameters are primarily enforced on models with dedicated integration packages.
V. Advanced Topics
A. Rate-Limiting
Strategies for handling rate limits:
Spacing out requests: Use rate_limiter parameter during initialization.
Handling rate limit errors: Use max_retries parameter.
Fallback to another model: Switch to a non-rate-limited model.
B. Caching
Caching chat model responses is complex due to the low likelihood of exact input repetition. Semantic caching (caching based on meaning) is an alternative, but introduces dependencies and accuracy concerns. Caching may be beneficial for frequently asked questions.
VI. LangChain Messages
LangChain provides a unified message format with several core message types:
SystemMessage: Provides instructions or context to the model. Support varies across providers.
HumanMessage: Represents user input (text or multimodal data). Example: model.invoke([HumanMessage(content="Hello, how are you?")])
AIMessage: Represents the model's response (text or tool calls). Example: ai_message = model.invoke([HumanMessage("Tell me a joke")])
Attributes: content (text or list of dictionaries for multimodal data), tool_calls, invalid_tool_calls, usage_metadata, id, response_metadata.
AIMessageChunk: Used for streaming responses. Supports + operator for aggregation.
ToolMessage: Represents the result of a tool invocation.
RemoveMessage: Used for managing chat history in LangGraph.
Legacy FunctionMessage: Legacy type; use ToolMessage instead.
VII. OpenAI Format
LangChain can accept and (partially) output messages in OpenAI's format:
Input Example:
chat_model.invoke([
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing well, thank you for asking."},
    {"role": "user", "content": "Can you tell me a joke?"}
])

Output: Output from the model is in LangChain's message format; conversion to OpenAI format is required if needed.
VIII. Tool Calling, Structured Outputs, and Multimodality
These features are briefly mentioned but detailed explanations are deferred to separate guides within the LangChain documentation. In short:
Tool Calling: Enables LLMs to interact with external services.
Structured Outputs: Allows responses in specific formats (e.g., JSON).
Multimodality: Supports input/output of data types beyond text (limited support currently).
IX. Context Window
The context window limits the input size (measured in tokens) a model can process. Exceeding the limit can cause errors. Managing conversation history within the context window is crucial for coherent dialogues.
This structured summary provides a comprehensive overview of LangChain's chat model capabilities. Refer to the original documentation for detailed explanations and further examples



Structured Summary of the Provided Text
Introduction to Chat History
Chat History:
A record of the conversation between the user and the chat model.
Maintains context and state throughout the conversation.
Each message is associated with a specific role: "user", "assistant", "system", or "tool".
Conversation Patterns
Typical Conversation Flow:
System Message: Sets the context for the conversation.
User Message: Contains the user's input.
Assistant Message: Contains the model's response.
Conversation Patterns:
Back-and-Forth Conversation: Alternating messages between the user and the assistant.
Agentic Workflow: The assistant invokes tools to perform specific tasks.
Managing Chat History
Key Guidelines:
Input Size Limit: Manage chat history to avoid exceeding the context window.
Conversation Structure:
The first message should be a "user" or "system" message.
The last message should be a "user" or "tool" message.
A "tool" message should follow an "assistant" message requesting the tool invocation.
Tip:
Understanding correct conversation structure is essential for implementing memory in chat models.
Tool Abstraction in LangChain
Key Concepts:
Tools encapsulate a function and its schema.
Tools can be passed to chat models that support tool calling.
Tool Interface:
BaseTool Class: Subclass of the Runnable Interface.
Key Attributes:
name: The name of the tool.
description: A description of what the tool does.
args: JSON schema for the tool's arguments.
Key Methods:
invoke: Invokes the tool with the given arguments.
ainvoke: Asynchronously invokes the tool.
Creating Tools Using the @tool Decorator
Code Snippet:
from langchain_core.tools import tool

@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

Usage:
Direct Invocation:
multiply.invoke({"a": 2, "b": 3})


Inspecting Tool Schema:
print(multiply.name) # multiply
print(multiply.description) # Multiply two numbers.
print(multiply.args)
# {
# 'type': 'object',
# 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}},
# 'required': ['a', 'b']
# }


Note:
Direct interaction with tools might not be necessary when using pre-built LangChain components.
Configuring the Schema
Additional Options:
Modify name, description, or parse the function's doc-string to infer the schema.
Tool Artifacts
Code Snippet:
@tool(response_format="content_and_artifact")
def some_tool(...) -> Tuple[str, Any]:
    """Tool that does something."""
    ...
    return 'Message for chat model', some_artifact

Special Type Annotations
Type Annotations:
InjectedToolArg: Hides the argument from the tool's schema.
RunnableConfig: Passes the RunnableConfig object to the tool.
InjectedState: Passes the overall state of the LangGraph graph to the tool.
InjectedStore: Passes the LangGraph store object to the tool.
Annotated: Adds a description to the argument exposed in the tool's schema.
Example:
from langchain_core.tools import tool, InjectedToolArg

@tool
def user_specific_tool(input_data: str, user_id: InjectedToolArg) -> str:
    """Tool that processes input data."""
    return f"User {user_id} processed {input_data}"

RunnableConfig Example:
from langchain_core.runnables import RunnableConfig

@tool
async def some_func(..., config: RunnableConfig) -> ...:
    """Tool that does something."""
    # do something with config
    ...

await some_func.ainvoke(..., config={"configurable": {"value": "some_value"}})

Best Practices
Designing Tools:
Well-named, correctly-documented, and properly type-hinted tools are easier for models to use.
Design simple and narrowly scoped tools.
Use chat models that support tool-calling APIs.
Toolkits
Interface:
get_tools Method: Returns a list of tools.
Code Snippet:
# Initialize a toolkit
toolkit = ExampleTookit(...)

# Get list of tools
tools = toolkit.get_tools()

Conceptual Overview of Tool Calling
Key Concepts:
Tool Creation: Use the @tool decorator.
Tool Binding: Connect the tool to a model that supports tool calling.
Tool Calling: The model decides when to call a tool.
Tool Execution: Execute the tool using the arguments provided by the model.
Recommended Usage:
# Tool creation
tools = [my_tool]
# Tool binding
model_with_tools = model.bind_tools(tools)
# Tool calling
response = model_with_tools.invoke(user_input)

Example:
def multiply(a: int, b: int) -> int:
    """Multiply a and b.

    Args:
        a: first int
        b: second int
    """
    return a * b

llm_with_tools = tool_calling_model.bind_tools([multiply])

Tool Calling Example:
result = llm_with_tools.invoke("What is 2 multiplied by 3?")
result.tool_calls
{'name': 'multiply', 'args': {'a': 2, 'b': 3}, 'id': 'xxx', 'type': 'tool_call'}

Further Reading
Conceptual Guide on Tools
Model Integrations that Support Tool Calling
How-to Guide on Tool Calling
LangGraph Documentation on Using ToolNode
Best Practices for Tool Design
Models with explicit tool-calling APIs perform better.
Tools with well-chosen names and descriptions are easier for models to use.
Simple, narrowly scoped tools are easier for models to use.
Avoid asking the model to select from a large list of tools.
Conclusion
The text provides a comprehensive overview of managing chat history, creating and using tools in LangChain, and best practices for designing tools for chat models. It includes code snippets, examples, and tips to ensure a thorough understanding of the concepts.
summary 2
Chat History and Conversation Patterns
Chat history is a sequential record of messages between a user and a chat model, maintaining context and state. Each message is tagged with a role ("user," "assistant," "system," or "tool").
Conversation Patterns:
Most conversations begin with a system message setting the context, followed by a user message and an assistant's response. The assistant might directly respond or invoke a tool for specific tasks. Conversations typically alternate between user-assistant exchanges and assistant-tool interactions (agentic workflow).
Managing Chat History:
Chat models have input size limits; therefore, managing and trimming chat history is crucial to avoid exceeding the context window. Maintain correct conversation structure:
The first message is either "user" or "system," followed by "user" then "assistant."
The last message is either "user" or a "tool" message (tool call result).
"tool" messages only follow "assistant" messages requesting tool invocation.
Tip: Correct conversation structure is essential for proper memory implementation in chat models.
Tools and the @tool Decorator
The LangChain tool abstraction associates a Python function with a schema defining its name, description, and arguments. Tools can be passed to chat models supporting tool calling.
Key Concepts:
Tools encapsulate functions and their schemas for chat models. The @tool decorator simplifies tool creation:
Infers tool name, description, and arguments (customization supported).
Supports tools returning artifacts (images, dataframes, etc.).
Hides input arguments from the schema using injected tool arguments.
Tool Interface (BaseTool):
The BaseTool class (subclass of Runnable interface) defines the tool interface:
name: Tool name.
description: Tool description.
args: JSON schema for tool arguments.
invoke(): Invokes the tool with given arguments.
ainvoke(): Asynchronous invocation (Langchain async programming).
Creating Tools with @tool:
The recommended method is using the @tool decorator.
from langchain_core.tools import tool

@tool
def multiply(a: int, b: int) -> int:
   """Multiply two numbers."""
   return a * b

This code defines a multiply tool using the @tool decorator. The docstring provides the description.
API Reference: tool
Note: Other methods (subclassing BaseTool or using StructuredTool) exist but the @tool decorator is generally preferred.
Using the Tool:
multiply.invoke({"a": 2, "b": 3})

This invokes the multiply tool.
Inspecting the Tool:
print(multiply.name)       # multiply
print(multiply.description) # Multiply two numbers.
print(multiply.args) 
# {
# 'type': 'object', 
# 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 
# 'required': ['a', 'b']
# }

This demonstrates how to access the tool's properties.
Note: Direct tool interaction might not be needed with pre-built LangChain components but is valuable for debugging, testing, and custom LangGraph workflows.
Configuring the Schema:
The @tool decorator allows schema configuration (name, description, docstring parsing). Refer to the @tool API reference and the custom tools guide for details.
Tool Artifacts:
Tools' outputs can be fed back to the model. Sometimes, artifacts (custom objects, dataframes, images) should be accessible downstream but not exposed to the model.
@tool(response_format="content_and_artifact")
def some_tool(...) -> Tuple[str, Any]:
    """Tool that does something."""
    ...
    return 'Message for chat model', some_artifact 

This example uses response_format="content_and_artifact" to handle artifacts. See "how to return artifacts from tools" for more details.
Special Type Annotations:
Special type annotations control tool runtime behavior:
InjectedToolArg: Arguments injected manually at runtime (hidden from schema).
RunnableConfig: Pass RunnableConfig object to the tool.
InjectedState: Pass LangGraph graph state.
InjectedStore: Pass LangGraph store object.
Annotated[..., "string literal"]: Adds a description to an argument exposed in the schema.
InjectedToolArg:
Hides arguments from the schema, allowing runtime injection.
from langchain_core.tools import tool, InjectedToolArg

@tool
def user_specific_tool(input_data: str, user_id: InjectedToolArg) -> str:
    """Tool that processes input data."""
    return f"User {user_id} processed {input_data}"

API Reference: tool | InjectedToolArg
See "how to pass runtime values to tools" for details on InjectedToolArg.
RunnableConfig:
Allows passing custom runtime values.
from langchain_core.runnables import RunnableConfig

@tool
async def some_func(..., config: RunnableConfig) -> ...:
    """Tool that does something."""
    # do something with config
    ...

await some_func.ainvoke(..., config={"configurable": {"value": "some_value"}})

config is not part of the schema and is injected at runtime.
Note: Manual RunnableConfig propagation might be needed in Python 3.9/3.10 async environments (not an issue in Python 3.11). See "Propagation RunnableConfig" for details.
InjectedState and InjectedStore: See respective documentation for details.
Best Practices for Tool Design
Use clear names, documentation, and type hints.
Design simple, narrowly scoped tools.
Use chat models with tool-calling APIs.
Toolkits
Toolkits group tools for specific tasks. They expose a get_tools() method returning a list of tools.
Tool Calling
Many AI applications interact with systems (databases, APIs) requiring structured input (tool calling, function calling).
Conceptual Overview:
Tool Creation: Create tools using @tool.
Tool Binding: Connect tools to a tool-calling model.
Tool Calling: The model decides when to call a tool.
Tool Execution: The tool executes using model-provided arguments.
Recommended Usage (Pseudo-code):
# Tool creation
tools = [my_tool]
# Tool binding
model_with_tools = model.bind_tools(tools)
# Tool calling 
response = model_with_tools.invoke(user_input)

Tool Creation (using @tool):
from langchain_core.tools import tool

@tool
def multiply(a: int, b: int) -> int:
    """Multiply a and b."""
    return a * b

API Reference: tool
Tool Binding:
LangChain provides a standardized interface (bind_tools()) for connecting tools to models.
model_with_tools = model.bind_tools(tools_list)

Example:
def multiply(a: int, b: int) -> int:
    """Multiply a and b.

    Args:
        a: first int
        b: second int
    """
    return a * b

llm_with_tools = tool_calling_model.bind_tools([multiply])

Tip: See the model integration page for tool-calling model providers.
Tool Calling:
The model decides when to use a tool based on input relevance.
Example:
result = llm_with_tools.invoke("Hello world!") # No tool call
result = llm_with_tools.invoke("What is 2 multiplied by 3?") # Tool call
result.tool_calls # Contains tool call information

Tool Execution:
Tools implement the Runnable interface; they can be invoked directly (tool.invoke(args)). LangGraph provides components (e.g., ToolNode) for tool invocation.
Best Practices for Tool Calling
Use models with explicit tool-calling APIs.
Use well-named and described tools.
Use simple, narrowly scoped tools.
Avoid large tool lists

