# **LangChain - A Framework for developing applications powered by LLMs**

## **What's Covered?**
1. Introduction to LangChain
    - What is LangChain?
    - Why use LangChain?
    - Architecture
    - LangChain Packages
2. Building Blocks
    - Prompt Template
    - Chat Model
    - Output Parser
    - Chains
3. Case Study
    - Building a Chat Prompt Template
    - Building a Chat Model
    - Building an Output Parser
    - Putting it all together

## **Introduction to LangChain**

### **What is LangChain?**
LangChain is an open-source framework that acts like a toolkit and orchestration layer for building applications with Large Language Models.

It simplifies the process of chaining together different components (like LLMs, data sources, and tools) to create complex, intelligent applications. It's designed to make prompt engineering more efficient and allow developers to adapt language models to specific business contexts.

Imagine you want to build a truly smart application powered by a Large Language Model (LLM), like OpenAI's GPT or Google's Gemini. Simply sending a prompt to an LLM isn't usually enough for real-world scenarios. You often need to:

1. **Get relevant information:** Your LLM might need to access external data (your documents, a database, the internet) to answer a user's question accurately.
2. **Remember past conversations:** For a coherent dialogue, the LLM needs a memory of what was said before.
3. **Perform actions:** Sometimes, the LLM needs to "do" something, like search the web, call an API, or run a calculator.
4. **Structure its output:** You might want the LLM's response to be in a specific format (e.g., JSON, a bulleted list).

This is where LangChain comes in.

### **Why use LangChain?**
- **Abstraction:** It provides high-level abstractions over common LLM functionalities, so you don't have to worry about the low-level API calls for every LLM.
- **Modularity:** It offers modular components that you can mix and match to build various LLM-powered applications (chatbots, Q&A systems, content generators, agents).
- **Integration:** It provides numerous integrations with different LLMs, vector databases, document loaders, and other tools.
- **Orchestration:** It excels at orchestrating sequences of operations, allowing you to build multi-step workflows.

### **Architecture**
LangChain is a framework that consists of a number of packages.

<img src="images/langchain_stack_updated.png" width="500" height="600">

### **LangChain Packages**
**langchain-core**  
This package contains base abstractions for different components and ways to compose them together. The interfaces for core components like **chat models**, **vector stores**, **tools** and more are defined here. No third-party integrations are defined here. The dependencies are very lightweight.

**langchain**  
The main langchain package contains chains and retrieval strategies that make up an application's cognitive architecture. These are NOT third-party integrations. All chains, agents, and retrieval strategies here are NOT specific to any one integration, but rather generic across all integrations.

**Integration packages**  
Popular integrations have their own packages (e.g. langchain-openai, langchain-anthropic, etc) so that they can be properly versioned and appropriately lightweight.

**langchain-community**  
This package contains third-party integrations that are maintained by the LangChain community. Key integration packages are separated out (see above). This contains integrations for various components (chat models, vector stores, tools, etc). All dependencies in this package are optional to keep the package as lightweight as possible.

**langgraph**  
langgraph is an extension of langchain aimed at building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.

LangGraph exposes high level interfaces for creating common types of agents, as well as a low-level API for composing custom flows.

## **Building Blocks**

1. **Prompt Template**
2. **Chat Model**
3. **Output Parser**
4. **Chains**

<img src="images/langchain_LCEL.JPG">

### **Installation**

1. Install `langchain` 
```python
! pip install langchain
```
> The above command installs the following packages:
> - langchain
> - langchain-core
> - langchain-text-splitters
> - langsmith

2. Install `langchain-google-genai` and `langchain-openai`
```python
! pip install langchain-openai langchain-google-genai
```

In [1]:
# # Install langchain

# ! pip install langchain
# ! pip install langchain-openai langchain-google-genai

In [2]:
import langchain

print(langchain.__version__)

0.3.26


### **Chat Models**

LLMs handle various language operations such as translation, summarization, question answering, and content creation.

In [3]:
# Setup API Key

f = open('keys/.gemini.txt')

GOOGLE_API_KEY = f.read()

In [4]:
# Import ChatModel
from langchain_google_genai import ChatGoogleGenerativeAI

# Pass the standard parameters during initialization
chat_model = ChatGoogleGenerativeAI(api_key=GOOGLE_API_KEY, 
                                    model="gemini-2.0-flash", 
                                    temperature=1)

# Creating a Prompt
prompt = "What is LangChain? Explain in 200 words."

# Printing the output of model
model_response = chat_model.invoke(prompt)

model_response

I0000 00:00:1752258168.849632 20227588 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported


AIMessage(content='LangChain is a framework designed to simplify the development of applications powered by large language models (LLMs).  It provides tools and abstractions to connect LLMs to various data sources and other components, enabling more complex and powerful applications than simply prompting an LLM directly.\n\nThink of it as a toolkit for building LLM-powered applications. It includes modules for things like:\n\n*   **Chains:** Sequences of calls to LLMs or other utilities.\n*   **Indexes:** Ways to structure data so LLMs can effectively access and retrieve relevant information.\n*   **Memory:** Mechanisms for LLMs to remember previous interactions.\n*   **Agents:** Allowing LLMs to use external tools and decide the best course of action.\n\nEssentially, LangChain lets you orchestrate LLMs with other tools to build more sophisticated and useful AI applications, like chatbots that can answer questions based on specific documents, or automated agents that can perform tasks 

### **Output Parser**

- LLMs primarily generate free-form text. However, in many applications, you need structured data (e.g., a list of items, a JSON object, a specific format). OutputParsers bridge this gap.
- These are the objects that take the raw string output from an LLM and transform it into a more usable Python data structure (e.g., a list, a dictionary, a Pydantic object).
- **Input** to the output parser should be an **AIMessage**.

**The output of a ChatModel (and therefore, of this chain) is a AI Message. However, it's often much more convenient to work with strings. Let's add a simple output parser to convert the chat message to a string.**

In [5]:
# Output Parsing

from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

In [6]:
output_parser.invoke(model_response)

'LangChain is a framework designed to simplify the development of applications powered by large language models (LLMs).  It provides tools and abstractions to connect LLMs to various data sources and other components, enabling more complex and powerful applications than simply prompting an LLM directly.\n\nThink of it as a toolkit for building LLM-powered applications. It includes modules for things like:\n\n*   **Chains:** Sequences of calls to LLMs or other utilities.\n*   **Indexes:** Ways to structure data so LLMs can effectively access and retrieve relevant information.\n*   **Memory:** Mechanisms for LLMs to remember previous interactions.\n*   **Agents:** Allowing LLMs to use external tools and decide the best course of action.\n\nEssentially, LangChain lets you orchestrate LLMs with other tools to build more sophisticated and useful AI applications, like chatbots that can answer questions based on specific documents, or automated agents that can perform tasks using external API

### **Prompt Template**

- Writing static strings as prompts quickly becomes unmanageable. We need a way to inject dynamic information. This is where Prompt Template comes in.
- These are the objects that help you construct prompts dynamically by accepting input variables. Think of them as blueprints for your prompts.
- There are two types: **PromptTemplate** and **ChatPromptTemplate**.
- **Input** to the Prompt Template should be a **dictionary** containing raw user inputs.
- **Output** of a Prompt Template will be a **string** or **list of chat messages**.

In [7]:
from langchain_core.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate(
                    messages = [
                    ("system", "You are a helpful AI Tutor with expertise in Data Science and Artificial Intelligence. "),
                    ("human", "What is {topic}? Explain in 200 words."),
                    ]
)

In [8]:
prompt_template.input_variables

['topic']

In [9]:
prompt_template.invoke({"topic": "LangChain"})

ChatPromptValue(messages=[SystemMessage(content='You are a helpful AI Tutor with expertise in Data Science and Artificial Intelligence. ', additional_kwargs={}, response_metadata={}), HumanMessage(content='What is LangChain? Explain in 200 words.', additional_kwargs={}, response_metadata={})])

In [10]:
print(prompt_template.invoke({"topic": "LangChain"}).to_string())

System: You are a helpful AI Tutor with expertise in Data Science and Artificial Intelligence. 
Human: What is LangChain? Explain in 200 words.


### **Chains**

- A chain is simply a sequence of operations where the output of one step becomes the input of the next. This allows you to build more complex and intelligent workflows than a single LLM call could achieve.
- The `|` symbol chains together the different components feeds the output from one component as input into the next component.


In [11]:
# 1.Prompt Template
prompt = prompt_template.invoke({"topic": "Linear Regression"})
print(type(prompt))

# 2. Chat Model
model_response = chat_model.invoke(prompt)
print(type(model_response))

# 3. Ouput Parser
final_output = output_parser.invoke(model_response)
print(type(final_output))

<class 'langchain_core.prompt_values.ChatPromptValue'>
<class 'langchain_core.messages.ai.AIMessage'>
<class 'str'>


In [12]:
chain = prompt_template | chat_model | output_parser

user_input = {"topic": "NLP"}

chain.invoke(user_input)

"Okay, let's break down Natural Language Processing (NLP) in a concise way.\n\n**NLP, or Natural Language Processing, is a branch of Artificial Intelligence focused on enabling computers to understand, interpret, and generate human language.** Think of it as bridging the gap between how humans communicate and how computers process information.\n\nInstead of relying on structured data, NLP algorithms are designed to work with unstructured text and speech data. This involves a range of tasks including:\n\n*   **Understanding meaning:** Analyzing sentiment, identifying key entities, and discerning relationships between words and concepts.\n*   **Generating text:** Creating summaries, translating languages, writing different kinds of creative content, and answering questions in a comprehensive manner.\n*   **Speech recognition & synthesis:** Converting spoken language into text and vice versa.\n\nNLP combines computer science, linguistics, and machine learning techniques to build models th

## **Case Study**

**Create an AI Tutor App that uses Prompts and Chat internally to give Python Implementation tutorial for Data Science topics**

Inorder to solve this, we will first create the following three components:
1. Chat Prompt Template
2. Chat Model
3. Output Parser

In [13]:
## Building a Chat Prompt Template

from langchain_core.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

# Constructing System Prompt
system_prompt = SystemMessagePromptTemplate.from_template("""
You are a friendly AI Tutor with expertise in Data Science and AI who tells step by step Python Implementation for topics asked by user.
""")

# Constructing Human Prompt
human_prompt = HumanMessagePromptTemplate.from_template("Tell me a python implementation for {topic_name}.")

# Compiling Chat Prompt
chat_prompt = ChatPromptTemplate(messages=[system_prompt, human_prompt])

chat_prompt

ChatPromptTemplate(input_variables=['topic_name'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template='\nYou are a friendly AI Tutor with expertise in Data Science and AI who tells step by step Python Implementation for topics asked by user.\n'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['topic_name'], input_types={}, partial_variables={}, template='Tell me a python implementation for {topic_name}.'), additional_kwargs={})])

In [14]:
## Building a Chat Model

from langchain_google_genai import ChatGoogleGenerativeAI

# Pass the standard parameters during initialization
chat_model = ChatGoogleGenerativeAI(api_key=GOOGLE_API_KEY, 
                                    model="gemini-2.0-flash", 
                                    temperature=1)


I0000 00:00:1752258194.022132 20227588 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported


In [15]:
## Building an Output Parser

from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

In [16]:
## Chaining All the components together

chain = chat_prompt | chat_model | output_parser

In [17]:
## Calling the chain

user_input = {"topic_name": "Logistic Regression"}

output = chain.invoke(user_input)

print(output)

Certainly, let's break down the implementation of Logistic Regression in Python step by step.

**Understanding Logistic Regression**

Logistic Regression is a classification algorithm used when the dependent variable is categorical. It models the probability of a binary outcome (0 or 1) using a sigmoid function.

**Steps for Implementation**

1.  **Import Libraries:**
    *   `numpy` for numerical operations
    *   `sklearn.model_selection` for train/test split
    *   `sklearn.preprocessing` for feature scaling (optional but often recommended)

```python
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler  # Optional
```

2.  **Define the Sigmoid Function:**
    *   The sigmoid function maps any real value to a value between 0 and 1.
    *   Formula: `sigmoid(z) = 1 / (1 + exp(-z))`

```python
def sigmoid(z):
    return 1 / (1 + np.exp(-z))
```

3.  **Define the Logistic Regression Model:**

```python
class Logistic