# 2.3 LLMs vs Chat Models in LangChain

## üéØ Learning Objectives

This notebook explores the **two types of language model interfaces** in LangChain:

1. **LLMs (Text Completion)** - Basic text-in, text-out models
2. **Chat Models** - Message-based conversational models

## üîë Key Differences

| Feature | LLMs | Chat Models |
|---------|------|-------------|
| **Input** | String | List of Messages |
| **Output** | String | AIMessage |
| **Examples** | `gpt-3.5-turbo-instruct` | `gpt-4o-mini`, `gemini-1.5-flash` |
| **Use Case** | Simple completions | Conversations, complex tasks |

## üìö What You'll Learn

- Access different LLM providers (OpenAI, Google, HuggingFace)
- Understand the LLM vs ChatModel distinction
- Use message types for conversational prompting
- Build multi-turn conversations with history

---

## üì¶ Install Dependencies (Run Once)

Uncomment and run the cells below if you haven't installed the required packages.

In [None]:
# ============================================================================
# INSTALLATION (Uncomment if needed)
# ============================================================================
# !pip install -qq langchain==0.3.11
# !pip install -qq langchain-openai==0.2.12
# !pip install -qq langchain-community==0.3.11
# !pip install -qq huggingface_hub==0.30.2
# !pip install -qq langchain-core==0.3.63

In [None]:
# Don't run if you want to use only chatgpt
# This is for accessing open LLMs from huggingface
# !pip install -qq transformers==4.46.3

In [None]:
# !pip install -qq langchain_google_genai

## üîê Environment Setup

In [None]:
# ============================================================================
# LOAD API KEYS FROM .env FILE
# ============================================================================
import os 
from dotenv import load_dotenv

load_dotenv()
print("‚úÖ Environment variables loaded!")

---

## üìñ Understanding Model I/O in LangChain

In LangChain, the **language model** is the central component. LangChain provides a unified interface to work with different types of models.

### Key Components

| Type | Definition | Input | Output |
|------|------------|-------|--------|
| **LLMs** | Pure text completion models | Text string | Text string |
| **Chat Models** | Message-based conversational models | List of messages | AIMessage |

> **Note:** In practice, Chat Models are preferred for most applications because they support system prompts, conversation history, and structured outputs.


## üè¢ LLM Providers

LangChain provides a **unified API** for many LLM providers:

| Provider | Commercial | Chat Model Class | LLM Class |
|----------|------------|------------------|-----------|
| OpenAI | Yes | `ChatOpenAI` | `OpenAI` |
| Google | Yes | `ChatGoogleGenerativeAI` | - |
| HuggingFace | Open Source | `ChatHuggingFace` | `HuggingFaceEndpoint` |
| Anthropic | Yes | `ChatAnthropic` | - |

The consistent API means you can **swap providers with minimal code changes**.

---

## ü§ñ Part 1: Accessing OpenAI Models

### Option A: Using OpenAI as an LLM (Text Completion)

The `OpenAI` class provides access to **text completion models** (like `gpt-3.5-turbo-instruct`). These models:
- Take a string prompt
- Return a string response
- Cannot maintain conversation context naturally

> **Note:** This is the older API. Most new applications should use **ChatOpenAI** instead.

In [None]:
# ============================================================================
# OPENAI LLM (Text Completion Model)
# ============================================================================
# The OpenAI class uses the older completion API
# - model_name: The model to use (instruct models only)
# - temperature: 0 = deterministic, 1 = more creative
# ============================================================================

from langchain_openai import OpenAI, ChatOpenAI

# Initialize the LLM (text completion model)
llm_openai = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0)
print(f"‚úÖ OpenAI LLM initialized: gpt-3.5-turbo-instruct")

In [None]:
# ============================================================================
# USING THE LLM
# ============================================================================
# Input: A string prompt
# Output: A string response (notice: NOT an AIMessage object)
# ============================================================================

prompt = """Explain what is Generative AI in 3 bullet points"""
response = llm_openai.invoke(prompt)

print("üìù Prompt:", prompt)
print("\nü§ñ Response (type: string):")
print(response)
print(f"\nüìä Response type: {type(response)}")

### Option B: Using OpenAI as a Chat Model (Recommended)

The `ChatOpenAI` class provides access to **chat models** (like `gpt-3.5-turbo`, `gpt-4o-mini`). These models:
- Support system prompts to set behavior
- Can maintain conversation history
- Return structured `AIMessage` objects

In [None]:
# ============================================================================
# OPENAI CHAT MODEL (Recommended)
# ============================================================================
# ChatOpenAI uses the chat completions API
# - Supports newer models like gpt-3.5-turbo, gpt-4o-mini, gpt-4o
# - Returns AIMessage objects with metadata
# ============================================================================

from langchain_openai import ChatOpenAI

# Initialize the Chat Model
chatgpt = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

prompt = """Explain what is Generative AI in 3 bullet points"""
response = chatgpt.invoke(prompt)

print("üìù Prompt:", prompt)
print("\nü§ñ Response (type: AIMessage):")
print(response.content)
print(f"\nüìä Response type: {type(response)}")

---

## üåê Part 2: Accessing Google Gemini

Google's Gemini models are accessed via the `ChatGoogleGenerativeAI` class.

In [None]:
# ============================================================================
# GOOGLE GEMINI CHAT MODEL
# ============================================================================
# Requires: GOOGLE_API_KEY in your .env file
# Available models: gemini-1.5-flash, gemini-1.5-pro, gemini-2.0-flash
# ============================================================================

from langchain_google_genai import ChatGoogleGenerativeAI

gemini = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)
print("‚úÖ Google Gemini initialized: gemini-1.5-flash")

In [None]:
# Same API as ChatOpenAI - LangChain's unified interface!
prompt = """Explain what is Generative AI in 3 bullet points"""
response = gemini.invoke(prompt)

print("ü§ñ Gemini Response:")
print(response.content)

---

## üè† Part 3: Accessing Local/Open LLMs with HuggingFace

HuggingFace offers **500k+ models** including **90k+ open LLMs**. You can access them in two ways:

| Method | Class | Pros | Cons |
|--------|-------|------|------|
| **Local** | `HuggingFacePipeline` | Privacy, no API costs | Requires GPU |
| **Remote** | `HuggingFaceEndpoint` | No GPU needed | API rate limits |

> **Note:** Local inference requires `transformers` and `pytorch` packages.

In [None]:
# ============================================================================
# HUGGINGFACE LOCAL PIPELINE (Commented - requires GPU)
# ============================================================================
from langchain_huggingface import HuggingFacePipeline

# The code below is commented because it requires significant compute resources

In [None]:
# gemma_params = {
#                   "do_sample": False, # greedy decoding - temperature = 0
#                   "return_full_text": False, # don't return input prompt
#                   "max_new_tokens": 1000, # max tokens answer can go upto
#                 }

# local_llm = HuggingFacePipeline.from_model_id(
#     model_id="microsoft/Phi-3.5-mini-instruct",
#     task="text-generation",
#     pipeline_kwargs=gemma_params,
#     # device=0 # when running on Colab selects the GPU, you can change this if you run it on your own instance if needed
# )

In [None]:
# local_llm

In [None]:
# # Gemma2B when used locally expects input prompt to be formatted in a specific way
# # check more details here: https://huggingface.co/google/gemma-1.1-2b-it#chat-template
# gemma_prompt = """<bos><start_of_turn>user\n""" + prompt + """\n<end_of_turn>
# <start_of_turn>model
# """
# print(gemma_prompt)

In [None]:
# response = local_llm.invoke(gemma_prompt)
# print(response)

### Using HuggingFace Endpoint (Remote Inference)

For users without GPUs, `HuggingFaceEndpoint` provides remote inference through HuggingFace's servers.

In [None]:
# ============================================================================
# HUGGINGFACE ENDPOINT (Remote Inference)
# ============================================================================
# Access HuggingFace models without local GPU
# Requires: HUGGINGFACEHUB_API_TOKEN in your .env file
# ============================================================================

from langchain_huggingface import HuggingFaceEndpoint

repo_id = "microsoft/Phi-3.5-mini-instruct"

phi3_params = {
    "wait_for_model": True,      # Wait if model is loading
    "do_sample": False,          # Greedy decoding (temperature = 0)
    "return_full_text": False,   # Don't echo the input prompt
    "max_new_tokens": 1000,      # Max tokens in response
}

# Note: Uncomment and add your token to use
# llm = HuggingFaceEndpoint(
#     repo_id=repo_id,
#     temperature=0.5,
#     huggingfacehub_api_token=os.getenv("HUGGINGFACEHUB_API_TOKEN"),
#     **phi3_params
# )
print(f"üìã HuggingFace Endpoint configured for: {repo_id}")

In [None]:
# ============================================================================
# WRAPPING AS A CHAT MODEL
# ============================================================================
# ChatHuggingFace wraps a HuggingFace model to use the Chat Model interface
# ============================================================================

from langchain_huggingface import ChatHuggingFace

# Uncomment if you have the llm variable defined above:
# chat_gemma = ChatHuggingFace(llm=llm, model_id='google/gemma-1.1-2b-it')
print("üí° ChatHuggingFace wraps any HuggingFace model as a Chat Model")

In [None]:
# ============================================================================
# DISPLAY RESPONSE (from earlier Gemini call)
# ============================================================================
print("ü§ñ Previous Gemini response:")
print(response.content)

---

## üí¨ Part 4: Message Types for Conversational Prompting

Chat Models process **lists of messages**, where each message has a **role** and **content**.

### Message Types in LangChain

| Message Type | Role | Purpose | Example |
|--------------|------|---------|---------|
| `SystemMessage` | System | Set AI behavior | "You are a helpful assistant" |
| `HumanMessage` | User | User input | "What is AI?" |
| `AIMessage` | Assistant | AI response | "AI stands for..." |

### Message Properties

- **content**: The actual text (string or list for multi-modal)
- **additional_kwargs**: Extra info like `tool_calls` for function calling
- **response_metadata**: Metadata about the response


### Example: Building a Conversation with ChatGPT

Let's build a multi-turn conversation using message types.

In [None]:
# ============================================================================
# INITIALIZE CHAT MODEL FOR CONVERSATION
# ============================================================================

from langchain_openai import ChatOpenAI

chatgpt = ChatOpenAI(model_name="gpt-4o-mini", temperature=0)
print(f"‚úÖ Chat model ready: gpt-4o-mini")

In [None]:
# ============================================================================
# METHOD 1: Dictionary Format (OpenAI-style)
# ============================================================================
# You can pass messages as dictionaries with "role" and "content" keys
# This is familiar if you've used the OpenAI API directly
# ============================================================================

from langchain_core.messages import HumanMessage, AIMessage, SystemMessage

prompt = """Can you explain what is Generative AI in 3 bullet points?"""
sys_prompt = """Act as a helpful assistant and give meaningful examples in your responses."""

# Dictionary format (OpenAI-compatible)
message = [
    {"role": "system", "content": sys_prompt},
    {"role": "user", "content": prompt}
]

response = chatgpt.invoke(message)
print("üìù Using dictionary format for messages")

In [None]:
# View the message structure
print("üìã Message structure (dictionary format):")
for i, msg in enumerate(message):
    print(f"  [{i}] {msg['role']}: {msg['content'][:50]}...")

In [None]:
print("ü§ñ AI Response:")
print(response.content)

In [None]:
# View the full AIMessage object with metadata
print("üìä Full AIMessage object:")
print(f"  Type: {type(response)}")
print(f"  Content length: {len(response.content)} chars")
print(f"  Token usage: {response.usage_metadata}")

In [None]:
# ============================================================================
# METHOD 2: LangChain Message Objects (Recommended)
# ============================================================================
# Using Message classes provides better type safety and IDE support
# ============================================================================

message = [
    SystemMessage(content=sys_prompt),
    HumanMessage(content=prompt)
]

response = chatgpt.invoke(message)
print("üìù Using LangChain Message objects:")
print(response.content)

In [None]:
# View message objects
print("üìã Message objects:")
for i, msg in enumerate(message):
    print(f"  [{i}] {type(msg).__name__}")

In [None]:
# ============================================================================
# BUILDING CONVERSATION HISTORY
# ============================================================================
# To maintain context, append the AI response and new user message to the list
# This is how you build multi-turn conversations!
# ============================================================================

# Add the AI's response to the conversation history
message.append(response)

# Add a new user question
follow_up = """What did we discuss so far?"""
message.append(HumanMessage(content=follow_up))

print("üìã Conversation history:")
for i, msg in enumerate(message):
    role = type(msg).__name__.replace("Message", "")
    content_preview = msg.content[:40] + "..." if len(msg.content) > 40 else msg.content
    print(f"  [{i}] {role}: {content_preview}")

In [None]:
# ============================================================================
# CONTINUE THE CONVERSATION
# ============================================================================
# The model now has full context of the conversation!
# ============================================================================

response = chatgpt.invoke(message)

print("ü§ñ AI Response (with conversation context):")
print(response.content)

# ============================================================================
# üìù KEY TAKEAWAYS FROM THIS NOTEBOOK:
# ============================================================================
# 1. LLMs: Text completion models (string in ‚Üí string out)
# 2. Chat Models: Message-based models (messages in ‚Üí AIMessage out)
# 3. Chat Models are preferred for most applications
# 4. LangChain provides a unified API across providers (OpenAI, Google, HuggingFace)
# 5. Message types: SystemMessage, HumanMessage, AIMessage
# 6. Build conversations by appending messages to a list
# ============================================================================