## üìã Types of Models in LangChain

### **1. LLMs (Language Models)**
- Input: String
- Output: String
- Use case: Simple text completion (legacy, less common now)

### **2. Chat Models**
- Input: List of messages
- Output: Message
- Use case: **Most common** - conversational AI, structured interactions
- Examples: GPT-4, Claude, Gemini, Llama

### **3. Text Embedding Models**
- Input: Text
- Output: Vector of numbers (embeddings)
- Use case: Semantic search, RAG, similarity comparison

In [1]:
from langchain_ollama import ChatOllama

In [2]:
llm = ChatOllama(
    model="llama3.2",
    temperature=0.8
)

In [3]:
response = llm.invoke("Explain machine learning to a 10 year old")

In [5]:
print(response.content)

Imagine you have a super smart robot friend who can learn new things from you. That's basically what machine learning is!

Machine learning is a way for computers to get better at doing things on their own, without being told exactly how to do it. It's like teaching your robot friend a game or a skill.

Here's an example: Let's say you want your robot friend to recognize pictures of different animals, like dogs and cats. At first, the robot might not be very good at it, but as you show it more and more pictures, it starts to learn what makes a dog look like a dog and what makes a cat look like a cat.

Over time, the robot gets better and better at recognizing animals, until it can do it all by itself! That's basically machine learning in action.

There are different types of machine learning, but one common type is called "supervised learning". This means that the computer is shown lots of examples of something (like pictures of animals) and then tries to figure out how to recognize it

In [6]:
llm = ChatOllama(
    model="llama3.2",
    
    # Creativity control
    temperature=0.7,  # 0-1, higher = more random
    
    # Length control
    max_tokens=500,  # Maximum response length
    
    # Determinism
    seed=42,  # Some providers support reproducible outputs
    
    # Advanced sampling
    top_p=0.9,  # Nucleus sampling (alternative to temperature)
    frequency_penalty=0.0,  # Penalize repetition (-2.0 to 2.0)
    presence_penalty=0.0,   # Encourage new topics (-2.0 to 2.0)
    
    # Streaming
    streaming=True,  # Enable token-by-token streaming
)

### **Temperature Guide**

In [9]:
# Temperature = 0 ‚Üí Deterministic, factual
llm_factual = ChatOllama(model="llama3.2",temperature=0)
# Good for: Math, coding, factual Q&A

# Temperature = 0.7 ‚Üí Balanced
llm_balanced = ChatOllama(model="llama3.2",temperature=0.7)
# Good for: General conversation, explanations

# Temperature = 1.0+ ‚Üí Creative
llm_creative = ChatOllama(model="llama3.2",temperature=1.2)
# Good for: Creative writing, brainstorming

## üîç Understanding Model Response Objects

In [10]:
llm = ChatOllama(model="llama3.2")
response = llm.invoke("Hello!")

In [11]:
print(type(response))

<class 'langchain_core.messages.ai.AIMessage'>


In [12]:
print(response.content)

How can I assist you today?


In [13]:
print(response.response_metadata)

{'model': 'llama3.2', 'created_at': '2025-11-14T16:30:07.718005278Z', 'done': True, 'done_reason': 'stop', 'total_duration': 90156590, 'load_duration': 26800635, 'prompt_eval_count': 27, 'prompt_eval_duration': 9893106, 'eval_count': 8, 'eval_duration': 53180020, 'model_name': 'llama3.2', 'model_provider': 'ollama'}


## üöÄ Async Support 

In [14]:
import asyncio
from langchain_ollama import ChatOllama

In [16]:
async def async_example():
    llm = ChatOllama(model="llama3.2")

    # single async call
    response = await llm.ainvoke("what is async programming")
    print(response.content)

    # batch async calls (will be executed in parallel)
    prompts = [
        "what is python?",
        "what is Javascript",
        "What is Rust",
    ]

    responses = await llm.abatch(prompts)
    for i, resp in enumerate(responses):
        print(f"\n{prompts[i]}")
        print(resp.content)

await async_example()

Async (Asynchronous) Programming is a paradigm that allows for non-blocking, concurrent execution of tasks. It enables multiple tasks to run simultaneously without waiting for each other, improving the overall performance and responsiveness of an application.

**What is asynchronous?**

In traditional synchronous programming, a task completes one after another, where each task waits for the previous one to finish before starting. In contrast, asynchronous programming allows tasks to execute concurrently, allowing them to overlap in time. This means that while one task is executing, another task can start running without waiting for the first one to complete.

**Key characteristics of async programming:**

1. **Non-blocking**: Async tasks do not block or wait for each other to finish.
2. **Concurrent execution**: Multiple tasks can run simultaneously.
3. **Event-driven**: Async programming is often driven by events, such as user input, network requests, or timer events.

**Benefits of a

## üåä Streaming Responses

In [17]:
for chunk in llm.stream("write a short poen about python programming."):
    print(chunk.content, end="", flush=True)
print()

Here's a short poem:

In the land of code, where variables roam,
Python reigns supreme, in its coding home.
Indentation is key, to logic so fine,
A syntax so gentle, it's truly divine.

Indians and foreigners alike unite,
To write with ease, through day and night.
Loops and lists, they dance with glee,
In this world of code, where Python sets me free.
