# Comprehensive Guide to Using Mistral AI with LangChain

- Author: Martin Fockedey with the help of copilot
- This is based on [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)

## Overview

This comprehensive tutorial demonstrates how to effectively use **Mistral AI models** with **LangChain**. You'll learn to:

- Set up and configure Mistral AI chat models
- Create and use prompt templates with Mistral
- Build LCEL (LangChain Expression Language) chains
- Implement streaming, batch processing, and async operations
- Execute parallel chains for efficient processing

### Table of Contents

- [Overview](#overview)
- [Environment Setup](#environment-setup)
- [Mistral AI Models Overview](#mistral-ai-models-overview)
- [Basic Usage of ChatMistralAI](#basic-usage-of-chatmistralai)
- [Building Chains with LCEL](#building-chains-with-lcel)
- [LCEL Interfaces](#lcel-interfaces)
- [Parallel Execution](#parallel-execution)

### References

- [Mistral AI Documentation](https://docs.mistral.ai/)
- [LangChain Mistral Integration](https://python.langchain.com/docs/integrations/chat/mistralai)
- [LangChain Expression Language](https://python.langchain.com/docs/expression_language/)

---

## Environment Setup
Set up the environment.

**[Note]**
- You have one on your Teams group channel.
- Store your API key in a `.env` file as `MISTRAL_API_KEY`


In [1]:
%pip install -q python-dotenv langchain_mistralai

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.3
[notice] To update, run: C:\Users\FKY\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [4]:
# Configuration file to manage the API KEY as an environment variable
from dotenv import load_dotenv

# Load API KEY information
load_dotenv(override=True)

True

## LangChain
We use LangChain because it provides a unified, modular framework for building complex AI applications that can **easily switch between different LLM providers without rewriting code**. It offers **powerful abstractions** like chains, agents, and memory that enable developers to create sophisticated workflows such as conversational AI with context, retrieval-augmented generation (RAG), and autonomous agents that can use tools. Additionally, its **composable architecture (LCEL)** makes it easy to build, test, and maintain complex AI pipelines by breaking them into reusable components that can be combined in different ways.

## Mistral AI Models Overview

Mistral AI offers several powerful language models:

### Text-Only Models

| Model | Description | Use Case |
|-------|-------------|----------|
| `mistral-large-latest` | Most capable text model | Complex reasoning, coding |
| `mistral-small-latest` | Efficient and cheaper text model | General tasks, cost-effective |
| `open-mistral-7b` | Open-source base model | Fine-tuning, experimentation |
| `open-mixtral-8x7b` | Mixture of experts | High performance tasks |
| `open-mixtral-8x22b` | Larger MoE model | Advanced reasoning |

### Multimodal Models (Text + Images)

| Model | Description | Use Case |
|-------|-------------|----------|
| `pixtral-12b-2409` | Multimodal model | Image + text tasks |
| `pixtral-large-latest` | Larger multimodal | Complex vision tasks |

**Note**: Mistral models do NOT support:
- ‚ùå `logprobs` (token log probabilities)
- ‚ùå Some OpenAI-specific features

For this tutorial, we'll primarily use `mistral-small-latest` for text-based tasks.

## Basic Usage of ChatMistralAI

Let's start by creating a basic ChatMistralAI object and making simple queries.

### Parameters:

- `temperature`: Controls randomness (0.0 = deterministic, 1.0 = sample from the output distribution, 2.0 = sample from a modified output distribution)
- `model`: Specifies which Mistral model to use
- `max_tokens`: Maximum number of tokens in the response

In [None]:
from langchain_mistralai import ChatMistralAI

# Create the ChatMistralAI object
llm = ChatMistralAI(
    temperature=0.1,  # Low temperature for more focused responses
    model="mistral-small-latest", 
)

question = "What is the capital of France?"

print(f"[Answer]: {llm.invoke(question)}")

[Answer]: content='The capital of France is **Paris**.\n\nParis is known for its iconic landmarks such as the **Eiffel Tower**, **Louvre Museum**, and **Notre-Dame Cathedral**, as well as its rich history, culture, and cuisine. It is one of the most visited cities in the world.\n\nWould you like to know more about Paris or France? üòä' additional_kwargs={} response_metadata={'token_usage': {'prompt_tokens': 10, 'total_tokens': 86, 'completion_tokens': 76}, 'model': 'mistral-small-latest', 'finish_reason': 'stop'} id='run--94da3af0-5781-474c-a4c6-4f6b129d011d-0' usage_metadata={'input_tokens': 10, 'output_tokens': 76, 'total_tokens': 86}


### Response Format (AI Message)

When using the `ChatMistralAI` object, the response is returned as an AI Message with:

### Response Format (AI Message)
When using the ```ChatOpenAI``` object, the response is returned in the format of an AI Message. This includes the text content generated by the model along with any metadata or additional properties associated with the response. These provide structured information about the AI's reply and how it was generated.

**Key Components of AI Message**
1. **```content```**  
   - **Definition:** The primary response text generated by the AI.  
   - **Example:** **"The capital of South Korea is Seoul."**
   - **Purpose:** This is the main part of the response that users interact with.

2. **```response_metadata```**  
   - **Definition:** Metadata about the response generation process.  
   - **Key Fields:**
     - **```model_name``` :** Name of the model used (e.g., ```"gpt-4o-mini"``` ).
     - **```finish_reason``` :** Reason the generation stopped (**stop** for normal completion).
     - **```token_usage``` :** Token usage details:
       - **```prompt_tokens``` :** Tokens used for the input query.
       - **```completion_tokens``` :** Tokens used for the response.
       - **```total_tokens``` :** Combined token count.

3. **```id```**  
   - **Definition:** A unique identifier for the API call.  
   - **Purpose:** Useful for tracking or debugging specific interactions.

In [32]:
# Query content
question = "What is the capital of Japan?"

# Get full response
response = llm.invoke(question)
response

AIMessage(content="The capital of Japan is **Tokyo**. It is the largest city in Japan and serves as the country's political, economic, and cultural center. Tokyo is known for its bustling streets, modern technology, and rich history, including landmarks like the Imperial Palace, Shibuya Crossing, and historic temples.", additional_kwargs={}, response_metadata={'token_usage': {'prompt_tokens': 10, 'total_tokens': 72, 'completion_tokens': 62}, 'model': 'mistral-small-latest', 'finish_reason': 'stop'}, id='run--142e288b-a0c3-4993-86d1-689f1998c932-0', usage_metadata={'input_tokens': 10, 'output_tokens': 62, 'total_tokens': 72})

In [33]:
# Extract key components
content = response.content
model_name = response.response_metadata.get("model", "Unknown")
token_usage = response.response_metadata.get("token_usage", {})

# Print results
print(f"Response: {content}")
print(f"Model: {model_name}")
print(f"Token Usage: {token_usage}")

Response: The capital of Japan is **Tokyo**. It is the largest city in Japan and serves as the country's political, economic, and cultural center. Tokyo is known for its bustling streets, modern technology, and rich history, including landmarks like the Imperial Palace, Shibuya Crossing, and historic temples.
Model: mistral-small-latest
Token Usage: {'prompt_tokens': 10, 'total_tokens': 72, 'completion_tokens': 62}


### Streaming Output

The streaming option allows you to receive real-time responses token by token.

In [8]:
answer = llm.stream(
    "Please provide 5 beautiful tourist destinations in Europe along with brief descriptions!"
)

# Streaming real-time output
for token in answer:
    print(token.content, end="", flush=True)

Certainly! Here are five stunning tourist destinations in Europe, each offering unique beauty and cultural richness:

### 1. **Santorini, Greece**
   - Famous for its whitewashed buildings, blue-domed churches, and breathtaking sunsets over the Aegean Sea. The volcanic island offers luxury resorts, wine tours, and stunning cliffside views in Oia.

### 2. **Plitvice Lakes National Park, Croatia**
   - A UNESCO World Heritage Site featuring cascading turquoise lakes, waterfalls, and lush forests. The wooden walkways and boat rides make it a paradise for nature lovers.

### 3. **Hallstatt, Austria**
   - A picturesque lakeside village nestled in the Alps, known for its charming pastel houses, crystal-clear lake, and salt mines. It‚Äôs often called one of the most beautiful places on Earth.

### 4. **Cinque Terre, Italy**
   - A colorful coastal region in Liguria with five vibrant villages (Monterosso, Vernazza, Corniglia, Manarola, and Riomaggiore). Hiking trails, vineyards, and cliffside

## Building Chains with LCEL

LCEL (LangChain Expression Language) uses a simple pipe operator (`|`) to chain components together, creating powerful data processing pipelines. This powerful pattern offers several benefits: modularity allows each component (prompt, model, parser) to be developed and tested independently; reusability enables components to be mixed and matched across different chains; readability makes the data flow intuitive and easy to understand through the pipe operator; composability enables complex workflows to be built by combining simple chains; and maintainability ensures that changes to one component don't affect others in the chain.

### Basic Chain: Prompt + Model + Output Parser

In [9]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define template
template = "What is the capital of {country}?"

# Create a PromptTemplate
prompt_template = PromptTemplate.from_template(template)
prompt_template

PromptTemplate(input_variables=['country'], input_types={}, partial_variables={}, template='What is the capital of {country}?')

In [10]:
# Test the prompt
prompt = prompt_template.format(country="Germany")
print(prompt)

What is the capital of Germany?


In [None]:
# Initialize the Mistral model
model = ChatMistralAI(model="mistral-small-latest", temperature=0.1)

# Create a simple chain that would take country as input and return the 
# capital city without applying the format function
chain = prompt_template | model

# Test the chain
input_data = {"country": "Spain"}
chain.invoke(input_data)

AIMessage(content="The capital of Spain is **Madrid**. It is the largest city in Spain and serves as the country's political, economic, and cultural center. Madrid is also home to the Spanish royal palace, the government headquarters, and numerous museums, including the famous **Prado Museum** and **Reina Sof√≠a Museum**.\n\nWould you like to know more about Madrid or Spain in general? üòä", additional_kwargs={}, response_metadata={'token_usage': {'prompt_tokens': 10, 'total_tokens': 90, 'completion_tokens': 80}, 'model': 'mistral-small-latest', 'finish_reason': 'stop'}, id='run--c5c2bf2f-8a4f-4776-9566-5b583b93fd31-0', usage_metadata={'input_tokens': 10, 'output_tokens': 80, 'total_tokens': 90})

### Adding an Output Parser

An output parser converts the AI message to a string format.

In [None]:
# Add output parser to the chain to get clean string output from the dict response
output_parser = StrOutputParser()
chain = prompt_template | model | output_parser

# Now the output is a clean string
result = chain.invoke({"country": "Italy"})
print(f"Result type: {type(result)}")
print(f"Result: {result}")

Result type: <class 'str'>
Result: The capital of Italy is **Rome**.

Rome is not only the capital city but also one of the most historically significant cities in the world, known for its ancient ruins, art, and cultural heritage. It was the center of the Roman Empire and remains a major global city today.


### Complex Template Example

In [13]:
template = """You are an expert {role} with 10 years of experience.
Please provide advice on the following topic.

Topic: {topic}

Format your response as:
- Main Advice:
- Key Points:
- Recommendations:
"""

# Create prompt and chain
prompt = PromptTemplate.from_template(template)
model = ChatMistralAI(model="mistral-small-latest", temperature=0.3)
output_parser = StrOutputParser()

chain = prompt | model | output_parser

In [14]:
# Execute the chain
response = chain.invoke({
    "role": "software architect",
    "topic": "designing scalable microservices"
})

print(response)

### **Designing Scalable Microservices**

#### **Main Advice:**
When designing scalable microservices, prioritize **loose coupling, high cohesion, and resilience** while ensuring the system can handle growth in traffic, data, and complexity. Focus on **autonomy, observability, and fault tolerance** to maintain performance and reliability as the system evolves.

---

#### **Key Points:**
1. **Domain-Driven Design (DDD)**
   - Align microservices with business domains (bounded contexts) to ensure clear ownership and minimal inter-service dependencies.
   - Use **event-driven architecture** for asynchronous communication where appropriate.

2. **Statelessness & Caching**
   - Design services to be stateless where possible to enable horizontal scaling.
   - Implement **distributed caching** (e.g., Redis, Memcached) to reduce database load.

3. **Resilience & Fault Tolerance**
   - Use **circuit breakers** (e.g., Hystrix, Resilience4j) to prevent cascading failures.
   - Implement **retries

## LCEL Interfaces

The LCEL Runnable protocol provides several standard interfaces:
These interfaces allow you to interact with LCEL chains in different ways depending on your use case:

### Synchronous Methods:

- **`invoke()`**: Use when you need a single, immediate response. Perfect for one-off queries or interactive applications where you process one request at a time.

- **`stream()`**: Use when you want to display responses progressively to users. Ideal for chatbots or UIs where showing partial results improves user experience and perceived responsiveness.

- **`batch()`**: Use when you have multiple inputs to process efficiently. Optimizes resource usage by grouping network requests together, useful for bulk data processing or generating multiple variations.

### Asynchronous Methods:

Use in async/await contexts when you need non-blocking execution

- `ainvoke()`: Async single execution
- `astream()`: Async streaming
- `abatch()`: Async batch execution

In [16]:
# Create a simple chain for demonstration
prompt = PromptTemplate.from_template("Explain {topic} in 2 sentences.")
model = ChatMistralAI(model="mistral-small-latest")
chain = prompt | model | StrOutputParser()

### invoke(): Single Execution

In [17]:
# Single invocation
result = chain.invoke({"topic": "quantum computing"})
print(result)

Quantum computing leverages the principles of quantum mechanics, such as superposition and entanglement, to process information in ways that classical computers cannot, enabling faster solutions to complex problems like cryptography, optimization, and simulation. Unlike classical bits, quantum bits (qubits) can exist in multiple states simultaneously, allowing for exponential computational power in certain tasks.


### stream(): Real-time Streaming

In [18]:
# Streaming output
for token in chain.stream({"topic": "blockchain technology"}):
    print(token, end="", flush=True)

Blockchain is a decentralized digital ledger that records transactions across a network of computers, ensuring transparency, security, and immutability through cryptographic hashing and consensus mechanisms. It eliminates the need for intermediaries by allowing peer-to-peer verification and trustless transactions.

### batch(): Process Multiple Inputs

In [19]:
# Batch processing
topics = [
    {"topic": "artificial intelligence"},
    {"topic": "machine learning"},
    {"topic": "deep learning"}
]

results = chain.batch(topics)
for i, result in enumerate(results):
    print(f"\n{i+1}. {result}")


1. Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to learn, reason, and perform tasks autonomously. It encompasses technologies like machine learning, natural language processing, and robotics to solve complex problems and improve decision-making.

2. Machine learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed, using algorithms to identify patterns in data. It involves training models on data to make predictions or decisions, improving accuracy over time as they process more information.

3. Deep learning is a subset of machine learning that uses artificial neural networks with multiple layers to automatically learn and extract hierarchical features from data. It excels at tasks like image recognition, natural language processing, and decision-making by processing vast amounts of data through deep, interconnected layers.


### Controlling Concurrency in Batch

Use `max_concurrency` to control how many requests run simultaneously. This can be usefull to respect rate limits set by the LLM APIs providers and to smoot the workload.

In [20]:
# Batch with concurrency control
topics = [
    {"topic": "neural networks"},
    {"topic": "natural language processing"},
    {"topic": "computer vision"},
    {"topic": "reinforcement learning"},
    {"topic": "generative AI"}
]

results = chain.batch(topics, config={"max_concurrency": 2})
for i, result in enumerate(results):
    print(f"\n{i+1}. {result}")


1. Neural networks are computational models inspired by the human brain, consisting of interconnected layers of artificial neurons that process and transmit information to learn patterns from data. They can recognize complex relationships, make predictions, and improve their accuracy through training on large datasets.

2. Natural language processing (NLP) is a branch of artificial intelligence that enables computers to understand, interpret, and generate human language by analyzing text and speech data. It powers applications like chatbots, translation tools, and sentiment analysis by using algorithms to process and derive meaning from language patterns.

3. Computer vision is a field of artificial intelligence that enables machines to interpret and understand visual data from the world, such as images or videos, by using algorithms and deep learning models. It allows applications like object detection, facial recognition, and autonomous navigation by analyzing patterns and extractin

### Async Methods

Async methods are useful for concurrent operations in async environments.

In [None]:
import asyncio

# Async stream - demonstrate asynchronicity with multiple concurrent streams running at the same time
async def demonstrate_async_streaming():
    topics = ["edge computing", "quantum computing", "blockchain"]
    
    async def stream_topic(topic_name):
        print(f"\n\n[Starting stream for: {topic_name}]")
        async for token in chain.astream({"topic": topic_name}):
            print(f"[{topic_name[:4]}] {token}", end="", flush=True)
        print(f"\n[Completed: {topic_name}]")
    
    # Run all streams concurrently to show async behavior
    await asyncio.gather(*[stream_topic(topic) for topic in topics])

# Execute the async function
await demonstrate_async_streaming()



[Starting stream for: edge computing]


[Starting stream for: quantum computing]


[Starting stream for: blockchain]
[edge] [quan] [bloc] [edge] Edge[quan] Quant[bloc] Block[edge]  computing[quan] um[bloc] chain[quan]  computing[edge]  processes[bloc]  is[edge]  data[quan]  lever[quan] ages[quan]  the[bloc]  a[edge]  closer[quan]  principles[bloc]  decentral[quan]  of[bloc] ized[edge]  to[bloc]  digital[quan]  quantum[quan]  mechanics[bloc]  led[edge]  where[edge]  it[edge] ‚Äôs[quan] ,[quan]  such[edge]  generated[bloc] ger[edge]  ([edge] like[quan]  as[bloc]  that[quan]  super[bloc]  records[quan] position[edge]  Io[edge] T[edge]  devices[bloc]  transactions[quan]  and[edge]  or[quan]  ent[bloc]  across[quan] ang[edge]  sensors[bloc]  a[quan] lement[quan] ,[bloc]  network[edge] )[quan]  to[bloc]  of[edge]  to[edge]  reduce[quan]  perform[bloc]  computers[edge]  latency[quan]  calculations[bloc]  in[edge]  and[quan]  exponentially[edge]  bandwidth[bloc]  a[quan]  faster[edge]  usage

In [22]:
# Async invoke
result = await chain.ainvoke({"topic": "cloud computing"})
print(result)

Cloud computing delivers computing services‚Äîlike servers, storage, databases, and software‚Äîover the internet, allowing users to access and scale resources on demand without managing physical infrastructure. It offers flexibility, cost-efficiency, and global accessibility by leveraging remote data centers.


In [23]:
# Async batch
topics = [
    {"topic": "IoT"},
    {"topic": "5G technology"},
    {"topic": "cybersecurity"}
]

results = await chain.abatch(topics)
for i, result in enumerate(results):
    print(f"\n{i+1}. {result}")


1. The Internet of Things (IoT) refers to a network of physical devices embedded with sensors, software, and connectivity to collect and exchange data over the internet. It enables smart automation, real-time monitoring, and data-driven decision-making across industries like healthcare, manufacturing, and smart homes.

2. 5G is the fifth generation of wireless technology, offering significantly faster speeds, lower latency, and greater capacity than previous generations, enabling advanced applications like IoT, autonomous vehicles, and virtual reality. It operates on higher frequency bands and uses advanced technologies like beamforming and network slicing to deliver more efficient and reliable connectivity.

3. Cybersecurity is the practice of protecting internet-connected systems, including hardware, software, and data, from digital attacks, damage, or unauthorized access. It involves implementing technologies, processes, and policies to safeguard sensitive information and maintain 

## Parallel Execution

LCEL supports parallel execution using `RunnableParallel`, allowing multiple chains to run simultaneously.

This is useful when you need to:
- Get different types of information about the same topic
- Process the same input through different chains
- Combine results from multiple models or prompts

In [24]:
from langchain_core.runnables import RunnableParallel

# Create multiple chains
capital_chain = (
    PromptTemplate.from_template("What is the capital of {country}?")
    | model
    | StrOutputParser()
)

population_chain = (
    PromptTemplate.from_template("What is the population of {country}?")
    | model
    | StrOutputParser()
)

language_chain = (
    PromptTemplate.from_template("What languages are spoken in {country}?")
    | model
    | StrOutputParser()
)

# Combine chains in parallel
parallel_chain = RunnableParallel(
    capital=capital_chain,
    population=population_chain,
    language=language_chain
)

In [25]:
# Execute individual chains
print("Capital:", capital_chain.invoke({"country": "Japan"}))
print("\nPopulation:", population_chain.invoke({"country": "Japan"}))
print("\nLanguage:", language_chain.invoke({"country": "Japan"}))

Capital: The capital of Japan is **Tokyo**. It is the largest city in Japan and serves as the country's political, economic, and cultural center. Tokyo is also home to the Imperial Palace, the primary residence of the Emperor of Japan.

Would you like to know more about Tokyo or other aspects of Japan?

Population: As of the latest estimates (2024), the population of Japan is approximately **123.7 million** people.

Japan has been experiencing a long-term decline in population due to low birth rates and an aging society. The population has been decreasing since around 2010, and projections suggest it will continue to shrink in the coming decades.

For the most precise and up-to-date figures, you can refer to official sources like the **Statistics Bureau of Japan** or the **United Nations World Population Prospects**.

Would you like additional demographic details (e.g., age distribution, urban vs. rural populations)?

Language: In Japan, the primary language spoken is **Japanese (Êó•Êú

In [26]:
# Execute all chains in parallel
result = parallel_chain.invoke({"country": "Japan"})
print("Parallel execution results:")
print(f"Capital: {result['capital']}")
print(f"Population: {result['population']}")
print(f"Language: {result['language']}")

Parallel execution results:
Capital: The capital of Japan is **Tokyo**. It is the largest city in Japan and serves as the country's political, economic, and cultural center.

Tokyo became the official capital in 1869 when the Meiji government moved the imperial court from Kyoto to Edo (renamed Tokyo). Today, it is a bustling metropolis known for its modern skyscrapers, historic temples, and vibrant districts like Shibuya and Shinjuku.
Population: As of 2024, the estimated population of **Japan** is approximately **124.3 million** people, according to the latest data from the **Statistics Bureau of Japan** and the **United Nations**.

Japan has been experiencing a **declining population** due to low birth rates and an aging society. The population has been decreasing since 2010, and projections suggest it could fall below **100 million by 2050** if current trends continue.

Would you like more details on demographics, such as age distribution or regional population changes?
Language: In

### Batch Processing with Parallel Chains

In [27]:
# Batch process multiple countries
countries = [
    {"country": "Brazil"},
    {"country": "Australia"}
]

results = parallel_chain.batch(countries)

for i, result in enumerate(results):
    print(f"\n=== Country {i+1} ===")
    print(f"Capital: {result['capital']}")
    print(f"Population: {result['population']}")
    print(f"Language: {result['language']}")


=== Country 1 ===
Capital: The capital of Brazil is **Bras√≠lia**. It was officially inaugurated as the capital on April 21, 1960, replacing Rio de Janeiro. Bras√≠lia is known for its modernist architecture and urban planning, designed by urban planner L√∫cio Costa and architect Oscar Niemeyer. It is a UNESCO World Heritage Site.
Population: As of 2024, the estimated population of Brazil is approximately **216 million people**, making it the **6th most populous country in the world**.

For the most up-to-date figures, you can check sources like:
- **IBGE (Brazilian Institute of Geography and Statistics)** ‚Äì [www.ibge.gov.br](https://www.ibge.gov.br)
- **World Bank** ‚Äì [data.worldbank.org](https://data.worldbank.org)
- **United Nations Population Division** ‚Äì [population.un.org](https://population.un.org)

Would you like historical population trends or demographic breakdowns?
Language: Brazil is a linguistically diverse country, but the official language is **Portuguese**, spoken

## Using Mistral AI Without LangChain

While LangChain provides a convenient abstraction layer, you can also use Mistral AI directly through their official Python client library. This approach gives you more direct control and can be simpler for basic use cases.

### Installation

Install the official Mistral AI client library:

In [1]:
%pip install -q mistralai

Note: you may need to restart the kernel to use updated packages.


  You can safely remove it manually.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-chroma 0.1.2 requires langchain-core<0.3,>=0.1.40, but you have langchain-core 0.3.63 which is incompatible.
langchain-nomic 0.1.2 requires langchain-core<0.3,>=0.1.46, but you have langchain-core 0.3.63 which is incompatible.
langchain-nomic 0.1.2 requires pillow<11.0.0,>=10.3.0, but you have pillow 11.3.0 which is incompatible.

[notice] A new release of pip is available: 24.3.1 -> 25.3
[notice] To update, run: C:\Users\FKY\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


### Basic Example: Simple Chat Completion

Here's a simple example of using Mistral AI directly:

In [2]:
import os
from mistralai import Mistral

# Initialize the Mistral client with your API key
client = Mistral(api_key=os.getenv("MISTRAL_API_KEY"))

# Create a simple chat completion
response = client.chat.complete(
    model="mistral-small-latest",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ]
)

# Print the response
print("Response:", response.choices[0].message.content)
print("\nModel used:", response.model)
print("Tokens used:", response.usage)

Response: The capital of France is **Paris**.

Paris is known for its iconic landmarks such as the **Eiffel Tower**, **Louvre Museum**, and **Notre-Dame Cathedral**, as well as its rich history, culture, and influence in art, fashion, and cuisine. It is one of the most visited cities in the world.

Would you like to know more about Paris or France? üòä

Model used: mistral-small-latest
Tokens used: prompt_tokens=10 completion_tokens=83 total_tokens=93 prompt_audio_seconds=Unset()


## Summary

In this comprehensive tutorial, you learned:

1. **Mistral AI Models**: Different model types and their capabilities
2. **Basic Usage**: How to create and use ChatMistralAI objects
3. **LCEL Chains**: Building chains with prompts, models, and parsers
4. **LCEL Interfaces**: Using invoke, stream, batch, and async methods
5. **Parallel Execution**: Running multiple chains simultaneously

### Key Takeaways:

- Mistral models are powerful alternatives to OpenAI for text-based tasks
- LCEL provides a clean, composable way to build AI applications
- Parallel execution can significantly improve efficiency
- Streaming enables real-time user feedback
- Batch processing is useful for handling multiple inputs

### Next Steps:

- Explore more complex prompt templates
- Implement error handling and retries
- Combine Mistral with other LangChain components (memory, tools, agents)
- Try multimodal models like Pixtral for vision tasks