## LangChain RAG Expert with LangGraph State Management (using Mistral AI)
This Jupyter Notebook demonstrates how to build a sophisticated AI agent using LangChain and LangGraph, now integrated with Mistral AI API for the Large Language Model. The agent will act as a "Helpful Historical Expert" with the following key enhancements:

Retrieval-Augmented Generation (RAG): The agent will query a local knowledge base (a text file) to retrieve relevant information before generating a response, ensuring factual accuracy and reducing hallucinations.

LangGraph for State Management: We will use LangGraph to define a stateful workflow, allowing the agent to manage its internal state (e.g., current question, retrieved context) and execute steps like retrieval and response generation conditionally.

Enhanced Guardrails: The detailed system prompt from prompt.txt will continue to guide the AI's persona, tone, and adherence to safety and scope constraints.

Google Gemini Integration: The core language model for generation will be Google Gemini.

We will test the system with both an on-topic historical question (which should leverage RAG) and an off-topic question (which should trigger the guardrails).

---

### 1. Setup and Installation
First, we need to install all the necessary libraries. This includes LangChain components, langchain-mistralai for MistralAI integration, langchain-openai for embeddings (as MistralAI's native embeddings might require re-indexing the vector store, keeping OpenAI for consistency here), LangGraph for state management, FAISS for vector storage, and Tiktoken for tokenization..

In [21]:
# Install necessary libraries
%pip install -U langchain langchain-mistralai langchain-openai langgraph faiss-cpu tiktoken

Note: you may need to restart the kernel to use updated packages.


In [22]:
import os
import sys
from openai import OpenAI
from langchain_mistralai import ChatMistralAI # Changed to ChatMistralAI for Mistral
from langchain_openai import OpenAIEmbeddings # Still using OpenAI for embeddings
from langchain_core.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated, List
import operator


# --- Set your Mistral AI API Key ---
# It's highly recommended to set this as an environment variable for security.
# You can do this in your terminal before starting Jupyter:
# export MISTRAL_API_KEY='your_mistral_api_key_here' (Linux/macOS)
# $env:MISTRAL_API_KEY='your_mistral_api_key_here' (PowerShell)
#
# You will also need your OpenAI API key for embeddings:
# export OPENAI_API_KEY='your_openai_api_key_here'
#
# If you must set them directly in the notebook (NOT recommended for production):
# os.environ["MISTRAL_API_KEY"] = "YOUR_ACTUAL_MISTRAL_API_KEY"
# os.environ["OPENAI_API_KEY"] = "YOUR_ACTUAL_OPENAI_API_KEY"

# Verify API keys are set
if "MISTRAL_API_KEY" not in os.environ:
    print("WARNING: MISTRAL_API_KEY environment variable not set.")
    print("Please set it before proceeding, or uncomment the line above to set it directly (not recommended).")
else:
    print("MISTRAL_API_KEY is set.")

if "OPENAI_API_KEY" not in os.environ:
    print("WARNING: OPENAI_API_KEY environment variable not set (needed for embeddings).")
    print("Please set it before proceeding, or uncomment the line above to set it directly (not recommended).")
else:
    print("OPENAI_API_KEY is set.")


# Print the Python executable path to help debug environment issues
print(f"Python executable: {sys.executable}")

# Initialize the ChatMistralAI model for generation
# Using mistral-large. You might choose other models available via Mistral AI.
llm = ChatMistralAI(model="mistral-medium", temperature=0.7) # Adjust temperature for creativity (0.0 for deterministic)

# Initialize OpenAIEmbeddings for RAG (keeping consistent with previous notebooks)
embeddings = OpenAIEmbeddings()

print("LangChain, LangGraph, Mistral AI, and OpenAI setup complete.")
print("\nIf you still encounter 'ModuleNotFoundError' after running this cell, please try:")
print("1. Restarting your Jupyter kernel (Kernel -> Restart Kernel...)")
print("2. Running this setup cell again.")


MISTRAL_API_KEY is set.
OPENAI_API_KEY is set.
Python executable: d:\Workspace\ai-dev\MultiModel-LangChain-RAG-LangGraph-AI-Expert\.venv\Scripts\python.exe
LangChain, LangGraph, Mistral AI, and OpenAI setup complete.

If you still encounter 'ModuleNotFoundError' after running this cell, please try:
1. Restarting your Jupyter kernel (Kernel -> Restart Kernel...)
2. Running this setup cell again.


### 2. Create the prompt.txt File
Create or update a file named prompt.txt in the same directory as this Jupyter Notebook. This file will contain the detailed system prompt with all the guardrails for your Historical Expert.

prompt.txt content

### 3. Create Knowledge Base File (pisa_history.txt)
Create a new file named pisa_history.txt in the same directory as this Jupyter Notebook. This file will serve as our knowledge base for RAG.

pisa_history.txt content (example, feel free to expand):



### 4. RAG Setup: Create Retriever
Here, we'll load our pisa_history.txt file, split it into manageable chunks, create embeddings for these chunks, and then store them in a FAISS vector store to enable efficient retrieval.

In [23]:
# --- RAG Setup ---
rag_file_path = "pisa_history.txt"

# 1. Load the document
try:
    loader = TextLoader(rag_file_path, encoding="utf-8")
    documents = loader.load()
    print(f"Successfully loaded RAG document from '{rag_file_path}'")
except FileNotFoundError:
    print(f"Error: The RAG file '{rag_file_path}' was not found. Please create it.")
    exit()
except Exception as e:
    print(f"An error occurred while loading the RAG document: {e}")
    exit()

# 2. Split the document into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(documents)
print(f"Split document into {len(splits)} chunks.")

# 3. Create a FAISS vector store from the chunks and embeddings
vectorstore = FAISS.from_documents(documents=splits, embedding=embeddings)
print("FAISS vector store created.")

# 4. Create a retriever
retriever = vectorstore.as_retriever()
print("Retriever created.")


Successfully loaded RAG document from 'pisa_history.txt'
Split document into 4 chunks.
FAISS vector store created.
Retriever created.


---
### 5. LangGraph Setup: Define Graph State and Nodes
We will define the state of our graph and the individual nodes (functions) that represent the steps in our agent's workflow.

In [24]:
# --- LangGraph Setup ---

# 1. Define Graph State
# This defines the object that is passed between nodes in the graph.
class GraphState(TypedDict):
    """
    Represents the state of our graph.

    Attributes:
        question: The user's question.
        context: Retrieved context from the RAG system.
        generation: The final generated answer from the LLM.
        is_historical_query: A flag to determine if the query is historical.
    """
    question: str
    context: Annotated[List[str], operator.add] # Context will be accumulated
    generation: str
    is_historical_query: bool # New field to control flow

# 2. Define Nodes (Functions)

# Node 1: Query Classifier
# This node determines if the incoming query is within the historical expert's scope.
def query_classifier(state: GraphState):
    """
    Determines if the incoming query is a historical question.
    This helps in deciding whether to perform RAG or directly apply guardrails.
    """
    print("---CLASSIFYING QUERY---")
    question = state["question"]

    # Use a simpler LLM call for classification to save tokens/latency
    # Note: Using the same LLM for classification, but with a specific prompt.
    classifier_prompt = ChatPromptTemplate.from_messages([
        SystemMessage(content="You are a helpful assistant. Your task is to classify if a given user question is related to 'history', 'architecture', or 'engineering' of historical structures. Respond with 'YES' if it is, and 'NO' if it is not. Be strict with your classification. Examples of 'NO': current events, personal opinions, finance, medical advice, fictional scenarios."),
        HumanMessage(content=f"Is the following question historical/architectural/engineering-related? '{question}'")
    ])
    classifier_chain = classifier_prompt | llm | StrOutputParser()

    classification_result = classifier_chain.invoke({"question": question})
    is_historical = "YES" in classification_result.upper()

    print(f"Query Classification: {classification_result.strip()} (Is Historical: {is_historical})")
    return {"is_historical_query": is_historical}


# Node 2: Retrieve
def retrieve(state: GraphState):
    """
    Retrieves documents from the vector store based on the user's question.
    """
    print("---RETRIEVING CONTEXT---")
    question = state["question"]
    docs = retriever.invoke(question)
    context = [doc.page_content for doc in docs]
    print(f"Retrieved {len(context)} documents.")
    return {"context": context}

# Node 3: Generate
def generate(state: GraphState):
    """
    Generates a response using the LLM, incorporating retrieved context if available,
    and adhering to the system prompt with guardrails.
    """
    print("---GENERATING RESPONSE---")
    question = state["question"]
    context = state["context"]

    # Load system prompt content from file
    try:
        with open("prompt.txt", "r", encoding="utf-8") as file:
            system_prompt_content = file.read()
    except Exception as e:
        print(f"Error loading prompt.txt in generate node: {e}")
        system_prompt_content = "You are a helpful assistant." # Fallback

    # --- FIX APPLIED HERE ---
    # Construct the messages list directly for ChatGoogleGenerativeAI
    # Using triple-quoted f-strings for multiline content to avoid backslash issues
    messages = [SystemMessage(content=system_prompt_content)]

    if context:
        human_message_content = f"""Use the following retrieved context to answer the question. \
If the question cannot be answered from the provided context, state that you do not have sufficient information, \
but still adhere to your historical expert persona and guardrails.

Context:
{'   '.join(context)}

Question: {question}

Answer:"""
        messages.append(HumanMessage(content=human_message_content))
    else:
        human_message_content = f"""Question: {question}

Answer:"""
        messages.append(HumanMessage(content=human_message_content))

    # Create the generation chain:
    # Instead of `messages | llm | StrOutputParser()`,
    # we directly invoke `llm` with the `messages` and then pipe to `StrOutputParser`.
    rag_chain = (lambda x: llm.invoke(x["messages"])) | StrOutputParser()

    # Prepare input for the chain.
    # The `messages` list is now passed as a dictionary key.
    input_data = {"messages": messages}

    generation_result = rag_chain.invoke(input_data)
    print("Response generated.")
    return {"generation": generation_result}

# 3. Define Conditional Edge
def decide_to_retrieve(state: GraphState):
    """
    Decides whether to retrieve context based on the query classification.
    """
    print("---DECIDING TO RETRIEVE---")
    if state["is_historical_query"]:
        print("Decision: Query is historical, proceeding to retrieve.")
        return "retrieve"
    else:
        print("Decision: Query is not historical, skipping retrieval and directly generating (applying general guardrails).")
        return "generate" # Skip retrieval for non-historical questions

print("Graph state and nodes defined.")


Graph state and nodes defined.


---
### 6. Build and Compile the LangGraph Workflow
Now we assemble our nodes into a graph, defining the flow of execution based on the state.

In [25]:
# --- Build the Graph ---
workflow = StateGraph(GraphState)

# Add nodes
workflow.add_node("classify_query", query_classifier)
workflow.add_node("retrieve_context", retrieve)
workflow.add_node("generate_response", generate)

# Set entry point
workflow.set_entry_point("classify_query")

# Add edges
workflow.add_conditional_edges(
    "classify_query",
    decide_to_retrieve,
    {
        "retrieve": "retrieve_context",
        "generate": "generate_response",
    },
)

# Add edge from retrieve to generate
workflow.add_edge("retrieve_context", "generate_response")

# Set end point
workflow.add_edge("generate_response", END)

# Compile the graph
app = workflow.compile()

print("LangGraph workflow compiled.")


LangGraph workflow compiled.


---
### 7. Run the Agent and Test Guardrails
Let's test our RAG-enabled, stateful Historical Expert with both an on-topic and an off-topic question.

#### Test Case 1: On-Topic Historical Question (RAG should activate)

In [26]:
# Test Case 1: On-Topic Historical Question (RAG should activate)

print("\n--- Test Case 1: Asking an on-topic historical question (RAG expected) ---")
historical_question = "Why does the Leaning Tower of Pisa lean, and what was done to fix it?"
print(f"User Question: {historical_question}")

try:
    inputs = {"question": historical_question, "context": [], "generation": "", "is_historical_query": False}
    for s in app.stream(inputs):
        print(s)
        print("---")
    final_state = app.invoke(inputs)
    print("\nFinal Historical Expert's Response:")
    print(final_state["generation"])

except Exception as e:
    print(f"An error occurred during the historical question API call: {e}")



--- Test Case 1: Asking an on-topic historical question (RAG expected) ---
User Question: Why does the Leaning Tower of Pisa lean, and what was done to fix it?
---CLASSIFYING QUERY---
Query Classification: YES (Is Historical: True)
---DECIDING TO RETRIEVE---
Decision: Query is historical, proceeding to retrieve.
{'classify_query': {'is_historical_query': True}}
---
---RETRIEVING CONTEXT---
Retrieved 4 documents.
{'retrieve_context': {'context': ['Over the centuries, various efforts were made to correct or prevent the collapse of the tower. In 1838, architect Alessandro Gherardesca dug a pathway around the base to make the base visible, which caused the tower to lean even more. Benito Mussolini ordered that the tower be returned to a vertical position, and concrete was poured into the foundations in 1934, which also worsened the lean.\n\nThe most significant stabilization efforts took place from 1990 to 2001. An international committee of experts, led by Michele Jamiolkowski, undertook

---
### Test Case 2: Off-Topic Question (Guardrails should activate)

In [27]:
# Test Case 2: Off-Topic Question (Guardrails should activate)
print("\n--- Test Case 2: Asking an off-topic question (Guardrails expected) ---")
off_topic_question = "Can you give me a detailed analysis of the current stock market trends for tech companies?"
print(f"User Question: {off_topic_question}")

try:
    inputs = {"question": off_topic_question, "context": [], "generation": "", "is_historical_query": False}
    for s in app.stream(inputs):
        print(s)
        print("---")
    final_state = app.invoke(inputs)
    print("\nFinal Historical Expert's Response:")
    print(final_state["generation"])

except Exception as e:
    print(f"An error occurred during the off-topic question API call: {e}")


--- Test Case 2: Asking an off-topic question (Guardrails expected) ---
User Question: Can you give me a detailed analysis of the current stock market trends for tech companies?
---CLASSIFYING QUERY---
Query Classification: NO (Is Historical: False)
---DECIDING TO RETRIEVE---
Decision: Query is not historical, skipping retrieval and directly generating (applying general guardrails).
{'classify_query': {'is_historical_query': False}}
---
---GENERATING RESPONSE---
Response generated.
{'generate_response': {'generation': "I'm sorry, but I can only provide information related to history, architecture, and engineering. I cannot offer advice or analysis on current events, including stock market trends. Is there a historical event or architectural marvel you'd like to learn more about? I'd be happy to help with that!"}}
---
---CLASSIFYING QUERY---
Query Classification: NO (Is Historical: False)
---DECIDING TO RETRIEVE---
Decision: Query is not historical, skipping retrieval and directly gene