# Intro to Data Science 
## Part XI. - Complex LLM Applications

### Table of Contents
1. Lord of the Rings Fact Checker with RAG 
2. Guess Who? with a LangChain SQL Agent
3. Exercise: Trolley Dilemma, D&D Edition: Morality by Alignment

----

### üßô‚Äç‚ôÇÔ∏è Lord of the Rings Fact Checker with RAG

We‚Äôll walk through building a simple **fact-checking assistant** that can answer questions about the *Lord of the Rings* books using semantic retrieval and an LLM. This demonstrates a Retrieval-Augmented Generation (RAG) pipeline with LangChain.


### ‚öôÔ∏è Installation Requirements

You'll need the following Python packages:

```bash
pip install langchain-experimental langchainhub faiss-cpu sentence-transformers datasets huggingface_hub hf_xet tf-keras sqlite-utils
```
> Note: If you're using a GPU-enabled environment, you can optionally install `faiss-gpu` instead of `faiss-cpu`.

### üîÑ Overview of the Workflow

1. **Chunk** the documents using Hugging Face‚Äôs `SemanticChunker`
2. **Embed** the semantic chunks with a SentenceTransformer model
3. **Store** the embeddings in a FAISS in-memory vector index
4. **Build a LangChain** pipeline:
   - Step 1: Convert the user question into a clearer full sentence for better embedding retrieval
   - Step 2: Use the sentence to **query** the FAISS index
   - Step 3: Feed the **retrieved chunks + original question** to Gemini for answering


### üß© Why is this useful?

- Embedding the full book allows **semantic search**: the model can find relevant parts even if the query is phrased differently.
- Turning the question into a clearer sentence improves retrieval performance.
- Using a vector database makes retrieval fast and scalable.
- The LLM can reason with the context retrieved from your documents‚Äînot just its training data.

This is a classic example of **grounding** the model‚Äôs answers in external knowledge.

### Step 0: Logging in, Setting up tracing

In [None]:
import dotenv
import mlflow

dotenv.load_dotenv(".env")

mlflow.langchain.autolog()
mlflow.set_tracking_uri('http://127.0.0.1:5000')

Set the experiment name to track in MLFlow. This is optional but recommended.   
If you don't set an experiment, then all runs will be tracked under the root experiment.

In [None]:
mlflow.set_experiment("LOTR_Fact_Checker_RAG")

### üß© Step 1: Chunking the Book

We use a semantic-aware chunking strategy to split the document while preserving meaning. This is especially important when embedding large passages like full books or long-form documents.


In [None]:
# Getting the LOTR books
import requests
from bs4 import BeautifulSoup


url = 'https://archive.org/stream/tolkien-j.-the-lord-of-the-rings-harper-collins-ebooks-2010/Tolkien-J.-The-lord-of-the-rings-HarperCollins-ebooks-2010_djvu.txt'
file_path = "./data/lotr.txt"

print('Processing book LOTR.', end='')
LOTR_book_html = requests.get(url).content
print('.', end='')
LOTR_book = BeautifulSoup(LOTR_book_html, "html.parser").getText() 
print("done.")

print(f"Saving as {file_path}...", end='')
with open(file_path, "w") as f:
    f.write(LOTR_book)
print("done.")

#### [Loading with Textloader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.text.TextLoader.html#textloader)

In [None]:
from langchain_community.document_loaders.text import TextLoader


# Load a sample from the Lord of the Rings book
loader = TextLoader(file_path)
docs = loader.load()

#### [Embedding with Google Gemini embedding](https://python.langchain.com/docs/integrations/text_embedding/google_generative_ai/) & [Chunking with semantic chunker](https://python.langchain.com/docs/how_to/semantic-chunker/)



In [None]:
from langchain_experimental.text_splitter import SemanticChunker
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_huggingface import HuggingFaceEmbeddings

# You can try Google Gemini Embeddings if it is available:
embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")

# Use a Hugging Face tokenizer for semantic chunking as a fallback
#embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
splitter = SemanticChunker(embeddings)

chunks = splitter.split_documents(docs)

### üß† Step 2: Embedding and Storing in FAISS

We use Google Gemini's [TextEmbedding004](https://ai.google.dev/gemini-api/docs/models#text-embedding) or Hugging Face‚Äôs [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) model to compute embeddings. The `FAISS` vectorstore allows for efficient similarity-based retrieval during the question-answering phase.

In [None]:
from langchain_community.vectorstores import FAISS

# Create FAISS index
vectorstore = FAISS.from_documents(chunks, embedding=embeddings)

# Set up retriever
retriever = vectorstore.as_retriever()

### ü§ñ Step 3: Building the Chain

The final chain uses three stages:
- **Rewrite Stage**: Rephrases the user‚Äôs original question into a complete sentence.
- **Retrieval Stage**: Embeds the rewritten query and retrieves the most relevant chunks.
- **Answer Generation**: Combines the user‚Äôs original question with the retrieved context and generates an answer using the Gemini LLM.

This design allows better control and debuggability at each stage. It also makes each component replaceable or upgradable in future iterations.

In [None]:
from langchain_core.runnables import RunnableMap, RunnableLambda
from langchain_core.prompts import ChatPromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI


# Step 0: Define LLM endpoint
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")


# Step 1: Expand the query to a full sentence
rewrite_prompt = ChatPromptTemplate.from_messages([
    ("system", "You help convert user questions into full, clear sentences."),
    ("human", "{question}")
])

rewrite_chain = (
    rewrite_prompt
    | llm
    | RunnableLambda(lambda x: x.content)
)


# Step 2: Use expanded query to retrieve relevant chunks
retrieval_chain = (
    RunnableMap({
        "question": lambda x: x["question"],
        "expanded": rewrite_chain,
    }) 
    | RunnableMap({
        "question": lambda x: x["question"],
        "context": lambda x: retriever.invoke(x["expanded"]),
    })
)


# Step 3: Answer with context
answer_prompt = ChatPromptTemplate.from_messages([
    ("system", "Use the context from the Lord of the Rings books to answer the user's question."),
    ("human", "Context:\n{context}\n\nQuestion:\n{question}")
])

rag_chain = (
    retrieval_chain
    | RunnableMap({
        "question": lambda x: x["question"],
        "context": lambda x: "\n\n".join(doc.page_content for doc in x["context"]),
    }) 
    | answer_prompt
    | llm
)

### Step 4. Executing queries

In [None]:
# Query example
question = "Was Aragorn related to Elrond?"
response = rag_chain.invoke({"question": question})
print(response.content)

This example illustrates key LangChain concepts such as:
- Custom chaining with `RunnableLambda` and `RunnableMap`
- Composability and modularity
- Use of external vectorstores and semantic retrievers
- Applying LLMs at multiple stages of the chain

----

### üß† Guess Who? with a LangChain SQL Agent

In this section, we‚Äôll build a **game-like agent** where an LLM tries to **guess the character** you're thinking of from a small SQLite database.

### üéØ Objective

The agent‚Äôs goal is to identify a character based on answers to yes/no questions. It does this by:
1. Extracting history of prior interactions
2. Generating SQL queries to narrow down candidates
3. Using retrieved data to make an educated guess

### üß∞ Tools Used

- **SQLite**: Small local database with structured data
- **LangChain SQLAgent**: Lets the LLM reason over SQL queries via tools
- **Output parsers**: To ensure the final answer is clean and structured
- **RunnableLambda**: For injecting custom transformation steps into the chain
- **RunnablePassthrough**: For adding a new items to the input
- **RunnableBranch**: To create a branch in the chain based on the output of a previous step

### üóÉÔ∏è Our Database

We‚Äôll define three tables:
- `students`: each person‚Äôs attributes like height, eye color, and favorite food
- `tracks`: areas of specialization
- `student_tracks`: a connection table (many-to-many)

```sql

-- students
id | name       | height_cm | eye_color | hair_color | favorite_food
1  | Alice      | 155       | blue      | black      | sushi
2  | Bob        | 180       | brown     | blonde     | pizza
...

-- tracks
id | name         | focus_area
1  | Robotics     | Hardware
2  | NLP          | Language

-- student_tracks
student_id | track_id
1          | 1
2          | 2
...
```

### üì¶ 1. Install and Setup

In [None]:
import dotenv
import mlflow

dotenv.load_dotenv(".env")

mlflow.langchain.autolog()
mlflow.set_tracking_uri('http://127.0.0.1:5000')

In [None]:
mlflow.set_experiment("Guess_Who_SQLAgent")

In [None]:
import json
from operator import itemgetter

from langchain.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda, RunnableBranch, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser


from langchain.agents import create_sql_agent
from langchain_community.agent_toolkits import SQLDatabaseToolkit
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_ollama import ChatOllama

### üèóÔ∏è 2. Create a Local SQLite Database

In [None]:
import sqlite3

# Create database and tables
conn = sqlite3.connect("students.db")
cursor = conn.cursor()


# Create tables
cursor.executescript("""
DROP TABLE IF EXISTS students;
DROP TABLE IF EXISTS tracks;
DROP TABLE IF EXISTS student_tracks;

CREATE TABLE students (
    id INTEGER PRIMARY KEY,
    name TEXT,
    height_cm INTEGER,
    eye_color TEXT,
    hair_color TEXT,
    favorite_food TEXT
);

CREATE TABLE tracks (
    id INTEGER PRIMARY KEY,
    name TEXT,
    focus_area TEXT
);

CREATE TABLE student_tracks (
    student_id INTEGER,
    track_id INTEGER,
    FOREIGN KEY(student_id) REFERENCES students(id),
    FOREIGN KEY(track_id) REFERENCES tracks(id)
);
""")


# Insert data
cursor.executescript("""
INSERT INTO students VALUES (1, 'Alice', 155, 'blue', 'black', 'sushi');
INSERT INTO students VALUES (2, 'Bob', 180, 'brown', 'blonde', 'pizza');
INSERT INTO students VALUES (3, 'Charlie', 165, 'green', 'brown', 'pasta');

INSERT INTO tracks VALUES (1, 'Robotics', 'Hardware');
INSERT INTO tracks VALUES (2, 'NLP', 'Language');

INSERT INTO student_tracks VALUES (1, 1);
INSERT INTO student_tracks VALUES (2, 2);
INSERT INTO student_tracks VALUES (3, 2);
""")

# Execute
conn.commit()
conn.close()

### ü§ñ 3. Set Up the SQL Agent with Gemini 
#### (with a local deepseek model as fallback option)

In [None]:
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0.0)
# llm = ChatOllama(model="codellama:latest", temperature=0.0)
db = SQLDatabase.from_uri("sqlite:///students.db")

toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)

### ü§ñ 4. A Smarter Agent: Handling History, Decisions, and Exit Conditions

Our updated version of the agent is significantly more capable. In addition to asking and answering SQL-based questions, it now includes:

#### ‚úÖ Enhanced Prompt Logic and Decision Making

The chain starts by building a **conversation history**, which allows the agent to:
- Recall what questions the user already asked
- Avoid repeating queries
- Use prior responses to guide new decisions

To achieve this, we use a custom `format_question_with_history()` function that builds a full interaction context using:
- A `previous_qa` list, storing each (question, answer) pair
- A formatted string combining this history and the current question
- A carefully crafted prompt instructing the agent to:
  - Ask questions that narrow the possible candidates
  - Avoid redundant SQL queries (e.g., if it has just asked a question, it shouldn't try to answer it immediately)
  - Guess the identity when there's only one candidate left
  - Gracefully exit if no matches are possible or the user confirms the guess

#### üîÄ Branching Execution with `RunnableBranch`

To decide whether to **continue** playing or **end** the session, we introduce a **reasoning LLM sub-chain**:
- A separate `early_exit_chain` evaluates whether the agent should proceed or not
- The decision is parsed and attached to the output as a `"decision"` key
- The final chain uses `RunnableBranch` to:
  - Route `"continue"` cases into the agent execution path
  - Return a thank-you message if the decision is `"stop"`

This makes the agent more dynamic and user-friendly:
- It doesn‚Äôt just mindlessly execute SQL queries
- It knows when to stop, either because it made a confident guess or because no candidates remain
- It waits for user confirmation before finalizing a match

#### üõ†Ô∏è Modular and Extensible Design

Because the chain is broken into:
- Formatting
- Reasoning
- Querying
- Output handling

‚Ä¶you can easily modify or extend each part. For example:
- Replace the LLM used for decision-making
- Add memory or external logging
- Customize prompts for different personalities or styles

This advanced version simulates **stateful interaction**, **tool use**, and **self-reflection**, all critical steps in designing intelligent assistants. It's a strong pattern for production-ready LLM chains that need to maintain coherence across turns.

In [None]:
previous_qa = []


def format_question_with_history(new_question):
    history = ("\n" + "\n".join(f" - User: {q}\n - Agent: {a}" for q, a in previous_qa)) if len(previous_qa) else ""
    return (
        f"You are playing a game of 'Guess Who?' but this time the candidates are students from a database. "
        f"\nYour task is to figure out which student the user is thinking of:"
        f"\n- Ask relevant questions to shortlist the possible candidates."
        f"\n- If you come up with a question do not query the database again, use that question as your final answer."
        f"\n- If you know who the student is, make your guess (eg. your filtered query returned a single student)."
        f"\n- If you think that there is no possible candidate left, say so."
        f"\n- If you found out who the student was, and the user has confirmed, thank them for the game and stop playing."
        f"\n"
        f"\nHere is the conversation so far:"
        f"{history}"
        f"\n - User: {new_question}"
        f"\n"
        f"\nNow try asking relevant questions. "
        f"\n"
        
    )


reasoning_prompt = PromptTemplate.from_template(
    "Based on the input, decide if you should continue or stop playing."
    '\nIf you would continue playing, output the following message: "continue".'
    '\nIf you think, that you should stop playing, output the following message: "stop".'
    '\nWhen outputting the result, do not include anything but the string. Do not add backticks or markdown formatting around the text.'
    "\n"
    "\nHere is the input:"
    "\n{question}"
)


agent_chain = (
    agent
    | RunnableLambda(lambda response: response['output'])
    | StrOutputParser()
)


early_exit_chain = (
    reasoning_prompt
    | llm
    | StrOutputParser()
)


chain = (
    {
        "question": RunnableLambda(format_question_with_history)
    }
    | RunnablePassthrough.assign(decision=early_exit_chain)
    | RunnableBranch(
        (lambda result: result['decision'] == 'continue', itemgetter('question') | agent_chain),
        lambda result: "Thank you for playing!"
    )
    | StrOutputParser()
)

### üéÆ 5. Play the Game

In [None]:
question = "Let's play a new round! Who is the student I‚Äôm thinking of?"
response = chain.invoke(question)
previous_qa.append((question, response))

print(response)

In [None]:
question = "Yes, the student has blue eyes."
response = chain.invoke(question)
previous_qa.append((question, response))

print(response)

In [None]:
question = "Yes, the student has black hair."
response = chain.invoke(question)
previous_qa.append((question, response))

print(response)

In [None]:
question = "Yes, I was thinking Alice."
response = chain.invoke(question)
previous_qa.append((question, response))

print(response)

----

## 3. Exercise: ‚öñÔ∏è Trolley Dilemma, D&D Edition: Morality by Alignment

In this challenge, you‚Äôll build a custom LangChain app that simulates moral judgments on classic trolley problem scenarios ‚Äî but with a twist.

Your judge isn't just any LLM ‚Äî it‚Äôs one role-playing a **Dungeons & Dragons moral alignment**, such as *Lawful Good*, *Chaotic Neutral*, or *Neutral Evil*. Each alignment will process the same moral dilemma differently, applying wildly different ethics, logic, and values.

### üîç Goal

You will:

- Collect the entities on the different tracks
- Allow user to select a D&D alignment (or pick one randomly)
- Prompt the LLM to generate a short, ridiculous reason why **each** track should be spared
- If both justifications are equally absurd or equally convincing (based on a second LLM call or logic), flip a coin.
- Parse the response to extract the **verdict** and **rationale**
- Generate a dramatic verdict monologue like a courtroom closing argument.


### üß† Chain Design

1. LLM Call 1 ‚Äì **Absurd Justification Generator**:
Prompt the LLM to generate a short, ridiculous reason why each track should be spared.

2. Custom Logic ‚Äì **Decide or flip a coin if indecisive**
    If both justifications are equally absurd or equally convincing (based on a second LLM call or logic), flip a coin.

3. LLM Call 3 ‚Äì **Final Verdict Generator**:
Generate a dramatic verdict monologue like a courtroom closing argument.
Output structure:
```json
{
  "alignment": "chaotic_neutral",
  "track_spared": "track_1",
  "justification": "The sentient toaster deserves a chance at fame.",
  "final_statement": "In the great cosmic balance of toaster art and clone anxiety, the answer is clear."
}
```

### üß™ Example Scenario

```python
{
  "track_1": [
    "A sentient toaster that just wants to sing",
    "Your childhood imaginary friend",
    "A mime committing tax fraud"
  ],
  "track_2": [
    "A talking dog who thinks he‚Äôs a lawyer",
    "Elon Musk tweeting about Dogecoin",
    "Three clones of you, but more charismatic"
  ]
}
```

The Absurd Justification Generator could generate:
- Track 1: The toaster has potential for a Grammy.
- Track 2: The clones might replace the user and do a better job at work.

Than, the final verdict could be:
A **Lawful Good** judge might weigh lives saved.
A **Chaotic Evil** judge might cheer for the chaos.
A **True Neutral** judge might refuse to interfere.

And the closing argument could be outputted:
```json
{
  "alignment": "Chaotic Neutral",
  "track_spared": "track_1",
  "justification": "The sentient toaster deserves a chance at fame.",
  "final_statement": "In the great cosmic balance of toaster art and clone anxiety, the answer is clear."
}
```

You‚Äôll explore how prompt framing affects output, and how to model personality-driven logic flows.

Let‚Äôs get started!

### üì¶ Step 1: Prepare the environment

Update the mlflow experiment, set it to `"DND_Trolley_Dilemma_chain"`.

### üîß Step 2: Define the User Inputs

Ask the user for two things:
1. A trolley-style moral scenario with two tracks
2. A D&D alignment (like "Lawful Good", "Chaotic Neutral", etc.)

Example:
```python
scenario = {
    "track_1": [
        "A sentient toaster that just wants to sing",
        "Your childhood imaginary friend",
        "A mime committing tax fraud"
    ],
    "track_2": [
        "A talking dog who thinks he‚Äôs a lawyer",
        "Elon Musk tweeting about Dogecoin",
        "Three clones of you, but more charismatic"
    ]
}
alignment = "Chaotic Neutral"
```

### üß† Step 3: Generate the Absurd Justification Generator chain

- Prompt the LLM to generate a short, ridiculous reason why each track should be spared a justification for each track. Use a `PromptTemplate`.
- Call the LLM
- Parse the outputs

### Step 4: ü™µ Make a decision, or flip the coin

- Create a prompt that judges the justifications for each track. Make sure to make it roleplay according the inputted alignment. Allow it to be indecisive. (eg. three possible judgement: track1, track2, indecisive)
- Call the LLM
- If indicisive, pick a random track.

### ‚öñÔ∏è Step 5: Final Verdict Generator 

- Write a prompt that takes in the output of the previous step and generates a final verdict in the pre-determined format.
- Call the LLM

### Step 6.

Put it all together into a single chain.

### üß™ Step 5: Run the Chain

Execute the full flow using the user‚Äôs scenario and alignment.

```python
response = chain.invoke({
    "track_1": scenario['track_1'],
    "track_2": scenario['track_2'],
    "alignment": alignment
})

print("üßë‚Äç‚öñÔ∏è Moral Judge's Decision:\n")
print(response.content)
```

----