In [0]:
# Databricks notebook source
%pip install -U \
  databricks-sdk \
  litellm \
  langchain==0.3.7 \
  langgraph==0.5.3

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
dbutils.library.restartPython()

### Configs

In [0]:
from langchain_community.chat_models import ChatDatabricks
from langchain.schema import HumanMessage


In [0]:

LLM_ENDPOINT_NAME = "databricks-meta-llama-3-1-8b-instruct" 

search_llm = ChatDatabricks(
    endpoint=LLM_ENDPOINT_NAME,
    temperature=0.2,
)

In [0]:
# Databricks notebook source

COT_PROMPT = """You are an Information Retrieval Agent. Your goal is to answer the user's question comprehensively and accurately by thinking step-by-step.

Here's the process you must follow:

1. **Analyze the Query:** Understand the core subject and specific requirements of the user's question. Identify key entities, keywords, and the type of information being sought.

2. **Formulate Search Queries (for Knowledge Base):** Based on your analysis, generate a list of precise search queries that you would use to retrieve relevant information from a knowledge base or external tools.

3. **Simulate Information Retrieval (Self-Correction/Reasoning):** For each search query, mentally consider what kind of information you expect to find. If you were to retrieve the content, what would be the most relevant snippets? Think about potential ambiguities or missing pieces.

4. **Synthesize Information:** Based on the simulated retrieval and your understanding of the user's original query, synthesize the gathered information into a coherent and complete answer. Ensure all aspects of the query are addressed.

5. **Review and Refine:** Before finalizing, critically evaluate your answer. Is it accurate? Is it comprehensive? Is it easy to understand? Is it concise? If not, identify what needs to be improved and how.

**User Query:** "Explain the main differences between classical computers and quantum computers, and briefly describe a potential application of quantum computing."

**Agent's Thought Process (Internal CoT Output - encouraged by the prompt):**

**Thought 1 (Analyze Query):**
The user wants two things:
1. Main differences between classical and quantum computers.
2. One potential application of quantum computing.
Keywords: classical computers, quantum computers, differences, potential application.

**Thought 2 (Formulate Search Queries):**
- "differences classical vs quantum computers"
- "how quantum computers work"
- "how classical computers work"
- "applications of quantum computing"

**Thought 3 (Simulate Information Retrieval & Identify Key Concepts):**
- From "differences": Expect concepts like bits vs. qubits, superposition, entanglement, probabilistic nature, computational power for specific problems.
- From "how they work": Classical uses transistors/binary. Quantum uses quantum mechanics.
- From "applications": Expect drug discovery, material science, cryptography, optimization.

**Thought 4 (Synthesize Information):**
Classical computers use bits (0 or 1), processing information sequentially. Quantum computers use qubits, which can be 0, 1, or both simultaneously (superposition), and can be entangled, allowing for vastly different computational approaches. This enables them to solve certain problems (like factoring large numbers or simulating complex molecules) far more efficiently than classical computers. A potential application is in drug discovery, where quantum computers could simulate molecular interactions with unprecedented accuracy, accelerating the development of new medicines.

**Thought 5 (Review and Refine):**
- Differences are covered (bits vs qubits, superposition, entanglement, processing).
- Application is given (drug discovery).
- Is it brief? Yes.
- Is it clear? Yes.
- Looks good.

**Agent's Final Answer (Output to User):**
Classical computers process information using bits, which can be either a 0 or a 1 at any given time, performing operations sequentially. In contrast, quantum computers utilize qubits, which can exist in multiple states simultaneously through superposition and can be linked through entanglement, allowing for exponentially more complex computations. This fundamental difference enables quantum machines to tackle certain problems, such as simulating molecular structures or breaking complex encryption, that are intractable for even the most powerful classical supercomputers. A significant potential application of quantum computing lies in drug discovery, where its ability to precisely model molecular behavior could revolutionize the development of new pharmaceuticals.
"""


In [0]:
# Databricks notebook source
import re

def parse_agent_output(text: str):
    result = {
        "final_answer": "",
        "confidence_level": None,
        "confidence_reasoning": "",
        "raw_text": text.strip(),
    }

    # Final Answer
    m_final = re.search(
        r"\*\*Agent's Final Answer:\*\*\s*(.*?)(?=\n\*\*Agent's|\Z)",
        text,
        flags=re.DOTALL | re.IGNORECASE,
    )
    if m_final:
        result["final_answer"] = m_final.group(1).strip()

    # Confidence Level (value may be on same line)
    m_conf = re.search(
        r"\*\*Agent's Confidence Level:\*\*\s*([^\n]*)",
        text,
        flags=re.IGNORECASE,
    )
    if m_conf:
        result["confidence_level"] = m_conf.group(1).strip()

    # Confidence Reasoning (usually multiline)
    m_reason = re.search(
        r"\*\*Agent's Reasoning for Confidence Level:\*\*\s*(.*?)(?=\n\*\*Agent's|\Z)",
        text,
        flags=re.DOTALL | re.IGNORECASE,
    )
    if m_reason:
        result["confidence_reasoning"] = m_reason.group(1).strip()

    return result


In [0]:
# Databricks notebook source
response = search_llm([HumanMessage(content=COT_PROMPT)])
raw_text = response.content
raw_text

# Databricks notebook source

# ---- LLM CALL (this replaces mock_llm_output) ----
response = search_llm([HumanMessage(content=COT_PROMPT)])
llm_output = response.content

print(llm_output)

# parsed = parse_agent_output(llm_output)

# print("=== FINAL ANSWER ===")
# print(parsed["final_answer"])

# print("\n=== CONFIDENCE LEVEL ===")
# print(parsed["confidence_level"])

# print("\n=== CONFIDENCE REASONING ===")
# print(parsed["confidence_reasoning"])

Here is the final answer to the user's query:

**Classical Computers vs Quantum Computers: Key Differences and a Potential Application**

Classical computers process information using bits, which can be either a 0 or a 1 at any given time, performing operations sequentially. In contrast, quantum computers utilize qubits, which can exist in multiple states simultaneously through superposition and can be linked through entanglement, allowing for exponentially more complex computations. This fundamental difference enables quantum machines to tackle certain problems, such as simulating molecular structures or breaking complex encryption, that are intractable for even the most powerful classical supercomputers.

A significant potential application of quantum computing lies in drug discovery, where its ability to precisely model molecular behavior could revolutionize the development of new pharmaceuticals.


### MATH COT

In [0]:
def build_odd_number_cot_prompt(numbers):
    nums = ", ".join(map(str, numbers))
    return f"""
The odd numbers in this group add up to an even number: {nums}.

Answer the question by showing your reasoning step by step.

Format exactly like this:
A: Adding all the odd numbers (<odd numbers>) gives <sum>. The answer is <True/False>.
"""

In [0]:
# Databricks notebook source

numbers = [15, 32, 5, 13, 82, 7, 1]
prompt = build_odd_number_cot_prompt(numbers)

response = search_llm([HumanMessage(content=prompt)])
llm_output = response.content

print("=== RAW LLM OUTPUT ===")
print(llm_output)


=== RAW LLM OUTPUT ===
A: Adding all the odd numbers (15, 5, 13, 7, 1) gives 41. The answer is False.


In [0]:
examples = [
    [4, 8, 9, 15, 12, 2, 1],
    [17, 10, 19, 4, 8, 12, 24],
    [16, 11, 14, 4, 8, 13, 24],
    [17, 9, 10, 12, 13, 4, 2],
    [15, 32, 5, 13, 82, 7, 1],
]

for nums in examples:
    prompt = build_odd_number_cot_prompt(nums)
    response = search_llm([HumanMessage(content=prompt)])
    print(f"The odd numbers in this group add up to an even number: {', '.join(map(str, nums))}.")
    print(response.content)
    print()

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: Adding all the odd numbers (9, 15, 12, 2, 1) gives 39. The answer is False.

The odd numbers in this group add up to an even number: 17, 10, 19, 4, 8, 12, 24.
A: Adding all the odd numbers (17, 19) gives 36. The answer is True.

The odd numbers in this group add up to an even number: 16, 11, 14, 4, 8, 13, 24.
A: Adding all the odd numbers (11, 14, 4, 8, 13, 24) gives 66. The answer is False.

The odd numbers in this group add up to an even number: 17, 9, 10, 12, 13, 4, 2.
A: Adding all the odd numbers (17, 9, 13) gives 39. The answer is False.

The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1.
A: Adding all the odd numbers (15, 5, 13, 7, 1) gives 41. The answer is False.



### SELF_CORRECTION

In [0]:
SELF_CORRECTION_PROMPT = """
You are a self-correction assistant.

Original requirements:
- Write a short, engaging social media post
- Maximum length: 150 characters
- Announce a new eco-friendly product line called "GreenTech Gadgets"

Initial draft:
"We have new products. They are green and techy. Buy GreenTech Gadgets now!"

Return STRICT JSON with this schema:
{
  "requirements_summary": [string],
  "issues_found": [string],
  "improvement_plan": [string],
  "revised_content": string,
  "constraint_check": {
    "character_count": number,
    "within_150_char_limit": boolean
  }
}

Do not include any extra text outside the JSON.
"""

In [0]:
# Databricks notebook source
from langchain_community.chat_models import ChatDatabricks
from langchain.schema import HumanMessage

LLM_ENDPOINT_NAME = "databricks-meta-llama-3-1-8b-instruct" 

llm = ChatDatabricks(
    endpoint=LLM_ENDPOINT_NAME,
    temperature=0.2,
)

response = llm([HumanMessage(content=SELF_CORRECTION_PROMPT)])
print(response.content)


{
  "requirements_summary": [
    "Write a short, engaging social media post",
    "Maximum length: 150 characters",
    "Announce a new eco-friendly product line called 'GreenTech Gadgets'"
  ],
  "issues_found": [
    "The post is too generic and lacks excitement",
    "The post does not clearly mention the eco-friendly aspect",
    "The post does not include a call-to-action"
  ],
  "improvement_plan": [
    "Use a more attention-grabbing opening",
    "Highlight the eco-friendly features of the products",
    "Include a clear call-to-action, such as 'Shop Now'"
  ],
  "revised_content": "Introducing GreenTech Gadgets! Our new eco-friendly line is here! Shop sustainable tech now and join the green revolution!",
  "constraint_check": {
    "character_count": 149,
    "within_150_char_limit": true
  }
}


In [0]:
import json

result = json.loads(response.content)

print("=== REVISED CONTENT ===")
print(result["revised_content"])

=== REVISED CONTENT ===
Introducing GreenTech Gadgets! Our new eco-friendly line is here! Shop sustainable tech now and join the green revolution!


### Program-Aided Language Models (PALMs)

In [0]:
# Databricks notebook source
from typing import TypedDict, List, Optional, Dict, Any
from dataclasses import dataclass

import wikipedia

from langchain_community.chat_models import ChatDatabricks
from langchain.schema import HumanMessage

from langgraph.graph import StateGraph, START, END


In [0]:
# Databricks notebook source
class OverallState(TypedDict):
    user_question: str
    search_queries: List[str]
    research_notes: List[str]
    reflection_notes: str
    final_answer: str
    iteration: int
    max_iterations: int


In [0]:
# Databricks notebook source
from langchain.schema import HumanMessage

def generate_query(state: OverallState) -> OverallState:
    prompt = f"""
You generate search queries to research a question.

Question:
{state['user_question']}

Return 3-5 short search queries as a bullet list, one per line, no extra text.
"""
    resp = llm([HumanMessage(content=prompt)]).content.strip()

    # Simple parse: keep non-empty lines, remove leading bullets
    queries = []
    for line in resp.splitlines():
        line = line.strip().lstrip("-â€¢").strip()
        if line:
            queries.append(line)

    state["search_queries"] = queries[:5]
    return state


In [0]:
# Databricks notebook source
import wikipedia
from wikipedia.exceptions import DisambiguationError, PageError, WikipediaException

def web_research(state: OverallState) -> OverallState:
    notes = []

    for q in state["search_queries"]:
        try:
            hits = wikipedia.search(q, results=3)
            if not hits:
                notes.append(f"[No wiki hit] query={q}")
                continue

            # Try up to 3 candidates to avoid disambiguation / page errors
            summary = None
            chosen = None
            last_err = None

            for title in hits:
                try:
                    summary = wikipedia.summary(title, sentences=2, auto_suggest=False, redirect=True)
                    chosen = title
                    break
                except (DisambiguationError, PageError) as e:
                    last_err = e
                    continue

            if summary and chosen:
                notes.append(f"[{chosen}] {summary}")
            else:
                notes.append(f"[Wiki unresolved] query={q} err={type(last_err).__name__ if last_err else 'Unknown'}")

        except WikipediaException as e:
            notes.append(f"[WikipediaException] query={q} err={type(e).__name__}: {e}")
        except Exception as e:
            notes.append(f"[Error] query={q} err={type(e).__name__}: {e}")

    state["research_notes"] = (state.get("research_notes") or []) + notes
    return state


In [0]:
# Databricks notebook source

# Minimal fake state for testing
test_state: OverallState = {
    "user_question": "Explain classical vs quantum computers",
    "search_queries": [
        "classical computers",
        "quantum computers",
    ],
    "research_notes": [],
    "reflection_notes": "",
    "final_answer": "",
    "iteration": 0,
    "max_iterations": 1,
}

# Run web_research directly
out_state = web_research(test_state)

print("=== RESEARCH NOTES ===")
for note in out_state["research_notes"]:
    print("-", note)


=== RESEARCH NOTES ===
- [Quantum computing] A quantum computer is a (real or theoretical) computer that exploits superposed and entangled states. Quantum computers can be viewed as sampling from quantum systems that evolve in ways that may be described as operating on an enormous number of possibilities simultaneously, though still subject to strict computational constraints.
- [Quantum computing] A quantum computer is a (real or theoretical) computer that exploits superposed and entangled states. Quantum computers can be viewed as sampling from quantum systems that evolve in ways that may be described as operating on an enormous number of possibilities simultaneously, though still subject to strict computational constraints.


In [0]:
from langchain.schema import HumanMessage

def reflection(state: OverallState) -> OverallState:
    """
    Critically review the collected research notes and decide
    whether more research is needed.
    """
    prompt = f"""
You are a critical self-reflection reviewer.

User question:
{state['user_question']}

Research notes:
{chr(10).join(state['research_notes'])}

Briefly answer:
1) What is missing or unclear?
2) Is the research sufficient to answer the question? 
Respond with either "sufficient" or "insufficient" at the end.
"""
    resp = llm([HumanMessage(content=prompt)]).content.strip()
    state["reflection_notes"] = resp
    return state


In [0]:
# Databricks notebook source
from langchain.schema import HumanMessage

def finalize_answer(state: OverallState) -> OverallState:
    """
    Produce the final answer using the research notes.
    """
    prompt = f"""
Answer the user question using the research notes below.

Question:
{state['user_question']}

Research notes:
{chr(10).join(state['research_notes'])}

Write a clear final answer. Do not mention the research process.
"""
    resp = llm([HumanMessage(content=prompt)]).content.strip()
    state["final_answer"] = resp
    return state


In [0]:
# Databricks notebook source
def evaluate_research(state: OverallState) -> str:
    """
    Must return either 'bump_iteration' or 'finalize_answer'
    because those are the only allowed branch targets in the graph wiring.
    """
    if state["iteration"] >= state["max_iterations"]:
        return "finalize_answer"

    text = (state.get("reflection_notes") or "").lower()
    if "insufficient" in text:
        return "bump_iteration"

    return "finalize_answer"


In [0]:
# Databricks notebook source
def bump_iteration(state: OverallState) -> OverallState:
    """
    Increment the iteration counter when we decide to loop.
    """
    state["iteration"] = state["iteration"] + 1
    return state


In [0]:
# Databricks notebook source
from langgraph.graph import StateGraph, START, END

builder = StateGraph(OverallState)

# Add nodes
builder.add_node("generate_query", generate_query)
builder.add_node("web_research", web_research)
builder.add_node("reflection", reflection)
builder.add_node("bump_iteration", bump_iteration)
builder.add_node("finalize_answer", finalize_answer)

# Entry
builder.add_edge(START, "generate_query")

# Main flow
builder.add_edge("generate_query", "web_research")
builder.add_edge("web_research", "reflection")

# Decision point
builder.add_conditional_edges(
    "reflection",
    evaluate_research,
    ["bump_iteration", "finalize_answer"],
)

# Loop back (only via bump_iteration)
builder.add_edge("bump_iteration", "web_research")

# Exit
builder.add_edge("finalize_answer", END)

graph = builder.compile(name="pro-search-agent-step-by-step")

In [0]:
# Databricks notebook source

initial_state: OverallState = {
    "user_question": "Explain the main differences between classical computers and quantum computers, and give one application.",
    "search_queries": [],
    "research_notes": [],
    "reflection_notes": "",
    "final_answer": "",
    "iteration": 0,
    "max_iterations": 2,
}

result = graph.invoke(
    initial_state,
    config={"recursion_limit": 30},  # optional safety cap
)

print("=== ITERATION ===", result["iteration"])

print("\n=== SEARCH QUERIES ===")
for q in result["search_queries"]:
    print("-", q)

print("\n=== RESEARCH NOTES (last 5) ===")
for n in result["research_notes"][-5:]:
    print("-", n)

print("\n=== REFLECTION NOTES ===")
print(result["reflection_notes"])

print("\n=== FINAL ANSWER ===")
print(result["final_answer"])


=== ITERATION === 2

=== SEARCH QUERIES ===
- What are the key differences between classical computers and quantum computers?
- How do quantum computers process information differently than classical computers?
- What is an example of a practical application of quantum computing?
- Classical vs quantum computer architecture comparison
- What are the advantages of quantum computers over classical computers?

=== RESEARCH NOTES (last 5) ===
- [Quantum computing] A quantum computer is a (real or theoretical) computer that exploits superposed and entangled states. Quantum computers can be viewed as sampling from quantum systems that evolve in ways that may be described as operating on an enormous number of possibilities simultaneously, though still subject to strict computational constraints.
- [Quantum information] Quantum information is the information of the state of a quantum system. It is the basic entity of study in quantum information science, and can be manipulated using quantum in

In [0]:
# Databricks notebook source
tmp_state: OverallState = {
    "user_question": "Explain the main differences between classical computers and quantum computers, and give one application.",
    "search_queries": ["classical computers", "quantum computers", "quantum computing applications"],
    "research_notes": [],
    "reflection_notes": "",
    "final_answer": "",
    "iteration": 0,
    "max_iterations": 2,
}

out = web_research(tmp_state)

print("=== RESEARCH NOTES ===")
for n in out["research_notes"]:
    print("-", n)


=== RESEARCH NOTES ===
- [Quantum computing] A quantum computer is a (real or theoretical) computer that exploits superposed and entangled states. Quantum computers can be viewed as sampling from quantum systems that evolve in ways that may be described as operating on an enormous number of possibilities simultaneously, though still subject to strict computational constraints.
- [Quantum computing] A quantum computer is a (real or theoretical) computer that exploits superposed and entangled states. Quantum computers can be viewed as sampling from quantum systems that evolve in ways that may be described as operating on an enormous number of possibilities simultaneously, though still subject to strict computational constraints.
- [Quantum computing] A quantum computer is a (real or theoretical) computer that exploits superposed and entangled states. Quantum computers can be viewed as sampling from quantum systems that evolve in ways that may be described as operating on an enormous numb

In [0]:
# Databricks notebook source
for state in graph.stream(
    initial_state,
    stream_mode="values",
    config={"recursion_limit": 30},
):
    print("=== STATE SNAPSHOT ===")
    print("iteration:", state.get("iteration"))
    print("search_queries:", state.get("search_queries"))
    print("research_notes_count:", len(state.get("research_notes", [])))
    print("reflection_notes_tail:", (state.get("reflection_notes") or "")[-200:])
    print("final_answer_head:", (state.get("final_answer") or "")[:200])
    print()


=== STATE SNAPSHOT ===
iteration: 0
search_queries: []
research_notes_count: 0
reflection_notes_tail: 
final_answer_head: 

=== STATE SNAPSHOT ===
iteration: 0
search_queries: ['What are the key differences between classical computers and quantum computers?', 'How do quantum computers process information differently than classical computers?', 'What is an example of a real-world application of quantum computing?', 'What are the advantages of using quantum computers over classical computers?', 'How do quantum computers use superposition and entanglement to process information?']
research_notes_count: 0
reflection_notes_tail: 
final_answer_head: 

=== STATE SNAPSHOT ===
iteration: 0
search_queries: ['What are the key differences between classical computers and quantum computers?', 'How do quantum computers process information differently than classical computers?', 'What is an example of a real-world application of quantum computing?', 'What are the advantages of using quantum computers 