# Script 04 — Deploy to the Cloud with ThoughtBase

This script takes the research agent from Script 03 and deploys it
as a **live cloud API** using ThoughtBase.

You will:
1. Verify your ThoughtBase account
2. Store API keys as server-side secrets
3. Deploy a simple function as a sanity check
4. Deploy a ThoughtFlow-powered research agent
5. Have a multi-turn conversation with the deployed agent
6. See how to call the agent from curl or plain Python

**Every cell makes real API calls** — your agent becomes a real
HTTP endpoint accessible from anywhere.

Prerequisites:
  - ThoughtBase API key in `.env`
  - Groq and Brave API keys in `.env`

In [1]:
from _setup import load_env, print_heading, print_separator
import os

env = load_env()

from thoughtbase import (
    set_api_key,
    get_balance,
    set_secrets,
    list_secrets,
    test_agent,
    deploy_agent,
    call_agent,
    list_agents,
)

---
## Part 1 — Connect to ThoughtBase

In [2]:
print_heading("1.1  Set API key and verify account")

thb_key = os.environ.get("THB_API_KEY", "")
set_api_key(thb_key)

balance = get_balance()
print("Balance:", balance)


══════════════════════════════════════════════════════════════════════
  1.1  Set API key and verify account
══════════════════════════════════════════════════════════════════════

Balance: {'user_id': 'j2RJkFaozg1ASOLAD', 'balance_usd': '$9.98789'}


---
## Part 2 — Secrets Management

Your deployed agent needs API keys for Groq and Brave Search.
ThoughtBase stores these as **secrets** — they are injected into
every execution sandbox as a `SECRETS` dict, so you never embed
credentials in your agent code.

In [3]:
print_heading("2.1  Store secrets")

groq_key = os.environ.get("GROQ_API_KEY", "")
brave_key = os.environ.get("BRAVE_API_KEY", "")

result = set_secrets({
    "GROQ_API_KEY": groq_key,
    "BRAVE_API_KEY": brave_key,
})
print("set_secrets:", result)


══════════════════════════════════════════════════════════════════════
  2.1  Store secrets
══════════════════════════════════════════════════════════════════════

set_secrets: {'stored': ['GROQ_API_KEY', 'BRAVE_API_KEY']}


In [4]:
print_heading("2.2  Verify secrets are stored")

stored = list_secrets()
print("Stored secret names:", stored)


══════════════════════════════════════════════════════════════════════
  2.2  Verify secrets are stored
══════════════════════════════════════════════════════════════════════

Stored secret names: {'secret_names': ['GROQ_API_KEY', 'BRAVE_API_KEY']}


In [5]:
print_heading("2.3  Prove that SECRETS are injected at runtime")

# This code runs in the cloud sandbox and reads SECRETS.
# Values are never logged — we just confirm the keys are present.
result = test_agent(
    code='def check(x): return {"keys": list(SECRETS.keys()), "count": len(SECRETS)}',
    fname="check",
    input_obj={},
)
print("Secrets visible in sandbox:", result)


══════════════════════════════════════════════════════════════════════
  2.3  Prove that SECRETS are injected at runtime
══════════════════════════════════════════════════════════════════════

Secrets visible in sandbox: {'keys': ['GROQ_API_KEY', 'BRAVE_API_KEY'], 'count': 2}


In [6]:
print_heading("2.4  Quick smoke test — run code without deploying")

result = test_agent(
    code="def add(x): return x['a'] + x['b']",
    fname="add",
    input_obj={"a": 17, "b": 25},
)
print("17 + 25 =", result)
assert result == 42, "Smoke test failed — check your THB_API_KEY"
print("Cloud execution works!")


══════════════════════════════════════════════════════════════════════
  2.4  Quick smoke test — run code without deploying
══════════════════════════════════════════════════════════════════════

17 + 25 = 42
Cloud execution works!


---
## Part 3 — Deploy a Simple Agent

Before deploying the full research agent, let us start with something
trivial to confirm the deploy → call round-trip works.

In [7]:
print_heading("3.1  Deploy a simple function")

simple_code = '''
def greet(name):
    """A trivially simple deployed function."""
    return "Hello from the cloud, {}!".format(name)

def multiply(data):
    """Multiply two numbers passed as a dict."""
    return data["a"] * data["b"]
'''

deploy_result = deploy_agent(simple_code)
simple_agent_id = deploy_result.get("api_id", "")
print("Deployed! Agent ID:", simple_agent_id)


══════════════════════════════════════════════════════════════════════
  3.1  Deploy a simple function
══════════════════════════════════════════════════════════════════════

Deployed! Agent ID: DvrtH1oMXHbC2doIu4x7ipDNNmGsU58QSE


In [8]:
print_heading("3.2  Call the deployed functions")

greeting = call_agent(simple_agent_id, "greet", "World")
print("greet('World') →", greeting)

product = call_agent(simple_agent_id, "multiply", {"a": 7, "b": 6})
print("multiply(7, 6) →", product)


══════════════════════════════════════════════════════════════════════
  3.2  Call the deployed functions
══════════════════════════════════════════════════════════════════════

greet('World') → Hello from the cloud, World!
multiply(7, 6) → 42


---
## Part 4 — Deploy the Research Agent

Now the real thing.  We package the research agent as a self-contained
string of Python code and deploy it to ThoughtBase.

**How API keys reach the cloud:**
The agent code reads credentials from the `SECRETS` dict, which
ThoughtBase automatically injects into every execution sandbox.
You stored the keys in Part 2 — the agent code never sees them
in plain text.

In [9]:
print_heading("4.1  The agent code (cloud version)")

# The agent reads API keys from SECRETS — no credentials in the code
# or in the request payload.

agent_code = '''
from thoughtflow import LLM, MEMORY, THOUGHT, DECIDE, PLAN, SEARCH

# --- Credentials from SECRETS (injected by ThoughtBase) ---
llm = LLM("groq:llama-3.3-70b-versatile", key=SECRETS["GROQ_API_KEY"])
brave_key = SECRETS["BRAVE_API_KEY"]

SYSTEM_PROMPT = """You are Ariel, a sharp and curious research analyst.

Your personality:
- You are direct and opinionated.
- You ask probing questions before researching.
- When you find conflicting information, you surface the conflict honestly.
- You are concise but thorough.

Rules:
- Never fabricate sources.
- If you are unsure, say so.
- Keep responses focused."""

# --- Components ---

intent_classifier = DECIDE(
    name="intent",
    llm=llm,
    choices={
        "new_topic": "User is introducing a new research topic",
        "clarify": "User is answering a clarifying question",
        "research": "User wants to go ahead and research",
        "deeper": "User wants to dig deeper into a sub-topic",
        "done": "User wants to wrap up",
    },
    prompt=(
        "Based on the conversation and the latest message, "
        "determine the user intent.\\n\\n"
        "Conversation:\\n{conversation_context}\\n\\n"
        "Latest message: {last_user_msg}"
    ),
    system_prompt=SYSTEM_PROMPT,
    max_retries=3,
)

discovery_thought = THOUGHT(
    name="discovery",
    llm=llm,
    prompt=(
        "The user wants to research: {last_user_msg}\\n\\n"
        "Context so far:\\n{conversation_context}\\n\\n"
        "Ask 2-3 sharp clarifying questions."
    ),
    system_prompt=SYSTEM_PROMPT,
)

research_planner = PLAN(
    name="research_plan",
    llm=llm,
    actions={
        "search": {"description": "Search the web", "params": {"query": "str"}},
        "analyze": {"description": "Analyze information", "params": {"focus": "str"}},
        "synthesize": {"description": "Create a synthesis", "params": {"format": "str?"}},
    },
    prompt=(
        "Topic: {research_topic}\\nAngle: {research_angle}\\n\\n"
        "Create a focused research plan with 2-3 search queries."
    ),
    system_prompt=SYSTEM_PROMPT,
    max_steps=5,
)

synthesis_thought = THOUGHT(
    name="synthesis",
    llm=llm,
    prompt=(
        "Topic: {research_topic}\\nAngle: {research_angle}\\n\\n"
        "Search results:\\n{search_findings}\\n\\n"
        "Conversation:\\n{conversation_context}\\n\\n"
        "Present your findings clearly and with your own perspective."
    ),
    system_prompt=SYSTEM_PROMPT,
)

deeper_thought = THOUGHT(
    name="deeper_dive",
    llm=llm,
    prompt=(
        "The user wants to go deeper:\\n{last_user_msg}\\n\\n"
        "Previous findings:\\n{search_findings}\\n\\n"
        "Provide a more detailed analysis."
    ),
    system_prompt=SYSTEM_PROMPT,
)

wrapup_thought = THOUGHT(
    name="wrapup",
    llm=llm,
    prompt=(
        "The user is wrapping up.\\n"
        "Conversation:\\n{conversation_context}\\n\\n"
        "Give a brief, warm closing with a 1-2 sentence takeaway."
    ),
    system_prompt=SYSTEM_PROMPT,
)


def _do_research(memory):
    """Execute the search pipeline and return the synthesized text."""
    topic = memory.get_var("research_topic") or memory.last_user_msg(content_only=True)
    angle = memory.get_var("research_angle") or ""
    memory.set_var("research_topic", topic)
    memory.set_var("research_angle", angle)

    memory = research_planner(memory)
    plan = memory.get_var("research_plan_result") or []

    queries = []
    for step in plan:
        for task in step:
            if task.get("action") == "search":
                queries.append(task.get("params", {}).get("query", topic))

    snippets = []
    for q in queries[:3]:
        s = SEARCH(name="s", provider="brave", query=q, api_key=brave_key, max_results=4)
        memory = s(memory)
        for r in (memory.get_var("s_results") or {}).get("results", []):
            snippets.append("- [{}] {}".format(r.get("title", ""), r.get("snippet", "")))

    memory.set_var("search_findings", "\\n".join(snippets) or "(No results.)")
    memory = synthesis_thought(memory)
    return memory


def research_turn(input_obj):
    """
    Process one conversation turn.

    Args:
        input_obj: dict with "message" and optional "memory_json".

    Returns:
        dict with "response" and "memory_json" for the next turn.
    """
    message = input_obj.get("message", "")
    memory_json = input_obj.get("memory_json")

    if memory_json:
        memory = MEMORY.from_json(memory_json)
    else:
        memory = MEMORY()

    memory.add_msg("user", message, channel="api")
    conversation_context = memory.render(format="conversation", max_total_length=3000)
    memory.set_var("conversation_context", conversation_context)

    memory = intent_classifier(memory)
    intent = memory.get_var("intent_result")

    if intent == "new_topic":
        memory.set_var("research_topic", message)
        memory = discovery_thought(memory)
        response = memory.get_var("discovery_result")
    elif intent == "clarify":
        memory.set_var("research_angle", message)
        response = "Got it. Let me search for current information. One moment..."
    elif intent == "research":
        memory = _do_research(memory)
        response = memory.get_var("synthesis_result")
    elif intent == "deeper":
        memory = deeper_thought(memory)
        response = memory.get_var("deeper_dive_result")
    elif intent == "done":
        memory = wrapup_thought(memory)
        response = memory.get_var("wrapup_result")
    else:
        response = "Could you rephrase that?"

    memory.add_msg("assistant", response, channel="api")

    return {
        "response": response,
        "memory_json": memory.to_json(),
    }
'''

print("Agent code prepared ({} chars)".format(len(agent_code)))


══════════════════════════════════════════════════════════════════════
  4.1  The agent code (cloud version)
══════════════════════════════════════════════════════════════════════

Agent code prepared (5706 chars)


---
### 4.2 — Test the agent in the cloud before deploying

`test_agent()` runs code on the ThoughtBase serverless backend without
creating a persistent deployment.  The agent reads its API keys from
`SECRETS` — no credentials in the request.

In [10]:
print_heading("4.2  Test the agent code in the cloud")

test_result = test_agent(
    code=agent_code,
    fname="research_turn",
    input_obj={
        "message": "What are the biggest challenges with deploying LLMs in production?",
        "memory_json": None,
    },
)

if isinstance(test_result, dict) and "response" in test_result:
    print("Ariel says:")
    print(test_result["response"][:500])
    print("\nMemory preserved:", "memory_json" in test_result)
else:
    print("Raw result:", str(test_result)[:500])


══════════════════════════════════════════════════════════════════════
  4.2  Test the agent code in the cloud
══════════════════════════════════════════════════════════════════════

Ariel says:
To better understand the challenges of deploying LLMs (Large Language Models) in production, I have a few questions:

1. Are you referring to a specific industry or application, such as chatbots, text classification, or language translation?
2. What is the scale of deployment you're considering - is it a small-scale pilot or a large-scale enterprise-wide rollout?
3. Are you more concerned with technical challenges, such as model drift or computational resources, or operational challenges, like d

Memory preserved: True


In [11]:
print_heading("4.3  Deploy the agent")

deploy_result = deploy_agent(agent_code)
agent_id = deploy_result.get("api_id", "")
print("Deployed! Agent ID:", agent_id)


══════════════════════════════════════════════════════════════════════
  4.3  Deploy the agent
══════════════════════════════════════════════════════════════════════

Deployed! Agent ID: NhLNcLc93zp01DrOkxzSNwZ7PjD3LN39KA


---
## Part 5 — Multi-turn Conversation with the Deployed Agent

The agent is now live.  Let us have a real conversation with it
across multiple turns, passing memory between calls so the agent
remembers context.

Notice that the call payloads contain **only the message and memory** —
no API keys.  Credentials come from SECRETS on the server side.

In [12]:
print_heading("5.1  Turn 1 — introduce a topic")

def _extract(result):
    """Pull the response string from a call_agent result."""
    if isinstance(result, dict):
        return result.get("response", str(result))
    return str(result)

turn1 = call_agent(agent_id, "research_turn", {
    "message": "I want to learn about vector databases and how they are used with LLMs.",
    "memory_json": None,
})

print("Ariel:", _extract(turn1)[:500])
memory_state = turn1.get("memory_json") if isinstance(turn1, dict) else None


══════════════════════════════════════════════════════════════════════
  5.1  Turn 1 — introduce a topic
══════════════════════════════════════════════════════════════════════

Ariel: To better understand your research needs, I have a few questions:

1. What specific aspects of vector databases are you interested in learning about (e.g., architecture, use cases, performance optimization)?
2. How do you envision using vector databases in conjunction with Large Language Models (LLMs) - are you looking to improve model training, efficient similarity searches, or something else?
3. Are there any particular applications or industries (e.g., natural language processing, computer vi


In [13]:
print_heading("5.2  Turn 2 — clarify the angle")

turn2 = call_agent(agent_id, "research_turn", {
    "message": "I care most about performance and cost at scale — millions of embeddings.",
    "memory_json": memory_state,
})

print("Ariel:", _extract(turn2)[:500])
memory_state = turn2.get("memory_json") if isinstance(turn2, dict) else None


══════════════════════════════════════════════════════════════════════
  5.2  Turn 2 — clarify the angle
══════════════════════════════════════════════════════════════════════

Ariel: Got it. Let me search for current information. One moment...


In [14]:
print_heading("5.3  Turn 3 — trigger research")

turn3 = call_agent(agent_id, "research_turn", {
    "message": "Yes, go ahead and research that.",
    "memory_json": memory_state,
})

print("Ariel:", _extract(turn3)[:800])


══════════════════════════════════════════════════════════════════════
  5.3  Turn 3 — trigger research
══════════════════════════════════════════════════════════════════════

Ariel: Based on the search results, I've gathered information on vector databases and their use with Large Language Models (LLMs). Vector databases are designed to efficiently store and query high-dimensional vector embeddings, which are crucial for many AI and machine learning applications, including LLMs.

The key benefits of using vector databases with LLMs include:

1. **Efficient similarity searches**: Vector databases can measure the distance between vectors, enabling efficient similarity searches and retrieval of relevant information.
2. **Improved performance**: Vector databases can handle millions of embeddings, making them ideal for large-scale LLM applications.
3. **Cost-effectiveness**: Specialized vector databases can reduce the costs associated with storing and querying large amount


---
## Part 6 — Calling from Outside Python

Your deployed agent is a standard HTTP API.  You can call it from
any language or tool.  The caller only needs the ThoughtBase API key —
the agent's LLM and search credentials live in SECRETS on the server.

In [15]:
print_heading("6.1  The equivalent curl command")

curl_cmd = '''curl -X POST "{exec_url}" \\
  -H "Content-Type: application/json" \\
  -d '{{
    "api_key": "{api_key}",
    "api_id": "{agent_id}",
    "fname": "research_turn",
    "encoded": 1,
    "zipped": 0,
    "input": {{
      "message": "What is retrieval-augmented generation?",
      "memory_json": null
    }}
  }}'
'''.format(
    exec_url="https://bdxwb8xftj.execute-api.us-east-1.amazonaws.com/prod/invoke",
    api_key=thb_key[:8] + "..." if len(thb_key) > 8 else "<your-key>",
    agent_id=agent_id,
)

print("You can call this agent from any terminal:\n")
print(curl_cmd)
print("(Replace api_key with your full ThoughtBase key.)")


══════════════════════════════════════════════════════════════════════
  6.1  The equivalent curl command
══════════════════════════════════════════════════════════════════════

You can call this agent from any terminal:

curl -X POST "https://bdxwb8xftj.execute-api.us-east-1.amazonaws.com/prod/invoke" \
  -H "Content-Type: application/json" \
  -d '{
    "api_key": "qL58Gs2y...",
    "api_id": "NhLNcLc93zp01DrOkxzSNwZ7PjD3LN39KA",
    "fname": "research_turn",
    "encoded": 1,
    "zipped": 0,
    "input": {
      "message": "What is retrieval-augmented generation?",
      "memory_json": null
    }
  }'

(Replace api_key with your full ThoughtBase key.)


In [16]:
print_heading("6.2  Standalone Python script (no thoughtbase import)")

standalone_script = '''import requests

EXEC_URL = "https://bdxwb8xftj.execute-api.us-east-1.amazonaws.com/prod/invoke"
API_KEY  = "<your-thoughtbase-api-key>"
AGENT_ID = "{agent_id}"


def call(message, memory_json=None):
    """Call the deployed research agent over HTTP.

    No LLM credentials needed — the agent reads them from SECRETS
    on the server side.
    """
    body = {{
        "api_key": API_KEY,
        "api_id": AGENT_ID,
        "fname": "research_turn",
        "encoded": 1,
        "zipped": 0,
        "input": {{
            "message": message,
            "memory_json": memory_json,
        }},
    }}
    resp = requests.post(EXEC_URL, json=body)
    data = resp.json()
    try:
        return data["output"]["result"]
    except Exception:
        return data


# --- Usage ---
result = call("What is the current state of quantum computing?")
print("Agent says:", result.get("response"))

# Multi-turn — pass memory forward:
result2 = call("Tell me more about error correction.", result.get("memory_json"))
print("Agent says:", result2.get("response"))
'''.format(agent_id=agent_id)

print("Save this as call_agent.py and run it standalone:\n")
print(standalone_script)


══════════════════════════════════════════════════════════════════════
  6.2  Standalone Python script (no thoughtbase import)
══════════════════════════════════════════════════════════════════════

Save this as call_agent.py and run it standalone:

import requests

EXEC_URL = "https://bdxwb8xftj.execute-api.us-east-1.amazonaws.com/prod/invoke"
API_KEY  = "<your-thoughtbase-api-key>"
AGENT_ID = "NhLNcLc93zp01DrOkxzSNwZ7PjD3LN39KA"


def call(message, memory_json=None):
    """Call the deployed research agent over HTTP.

    No LLM credentials needed — the agent reads them from SECRETS
    on the server side.
    """
    body = {
        "api_key": API_KEY,
        "api_id": AGENT_ID,
        "fname": "research_turn",
        "encoded": 1,
        "zipped": 0,
        "input": {
            "message": message,
            "memory_json": memory_json,
        },
    }
    resp = requests.post(EXEC_URL, json=body)
    data = resp.json()
    try:
        return data["output"]["result"]
   

---
## Part 7 — What You Have Deployed

In [17]:
print_heading("7.1  List all deployed agents")

agents = list_agents()

if isinstance(agents, dict) and "api_list" in agents:
    agent_ids = agents["api_list"]
    print("You have {} deployed agent(s):".format(len(agent_ids)))
    for aid in agent_ids:
        print("  -", aid)
elif isinstance(agents, list):
    print("You have {} deployed agent(s):".format(len(agents)))
    for a in agents:
        aid = a.get("api_id", a.get("id", str(a)))
        print("  -", aid)
else:
    print("Agents:", agents)


══════════════════════════════════════════════════════════════════════
  7.1  List all deployed agents
══════════════════════════════════════════════════════════════════════

You have 6 deployed agent(s):
  - t6mR0djfIxBJLoXjDwp0jvCDF4hYGAEL49
  - OOAt4aJrjfVJaMZE31erYvvjhqbGpdyxZP
  - hya6UpZrvAQzPqNltsczK2NQ25BFs81Te3
  - cW3X56SRSWTZxOsRMqJC0kvgRafuAdeuE2
  - DvrtH1oMXHbC2doIu4x7ipDNNmGsU58QSE
  - NhLNcLc93zp01DrOkxzSNwZ7PjD3LN39KA


In [18]:
print_heading("7.2  Summary")

print("Simple agent ID :", simple_agent_id)
print("Research agent ID:", agent_id)
print()
print("Both agents are now live cloud APIs.")
print("They auto-scale, require no servers, and cost only when called.")
print()
print("To update the research agent later:")
print('  from thoughtbase import update_agent')
print('  update_agent("{}", new_code)'.format(agent_id))


══════════════════════════════════════════════════════════════════════
  7.2  Summary
══════════════════════════════════════════════════════════════════════

Simple agent ID : DvrtH1oMXHbC2doIu4x7ipDNNmGsU58QSE
Research agent ID: NhLNcLc93zp01DrOkxzSNwZ7PjD3LN39KA

Both agents are now live cloud APIs.
They auto-scale, require no servers, and cost only when called.

To update the research agent later:
  from thoughtbase import update_agent
  update_agent("NhLNcLc93zp01DrOkxzSNwZ7PjD3LN39KA", new_code)


---
## Recap — The Full Journey

Across four scripts, you have:

| Script | What you built |
|--------|---------------|
| **01** | LLM calls, MEMORY state management, THOUGHT reasoning |
| **02** | DECIDE classification, PLAN generation, ACTION primitives |
| **03** | A conversational research agent with personality |
| **04** | Secrets, deployment, multi-turn cloud API, external access |

The entire stack — from a single LLM call to a production API —
was built with **6 primitives**: LLM, MEMORY, THOUGHT, DECIDE,
PLAN, and ACTION.

That is ThoughtFlow + ThoughtBase.

In [19]:
# [END!!]