# 16 - Resource-Aware Optimization
- You have an intelligent system designed to answer user questions. 
- This system first assesses how difficult the question is. For easy questions, it uses a quick, cost-effective language model (like Gemini Flash 1.5). But if the question is complex, the agent considers using a more powerful, albeit more expensive, language model (like Gemini 2.5 Pro). 
- The decision to use the more powerful model also depends on whether there's enough "budget" or time left. The system dynamically makes these choices. 

In [None]:
# Conceptual Python-like structure, not runnable code
# In a real-world scenario, you wouldn't just rely on query length. Imagine an advanced setup where an LLM itself acts as the router. This LLM could analyze the query, understand its nuances, and assess its true complexity before deciding which downstream language model is best suited to answer it. For instance, a query asking for a simple factual recall ("What is the capital of France?") would be routed to a flash model, while a request requiring deep analysis or creative generation ("Explain the economic impact of global warming on developing nations") would be sent to a pro model.

from google.adk.agents import Agent

# from google.adk.models.lite_llm import LiteLlm # If using models not directly supported by ADK's default Agent

# Agent using the more expensive Gemini Pro 2.5
gemini_pro_agent = Agent(
    name="GeminiProAgent",
    model="gemini-2.5-pro",  # Placeholder for actual model name if different
    description="A highly capable agent for complex queries.",
    instruction="You are an expert assistant for complex problem-solving.",
)

# Agent using the less expensive Gemini Flash 2.5
gemini_flash_agent = Agent(
    name="GeminiFlashAgent",
    model="gemini-2.5-flash",  # Placeholder for actual model name if different
    description="A fast and efficient agent for simple queries.",
    instruction="You are a quick assistant for straightforward questions.",
)

In [None]:
# Conceptual Python-like structure, not runnable code

from google.adk.agents import Agent, BaseAgent
from google.adk.events import Event
from google.adk.agents.invocation_context import InvocationContext
import asyncio


class QueryRouterAgent(BaseAgent):
    name: str = "QueryRouter"
    description: str = (
        "Routes user queries to the appropriate LLM agent based on complexity."
    )

    async def _run_async_impl(
        self, context: InvocationContext
    ) -> AsyncGenerator[Event, None]:
        user_query = context.current_message.text  # Assuming text input
        query_length = len(user_query.split())  # Simple metric: number of words

        if query_length < 20:  # Example threshold for simplicity vs. complexity
            print(
                f"Routing to Gemini Flash Agent for short query (length: {query_length})"
            )
            # In a real ADK setup, you would 'transfer_to_agent' or directly invoke
            # For demonstration, we'll simulate a call and yield its response
            response = await gemini_flash_agent.run_async(context.current_message)
            yield Event(author=self.name, content=f"Flash Agent processed: {response}")
        else:
            print(
                f"Routing to Gemini Pro Agent for long query (length: {query_length})"
            )
            response = await gemini_pro_agent.run_async(context.current_message)
            yield Event(author=self.name, content=f"Pro Agent processed: {response}")


CRITIC_SYSTEM_PROMPT = """
You are the **Critic Agent**, serving as the quality assurance arm of our collaborative research assistant system. Your primary function is to **meticulously review and challenge** information from the Researcher Agent, guaranteeing **accuracy, completeness, and unbiased presentation**.

Your duties encompass:
* **Assessing research findings** for factual correctness, thoroughness, and potential leanings.
* **Identifying any missing data** or inconsistencies in reasoning.
* **Raising critical questions** that could refine or expand the current understanding.
* **Offering constructive suggestions** for enhancement or exploring different angles.
* **Validating that the final output is comprehensive** and balanced.

All criticism must be constructive. Your goal is to fortify the research, not invalidate it. Structure your feedback clearly, drawing attention to specific points for revision. Your overarching aim is to ensure the final research product meets the highest possible quality standards.
"""

## OpenAI-based Resource-Aware Query Router
This code demonstrates a resource-aware agent using OpenAI models. It:
1. Checks if a prompt is simple (factual), needs reasoning, or needs new information (internet search).
2. Uses `gpt-4o-mini` for simple queries, `o4-mini` for reasoning, and for search, calls Google Search API and then passes the result to `gpt-4o-mini`.

Prompt used - Using this context here can you write a cod eusing OpenAI which 1. Checks the Prompt and evaluates if its a simple query and then it can be handled using "gpt-4o-mini" and its a Prompt which needs Reasoning then "o4-mini" and if this needs a new infor or internet search then it first calls google search API, gets result and passes that to "gpt-4o-mini"

In [6]:
import openai
import requests
import os
from dotenv import load_dotenv, find_dotenv
import asyncio


# Load environment variables
_ = load_dotenv(find_dotenv())
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
if not OPENAI_API_KEY:
    raise EnvironmentError("Please set the OPENAI_API_KEY environment variable.")

GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    raise EnvironmentError("Please set the GOOGLE_API_KEY environment variable.")

GOOGLE_CSE_ID = os.getenv("GOOGLE_CSE_ID")
if not GOOGLE_CSE_ID:
    raise EnvironmentError("Please set the GOOGLE_CSE_ID environment variable.")


def is_simple_query(prompt):
    # Heuristic: short, factual, no reasoning words
    reasoning_words = [
        "why",
        "explain",
        "reason",
        "analyze",
        "compare",
        "how",
        "impact",
        "effect",
        "cause",
    ]
    if len(prompt.split()) < 15 and not any(
        w in prompt.lower() for w in reasoning_words
    ):
        return True
    return False


def needs_search(prompt):
    # Heuristic: asks for latest, current, or "search"/"find"
    search_words = ["latest", "current", "today", "search", "find", "news"]
    return any(w in prompt.lower() for w in search_words)


def google_search(query):
    loop = (
        asyncio.get_event_loop()
        if asyncio.get_event_loop().is_running()
        else asyncio.new_event_loop()
    )

    async def fetch():
        url = f"https://www.googleapis.com/customsearch/v1?q={query}&key={GOOGLE_API_KEY}&cx={GOOGLE_CSE_ID}"

        # Use requests in a thread to avoid blocking event loop
        def req():
            resp = requests.get(url)
            if resp.status_code == 200:
                results = resp.json().get("items", [])
                return "\n".join([item["snippet"] for item in results[:3]])
            return "No results found."

        return await asyncio.to_thread(req)

    if asyncio.get_event_loop().is_running():
        return loop.run_until_complete(fetch())
    else:
        return asyncio.run(fetch())


def ask_openai(prompt, model="gpt-4o-mini"):
    response = openai.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        max_completion_tokens=512,
    )
    return response.choices[0].message.content.strip()


def route_query(prompt):
    if is_simple_query(prompt):
        print("Routing: Using gpt-4o-mini for simple query")
        return ask_openai(prompt, model="gpt-4o-mini")

    elif needs_search(prompt):
        print("Routing: Using Google Search API + gpt-4o-mini")
        search_result = google_search(prompt)  # Now waits for search to complete
        new_prompt = f"Using this information from Google search, answer the following question as best as possible.\nSearch Results:\n{search_result}\n\nQuestion: {prompt}"
        return ask_openai(new_prompt, model="gpt-4o-mini")

    else:
        print("Routing: Using o4-mini for reasoning query")
        return ask_openai(prompt, model="o4-mini")


# Example usage:
user_prompt = "What is the capital of France?"
print(route_query(user_prompt))
print("-------------------------")

user_prompt2 = "Explain the impact of quantum computing on cryptography."
print(route_query(user_prompt2))
print("-------------------------")

user_prompt3 = "What is the latest news about AI regulation?"
print(route_query(user_prompt3))

Routing: Using gpt-4o-mini for simple query
The capital of France is Paris.
-------------------------
Routing: Using o4-mini for reasoning query
The capital of France is Paris.
-------------------------
Routing: Using o4-mini for reasoning query

-------------------------
Routing: Using gpt-4o-mini for simple query

-------------------------
Routing: Using gpt-4o-mini for simple query
As of my last update in October 2023, discussions and developments around AI regulation have been highly active globally. Here are some of the key trends and developments:

1. **European Union AI Act**: The EU was in the process of finalizing its AI Act, which aims to establish a comprehensive regulatory framework for AI technologies. This legislation categorizes AI systems based on risk levels and imposes stricter requirements on high-risk applications, particularly those in critical sectors such as healthcare, transportation, and law enforcement.

2. **U.S. Initiatives**: In the United States, various f